In early July, Meta made headlines by offering a staggering USD 200 million annual package to lure Pang Ruoming, head of Apple’s foundational model team, away from the company. The move ignited speculation across Silicon Valley and Chinese tech circles alike: why would Meta splash out cash at someone who, from the outside, appeared to have achieved little at Apple?
Two years into Apple’s artificial intelligence endeavors, what has the company actually accomplished?
According to information obtained by 36Kr, Pang led Apple’s foundation model (AFM) team. His team achieved promising results in model development, but due to Apple’s highly closed ecosystem, nothing could be revealed to the public until the technology made it into a shipped product. As a result, the team’s work never saw the light of day.
No one feels this wasted potential more than Pang himself.
Apple’s slow AI march
Pang joined Apple in 2021 to head its in-house foundational model efforts. From the birth of Apple Intelligence in 2023 through to 2025, the field of large models evolved at breakneck speed. But Apple failed to keep pace. Its AI efforts drew mounting criticism for being lackluster and out of step.
On July 12, just before his departure, Pang reportedly made a final appeal to Apple’s software engineering (SWE) division, asking for permission to publicly share the results of his team’s work. The request was denied.
While the world races ahead, Apple’s AI strategy has remained cautious. Since 2023, multiple promises around AI functionality have gone unfulfilled. Though AI features have now been embedded into Apple’s system, they are widely perceived as uninspired. Public expectations for Apple’s AI have shifted, largely from hopeful to disillusioned.
But insiders insist this isn’t due to lack of talent. Rather, the system had failed.
And so, Pang left in disappointment.
A DeepSeek-like team that’s 80% Chinese
The AFM team, led by Pang, was Apple’s boldest bet on foundational model development. Pang is now the most expensive Chinese AI scientist in Silicon Valley following Meta’s poaching attempt.
Pang spent 15 years at Google as a principal software engineer. While there, he co-led the development of a Babelfish-Lingvo framework, along with Wu Yonghui and Zhifeng Chen. This framework was eventually adopted by over 1,000 internal projects at Google, surpassing even DeepMind and AdBrain in usage. Pang also made core contributions to Google’s neural TTS (text-to-speech) systems.
“Ruoming is the kind of scientist who makes everyone around him better,” one former colleague told 36Kr.
After joining Apple in 2021, Pang led a team of around 80 engineers under the AFM group. While most members stayed out of the spotlight, the lineup included a who’s who of former Google researchers. In addition to Wu and Chen, they include Zirui Wang, Chung-Cheng Chiu, Guoli Yin, Yinfei Yang, Nan Du, Chong Wang, Mark Lee, Bowen Zhang, and Tom Gunter. Most of them were among the team’s earliest members.
With Pang’s exit, leadership has now shifted to Chen. Apple’s AFM group is moving from a centralized model where most engineers reported directly to Pang, to a more distributed structure with multiple managers reporting upward.
Chen, a Fudan University alumnus, earned his MS and PhD qualifications from Princeton University and the University of Illinois Urbana-Champaign, respectively. He joined Google in 2005. Along with Pang and Wu, Chen was one of the earliest Chinese leaders at Google Brain. His academic work, particularly in machine learning and distributed systems, has garnered over 110,000 citations, including key contributions to TensorFlow.
According to sources, Apple had briefly considered outsourcing large model development to players like OpenAI or Anthropic. But after internal deliberations, the company decided to double down on its own models. It was during this crisis of confidence in early 2025 that Pang called in Chen, his longtime collaborator from Google, to bolster the AFM effort.
Zirui Wang, another early AFM member, had worked with Pang at Google Brain. After joining Google in 2020, Wang reported to Wu and had previously collaborated with Yu Jiahui, the multimodal specialist now at Meta’s Super Intelligence Lab (MSL), who was also offered a USD 100 million annual package.
At Google, Wang helped develop the Contrastive Captioner (CoCa) vision-language model alongside Yu and Wu. The model, cited over 1,700 times, made important strides in aligning visual and textual data, advancing multimodal AI.
After joining Apple, Wang developed the first version of the AFM model, which later evolved into Apple Intelligence. In 2024, he moved to Elon Musk’s xAI to lead post-training for Grok 3. He now oversees post-training efforts at Apple’s AI division.
Other members of the original AFM team included Guoli Yin and Tom Gunter, both longtime Apple engineers. Yin, a Stanford graduate, was part of the early Apple inference engine team and later helped build the company’s suggestion system for internal search. He joined the AFM team when it was just five people and now leads efforts on agents, APIs, and post-training.
Key contributors also included multimodal lead Yinfei Yang, whose 2021 paper on noisy text supervision in vision-language representation has been cited more than 4,000 times, and Zhe Gan, who recently joined to collaborate with AFM. Gan’s work on UNITER (universal image-text representation) and image-to-text generation has been cited over 26,000 times.
Nan Du, a specialist in mixture-of-experts (MoE) architectures, is Apple’s principal researcher and was previously a senior scientist at Google. He worked on the GLaM model as well as PaLM 2 and Magi.
A story of hesitation and missed timing
Meta’s USD 200 million offer might seem exorbitant, but it reflects the depth of talent within Apple’s AI division, most of which has gone unnoticed due to Apple’s internal secrecy.
One source told 36Kr that the AFM team had been training large models since early 2023 and had already built models with hundreds of billions of parameters. Their capabilities were said to be close to DeepSeek’s.
Despite the team’s technical wins, internal obstacles continued to mount. Pang, who originally reported to Daphne Luong, a deputy to Apple’s AI chief John Giannandrea, ended up appealing to SWE head Craig Federighi for public release rights.
Neither Giannandrea nor Luong responded publicly or internally, according to email exchanges reviewed by sources.
Federighi, senior vice president of SWE at Apple, has long been one of the company’s most powerful voices. Though Giannandrea was recruited from Google in 2018 with the aim of supercharging Apple’s AI efforts, he found his initiatives repeatedly blocked—often by Federighi.
According to Bloomberg reporter Mark Gurman, Federighi was deeply skeptical of AI and hesitant to approve large-scale investments. Even as ChatGPT gained mainstream traction in 2023, Apple’s internal AI budget remained conservative compared to the amount spent by OpenAI.
Giannandrea’s Siri team was transferred to the SWE division in March this year, a sign that Tim Cook had lost confidence in his execution.
Loyalty to hardware, disregard for talent
Alibaba co-founder Jack Ma once said companies lose talent for one of two reasons: either the money’s not right, or they feel underappreciated.
Pang’s departure was arguably driven by both.
One source close to Apple noted that Federighi insists on maintaining “Apple standards,” which means no AI research can be published unless it ships with a product. And even then, the shipped version is often a scaled-down version of what the engineers built.
“Craig would rather let people say Apple is bad at AI than let them say Apple builds bad AI services,” the source said.
Apple’s closed ecosystem created another constraint: the models had to run on Apple’s own hardware. But the performance of its custom chips has yet to match that of industry-leading GPUs like Nvidia’s H100. This forced the AFM team to continually downgrade their models just to make them deployable.
Federighi himself acknowledged this in an internal email to Pang, notifying him that Apple is forcing huge compromises to run models on its hardware.
While Apple’s chip legacy in mobile and laptop computing is well-known, its AI chip development is still in early stages. According to internal projections, Apple’s next-gen AI chips may only match today’s H100-level performance.
Ultimately, Apple’s insistence on in-house development, combined with its aversion to publicizing R&D, became a liability. It trapped top-tier scientists like Pang in a system where great work couldn’t be seen or celebrated.
This is Apple’s AI paradox: the company’s desire to guard its legacy now threatens its future.
KrASIA Connection features translated and adapted content that was originally published by 36Kr. This article was written by Chen Jiahui for 36Kr.