Toward the end of September in Hangzhou, the rivalry between Alibaba and ByteDance over dominance in artificial intelligence cloud services filled the city with palpable tension.
During Alibaba Cloud’s annual Apsara Conference, visitors could hardly miss the massive Volcano Engine ads blanketing Hangzhou Xiaoshan International Airport and Hangzhou East Railway Station. The bold text read:
“Holding 46.4% of China’s public cloud large model market share.”
The phrasing was subtle yet deliberate. Volcano Engine avoided superlatives such as “largest” or “number one,” but the message was unmistakable: with roughly ten major cloud providers in China, claiming nearly half the market speaks for itself.
Some industry insiders, however, questioned the claim, noting that Volcano Engine’s figure was based on model-as-a-service (MaaS) usage, which mainly covers closed-source models. Customers who buy infrastructure or platform resources and deploy open-source models are excluded, meaning MaaS consumption alone does not capture the full competitive picture.
But Volcano Engine is hardly unique. In today’s cloud market, nearly every major player can call itself a leader, depending on the metric.
Across ads and research reports, Alibaba, ByteDance, and Baidu have each claimed first place by highlighting different indicators. The contest appears to have reached a standoff, prompting the question: who truly leads China’s AI cloud market?
Some build the “kitchen,” others deliver the “meal”
In a sense, all these companies are right. The debate is less about data accuracy than about definitions—the boundaries each player sets and the metrics they choose to measure success.
Volcano Engine leads in public cloud large model invocation volume, counting every call made to large models on its public cloud platform. That includes ByteDance’s own Doubao model (excluding internal Douyin and Doubao usage) as well as many third-party models. Every token generated on the platform contributes to its market share.
Alibaba Cloud takes a broader approach, measuring total revenue across its full technology stack, from infrastructure to platform and model services.
Baidu, by contrast, focuses on revenue from productized and industry-tailored AI solutions. It has even cited “first place in both contract volume and value for large model projects in the first half of 2025.”
Meanwhile, Z.ai (formerly Zhipu AI) reported that after launching its GLM-4.5 model, its invocation revenue—based on OpenRouter data—surpassed the combined total of all other domestic models.
Each approach is valid in context. Cloud computing encompasses multiple layers and business models, so no single metric captures the full market.
For example, a small enterprise might rent virtual servers monthly, while an individual developer might skip servers entirely and access AI tools via API calls. Likewise, large model usage varies widely, from startups experimenting with APIs to enterprises running integrated AI systems.
Volcano Engine’s approach resembles a vast food delivery chain. It only cares about delivering the dish (the model output). Customers don’t need to know what’s happening in the kitchen, whether the model uses ByteDance’s Doubao, what chips it runs on, or what inference framework powers it. All that matters is how fast and how well the dish arrives.
That’s why Volcano Engine often emphasizes performance. After DeepSeek launched its V3.1 model in August, Volcano Engine promoted its strength in time per output token (TPOT) and tokens per minute (TPM), which are key measures of throughput.
TPM measures how many tokens a model can process per minute, therefore indicating capacity. In a typical workload such as document summarization or code generation, where each user consumes around 500 tokens per minute, a 5 TPM system can serve roughly 10,000 users simultaneously.
This model suits traffic-driven AI services and small developer projects where demand is elastic and scaling speed matters most.
Alibaba and Baidu, however, focus on total revenue, akin to running a full-service kitchen. Clients can rent high-end equipment (infrastructure), order ready-made dishes (model services), or combine both.
Selling models alone rarely ensures defensibility. As one AI engineer told 36Kr:
“It’s like someone who shops at a supermarket every week. If they need a cup, unless another store offers something extraordinary, they’ll probably just grab it there.”
Customer stickiness in cloud and AI services comes from integrated solutions: APIs combined with databases, virtual machines, and data management tools.
The age of differentiation
Despite fierce competition, China’s large model market remains relatively small.
In May 2024, Volcano Engine ignited the MaaS market by slashing prices. Its Doubao Pro-32K flagship model dropped to RMB 0.0008 (USD 0.0001) per 1,000 tokens, a 99.3% cut.
The move triggered a price war. Alibaba, Tencent, and Baidu quickly followed, and Volcano Engine’s model invocations soared from 120 billion tokens to more than 500 billion.
By the first half of 2025, IDC reported that China’s total large model invocations reached 536.7 trillion tokens. Using the Doubao Pro-128K model’s current price of RMB 0.0005 (USD 0.00007) per 1,000 tokens, the entire MaaS market is worth roughly RMB 500–600 million (USD 70–84 million).
For perspective, Alibaba Cloud earned more than RMB 80 billion (USD 11.2 billion) in 2024, underscoring how small MaaS still is within the broader cloud sector.
Today, each company’s strategy reflects its conviction about the future.
Cloud computing is a scale-driven business powered by network effects. Alibaba Cloud, with its established infrastructure and enterprise relationships, focuses on full-stack integration. Volcano Engine, a newer player, faces steeper competition in CPU-based services and is instead betting on GPU-based growth and MaaS scalability.
In May, Volcano Engine CEO Tan Dai said, “The marathon has only just reached the 500-meter mark,” predicting the market could grow a hundredfold. The company recently reported that Doubao’s daily token usage now exceeds 16.4 trillion, 137 times higher than when it first launched in May 2024.
But who’s really number one in the business? There’s still no consensus.
At the Apsara Conference, Alibaba Cloud argued that measuring only public cloud invocation volume is “like seeing just the tip of the iceberg.” Many clients deploy open-source Qwen models privately on their own servers, the company said, so their usage isn’t captured in external data.
Alibaba Cloud instead measures total revenue, including GPU rentals, platform, and model services. By this metric, it claims the top spot.
To further clarify model strength, Alibaba suggested differentiating between providers that sell self-developed models and those that resell third-party ones, as well as between models used for complex reasoning and simpler labeling tasks.
A Frost & Sullivan report published in September echoed this logic. Based on daily invocation shares of self-developed models in the first half of 2025, Alibaba’s Qwen ranked first with 17.7%, followed by ByteDance’s Doubao at 14.1% and DeepSeek at 10.3%.
The price war will not last forever. Large model developers are moving from competition on infrastructure and token volume to competing on performance, service depth, and user experience.
In 2024, price cuts drove token growth. By 2025, that strategy has lost momentum. Tokens are already cheap; businesses now care more about efficiency and outcomes.
For smaller model developers unable to compete on scale, differentiation has become critical.
Recent examples illustrate the trend. Moonshot AI’s K2 model, optimized for coding tasks, saw rapid adoption thanks to its strong price-performance ratio. Z.ai’s “GLM Coding Plan” has also expanded concurrent programming capacity.
Ultimately, business competition in the AI cloud market follows a familiar rule: when you can’t win by your rival’s metrics, redefine the game by your own. And in today’s large model race, the only metric that truly matters is how many customers are willing to pay.
KrASIA Connection features translated and adapted content that was originally published by 36Kr. This article was written by Deng Yongyi for 36Kr.

