AI & Computing

OpenAI's first self-developed inference chip, Jalapeño, released: Paradigm shift and competitive restructuring of the AI computing industry chain

In-depth Analysis of the Profound Impact of Jalapeño, the First Self-developed AI Inference Chip Jointly Released by OpenAI and Broadcom, on the Semiconductor Industry Chain, Technology Roadmap, Market Competition Landscape, and Regional Supply Chain.

事件概述

On June 24, 2026, OpenAI and Broadcom jointly announced their first self-developed AI inference chip, Jalapeño. The chip is an application-specific integrated circuit (ASIC) optimized for large language model inference. OpenAI is responsible for the underlying architecture design, Broadcom handles silicon implementation and networking hardware, and Canadian electronic manufacturing services provider Celestica handles board and rack system integration. OpenAI claims that Jalapeño's energy efficiency ratio (performance per watt) will surpass the current state-of-the-art. Engineering samples have already been running models such as GPT-5.3, Codex, and Spark in the lab at mass production target frequencies and power consumption.

This event is not isolated. On the same day, NVIDIA CEO Jensen Huang emphasized at the annual shareholders' meeting that the "era of practical AI" has arrived, and disclosed that the Vera Rubin platform is fully in production. Google, through its self-developed TPU series, has demonstrated significant profit advantages in computing cost control and software-hardware co-optimization. Anthropic has deeply tied itself to the computing infrastructure of Amazon and Google. The "arms race" in AI computing is shifting from model scale to self-reliance in computing infrastructure.

背景:从算力租赁到芯片自研的必然路径

In its early days, OpenAI completely relied on NVIDIA GPUs in Microsoft Azure clusters for training and inference, maintaining its leading position by "converting capital into computing power." However, as the parameter count of the GPT series models grew exponentially (GPT-5.3 is expected to have over one trillion parameters), inference costs rose sharply. According to industry estimates, the electricity cost for a single inference of a GPT-4-level model accounts for over 30% of operating expenses. At the same time, NVIDIA GPU supply remains tight, with Blackwell architecture capacity allocation prioritizing large cloud vendors, leaving small and medium AI companies facing a "waiting for supply" dilemma.

In this context, OpenAI's self-development of an inference chip becomes a natural extension of its long-term full-stack infrastructure strategy. The positioning of Jalapeño is not to completely replace NVIDIA GPUs, but to focus on the inference scenario—the most costly and frequent aspect of AI deployment. Through a custom ASIC, OpenAI is expected to improve inference energy efficiency by 3-5 times (based on similar design experience), thereby significantly reducing marginal costs.

深度分析

Technology ImpactJalapeño is an inference-specific ASIC whose technical approach differs fundamentally from GPUs: - Architecture: It uses a dataflow architecture instead of SIMT, hardening matrix multiplication and attention mechanisms for Transformer models. This avoids the overhead of general-purpose compute units found in GPUs, but sacrifices flexibility. - Process Node: The specific node has not been disclosed, but based on industry practice, Broadcom typically uses TSMC's 5nm or 3nm process. If 3nm is adopted, it will compete with clients like Apple and NVIDIA for capacity. - Interconnect: Inter-chip communication relies on Broadcom's Tomahawk series switches and custom silicon photonics, supporting ultra-large-scale cluster deployment. This directly competes with NVIDIA's NVLink.

The technical barrier lies in the fact that the inference ASIC's compiler and runtime must be deeply coupled with OpenAI's software stack (e.g., Triton, PyTorch). OpenAI has open-sourced part of the Triton backend, but Jalapeño's instruction set will be highly closed, creating a hardware-software lock-in effect similar to Google's TPU.

Supply Chain Impact

  • Jalapeño's supply chain is divided as follows:
  • Upstream: Chip design relies on Synopsys and Cadence's EDA tools; IP cores may come from ARM (CPU control unit) or SiFive (RISC-V coprocessor).
  • Midstream: Wafer fabrication is likely handled by TSMC or Samsung. Based on Broadcom's long-term cooperation with TSMC (e.g., 5nm AI accelerators), TSMC is more probable. Packaging may use CoWoS or InFO, further squeezing NVIDIA's capacity.
  • Downstream: System integration is handled by Celestica, but OpenAI is building its own data centers (e.g., planned deployment by the end of 2026), reducing reliance on Microsoft Azure.

Beneficiaries: Broadcom (significant revenue increase from custom chips, estimated $50-80 incremental per Jalapeño), Celestica (system assembly), TSMC (advanced process filling orders), related packaging substrate and testing vendors. Risk parties: NVIDIA (losing the OpenAI inference market; short-term impact limited but strong demonstration effect), Microsoft (Azure losing major AI workloads, cloud service growth under pressure), other AI companies (facing competitive pressure from hardware differentiation).

Competitive Landscape#### NVIDIA: No Worries in Short Term, Under Pressure in Long Term NVIDIA GPUs still hold a monopoly in the training domain (market share over 80%), and the Vera Rubin platform is fully rolled out. However, OpenAI's departure could trigger a chain reaction: if Google, Anthropic, Meta, etc., further increase the proportion of self-developed chips, NVIDIA will lose the high-margin inference market. Nevertheless, NVIDIA's advantage lies in the stickiness of the CUDA ecosystem—any new chip requires years to adapt to model frameworks, and NVIDIA continues to solidify its moat through acceleration libraries (e.g., cuDNN, TensorRT).

#### Broadcom: From Connectivity Chips to AI Core Transition Broadcom was previously primarily a networking chip (switches, PHY) and custom ASIC provider (e.g., auxiliary chips for Google TPU). Jalapeño makes it the first core supplier of AI main chips, marking a transformation toward higher value-added areas. However, over-reliance on a single customer (OpenAI) poses risks, and it must compete with custom chip makers like Marvell and MediaTek.

#### Google TPU: A Proven Business Model Google TPU has iterated to the sixth generation, providing inference services through GCP. Its advantage lies in end-to-end hardware-software integration (own models + own chips + own cloud). OpenAI's follow-up indicates this model has become a standard for AI companies, but Google has a clear first-mover advantage.

  • #### WiMi and Chinese Chips: Differentiation Breakthrough
  • Chinese company WiMi (WeiMei Hologram) is laying out AI chip clusters and quantum AI, but its focus remains on vertical scenarios like edge computing and holographic AR. The OpenAI case shows that self-developed chips require huge capital expenditure (billions of dollars) and long-term investment. If Chinese AI companies want to compete in general inference, they must rely on advanced processes from domestic foundries (e.g., SMIC), but current process technology lags by more than two generations. Therefore, WiMi and others are more likely to focus on ASICs for specific scenarios (e.g., low power, edge inference), leveraging open-source models (e.g., Llama) to build differentiation.- United States: Strengthening AI hardware self-sufficiency, but intensifying competition for semiconductor talent (especially ASIC design talent).
  • Taiwan, China: TSMC's foundry position is further solidified, but order concentration increases; if geopolitical events occur, the global AI supply chain faces disruption risks.
  • South Korea: Samsung may lose OpenAI orders (due to competition with TSMC), but can pursue other custom chip clients.
  • Japan: Advanced process followers like Rapidus gain a window of opportunity, but it will be difficult to break in the short term.
  • Europe: Demand for ASML lithography machines is further stimulated, but subsidy recipients under the EU Chips Act may be required to return benefits (e.g., local production).
  • Southeast Asia: Celestica's factory in Malaysia will benefit from system integration orders, but with low technical content and limited added value.

Investment Perspective

  • Short term: NVIDIA's stock price faces pressure but limited correction (training demand remains); Broadcom receives buy rating upgrades; TSMC ADR is stable due to full capacity.
  • Long term: The AI chip market will shift from "general GPU monopoly" to "heterogeneous divergence." Investment should focus on:
  • - Custom chip design services (Broadcom, Marvell, MediaTek)
  • - Advanced packaging and interconnects (TSMC, JCET, Shanghai Micro Electronics)
  • - Data center optical interconnects (Broadcom, InnoLight)
  • - Training chip startups replacing GPUs (e.g., Groq, Cerebras)

Long-Term Outlook

In 3 years: OpenAI's Jalapeño is deployed at scale; NVIDIA maintains training leadership through Vera Rubin; inference market becomes fragmented. In 5 years: AI giants (Google, Microsoft, Meta, Amazon, OpenAI) almost all have self-developed inference chips; NVIDIA transitions into a training + connectivity platform provider. In 10 years: Edge AI devices (phones, cars, robots) begin adopting dedicated NPUs; general GPUs' share in data centers drops below 50%.

Comprehensive Industry Chain Analysis

Upstream: EDA and IP - Synopsys and Cadence face a surge in custom chip design demand, but their per-seat licensing model is challenged (as chip companies expand their in-house teams). - ARM architecture may be eroded by RISC-V (if OpenAI adopts an open-source ISA to reduce costs).### Midstream: Wafer Foundry and Packaging - TSMC's 5nm/3nm capacity remains tight, and CoWoS packaging capacity will be difficult to alleviate before 2027. - Samsung Foundry needs to accelerate acquiring custom chip customers, otherwise it will be marginalized. - Packaging and testing companies such as JCET and Tongfu Microelectronics will benefit from the mass production demand of domestic AI chips (e.g., Cambricon, Hygon).

Downstream: Cloud Computing and AI Services - Microsoft Azure loses OpenAI's main workload, but can recover some revenue by providing Jalapeño cluster hosting services. - Google Cloud's self-sufficiency model becomes a benchmark, attracting other companies to follow suit (e.g., Oracle, IBM). - Small and medium-sized AI companies face a "chip selection dilemma": renting GPUs is expensive, self-developing chips has high barriers, and they may turn to FPGAs or buyout ASICs.

Conclusion

The launch of OpenAI Jalapeño is not a single product event, but a sign that the AI industry has entered an era of "infrastructure integration" from "algorithm innovation". Its impact on the industrial chain is far-reaching: custom ASIC design service providers rise, advanced packaging is in short supply, and regional supply chains accelerate differentiation. For investors, they should focus on the investment logic of "AI infrastructure" rather than "AI models"; for Chinese companies, they need to achieve deep coupling of ASIC and software stack in specific scenarios, rather than chasing general-purpose computing power. Whoever first gains control of the supply chain for low-cost, high-stability, zero-carbon computing power will hold the core discourse power of the global AI industry. However, it should be noted that the success of Jalapeño depends on mass production yield and software ecosystem maturity, and its real impact will become apparent after 2027.

Desk context · semiconreport

semiconreport frames this note through Semicon Report tracks chip design, fabrication, AI compute demand, supply-chain shifts, market cycles, and.... dates, names and status changes still need checking: Source links should be opened before the summary is reused. Chip Industry / Industry brief / Focus explains the local editorial angle.

Source links

  1. https://www.moomoo.com/community/feed/openai-launches-first-self-developed-ai-inference-chip-boosting-nvidia-116813940457481Primary

Related articles

Back to channel