NVIDIA GPUs vs. Custom Chips for AI Compute: Who Wins Out? (2025-2035)
Will NVIDIA fall from GPU dominance between 2025-2035 as a result of hyperscalers making custom chips?
Mohit Agarwal wrote an article called “The Future of Compute: NVIDIA’s Crown is Slipping” in October 2024 and it’s well-worth reading.
And while I have to say that I appreciate the comprehensive research and analysis by Mohit, I don’t necessarily agree with the premise that “NVIDIA’s Crown is Slipping” (or will be slipping).
NVIDIA’s Crown is Slipping Thesis (Key Points)
Included below is a summary of the thesis presented by Mohit.
Thesis: NVIDIA, the current leader in AI compute with its GPU dominance, faces significant long-term challenges. The consolidation of AI workloads among hyperscalers, the rise of custom silicon, and the evolution of distributed, vertically integrated systems threaten NVIDIA’s position. While the company will likely maintain dominance in the medium term due to current hardware constraints, its ability to adapt to these structural shifts remains uncertain.
1. NVIDIA’s Current Dominance
Success on the Scaling Hypothesis: NVIDIA has grown rapidly, adding $2 trillion in value in 13 months, driven by the AI boom and its GPU monopoly.
Peak Pricing Power: Current H100 GPUs represent peak pricing, but declining margins (e.g., B200 generation) and increasing competition could erode profitability.
Revenue Sources: ~50% of data center demand comes from hyperscalers like Google, Microsoft, Amazon, and Meta, making NVIDIA heavily reliant on a few large customers.
2. Demand Consolidation
Shrinking Market for Small Buyers: Startups and smaller companies, previously significant buyers, are now shifting to cloud-based AI solutions, reducing their direct hardware demand.
Independent Clouds Struggling: NVIDIA-supported independent cloud providers (e.g., Coreweave, Lambda) are under pressure from reduced demand and declining GPU rental prices, leading to unsustainable ROE.
Hyperscaler Dominance: Hyperscalers are consolidating demand and have better infrastructure, scale, and economics to support large AI workloads, creating a major revenue risk for NVIDIA.
3. Rise of Custom Silicon
Hyperscalers’ Silicon Efforts:
Google (TPUs): Outpacing NVIDIA GPUs for many internal workloads; cheaper and more efficient for training and inference.
Amazon (Trainium, Inferentia): Competitive in price-performance for AI workloads; co-developed with Anthropic.
Microsoft (Maia): Early but promising, leveraging Triton for CUDA replacement.
Meta (MTIA Chips): Transitioning workloads to in-house silicon, focusing on inference.
Threats from Vertical Integration: Hyperscalers’ history of designing out suppliers (e.g., Intel with Graviton and Nitro at AWS) signals potential existential risks for NVIDIA’s business model.
4. Distributed & Vertically Integrated Systems
Shift in Compute Design: Hyperscalers are moving toward smaller, interconnected data centers and away from monolithic, centralized designs.
Optimizations Beyond Chips: Gains in AI compute now come from system-level innovations (e.g., advanced cooling, networking, rack design) that NVIDIA is ill-prepared to compete with.
Distributed Training: Asynchronous training across interconnected data centers is becoming standard, reducing reliance on NVIDIA’s centralized hardware solutions.
5. Competitive Alternatives
Emerging Startups: Companies like Cerebras Systems, Groq, and Rain AI are offering differentiated AI accelerator solutions.
AMD’s Rising Competitiveness: Improvements in AMD’s ROCm software and hardware (e.g., MI300x) are making it a viable alternative, especially for cost-conscious buyers.
Chinese Competitors: Companies like Huawei (Ascend chips) and Alibaba (Hanguang chips) are gaining traction in specific markets, although broader success remains uncertain.
6. Structural & Technological Challenges
Infrastructural Weaknesses:
NVIDIA’s networking (Infiniband) struggles with fault tolerance in large-scale deployments.
Hyperscalers have better proprietary software stacks for cluster management and fault handling.
Cooling & Power Efficiency: NVIDIA lags behind hyperscalers like Google, which have pioneered efficient cooling systems and data center designs since 2018.
7. Attempts to Adapt
New Hardware and Software Initiatives:
Liquid cooling and backplane integration in GB200 servers to improve efficiency.
Spectrum-X networking solutions and improved diagnostic tools.
Value Chain Expansion: Efforts like DGX Cloud and NGC signal NVIDIA’s intent to move up the stack, but their narrow focus may not align with the broader trends toward distributed systems.
8. Medium-Term Moat
Advanced Packaging Constraints: NVIDIA has secured advanced packaging and wafer capacity, ensuring medium-term dominance.
Hyperscaler Commitment: Despite custom silicon efforts, hyperscalers still rely on NVIDIA for many workloads, but the relationship is increasingly precarious.
9. The Innovator’s Dilemma
Reliance on Unspecialized Platforms: NVIDIA’s platform-agnostic design limits its ability to deeply integrate with hyperscaler systems, leaving it vulnerable to being designed out.
Commoditization of AI Compute: Custom silicon and alternative frameworks (e.g., Triton) are reducing the need for NVIDIA-specific solutions like CUDA.
10. Long-Term Uncertainty
Future of AI Compute: Distributed training, asynchronous workflows, and custom silicon trends point to a future where NVIDIA may no longer dominate.
Strategic Imperatives: NVIDIA must address infrastructure gaps, diversify its revenue base, and innovate to compete in a rapidly evolving landscape.
Scenario #1: NVIDIA’s Chip Decline (Hypothetical Timeline)
As a thought experiment, let’s assume the points by Mohit are accurate and that NVIDIA will experience a reasonable fall from chip dominance.
What would this timeline look like based on the specific data presented?
2025–2026: Competitors Begin to Emerge
Rising Competition: Hyperscalers ramp up custom silicon (Google TPUs, AWS Trainium, Microsoft Maia) for AI inference workloads, targeting cost-performance parity with NVIDIA.
Market Shift: By 2025, ~70% of GPU hours transition to inference, where custom chips excel, reducing NVIDIA’s competitive edge.
AMD’s Rise: The MI300x series chips launched in 2023, directly challenging NVIDIA’s market share in training and inference workloads.
Market Cap Impact: Increased competition, pricing pressures, and reduced hyperscaler reliance shrink NVIDIA’s market cap from $3.42T to ~$3T by 2026.
2027–2030: Accelerating Decline
Demand Consolidation:
Hyperscalers like Google, AWS, and Meta consolidate AI workloads, shifting ~50% of their purchases to custom silicon.
Smaller customers (startups, independent clouds) struggle to justify GPU costs amid declining profitability, further reducing NVIDIA’s long-tail demand.
Erosion of Pricing Power:
GPU rental prices decline by ~50%, reflecting increased competition and commoditization.
NVIDIA’s ability to sustain high margins (~70% today) weakens, further pressuring revenues.
Structural Weaknesses:
NVIDIA’s reliance on hyperscalers (~50% of revenue) becomes a liability as they design out third-party suppliers.
Distributed AI systems and asynchronous training reduce dependency on centralized GPU hardware.
Market Cap Impact: Shrinking revenue (~30–50% in datacenter sales) and margin compression drive NVIDIA’s valuation down to ~$2T–$2.5T by 2030.
2030 & Beyond: Loss of Dominance
Commoditization of AI Compute:
Custom silicon achieves cost-performance parity, making NVIDIA GPUs less attractive for hyperscaler workloads.
CUDA loses its moat as alternative software stacks (e.g., Triton, JAX) mature and gain adoption.
Diminished Strategic Role:
NVIDIA shifts from a market leader to a niche supplier akin to a “component vendor,” losing strategic leverage over customers.
Alternative AI architectures (e.g., photonic or wafer-scale chips) further reduce reliance on NVIDIA.
Global Impact: NVIDIA’s datacenter revenue and valuation growth stagnate as competitors dominate key AI verticals.
Market Cap Impact: NVIDIA’s valuation could drop to $1T–$1.5T by 2035, reflecting its reduced relevance in a commoditized AI compute market.
Counter-Thesis: How NVIDIA Remains King of Future Compute
I appreciate Mohit’s research and publication, but I think there’s a reasonable probability that NVIDIA’s crown won’t slip – and instead ends up fortified with Gorilla Glue (i.e. zero slip).
Why? Numerous reasons… or at least here are some I dreamt up (some might be a bit farfetched).
Annual rate of innovation: NVIDIA is outpacing everyone – blowing minds with each GPU upgrade. Saves hyperscalers money and performance is unrivaled. This is due to a virtuous circle (best GPUs -> massive profits -> more R&D $ to engineer best GPUs, repeat).
Desire to use the best: Companies with a desire to use the “best-of-the-best” technology will continue using NVIDIA. Why risk lagging behind because you want to experiment with some custom chips? By the time you get custom chips made NVIDIA’s next chip may have blown it out of the water.
Inefficiencies: Large companies working on custom chips need to: hire specific talent (e.g. engineers, designers, etc.), forge relationships, increase efficiency, etc. But they are diverting their resources and attention away from what made them great. They aren’t chip-first companies – so probably won’t beat someone who’s sole goal is to make the best chips.
Supply chain relationships: NVIDIA has arguably better relationships with the entire semiconductor supply chain than any company. This means they’ll likely get preferential priority/treatment even over hyperscalers. They know how to get better GPUs to market more efficiently than anyone.
Prisoner’s dilemma: If major hyperscalers consider opting out of NVIDIA GPUs but competition decides to go “all in” on NVIDIA GPUs – the competition may gain a significant advantage in performance and/or capabilities – such that they’re able to offer services that others can’t as a result of sticking with NVIDIA.
Custom chips aren’t an alternative (historical evidence): Hyperscalers have been using custom chips for a while now and still buy NVIDIA chips en masse whenever possible. Google introduced its first TPU IN 2016 yet acquired ~50K NVIDIA H100s in 2023. It is also reported that they’ve loaded up on Blackwell GPUs for Google Cloud.
Strategic pricing adjustments: NVIDIA can significantly increase or decrease the prices of GPUs – selling them for whatever will make them more competitive. If certain hyperscalers ramp down the number of GPUs they buy, NVIDIA may actually increase pricing (e.g. GPU elitism). On the other hand, they might undercut competitors to keep their moat entrenched.
Expanding non-hyperscaler market: Many large companies benefit from GPUs but don’t want to make custom chips and/or lack resources to make custom chips. Sure the hyperscalers are NVIDIA’s biggest customer at the moment, but its customer base may expand if hyperscalers opt out. Remember – they keep selling out.
Consumer-grade GPU popularization: It’s possible that we could see a boom in consumer-grade GPUs driven by people who want custom LLMs (using open-source tech: Llama, Mixtral, Qwen, etc.), custom AI agents, and/or complete data protection/privacy from major companies. If you can use a powerful LLM locally, you may not want to use ChatGPT or Claude (data protection).
Integration efficiency (plug-&-play): NVIDIA’s hardware is engineered for ease of integration into existing infrastructure. Its GPUs are highly modular and efficient to replace, minimizing disruption and cost. Even if alternative solutions are more cost-effective, the hassle of reworking entire datacenters discourages switching.
Custom chip potential: NVIDIA could pivot toward manufacturing custom GPUs or chips tailored to hyperscaler needs, providing cost-effective and specialized solutions while retaining its technological advantage.
CUDA software: Although there’s competition for CUDA software (Triton, ROCm, etc.), CUDA is still the industry leader. It offers superior performance, ecosystem maturity, and is deeply integrated with AI frameworks (TensorFlow, PyTorch, etc.). There are also CUDA integrations for alt forms of computing (e.g. CUDA-Q for quantum).
Potential acquisitions: NVIDIA may acquire one or more key companies within: the semiconductor sector (e.g. ARM, Intel, Marvell); AI infrastructure sector; and/or alternative computing space (e.g. quantum to bridge with GPUs). Certain acquisitions could solidify NVIDIA’s dominance in the AI hardware ecosystem.
More Detail: How NVIDIA’s Crown Remains Intact
This is a more comprehensive breakdown of my ideas courtesy of ChatGPT.
1. Unmatched Chip Efficiency & Annual Performance Gains
Cutting-Edge Innovation:
NVIDIA’s Blackwell B200 represents a significant leap forward, delivering 4,500 teraFLOPS (FP16/BF16), 8TB/s memory bandwidth, and a 25x reduction in energy costs compared to previous architectures.
Hyperscaler custom chips, such as Google TPU v4 and AWS Trainium2, trail far behind in compute performance and memory bandwidth.
Broader Applicability: Unlike hyperscaler chips optimized for specific tasks, NVIDIA GPUs address a wider range of workloads, including AI training, inference, and HPC.
Efficiency as a Competitive Advantage: NVIDIA GPUs offer 20–30% lower TCO for AI training by optimizing power usage and training times, enabling faster deployment and greater cost savings for enterprises.
Relentless Iteration: NVIDIA’s annual updates ensure its GPUs consistently outperform competitors, while hyperscaler chips, constrained by longer iteration cycles, fall behind in efficiency and applicability.
2. Record-Setting Research & Development Spending
Aggressive Investment Growth: NVIDIA’s fiscal 2024 R&D spending of $8.68 billion reflects a 47.78% YoY increase, sustaining its technological lead over competitors. These investments drive innovations like 208 billion transistors on the Blackwell B200 and advanced features such as the Second-Generation Transformer Engine.
Strategic Impact: NVIDIA’s R&D investments fuel breakthroughs that keep its 80% AI chip market share intact, while competitors struggle to match its pace of innovation.
Focused on AI Dominance: Funding supports developments in memory technology (e.g., HBM3e), GPU architectures, and interconnect technologies like NVLink (1.8TB/s bandwidth), ensuring sustained leadership.
3. CUDA Ecosystem: A Decent Moat
Technical Superiority: CUDA delivers 10–30% better performance than alternatives like AMD ROCm by offering extensive optimization for AI frameworks, support for 2,000+ native operators, and advanced multi-GPU support. CUDA’s mature direct memory access and operator optimization enable scalable performance for demanding workloads.
Developer Lock-In: Tens of thousands of developers trained on CUDA create significant inertia, as switching to alternatives like Triton or PrimTorch involves substantial retraining and risks lower efficiency.
Entrenched Ecosystem: CUDA’s integration with TensorFlow, PyTorch, and other major frameworks makes it indispensable for enterprises, further reinforcing NVIDIA’s market position.
Narrowing Gap: Efforts like OpenAI’s Triton and UXL Foundation initiatives are reducing CUDA’s monopoly, but these alternatives remain years behind in ecosystem maturity, tool support, and optimization depth.
4. Plug-and-Play Ecosystem
Ease of Integration: NVIDIA GPUs are designed for seamless replacement and compatibility with existing infrastructure, minimizing downtime and reducing integration costs. Advanced interconnect technologies, such as NVLink bandwidth (1.8TB/s) and PCIe Gen6, support scalable performance without requiring significant architectural changes.
Broad Platform Support: NVIDIA’s hardware operates across major hyperscaler clouds (AWS, Azure, GCP) and on-premise datacenters, ensuring access for enterprises of all sizes.
5. Adjunct Role of Custom Chips
Coexistence with Hyperscaler Silicon: Hyperscalers like Google (TPU) and Amazon (Trainium) continue to rely on NVIDIA GPUs for high-flexibility workloads and inference, where their custom silicon is less effective. NVIDIA remains the gold standard for workloads requiring versatility and high performance.
Semi-Custom Solutions: NVIDIA’s ability to develop tailored GPUs for specific hyperscaler needs ensures relevance and adaptability in competitive markets.
6. Expanding AI Demand Beyond Hyperscalers
Non-Hyperscaler Adoption: Enterprises, startups, and smaller companies that lack the resources for custom silicon depend on NVIDIA’s plug-and-play GPUs for cutting-edge AI capabilities.
Decentralized AI Potential: Consumer-grade GPUs, such as NVIDIA’s GeForce line, enable localized AI workloads like LLaMA and other open-source models, allowing privacy-conscious users to avoid cloud dependencies.
7. Pricing Power & Strategic Adaptability
Flexibility in Pricing: NVIDIA’s high gross margins (~70%) allow it to adjust pricing to undercut competitors if needed, without sacrificing profitability.
Strategic Acquisitions and Partnerships: Financial strength enables NVIDIA to pursue key acquisitions (e.g., Cerebras, Marvell) or co-develop custom silicon with hyperscalers, reinforcing its competitive positioning.
8. Challenges for Custom Chip Alternatives
Development Costs: Designing and deploying custom chips requires billions in R&D and years of iteration, making them less cost-effective than NVIDIA’s high-volume GPUs.
Scaling Limitations: Custom chips are narrowly optimized for specific tasks, whereas NVIDIA GPUs handle diverse workloads, ensuring broader applicability and utilization.
9. Hardware Complexity as a Competitive Barrier
Advanced Manufacturing: NVIDIA’s partnerships with TSMC and ASML provide access to cutting-edge fabrication technologies like EUV lithography, making replication by competitors nearly impossible.
Software & Data Vulnerability: While open-source AI tools reduce software moats, NVIDIA’s hardware expertise remains a significant and enduring barrier to entry.
Scenario #2: NVIDIA Maintains Elite Chip Status (Hypothetical Timeline)
2024–2026: Innovation-Driven Momentum
Innovation Leadership: Blackwell GPUs dominate with cutting-edge performance, reinforcing NVIDIA’s leadership in training and inference workloads.
Massive Demand: Hyperscalers (Google, AWS, Microsoft) and enterprises drive ~30% YoY growth in data center revenue, while consumer-grade GPUs for localized AI see burgeoning adoption.
Speculative Investment: AI fervor among retail and institutional investors continues to elevate NVIDIA's valuation.
Market Cap Growth: NVIDIA grows from $3.42T to ~$5T by 2026, driven by expanding use cases, sustained pricing power, and speculative inflows.
2027–2030: Diversified Expansion & Market Confidence
AI Democratization: Consumer GPUs for private AI workloads (e.g., LLMs, generative AI agents) gain mainstream traction, significantly broadening NVIDIA's customer base.
New Revenue Streams:
Automotive AI: Contributes ~$15B annually by 2030, as NVIDIA’s autonomous driving systems become industry standards.
Enterprise AI Software: Adds $20B+ in recurring revenue from cloud-based AI tools and solutions.
Robotics & Edge AI: Generates $10B+ annually from NVIDIA-powered IoT and automation systems. (Read: NVIDIA’s Robotics Ecosystem)
Market Sentiment: Retail and institutional investors speculate on NVIDIA’s role as the backbone of AI, driving a valuation premium similar to Tesla’s speculative run.
Market Cap Growth: NVIDIA’s diversified revenue base and expanding TAM push its valuation to $7T–$8T by 2030.
2030–2035: Entrenched Global Leadership
Continued Innovation: Annual GPU and system-level upgrades keep NVIDIA 1–2 generations ahead, ensuring relevance in emerging fields like quantum computing and AI-specific cloud services.
Expanded Ecosystem: CUDA's ecosystem remains deeply entrenched across industries, while NVIDIA's modular hardware and software solutions enable plug-and-play adoption at scale.
Speculative AI Boom: Investor enthusiasm for AI-driven growth continues, buoyed by the idea of NVIDIA as a core infrastructure provider akin to utilities or energy giants.
Market Cap Potential: With strong execution, NVIDIA could surpass $10T by 2035, driven by: sustained AI adoption globally, expansion into high-growth markets (e.g., AI healthcare, decentralized AI), ongoing investor speculation and confidence in its innovation leadership.
Q&A: Questions to think about (re: NVIDIA dominance)
Below are some questions I’ve posed that are worth contemplating for NVIDIA investors.
What are the odds that NVIDIA loses, gains, maintains GPU dominance (2025-2035)?
A.) 2025-2026
Odds of Maintaining Dominance: 90-95%
Reasons for Confidence:
Blackwell B200 GPUs deliver unmatched performance (4,500 teraFLOPS, 208 billion transistors).
Energy efficiency improved by 25x, reducing operational costs.
Strong pre-orders (H100 GPUs booked 12 months in advance).
Market share dominance: 98% of data center GPUs and 80% of AI chips.
Odds of Gaining Market Share: 5-10%
Opportunities:
Continued growth of AI chip demand.
Expansion into untapped markets through advanced software ecosystems like CUDA.
Odds of Losing Market Share: 5-10%
Risks:
Early-stage custom silicon investments by hyperscalers.
Emerging competition from AMD and other players.
B.) 2027-2030
Odds of Maintaining Dominance: 70-80%
Reasons for Confidence:
Incremental performance improvements (1.5-2x annually).
Advanced wafer and packaging capabilities secured through 2026, with scope for extension.
CUDA software ecosystem expansion locks in customer loyalty.
Odds of Gaining Market Share: 10-15%
Opportunities:
AI chip market projected to grow to $400 billion by 2027.
Strong revenue growth ($111.3 billion forecast by 2025) allows for reinvestment in R&D and capacity.
Odds of Losing Market Share: 20-30%
Risks:
Custom silicon efforts by hyperscalers like Google and Amazon gaining momentum.
Increased adoption of alternative architectures, including AMD’s roadmap for surpassing GPUs with custom chips.
C.) 2031-2035
Odds of Maintaining Dominance: 55-65%
Reasons for Confidence:
Sustained leadership in innovation with annual GPU performance improvements.
Continued relevance of the CUDA ecosystem as a competitive moat.
Slower development cycles for custom chips (3-5 years for significant iterations).
Odds of Gaining Market Share: 10-15%
Opportunities:
Ongoing advancements in energy efficiency and scalability could attract new markets.
Expansion into related markets (e.g., edge computing, autonomous vehicles).
Odds of Losing Market Share: 35-45%
Risks:
Increasing maturity of hyperscaler custom silicon efforts.
Broader adoption of alternative architectures reducing dependence on GPUs.
Potential regulatory or supply chain disruptions impacting production.
How Would GPU Market Share Decline Impact NVIDIA’s Market Cap?
This assumes Mohit is fully or mostly accurate and that NVIDIA’s crown will slip significantly.
If this is the case, how much market share will NVIDIA likely lose as a result?
Short-Term (1-2 Years): NVIDIA's reliance on GPUs (76% of revenue, 70-80% margins) makes it vulnerable.
A 10% GPU market share loss could lead to:
Revenue drop: ~7-8%
Profit decline: ~12-15%
Market cap impact: ~20-25% decline
Mid to Long-Term (3-6+ Years): NVIDIA’s diversification into high-growth sectors could mitigate GPU losses.
Automotive AI: $10-15B annual revenue potential
Robotics & Edge AI: $5-8B annually
Enterprise AI Software: $15-20B annually
Total New Revenue Potential by 2030: $38-55B
With strong execution and market adoption, NVIDIA’s market cap could recover or grow, stabilizing within 4-6 years.
The trajectory may follow a "U-curve"—short-term decline, stabilization, and eventual recovery.
That said, this also assumes that NVIDIA continues to execute effectively in other biz segments (self-driving tech, robotics/edge AI, enterprise software, etc.) relative to competition.
Do companies benefit significantly from the latest NVIDIA tech (annual GPU upgrades)?
The decision to upgrade to NVIDIA’s latest technology depends on whether the benefits of superior performance, ROI, and competitive protection outweigh the appeal of custom solutions.
For most companies, particularly those focused on staying competitive in AI applications (i.e. the hyperscalers), NVIDIA’s annual upgrades provide significant value and long-term advantages.
This is evidenced by the fact that there’s basically zero slowdown in NVIDIA chip purchases from the hyperscalers AND that demand for NVIDIA GPUs is higher than ever.
Jensen Huang (on CNBC “Closing Bell” 2024 re: Blackwell GPUs):
Demand for Blackwell AI chip is “insane”
“Everybody wants to have the most and everybody wants to be first” (to get NVIDIA’s latest GPUs)
“At a time when the technology is moving so fast, it gives us an opportunity to triple down, to really drive the innovation cycle so that we can increase capabilities, increase our throughput, decrease our costs, decrease our energy consumption”
Which companies have the most custom chips to threaten NVIDIA GPU dominance?
1. Hyperscalers with Custom Silicon Efforts
Google:
Dominates with its TPU (Tensor Processing Unit) lineup, competitive in both training and inference.
Current TPU versions, such as Trillium, deliver significant performance and efficiency gains. Nearly all internal AI workloads have transitioned to TPUs.
Investment in software (e.g., JAX) and infrastructure further strengthens its position.
Amazon:
Trainium and Inferentia chips, developed with Annapurna Labs, are increasingly competitive for cost-efficient inference and training.
Partnerships with companies like Anthropic highlight Amazon's ability to co-develop cutting-edge hardware.
Microsoft:
Early stages with the Maia accelerator and Cobalt CPU, but strong partnerships (e.g., AMD) and proprietary software (Triton) position it as a long-term competitor.
Meta:
Focused on chips for massive inference workloads (e.g., Instagram, WhatsApp AI) and training for LLaMA models.
Commitment to custom silicon and infrastructure improvements suggests scalability in the future.
2. Alternative Silicon Providers
AMD:
Gaining ground with MI300x chips, designed for training and inference.
Software advancements (ROCm) and competitive price-performance ratios make AMD a credible alternative.
Startups and Innovators:
Cerebras: Wafer-scale chips for high-throughput AI workloads.
Groq: Deterministic VLIW architecture optimized for latency-sensitive workloads.
Sambanova: Integrated systems offering flexible scaling and innovative memory solutions.
Lightmatter: Photonic chips prioritizing energy efficiency and high-performance inference.
Tenstorrent: RISC-V-based AI CPUs, tailored for inference workloads.
3. Chinese Companies
Huawei: Ascend chips are highly competitive and widely adopted by domestic hyperscalers. (Read: Is Huawei Catching NVIDIA’s Chips?)
Alibaba (T-head): Active in inference workloads with the Hanguang chip lineup.
Baidu (Kunlun): Focused on AI solutions and autonomous vehicle applications.
What are the best investments to make (scenario-based)?
Your call. Nothing here is investment advice.
Drew’s idea: SMH or SOXX + MAG7. (I prefer SMH over SOXX).
A simple way to avoid getting burned is to have money in the MAG7 with a semiconductor ETF. (Read: Semiconductor Sector Outlook: 2025-2030)
Could do a 50/50 allocation or whatever fits your own thesis.
I like the idea of SMH ETF given: past-performance (even pre-NVIDIA), quarterly rebalancing, top-heavy to leaders (NVIDIA), max allocation of 20%, exposure to most of the top companies in semi supply chain (TSMC, ASML, SNPS, MRVL, MU, AVGO, LRCX).
Only risks to semi ETFs are: (1) a private company suddenly dominates the sector (unlikely) and (2) a novel form of computing rapidly takes over (e.g. neuromorphic, optical/photonic, etc.).
1. Probability-based approach: Investors could consider “what’s most likely to happen” based on weighted-probabilities of future scenarios (NVIDIA, custom chips, etc.). My data suggests NVDA is ~70% likely to maintain dominance, ~20% likely to lose dominance, and ~10% likely to become more dominant (2025-2035).
Investments: TSMC, ASML, NVDA, AVGO, GOOGL, MRVL, AMD, MSFT.
2. NVIDIA maintains market share: This scenario assumes NVIDIA maintains GPU market share.
Investments: NVDA, TSMC, ASML, AVGO, CDNS, SNPS, AMD, MRVL.
3. NVIDIA loses market share: If we assume NVIDIA loses significant market share, these are companies that would likely be good investments.
Investments: TSMC, ASML, GOOGL, AMZN, MRVL, AVGO, AMD, MSFT.
4. NVIDIA gains market share: This scenario assumes NVDA gains AI compute market share.
Investments: NVDA, TSMC, ASML, AVGO, MSFT, CDNS, SNPS, MRVL.
5. Scenario-agnostic investments: These should be relatively “safe” no matter what happens in the NVIDIA vs. custom chip war. (This does not mean that each company doesn’t have specific risks e.g. geopolitical. Just means they should do well if custom chips are only risk).
Investments: TSMC, ASML, AVGO, MRVL, SNPS, CDNS, AMAT, LRCX.
Final thoughts: NVIDIA GPUs vs. Custom Chips
I think it’s most likely that AI companies remain in a longstanding “hardware war” (all want the newest/best to avoid losing).
What about things like data, software, engineers, etc.? Data can be stolen or hacked. Engineers can be poached. Software architectures can be copied, cloned, refined, improved upon, etc.
It’s nearly impossible to replicate efficient hardware (AI chip) production at the rapid pace of NVIDIA while simultaneously maintaining insane rates of annual chip improvement.
Making custom chips has worked well for mega-cap tech companies like Apple, Google, and Amazon – but they still haven’t come close to usurping NVIDIA… and seem like they’ll continue functioning as adjuncts to NVIDIA GPUs going forward.
Mohit could be correct in his thesis that NVIDIA’s crown is slipping… and I won’t mind if he’s correct because I’m not 100% “all-in” on NVIDIA… I just find it fun to think about this stuff.