返回 Discover
Field DispatchNewsletter RSS7 · 2026-05-27

Nvidia Earnings, The AI Stack, Nvidia’s New Reporting

Stratechery by Ben Thompson

Published
2026/05/26
Feed
Stratechery by Ben Thompson
Categories
1
Excerpt
170
痛点分析发布于 2026/05/26

痛点为 AI 基于上游原始证据的初步提炼;未包含额外中国市场检索。

痛点

这篇来自 Stratechery 的新闻分析本身并非用户直接表达的痛点,而是对英伟达财报和 AI 堆栈的行业观察。从标题和摘要看,它讨论的是英伟达新财报结构如何反映 AI 堆栈的演变,而非某个具体用户群体的日常任务摩擦。因此,基于当前证据,无法识别出明确的用户使用场景或痛点。可能的弱信号是:AI 基础设施的复杂性(如堆栈分层、财报披露变化)可能让投资者或技术决策者难以快速理解市场动态,但缺乏直接证据表明这造成了时间浪费或决策困难。

Article

Newsletter article

Nvidia is changing its reporting to delineate between hyperscaler sales — where Nvidia is fighting commoditization — and everyone else, where Nvidia runs the whole stack.

§ Dossier

Feed context

Feed title
Stratechery by Ben Thompson
Author
Ben Thompson
Published
2026/05/26
Categories
Daily Update
源数据· Raw Archive
source
Newsletter
upstream_source
newsletter_rss
upstream_item_id
stratechery:a70c8d7dff439279f8f92ad4
daily_ranking_item_id
7b2f0fb7-c939-44e5-bb95-64459796f3d6
rank_date
2026-05-27
rank
7
name
Nvidia Earnings, The AI Stack, Nvidia’s New Reporting
tagline
Stratechery by Ben Thompson
description
Nvidia is changing its reporting to delineate between hyperscaler sales — where Nvidia is fighting commoditization — and everyone else, where Nvidia runs the whole stack.
votes_count
0
comments_count
0
created_at_on_source
2026-05-26T10:00:00.000Z
topics
Stratechery by Ben ThompsonDaily Update
media / source-specific data
{
  "newsletter_rss": {
    "author": "Ben Thompson",
    "feed_id": "stratechery",
    "feed_url": "https://stratechery.com/feed/",
    "categories": [
      "Daily Update"
    ],
    "feed_title": "Stratechery by Ben Thompson",
    "published_at": "2026-05-26T10:00:00.000Z"
  }
}
raw_payload
{
  "link": "https://stratechery.com/2026/nvidia-earnings-the-ai-stack-nvidias-new-reporting/",
  "title": "Nvidia Earnings, The AI Stack, Nvidia’s New Reporting",
  "author": "Ben Thompson",
  "feed_id": "stratechery",
  "entry_id": "https://stratechery.com/?p=19130",
  "feed_url": "https://stratechery.com/feed/",
  "categories": [
    "Daily Update"
  ],
  "feed_title": "Stratechery by Ben Thompson",
  "fetched_at": "2026-05-26T22:02:28.638Z",
  "raw_excerpt": "Nvidia is changing its reporting to delineate between hyperscaler sales — where Nvidia is fighting commoditization — and everyone else, where Nvidia runs the whole stack.",
  "summary_raw": "Nvidia is changing its reporting to delineate between hyperscaler sales — where Nvidia is fighting commoditization — and everyone else, where Nvidia runs the whole stack.",
  "published_at": "2026-05-26T10:00:00.000Z",
  "feed_site_url": "https://stratechery.com",
  "content_excerpt": "Nvidia is changing its reporting to delineate between hyperscaler sales — where Nvidia is fighting commoditization — and everyone else, where Nvidia runs the whole stack.",
  "feed_description": "On the business, strategy, and impact of technology.",
  "snapshot_version": "newsletter_rss_entry_v1",
  "content_raw_excerpt": "Nvidia is changing its reporting to delineate between hyperscaler sales — where Nvidia is fighting commoditization — and everyone else, where Nvidia runs the whole stack."
}
source_raw_snapshot
{
  "id": "402a6df6-75c8-48ae-a53a-e86492c73d30",
  "daily_ranking_item_id": "7b2f0fb7-c939-44e5-bb95-64459796f3d6",
  "source": "newsletter_rss",
  "external_id": "stratechery:a70c8d7dff439279f8f92ad4",
  "feed_id": "stratechery",
  "feed_url": "https://stratechery.com/feed/",
  "fetched_at": "2026-05-26T22:02:28.638Z",
  "feed_raw": {
    "rss": {
      "channel": {
        "item": [
          {
            "guid": {
              "#text": "https://stratechery.com/?p=19130",
              "@_isPermaLink": "false"
            },
            "link": "https://stratechery.com/2026/nvidia-earnings-the-ai-stack-nvidias-new-reporting/",
            "title": "Nvidia Earnings, The AI Stack, Nvidia’s New Reporting",
            "post-id": {
              "#text": "19130",
              "@_xmlns": "com-wordpress:feed-additions:1"
            },
            "pubDate": "Tue, 26 May 2026 10:00:00 +0000",
            "category": "Daily Update",
            "dc:creator": "Ben Thompson",
            "description": "Nvidia is changing its reporting to delineate between hyperscaler sales — where Nvidia is fighting commoditization — and everyone else, where Nvidia runs the whole stack.",
            "content:encoded": "<p>Nvidia is changing its reporting to delineate between hyperscaler sales — where Nvidia is fighting commoditization — and everyone else, where Nvidia runs the whole stack.</p>"
          },
          {
            "guid": {
              "#text": "https://stratechery.com/?p=19102",
              "@_isPermaLink": "false"
            },
            "link": "https://stratechery.com/2026/the-data-center-veto/",
            "title": "2026.21: The Data Center Veto",
            "post-id": {
              "#text": "19102",
              "@_xmlns": "com-wordpress:feed-additions:1"
            },
            "pubDate": "Fri, 22 May 2026 17:12:32 +0000",
            "category": "This Week in Stratechery",
            "dc:creator": "Ben Thompson",
            "description": "The best Stratechery content from the week of May 18, 2026, including data center discontent, agent economics, and slime mold.",
            "content:encoded": "\n<figure class=\"wp-block-image size-full\"><img data-recalc-dims=\"1\" decoding=\"async\" width=\"1024\" height=\"683\" src=\"https://i0.wp.com/stratechery.com/wp-content/uploads/2026/05/GettyImages-2243421278.jpg?resize=1024%2C683&#038;ssl=1\" alt=\"\" class=\"wp-image-19117\" srcset=\"https://i0.wp.com/stratechery.com/wp-content/uploads/2026/05/GettyImages-2243421278.jpg?w=1024&amp;ssl=1 1024w, https://i0.wp.com/stratechery.com/wp-content/uploads/2026/05/GettyImages-2243421278.jpg?resize=300%2C200&amp;ssl=1 300w, https://i0.wp.com/stratechery.com/wp-content/uploads/2026/05/GettyImages-2243421278.jpg?resize=768%2C512&amp;ssl=1 768w\" sizes=\"(max-width: 1000px) 100vw, 1000px\" /><figcaption class=\"wp-element-caption\"><em>(Lexi Critchett/Bloomberg via Getty Images)</em></figcaption></figure>\n\n\n\n<p class=\"wp-block-paragraph\">Welcome back to This Week in Stratechery!</p>\n\n\n\n<p class=\"wp-block-paragraph\">As a reminder, each week, every Friday, we&#8217;re sending out this overview of content in the Stratechery bundle; <span class=\"passport-tokenized-link\">highlighted links are free for everyone</span>. Additionally, you have complete control over what we send to you. If you don&#8217;t want to receive This Week in Stratechery emails (there is no podcast), please uncheck the box in <a href=\"https://stratechery.passport.online/member/account/delivery\">your delivery settings</a>.</p>\n\n\n\n<p class=\"wp-block-paragraph\">On that note, here were a few of our favorites this week.</p>\n\n\n\n<ol class=\"wp-block-list is-style-default\">\n<li><strong>Data Center Discontent.</strong> The impact of AI is, at least for now, being felt digitally: that is where AI is useful, and the more digital a job, the more it is threatened by LLMs. AI, however, depends on data centers in the physical world, and building data centers needs permission. This gives normal people the sort of veto power over AI they didn&#8217;t have in the face of globalization; I make the case in <a href=\"https://stratechery.com/2026/data-center-discontent-understanding-the-opposition-fixing-the-problem/\" data-type=\"post\" data-id=\"19061\">Monday&#8217;s Update</a> and on <a href=\"https://sharptech.fm/member/episode/much-ado-about-data-centers-what-tech-gets-wrong-about-its-critics-q-a-on-space-x-chinese-ai-elon-musk\">Sharp Tech</a> that understanding this dynamic is more important that trying to correct misinformation, which is a symptom, not a cause, of data center opposition. — <em>Ben Thompson</em><br></li>\n</ol>\n\n\n\n<span id=\"more-19102\"></span>\n\n\n\n<ol start=\"2\" class=\"wp-block-list is-style-default\">\n<li><strong>Agent Economics. </strong>What will the internet look like when ad-supported models are rendered obsolete by shifting user behavior and the rise of agentic web traffic? Ben considered this question last summer with <a href=\"https://stratechery.com/2025/the-agentic-web-and-original-sin/\"><span class=\"passport-tokenized-link\">The Agentic Web and Original Sin</span></a>, and I was surprised to learn this week that Parag Agarwal, former CEO of Twitter, is now focused on devising solutions for exactly this reality. <a href=\"https://stratechery.com/2026/an-interview-with-parallel-founder-parag-agarwal-about-valuing-content-on-the-agentic-web/\">This week&#8217;s Stratechery Interview with Agarwal</a> dives deep into the economics of content on the Internet, why ads make sense for humans, and why incentivizing content for agents will be different, and how Agarwal and Parallel are trying to solve them. I learned a ton from this interview, and I bet you will, too — and don&#8217;t worry, we did get a few bonus questions on the ride at Twitter.  <em>—</em> <em>Andrew Sharp</em><br><br></li>\n\n\n\n<li><strong>Never Count Out the Slime Mold. </strong>Wednesday&#8217;s Daily Update on Google I/O reminded me of an iconic <a href=\"https://www.inc.com/minda-zetlin/in-just-2-very-unpleasant-words-a-former-google-exec-pinpointed-companys-biggest-problem.html\" target=\"_blank\" rel=\"noreferrer noopener\">leaked memo</a> about the ungovernable&nbsp;and poorly coordinated mold in Mountain View, as the company seems to be throwing 10 different types of AI spaghetti at the wall to see what sticks. Then again, Google is now a nearly $5 trillion company and its transformer architecture&nbsp;supercharged the AI era. That second part is why, <a href=\"https://stratechery.com/2026/google-i-o-world-models-i-o-spaghetti/\" target=\"_blank\" rel=\"noreferrer noopener\">when Ben highlights a DeepMind approach to building AGI</a> that&#8217;s distinct from the approaches at OpenAI and Anthropic, I&#8217;m compelled to both pay attention, and remember: for all of Google&#8217;s faults and misses, they do in fact have plenty of historic hits.&nbsp; <em>—</em> <em>AS</em></li>\n</ol>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"/>\n\n\n\n<h3 class=\"wp-block-heading\">Stratechery Articles and Updates</h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https://stratechery.com/2026/data-center-discontent-understanding-the-opposition-fixing-the-problem/\">Data Center Discontent, Understanding the Opposition, Fixing the Problem</a> <em>— There are understandable reasons for people to oppose data centers; the only solution that will work is simply paying them off.</em></li>\n\n\n\n<li><a href=\"https://stratechery.com/2026/google-i-o-world-models-i-o-spaghetti/\">Google I/O, World Models, I/O Spaghetti</a> <em>— Google I/O put AI everywhere, for better and for worse. Meanwhile, is DeepMind aligned with Google’s business objectives?</em></li>\n\n\n\n<li><a href=\"https://stratechery.com/2026/an-interview-with-parallel-founder-parag-agarwal-about-valuing-content-on-the-agentic-web/\">An Interview with Parallel Founder Parag Agarwal About Valuing Content on the Agentic Web</a> <em>— An interview with Parallel founder Parag Agarwal about valuing content and incentivizing its creation in a world of agents (plus questions about Twitter).</em></li>\n</ul>\n\n\n\n<h3 class=\"wp-block-heading\">Dithering with Ben Thompson and Daring Fireball&#8217;s John Gruber</h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https://dithering.passport.online/member/episode/data-center-unpopularity\">Data Center Unpopularity</a></li>\n\n\n\n<li><a href=\"https://dithering.passport.online/member/episode/google-being-google\">Google Being Google</a></li>\n</ul>\n\n\n\n<h3 class=\"wp-block-heading\">Asianometry with Jon Yu</h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https://asianometry.passport.online/member/episode/the-little-vertical-laser-that-everyone-uses\">The Little Vertical Laser That Everyone Uses</a></li>\n\n\n\n<li><a href=\"https://asianometry.passport.online/member/episode/intels-30-years-in-costa-rica\">Intel’s 30 Years in Costa Rica</a></li>\n</ul>\n\n\n\n<h3 class=\"wp-block-heading\">Sharp China with Andrew Sharp and Sinocism&#8217;s Bill Bishop</h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https://sharpchina.fm/member/episode/constructing-us-china-stability-trumps-taiwan-comments-and-more-summit-takeaways-putin-in-china\">Constructing US-China Stability; Trump’s Taiwan Comments and More Summit Takeaways; Putin in China</a></li>\n</ul>\n\n\n\n<h3 class=\"wp-block-heading\">Greatest of All Talk</h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https://goat.passport.online/member/episode/wemby-harper-and-an-instant-classic-from-the-spurs-in-game-1-vs-okc\">Wemby, Harper and an Instant Classic from the Spurs in Game 1 vs. OKC</a></li>\n\n\n\n<li><a href=\"https://goat.passport.online/member/episode/a-note-on-the-future-of-goat-and-an-emergency-top-five\">A Note on the Future of GOAT and An Emergency Top Five</a></li>\n</ul>\n\n\n\n<h3 class=\"wp-block-heading\">Sharp Tech with Andrew Sharp and Ben Thompson</h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https://sharptech.fm/member/episode/much-ado-about-data-centers-what-tech-gets-wrong-about-its-critics-q-a-on-space-x-chinese-ai-elon-musk\">Much Ado About Data Centers, What Tech Gets Wrong About Its Critics, Q&amp;A on SpaceX, Chinese AI, Elon Musk</a></li>\n</ul>\n\n\n\n<p class=\"wp-block-paragraph\">This week&#8217;s Stratechery video is on <a href=\"https://stratechery.com/2026/the-inference-shift/\"><span class=\"passport-tokenized-link\">The Inference Shift</span></a>.</p>\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<iframe title=\"The Inference Shift | Stratechery by Ben Thompson\" width=\"500\" height=\"281\" src=\"https://www.youtube.com/embed/xyLr6XcktTo?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen></iframe>\n</div></figure>\n\n\n\n<p class=\"wp-block-paragraph\"></p>\n"
          },
          {
            "guid": {
              "#text": "https://stratechery.com/?p=19093",
              "@_isPermaLink": "false"
            },
            "link": "https://stratechery.com/2026/an-interview-with-parallel-founder-parag-agarwal-about-valuing-content-on-the-agentic-web/",
            "title": "An Interview with Parallel Founder Parag Agarwal About Valuing Content on the Agentic Web",
            "post-id": {
              "#text": "19093",
              "@_xmlns": "com-wordpress:feed-additions:1"
            },
            "pubDate": "Thu, 21 May 2026 10:00:00 +0000",
            "category": [
              "Daily Update",
              "Interviews"
            ],
            "dc:creator": "Ben Thompson",
            "description": "An interview with Parallel founder Parag Agarwal about valuing content and incentivizing its creation in a world of agents (plus questions about Twitter).",
            "content:encoded": "<p>An interview with Parallel founder Parag Agarwal about valuing content and incentivizing its creation in a world of agents (plus questions about Twitter).</p>"
          },
          {
            "guid": {
              "#text": "https://stratechery.com/?p=19078",
              "@_isPermaLink": "false"
            },
            "link": "https://stratechery.com/2026/google-i-o-world-models-i-o-spaghetti/",
            "title": "Google I/O, World Models, I/O Spaghetti",
            "post-id": {
              "#text": "19078",
              "@_xmlns": "com-wordpress:feed-additions:1"
            },
            "pubDate": "Wed, 20 May 2026 10:00:00 +0000",
            "category": "Daily Update",
            "enclosure": {
              "@_url": "https://videos.files.wordpress.com/N1hICMxJ/demis-singularity.mp4",
              "@_type": "video/mp4",
              "@_length": "4298812"
            },
            "dc:creator": "Ben Thompson",
            "description": "Google I/O put AI everywhere, for better and for worse. Meanwhile, is DeepMind aligned with Google's business objectives?",
            "content:encoded": "<p>Google I/O put AI everywhere, for better and for worse. Meanwhile, is DeepMind aligned with Google's business objectives?</p>"
          },
          {
            "guid": {
              "#text": "https://stratechery.com/?p=19061",
              "@_isPermaLink": "false"
            },
            "link": "https://stratechery.com/2026/data-center-discontent-understanding-the-opposition-fixing-the-problem/",
            "title": "Data Center Discontent, Understanding the Opposition, Fixing the Problem",
            "post-id": {
              "#text": "19061",
              "@_xmlns": "com-wordpress:feed-additions:1"
            },
            "pubDate": "Mon, 18 May 2026 10:00:00 +0000",
            "category": "Daily Update",
            "dc:creator": "Ben Thompson",
            "description": "There are understandable reasons for people to oppose data centers; the only solution that will work is simply paying them off.",
            "content:encoded": "<p>There are understandable reasons for people to oppose data centers; the only solution that will work is simply paying them off.</p>"
          },
          {
            "guid": {
              "#text": "https://stratechery.com/?p=19040",
              "@_isPermaLink": "false"
            },
            "link": "https://stratechery.com/2026/shifting-alliances-in-a-changing-world/",
            "title": "2026.20: Shifting Alliances in a Changing World",
            "post-id": {
              "#text": "19040",
              "@_xmlns": "com-wordpress:feed-additions:1"
            },
            "pubDate": "Fri, 15 May 2026 17:00:00 +0000",
            "category": "This Week in Stratechery",
            "dc:creator": "Ben Thompson",
            "description": "The best Stratechery content from the week of May 11, 2026, including a new kind of computing, Elon Musk, and 360 degrees of US-China relations.",
            "content:encoded": "\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" width=\"1700\" height=\"1133\" src=\"https://i0.wp.com/stratechery.com/wp-content/uploads/2026/05/GettyImages-2276124586.jpg?resize=1700%2C1133&#038;ssl=1\" alt=\"\" class=\"wp-image-19050\" srcset=\"https://i0.wp.com/stratechery.com/wp-content/uploads/2026/05/GettyImages-2276124586-scaled.jpg?resize=1700%2C1133&amp;ssl=1 1700w, https://i0.wp.com/stratechery.com/wp-content/uploads/2026/05/GettyImages-2276124586-scaled.jpg?resize=300%2C200&amp;ssl=1 300w, https://i0.wp.com/stratechery.com/wp-content/uploads/2026/05/GettyImages-2276124586-scaled.jpg?resize=768%2C512&amp;ssl=1 768w, https://i0.wp.com/stratechery.com/wp-content/uploads/2026/05/GettyImages-2276124586-scaled.jpg?resize=1536%2C1024&amp;ssl=1 1536w, https://i0.wp.com/stratechery.com/wp-content/uploads/2026/05/GettyImages-2276124586-scaled.jpg?resize=2048%2C1365&amp;ssl=1 2048w\" sizes=\"auto, (max-width: 1000px) 100vw, 1000px\" /><figcaption class=\"wp-element-caption\">(Photo by Alex Wong/Getty Images)</figcaption></figure>\n\n\n\n<p class=\"wp-block-paragraph\">Welcome back to This Week in Stratechery!</p>\n\n\n\n<p class=\"wp-block-paragraph\">As a reminder, each week, every Friday, we&#8217;re sending out this overview of content in the Stratechery bundle; <span class=\"passport-tokenized-link\">highlighted links are free for everyone</span>. Additionally, you have complete control over what we send to you. If you don&#8217;t want to receive This Week in Stratechery emails (there is no podcast), please uncheck the box in <a href=\"https://stratechery.passport.online/member/account/delivery\">your delivery settings</a>.</p>\n\n\n\n<p class=\"wp-block-paragraph\">On that note, here were a few of our favorites this week.</p>\n\n\n\n<ol class=\"wp-block-list is-style-default\">\n<li><strong>A New Kind of Computing</strong>. AI compute has been divided into two categories: training, and inference. However, in <a href=\"https://stratechery.com/2026/the-inference-shift/\" data-type=\"post\" data-id=\"18991\"><span class=\"passport-tokenized-link\">The Inference Shift</span></a> (and on <a href=\"https://sharptech.fm/member/episode/inference-in-the-agentic-future-x-ai-is-two-companies-in-one-q-a-on-elons-lawsuit-intel-apple\">this week&#8217;s Sharp Tech</a>), I make the case that there are two kinds of inference: the one we know today is &#8220;answer inference&#8221;, where humans are in the loop, and speed matters; the inference that will matter most in the future, at least in terms of market size, will be &#8220;agentic inference&#8221;, where humans aren&#8217;t involved at all. That will lead to very different trade-offs in architectures, and is good news for both China and space (but maybe not Nvidia). <em>—</em> <em>Ben Thompson</em><br></li>\n</ol>\n\n\n\n<span id=\"more-19040\"></span>\n\n\n\n<ol start=\"2\" class=\"wp-block-list is-style-default\">\n<li><strong>All About Elon. </strong>A week on from the news that Anthropic has secured compute from xAI, <a href=\"https://stratechery.com/2026/spacex-and-anthropic-xais-two-companies-elon-musk-and-spacexais-future/\">Tuesday&#8217;s Daily Update examined the deal from both sides</a>. On one hand, Anthropic&#8217;s side of the deal is a reminder that markets actually work quite well, much to the relief of Claude users all over the world. On the other, the logic of the deal for xAI raises an interesting question about whether Musk will listen to what the market has told him, as well as the future of space data centers and who exactly SpaceX will be serving. Finally, if you can&#8217;t get enough Elon, on Sharp Text this week I wrote about his ongoing lawsuit with OpenAI, and <a href=\"https://sharptext.net/2026/elons-openai-lawsuit-is-boring-and-insulting-and-its-already-a-success/\"><span class=\"passport-tokenized-link\">why I find the case both boring and insulting</span></a>, even as it&#8217;s clear that win or lose, Musk has already succeeded. <em>—</em> <em>Andrew Sharp</em><br><br></li>\n\n\n\n<li><strong>360 Degrees of US-China Relations.</strong> With a U.S. President visiting Beijing for the first time in nine years, <a href=\"https://sharpchina.fm/member/episode/10-questions-and-modest-expectations-with-trump-in-china-to-meet-xi-jinping\"><span class=\"passport-tokenized-link\">this week&#8217;s episode of Sharp China</span></a> asked 10 Questions about the US-China summit and what might be achieved. Trump has already left Beijing as you read this, and as predicted on the podcast, the deliverables from his visit were underwhelming (at least so far). Nevertheless, Wednesday&#8217;s conversation doubles as a great window into the state of the relationship generally, including why &#8220;upper hand&#8221; analysis tends to be overblown, why both sides are incentivized to play for time and stability, and the ways in which China&#8217;s posture has changed since the 90s and 2000s. Also: <a href=\"https://x.com/emilylgoodin/status/2054375216403611943?s=46\">Jensen Huang standing on a runway in Alaska</a>, and fun memories of a US-China fistfight in the Great Hall, back in 2017. <em>—</em> <em>AS</em></li>\n</ol>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"/>\n\n\n\n<h3 class=\"wp-block-heading\">Stratechery Articles and Updates</h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https://stratechery.com/2026/the-inference-shift/\"><span class=\"passport-tokenized-link\">The Inference Shift</span></a> <em>— Agentic inference is going to be different than the inference we use today, and it will change compute infrastructure because speed won’t matter when humans aren’t involved.</em></li>\n\n\n\n<li><a href=\"https://stratechery.com/2026/spacex-and-anthropic-xais-two-companies-elon-musk-and-spacexais-future/\">SpaceX and Anthropic, xAI’s Two Companies, Elon Musk and SpaceXAI’s Future</a> <em>— The Anthropic xAI deal is shocking but not surprising: Musk should double down on serving other companies.</em></li>\n\n\n\n<li><a href=\"https://stratechery.com/2026/the-deployment-company-back-to-the-70s-apple-and-intel/\">The Deployment Company, Back to the 70s, Apple and Intel</a> <em>— OpenAI is forming a new company to deploy AI, and the other labs aren’t far behind, reinforcing the thesis that AI’s impact will require top-down implementation. Then, Apple has economic reasons to work with Intel.</em></li>\n\n\n\n<li><a href=\"https://stratechery.com/2026/an-interview-with-ben-thompson-at-the-moffettnathanson-media-internet-communications-conference/\">An Interview with Ben Thompson at the MoffettNathanson Media, Internet &amp; Communications Conference</a> <em>— An interview with me about the implications of the compute shortage on Aggregation Theory, consumer AI, and more.</em></li>\n</ul>\n\n\n\n<h3 class=\"wp-block-heading\">Sharp Text by Andrew Sharp</h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https://sharptext.net/2026/elons-openai-lawsuit-is-boring-and-insulting-and-its-already-a-success/\"><span class=\"passport-tokenized-link\">Elon&#8217;s OpenAI Lawsuit Is Boring and Insulting, and It&#8217;s Already a Success</span></a> — <em>On the OpenAI trial and Elon Musk telling the world the same few Sam Altman stories that everyone knows and loves</em>.</li>\n</ul>\n\n\n\n<h3 class=\"wp-block-heading\">Dithering with Ben Thompson and Daring Fireball&#8217;s John Gruber</h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https://dithering.passport.online/member/episode/apples-supply-squeeze\">Apple’s Supply Squeeze</a></li>\n\n\n\n<li><a href=\"https://dithering.passport.online/member/episode/apples-ai-land-grab\">Apple’s AI Land Grab</a></li>\n</ul>\n\n\n\n<h3 class=\"wp-block-heading\">Asianometry with Jon Yu</h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https://asianometry.passport.online/member/episode/general-motors-dreamt-of-robots\">General Motors Dreamt of Robots</a></li>\n</ul>\n\n\n\n<h3 class=\"wp-block-heading\">Sharp China with Andrew Sharp and Sinocism&#8217;s Bill Bishop</h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https://sharpchina.fm/member/episode/10-questions-and-modest-expectations-with-trump-in-china-to-meet-xi-jinping\"><span class=\"passport-tokenized-link\">10 Questions and Modest Expectations With Trump in China to Meet Xi Jinping</span></a></li>\n</ul>\n\n\n\n<h3 class=\"wp-block-heading\">Greatest of All Talk with Andrew Sharp and Ben Golliver</h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https://goat.passport.online/member/episode/the-wizards-win-the-nba-lottery-post-lottery-reactions-and-questions-a-wemby-ejection-and-a-wolves-resurrection\">The Wizards Win the NBA Lottery, Post-Lottery Reactions and Questions, A Wemby Ejection and a Wolves Resurrection</a></li>\n\n\n\n<li><a href=\"https://goat.passport.online/member/episode/cavs-and-spurs-approach-the-conference-finals-the-knicks-look-better-than-ever-morey-out-and-whats-next-for-le-bron\">Cavs and Spurs Approach the Conference Finals, The Knicks Look Better Than Ever, Morey Out and What’s Next for LeBron</a></li>\n</ul>\n\n\n\n<h3 class=\"wp-block-heading\">Sharp Tech with Andrew Sharp and Ben Thompson</h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https://sharptech.fm/member/episode/inference-in-the-agentic-future-x-ai-is-two-companies-in-one-q-a-on-elons-lawsuit-intel-apple\">Inference in the Agentic Future, xAI Is Two Companies in One, Q&amp;A on Elon’s Lawsuit, Intel, Apple</a></li>\n</ul>\n\n\n\n<p class=\"wp-block-paragraph\">This week&#8217;s Stratechery video is on <a href=\"https://stratechery.com/2026/amazons-durability/\"><span class=\"passport-tokenized-link\">Amazon’s Durability</span></a>.</p>\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<iframe loading=\"lazy\" title=\"Amazon&#039;s Durability | Stratechery by Ben Thompson\" width=\"500\" height=\"281\" src=\"https://www.youtube.com/embed/KSJA5JFb25w?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen></iframe>\n</div></figure>\n\n\n\n<p class=\"wp-block-paragraph\"></p>\n"
          },
          {
            "guid": {
              "#text": "https://stratechery.com/?p=19032",
              "@_isPermaLink": "false"
            },
            "link": "https://stratechery.com/2026/an-interview-with-ben-thompson-at-the-moffettnathanson-media-internet-communications-conference/",
            "title": "An Interview with Ben Thompson at the MoffettNathanson Media, Internet &#038; Communications Conference",
            "post-id": {
              "#text": "19032",
              "@_xmlns": "com-wordpress:feed-additions:1"
            },
            "pubDate": "Thu, 14 May 2026 10:00:00 +0000",
            "category": "Daily Update",
            "dc:creator": "Ben Thompson",
            "description": "An interview with me about the implications of the compute shortage on Aggregation Theory, consumer AI, and more.",
            "content:encoded": "<p>An interview with me about the implications of the compute shortage on Aggregation Theory, consumer AI, and more.</p>"
          },
          {
            "guid": {
              "#text": "https://stratechery.com/?p=19023",
              "@_isPermaLink": "false"
            },
            "link": "https://stratechery.com/2026/the-deployment-company-back-to-the-70s-apple-and-intel/",
            "title": "The Deployment Company, Back to the 70s, Apple and Intel",
            "post-id": {
              "#text": "19023",
              "@_xmlns": "com-wordpress:feed-additions:1"
            },
            "pubDate": "Wed, 13 May 2026 10:00:00 +0000",
            "category": "Daily Update",
            "dc:creator": "Ben Thompson",
            "description": "OpenAI is forming a new company to deploy AI, and the other labs aren't far behind, reinforcing the thesis that AI's impact will require top-down implementation. Then, Apple has economic reasons to work with Intel.",
            "content:encoded": "<p>OpenAI is forming a new company to deploy AI, and the other labs aren't far behind, reinforcing the thesis that AI's impact will require top-down implementation. Then, Apple has economic reasons to work with Intel.</p>"
          },
          {
            "guid": {
              "#text": "https://stratechery.com/?p=19012",
              "@_isPermaLink": "false"
            },
            "link": "https://stratechery.com/2026/spacex-and-anthropic-xais-two-companies-elon-musk-and-spacexais-future/",
            "title": "SpaceX and Anthropic, xAI&#8217;s Two Companies, Elon Musk and SpaceXAI&#8217;s Future",
            "post-id": {
              "#text": "19012",
              "@_xmlns": "com-wordpress:feed-additions:1"
            },
            "pubDate": "Tue, 12 May 2026 10:00:00 +0000",
            "category": "Daily Update",
            "dc:creator": "Ben Thompson",
            "description": "The Anthropic xAI deal is shocking but not surprising: Musk should double down on serving other companies.",
            "content:encoded": "<p>The Anthropic xAI deal is shocking but not surprising: Musk should double down on serving other companies.</p>"
          },
          {
            "guid": {
              "#text": "https://stratechery.com/?p=18991",
              "@_isPermaLink": "false"
            },
            "link": "https://stratechery.com/2026/the-inference-shift/",
            "title": "The Inference Shift",
            "post-id": {
              "#text": "18991",
              "@_xmlns": "com-wordpress:feed-additions:1"
            },
            "pubDate": "Mon, 11 May 2026 10:00:00 +0000",
            "category": "Articles",
            "dc:creator": "Ben Thompson",
            "description": "Agentic inference is going to be different than the inference we use today, and it will change compute infrastructure because speed won't matter when humans aren't involved.",
            "content:encoded": "\n<div class=\"wp-block-passport-restricted-content\">\n\n\n\n\n<div class=\"wp-block-passport-logged-out-view\">\n<div class=\"wp-block-group wp-block-passport-podcast-player-logged-out-view has-border-color has-global-padding is-layout-constrained wp-container-core-group-is-layout-27c52d2c wp-block-group-is-layout-constrained\" style=\"border-color:#DDDDDD;border-width:1px;border-radius:5px;padding-top:var(--wp--preset--spacing--20);padding-right:var(--wp--preset--spacing--30);padding-bottom:var(--wp--preset--spacing--20);padding-left:var(--wp--preset--spacing--30)\">\n<div class=\"wp-block-group passport-podcast-player-logged-out-view-underlay has-global-padding is-layout-constrained wp-block-group-is-layout-constrained\">\n<div class=\"wp-block-group is-nowrap is-layout-flex wp-container-core-group-is-layout-878226d7 wp-block-group-is-layout-flex\">\n<div class=\"wp-block-group wp-container-content-9cfa9a5a has-global-padding is-layout-constrained wp-block-group-is-layout-constrained\">\n<p class=\"has-small-font-size wp-block-paragraph\"><strong>Listen to this <strong>post</strong>:</strong></p>\n\n\n\n<div class=\"wp-block-group has-global-padding is-layout-constrained wp-block-group-is-layout-constrained\" style=\"margin-top:0.5rem\">\n<audio controls disabled style=\"width:100%;margin-top:0.15rem\"></audio>\n</div>\n</div>\n</div>\n</div>\n\n\n\n<div class=\"wp-block-group is-style-default passport-podcast-player-logged-out-view-overlay is-vertical is-content-justification-center is-nowrap is-layout-flex wp-container-core-group-is-layout-470faa6e wp-block-group-is-layout-flex\" style=\"margin-top:0\">\n<div class=\"wp-block-group wp-block-buttons has-global-padding is-layout-constrained wp-block-group-is-layout-constrained\" style=\"margin-top:var(--wp--preset--spacing--20)\"><div style=\"text-decoration:none\" class=\"passport-logged-out align wp-block-passport-login-link\">\n<div class=\"wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button\"><a href=\"https://stratechery.com/wp-json/passport/v1/oauth/authlogin?signup_redirect_uri=https%3A%2F%2Fstratechery.com%2Fverify-your-email%2F\" class=\"wp-block-button__link wp-element-button\">Log in to listen</a></div>\n</div>\n</div></div>\n</div>\n</div>\n</div>\n\n\n</div>\n\n\n\n<p class=\"wp-block-paragraph\">If you were looking for the ideal time to IPO, being a chip company in May 2026 is hard to beat. <a href=\"https://www.reuters.com/legal/transactional/cerebras-raise-ipo-price-range-150-160-demand-surges-sources-say-2026-05-10/\">Reuters reported over the weekend</a>:</p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\">Cerebras Systems is set to raise the size and price of its initial public offering as soon as Monday, as demand for the artificial intelligence chipmaker&#8217;s shares continues to climb, two people familiar with the matter told Reuters on Sunday. The company is considering a new IPO price range of $150-$160 a share, up from $115-$125 a share, and raising the number of shares marketed to 30 million from 28 million, said the sources, who asked not to be identified because the information isn&#8217;t public yet.</p>\n</blockquote>\n\n\n\n<p class=\"wp-block-paragraph\">The fundamental driver of the ongoing surge in semiconductor stocks is, of course, AI, particularly the realization that <a href=\"https://stratechery.com/2026/agents-over-bubbles/\">agents are going to need a lot of compute</a>. What Cerebras represents, however, is something broader: while the compute story for AI has been largely about GPUs, particularly from Nvidia, the future is going to look increasingly heterogeneous.</p>\n\n\n\n<h3 class=\"wp-block-heading\">The GPU Era</h3>\n\n\n\n<p class=\"wp-block-paragraph\">The story of how Graphics Processing Units became the center of AI is a well-trodden one, but in brief:</p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Just as drawing pixels on a computer screen was a parallel process, which meant there was a direct connection between the number of processing units and graphics speed, making AI-related calculations was a parallel process, which meant there was a direct connection between the number of processing units and calculation speed.</li>\n\n\n\n<li>Nvidia enabled this dual-usage by making its graphics processors programmable, and created an entire software ecosystem called CUDA to make this programming accessible.</li>\n\n\n\n<li>The big difference between graphics and AI has been the size of the problem being solved — models are a lot bigger than video game textures — which has led to a dramatic expansion in high-bandwidth memory (HBM) per GPU, and dramatic innovations in terms of chip-to-chip networking to allow multiple chips to work together as one addressable system. Nvidia has been the leader in both.</li>\n</ul>\n\n\n\n<p class=\"wp-block-paragraph\">The number one use case for GPUs has been training, which stresses the third point in particular. While the calculations within each training step are massively parallel, the steps themselves are serial: every GPU has to share its results with every other GPU before the next step can begin. This is why a trillion-parameter model needs to fit in the aggregate memory of tens of thousands of GPUs that can communicate as one system. Nvidia dominates both problem spaces, first by securing HBM ahead of the rest of the industry, and second thanks to its investments in networking.</p>\n\n\n\n<p class=\"wp-block-paragraph\">Of course training isn&#8217;t the only AI workload: the other is inference. Inference has three main parts:</p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Prefill encodes everything the LLM needs to know into an understandable state; this is highly parallelizable and compute matters.</li>\n\n\n\n<li>The first part of decode entails reading the KV cache — which stores context, including the output of the prefill step — to make an attention calculation. This is a serial step where bandwidth matters, but the memory requirements are variable and increasingly large.</li>\n\n\n\n<li>The second part of decode is the feed-forward computation over the model weights; this is also a serial step where bandwidth matters, and the memory requirements are defined by the size of the model.</li>\n</ul>\n\n\n\n<p class=\"wp-block-paragraph\">The two decode steps alternate for every layer of the model (they&#8217;re interleaved, not in sequence), which is to say that decode is serial and memory-bandwidth bound. For every token generated, two distinct memory pools must be read: the KV cache, which stores context and grows with each token, and the model weights themselves. Both must be read in full to produce a single output token.</p>\n\n\n\n<p class=\"wp-block-paragraph\">GPUs handle all three needs: high compute for prefill, abundant HBM for KV cache and model weights, and chip-to-chip networking to pool memory across multiple chips when a single GPU isn&#8217;t enough. In other words, what works for training works for inference — look no further than the deal SpaceX made with Anthropic. From <a href=\"https://www.anthropic.com/news/higher-limits-spacex\">Anthropic&#8217;s blog</a>:</p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\">We’ve signed an agreement with SpaceX to use all of the compute capacity at their Colossus 1 data center. This gives us access to more than 300 megawatts of new capacity (over 220,000 NVIDIA GPUs) within the month. This additional capacity will directly improve capacity for Claude Pro and Claude Max subscribers.</p>\n</blockquote>\n\n\n\n<p class=\"wp-block-paragraph\">SpaceX retains Colossus 2 — presumably for both training of future models and inference of existing ones — and can afford to do both in the same data center precisely because xAI&#8217;s models aren&#8217;t getting much usage; more pertinently to this piece, they can do both in the same data center because both training and inference can be done on GPUs. Indeed, the GPUs Anthropic is contracting for at Colossus 1 were originally used for training as well; the fact that GPUs are so flexible is a big advantage.</p>\n\n\n\n<h3 class=\"wp-block-heading\">Understanding Cerebras</h3>\n\n\n\n<p class=\"wp-block-paragraph\">Cerebras makes something completely different. While a silicon wafer has a diameter of 300mm, the &#8220;reticle limit&#8221; — the maximum area that a lithography tool can expose on that wafer — is around 26mm x 33mm. This is the effective size limit for chips; going beyond that entails linking two separate chips together over a chip-to-chip interposer, which is exactly what Nvidia has done with the B200. Cerebras, on the other hand, has invented a way to lay down wiring across the so-called &#8220;scribe lines&#8221; that are the boundary between reticle exposures, making the entire wafer into a single chip with no need for relatively slow chip-to-chip linkages.</p>\n\n\n\n<p class=\"wp-block-paragraph\">The net result is a chip with a lot of compute and a lot of SRAM that is blisteringly fast to access. To put it in numbers, the WSE-3 (Cerebras&#8217; latest chip) has 44GB of on-chip SRAM at 21 PB/s of bandwidth; an H100 has 80GB of HBM at 3.35 TB/s. In other words, the WSE-3 has just over half the memory of an H100, but 6,000 times the memory bandwidth.</p>\n\n\n\n<p class=\"wp-block-paragraph\">The reason to compare the WSE-3 to an H100 is that the H100 is the chip most used for inference — and inference is clearly what Cerebras is most well-suited for. You can use Cerebras chips for training, but the chip-to-chip networking story isn&#8217;t very compelling, which is to say that all of that compute and on-chip memory is mostly just sitting around; what is much more interesting is the idea of getting a stream of tokens at dramatically faster speed than you can from a GPU.</p>\n\n\n\n<p class=\"wp-block-paragraph\">Note, however, that the limitation in terms of training also potentially applies in terms of inference: as long as everything fits in on-chip memory Cerebras&#8217; speed is an incredible experience; the moment you need more memory, whether that be for a larger model or, more likely, a larger KV cache, then Cerebras doesn&#8217;t make much sense, particularly given the price. That whole-wafer-as-chip technique means high yields are a massive challenge, which hugely drives up costs.</p>\n\n\n\n<p class=\"wp-block-paragraph\">At the same time, I do think there will be a market for Cerebras-style chips: right now <a href=\"https://www.cerebras.ai/blog/why-the-ai-race-shifted-to-speed\">the company is highlighting the usefulness of speed for coding</a> — reasoning means a lot of tokens, which means that dramatically scaling up tokens-per-second equals faster thinking — but I think this is a temporary use case, for reasons I&#8217;ll explain in a bit. What does matter is how long humans are waiting for an answer, and as products like AI wearables become more of a thing, the speed of interaction, particularly for voice — which will be a function of token generation speed — will have a tangible effect on the user experience.</p>\n\n\n\n<h3 class=\"wp-block-heading\">Agentic Inference</h3>\n\n\n\n<p class=\"wp-block-paragraph\">I have previously made the case, including in <a href=\"https://stratechery.com/2026/agents-over-bubbles/\">Agents Over Bubbles</a>, that we have gone through three inflection points in the LLM era:</p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>ChatGPT demonstrated the utility of token prediction.</li>\n\n\n\n<li>o1 introduced the idea of reasoning, where more tokens meant better answers.</li>\n\n\n\n<li>Opus 4.5 and Claude Code introduced the first usable agents, which could actually accomplish tasks, using a combination of reasoning models and a harness that utilized tools, verified work, etc.</li>\n</ul>\n\n\n\n<p class=\"wp-block-paragraph\">All of this falls under the banner of &#8220;inference&#8221;, but I think it will be increasingly clear that there is a difference between providing an answer — what I will call &#8220;answer inference&#8221; — and doing a task — what I will call &#8220;agentic inference.&#8221; Cerebras&#8217; target market is &#8220;answer inference&#8221;; in the long run, I think the architecture for &#8220;agentic inference&#8221; will look a lot different, not just from Cerebras&#8217; approach, but from the GPU approach as well.</p>\n\n\n\n<p class=\"wp-block-paragraph\">I mentioned above that fast inference for coding is a temporary use case. Specifically, coding with LLMs requires a human in the loop. It&#8217;s the human that defines what is to be coded, checks the work, commits the pull request, etc.; it&#8217;s not hard to envision a future, however, where all of this is completely handled by machines. This will apply to agentic work broadly: the true power of agents will not be that they do work for humans, but rather that they do work without human involvement at all.</p>\n\n\n\n<p class=\"wp-block-paragraph\">This, by extension, will mean that the likely best approach to solving agentic inference will look a lot different than answer inference. The most important aspect for answer inference is token speed; the most important aspect for agentic inference, however, is memory. Agents need context, state, and history. Some of that will live as active KV cache; some will live in host memory or SSDs; much of it will live in databases, logs, embeddings, and object stores. The important point is that agentic inference will be less about GPUs answering a question and more about the memory hierarchy wrapped around a model.</p>\n\n\n\n<p class=\"wp-block-paragraph\">Critically, this articulation of an agentic-specific memory hierarchy implies a necessary trade-off of speed for capacity. Here&#8217;s the thing, though: lower speed isn&#8217;t nearly as important a consideration if there isn&#8217;t a human in the loop. If an agent is waiting around for a job that is being run overnight, the agent doesn&#8217;t know or care about the user experience impact; what is most important is being able to accomplish a task, and if entirely new approaches to memory make that possible, then delays are fine.</p>\n\n\n\n<p class=\"wp-block-paragraph\">Meanwhile, if delays are fine, then all of the focus on pure compute power and high-bandwidth memory seems out of place: if latency isn&#8217;t the top priority, then slower and cheaper memory — like traditional DRAM, for example — makes a lot more sense. And if the entire system is mostly waiting on memory, then chips don&#8217;t need to be as fast as the cutting edge either. This represents a profound shift in future architectures, but it also doesn&#8217;t mean that current architectures are going away:</p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Training will continue to matter, and Nvidia&#8217;s current architecture, including high-speed compute, large amounts of high-bandwidth memory, and high-speed networking, will likely continue to dominate.</li>\n\n\n\n<li>Answer inference will be a meaningful market, albeit a relatively small one, and speed from chips like Cerebras or Groq (I explained how Nvidia is deploying Groq&#8217;s LPUs <a href=\"https://stratechery.com/2026/jensen-huang-and-andy-grove-groq-lpus-and-vera-cpus-hotel-california/\">here</a>) will be very useful.</li>\n\n\n\n<li>Agentic inference will gradually unbundle the GPU, which alternates between stranding high-bandwidth memory (during the prefill process) and stranding compute (during the decode process), in favor of increasingly sophisticated memory hierarchies dominated by high capacity and relatively lower cost memory types, with &#8220;good enough&#8221; compute; indeed, if anything it will be the speed of CPUs for things like tool use that will matter more than the speed of GPUs.</li>\n</ul>\n\n\n\n<p class=\"wp-block-paragraph\">At the same time, these categories won&#8217;t be equal in size or importance. Specifically, agentic inference will be the largest market by far, because that is the market that won&#8217;t be limited by humans or time. Today&#8217;s agents are fancy answer inference; in the future true agentic inference will be work done by computers according to dictates given by other computers, and the market size scales not with humans but with compute.</p>\n\n\n\n<h3 class=\"wp-block-heading\">The Implications of Agentic Inference on Compute</h3>\n\n\n\n<p class=\"wp-block-paragraph\">To date the invocation of &#8220;scaling with compute&#8221; has implicitly meant Nvidia bullishness. However, much of Nvidia&#8217;s relative advantage to date has been a function of latency: Nvidia chips have fast compute, but keeping that compute busy has required big investments in ever-expanding HBM memory and networking. If latency isn&#8217;t the key constraint, however, then Nvidia&#8217;s approach seems less worth paying a premium for.</p>\n\n\n\n<p class=\"wp-block-paragraph\">Nvidia does recognize this shift: the company launched an inference framework called <a href=\"https://www.nvidia.com/en-us/ai/dynamo/\">Dynamo</a> that helps disaggregate different parts of inference, and is shipping products like standalone memory and CPU racks to enable increasingly large KV caches and faster tool use, the better to keep their expensive GPUs busy. Ultimately, however, it&#8217;s easy to see cost and simplicity being increasingly attractive to hyperscalers for agentic inference that isn&#8217;t remotely GPU-bound.</p>\n\n\n\n<p class=\"wp-block-paragraph\">China, meanwhile, for all of its lack of leading edge compute, has everything it needs for agentic inference: fast-enough (but not leading-edge) GPUs, fast-enough (but not leading-edge) CPUs, DRAM, hard drives, etc. The challenge, of course, is compute for training; it&#8217;s also possible that answer inference is more important for national security, at least when it comes to military applications.</p>\n\n\n\n<p class=\"wp-block-paragraph\">The other interesting angle is space: slower chips actually make space data centers more viable for a number of reasons. First, if memory can be offloaded, chips can be made much simpler and run much cooler. Second, older nodes, by virtue of being physically larger, will better withstand space radiation. Third, older nodes require less power, which means there will be less heat to dissipate via radiation. Fourth, not being on the bleeding edge will mean higher reliability, an important consideration given that satellites won&#8217;t be repairable.</p>\n\n\n\n<p class=\"wp-block-paragraph\">Nvidia CEO Jensen Huang regularly says that &#8220;Moore&#8217;s Law is Dead&#8221;; what he means is that the future of computing speed-ups will be a function of systems innovation, which is exactly what Nvidia has done. Maybe the most profound implication of agents that act without humans in the loop, however, will be that Moore&#8217;s Law doesn&#8217;t matter, and that the way we get more compute is by realizing that the compute we have is already good enough.</p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"/>\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<iframe loading=\"lazy\" title=\"The Inference Shift | Stratechery by Ben Thompson\" width=\"500\" height=\"281\" src=\"https://www.youtube.com/embed/xyLr6XcktTo?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen></iframe>\n</div></figure>\n"
          }
        ],
        "link": "https://stratechery.com",
        "site": {
          "#text": "188043415",
          "@_xmlns": "com-wordpress:feed-additions:1"
        },
        "image": {
          "url": "https://i0.wp.com/stratechery.com/wp-content/uploads/2018/03/cropped-android-chrome-512x512-1.png?fit=32%2C32&#038;ssl=1",
          "link": "https://stratechery.com",
          "title": "Stratechery by Ben Thompson",
          "width": "32",
          "height": "32"
        },
        "title": "Stratechery by Ben Thompson",
        "language": "en-US",
        "atom:link": {
          "@_rel": "self",
          "@_href": "https://stratechery.com/feed/",
          "@_type": "application/rss+xml"
        },
        "description": "On the business, strategy, and impact of technology.",
        "lastBuildDate": "Tue, 26 May 2026 08:47:34 +0000",
        "sy:updatePeriod": "hourly",
        "sy:updateFrequency": "1"
      },
      "@_version": "2.0",
      "@_xmlns:dc": "http://purl.org/dc/elements/1.1/",
      "@_xmlns:sy": "http://purl.org/rss/1.0/modules/syndication/",
      "@_xmlns:wfw": "http://wellformedweb.org/CommentAPI/",
      "@_xmlns:atom": "http://www.w3.org/2005/Atom",
      "@_xmlns:slash": "http://purl.org/rss/1.0/modules/slash/",
      "@_xmlns:content": "http://purl.org/rss/1.0/modules/content/"
    },
    "?xml": {
      "@_version": "1.0",
      "@_encoding": "UTF-8"
    }
  },
  "entry_raw": {
    "guid": {
      "#text": "https://stratechery.com/?p=19130",
      "@_isPermaLink": "false"
    },
    "link": "https://stratechery.com/2026/nvidia-earnings-the-ai-stack-nvidias-new-reporting/",
    "title": "Nvidia Earnings, The AI Stack, Nvidia&#8217;s New Reporting",
    "post-id": {
      "#text": "19130",
      "@_xmlns": "com-wordpress:feed-additions:1"
    },
    "pubDate": "Tue, 26 May 2026 10:00:00 +0000",
    "category": "Daily Update",
    "dc:creator": "Ben Thompson",
    "description": "Nvidia is changing its reporting to delineate between hyperscaler sales — where Nvidia is fighting commoditization — and everyone else, where Nvidia runs the whole stack.",
    "content:encoded": "<p>Nvidia is changing its reporting to delineate between hyperscaler sales — where Nvidia is fighting commoditization — and everyone else, where Nvidia runs the whole stack.</p>"
  },
  "stats_raw": {
    "published_at": "2026-05-26T10:00:00.000Z",
    "categories_count": 1,
    "content_excerpt_length": 170
  },
  "aux_raw": {
    "author": "Ben Thompson",
    "categories": [
      "Daily Update"
    ],
    "feed_title": "Stratechery by Ben Thompson",
    "raw_excerpt": "Nvidia is changing its reporting to delineate between hyperscaler sales — where Nvidia is fighting commoditization — and everyone else, where Nvidia runs the whole stack.",
    "feed_site_url": "https://stratechery.com",
    "content_excerpt": "Nvidia is changing its reporting to delineate between hyperscaler sales — where Nvidia is fighting commoditization — and everyone else, where Nvidia runs the whole stack.",
    "feed_description": "On the business, strategy, and impact of technology."
  },
  "selection_meta": {
    "feed_id": "stratechery",
    "feed_priority": 7,
    "lookback_days": 7,
    "snapshot_version": "newsletter_rss_entry_v1",
    "max_items_per_feed": 5,
    "max_normalized_items": 40
  },
  "created_at": "2026-05-26T22:02:29.891Z",
  "updated_at": "2026-05-26T22:02:29.891Z"
}