星火 SparkCN

痛点分析发布于 2026/05/29

痛点为 AI 基于上游原始证据的初步提炼；未包含额外中国市场检索。

痛点

用户讨论集中在模型规模与训练数据量的比例上，例如有评论指出38T tokens对8B模型来说似乎过多（'overtraining'），并对比了Chinchilla scaling法则（20倍活跃参数）与Mistral（2倍）的差异，认为当前模型达到了1800倍。这表明在模型开发中，用户面临如何确定最优训练数据量与模型参数比例的问题，现有经验法则（如Chinchilla）可能不再适用，导致资源浪费或性能未达预期。这种不确定性增加了模型调优的试错成本和时间消耗。

External Article

External article summary

Today, we’re releasing LFM2.5-8B-A1B, a high-throughput edge model optimized for fast, reliable tool calling and complex instruction following on consumer hardware, delivering compressed performance competitive with much larger models and day-one support across major inference frameworks.

External Article

External article source

Article title: LFM2.5-8B-A1B: an Even Better on-Device Mixture-of-Experts | Liquid AI
Source URL: https://www.liquid.ai/blog/lfm2-5-8b-a1b
Host: www.liquid.ai

§ Dossier

Selected HN comments

The small models are getting really impressive. I recently realized that Qwen3.5:4B is way more capable than I thought a model that size could be. Combine that with the work Liquid puts into RL and fine tuning, and you get models that perform extremely well on minimal hardware. Combine that with your own fine tuning, and you get a specialized tool that is fast, private, and doesn’t require internet connection.

chabes

Hmm, I asked it who made it, and it says Google?

kilroy123

Woah, chinchilla scaling is 20 x active_params. I think mistral was 2 x Chinchilla. This is 1800 x

irthomasthomas

Anybody use their localcowork [1] before? That is where the demo lives. Or not? [1] https://github.com/Liquid4All/cookbook/tree/main/examples/lo...

SubiculumCode

Liquid does amazing work, but I kinda feel like they are overtraining their models. 38T tokens seems like a lot for an 8B model

Ifkaluva

源数据· Raw Archive

source: Hacker News
upstream_source: hacker_news
upstream_item_id: 48325306
daily_ranking_item_id: 93681b17-40f4-4ae6-97b7-5cc2ca2308e6
rank_date: 2026-05-30
rank: 10
name: Liquid AI reveals 8B-A1B MoE trained on 38T
tagline: www.liquid.ai
votes_count: 113
comments_count: 34
created_at_on_source: 2026-05-29T16:19:54.000Z
source_url: https://news.ycombinator.com/item?id=48325306
website_url: https://www.liquid.ai/blog/lfm2-5-8b-a1b

media / source-specific data

{
  "author": "simjnd",
  "hn_item_id": 48325306,
  "external_url": "https://www.liquid.ai/blog/lfm2-5-8b-a1b"
}

raw_payload

{
  "by": "simjnd",
  "id": 48325306,
  "url": "https://www.liquid.ai/blog/lfm2-5-8b-a1b",
  "kids": [
    48328322,
    48329846,
    48329703,
    48328511,
    48329676,
    48328307,
    48327949,
    48329706,
    48328208,
    48328281,
    48328261,
    48328092,
    48328194
  ],
  "time": 1780071594,
  "type": "story",
  "score": 113,
  "title": "Liquid AI reveals 8B-A1B MoE trained on 38T",
  "descendants": 34
}

source_raw_snapshot

{
  "id": "c209503f-a46e-470d-b29a-1857c47e35d3",
  "daily_ranking_item_id": "93681b17-40f4-4ae6-97b7-5cc2ca2308e6",
  "source": "hacker_news",
  "external_id": "48325306",
  "fetched_at": "2026-05-29T22:01:20.908Z",
  "story_raw": {
    "by": "simjnd",
    "id": 48325306,
    "url": "https://www.liquid.ai/blog/lfm2-5-8b-a1b",
    "kids": [
      48328322,
      48329846,
      48329703,
      48328511,
      48329676,
      48328307,
      48327949,
      48329706,
      48328208,
      48328281,
      48328261,
      48328092,
      48328194
    ],
    "time": 1780071594,
    "type": "story",
    "score": 113,
    "title": "Liquid AI reveals 8B-A1B MoE trained on 38T",
    "descendants": 34
  },
  "stats_raw": {
    "time": 1780071594,
    "score": 113,
    "descendants": 34
  },
  "aux_raw": {
    "external_url": "https://www.liquid.ai/blog/lfm2-5-8b-a1b",
    "hn_comment_url": "https://news.ycombinator.com/item?id=48325306",
    "normalized_text": null,
    "external_article": {
      "title": "LFM2.5-8B-A1B: an Even Better on-Device Mixture-of-Experts | Liquid AI",
      "excerpt": "Today, we're releasing LFM2.5-8B-A1B , an edge model built for fast, reliable tool calling on consumer hardware.\n\nIt builds on our LFM2-8B-A1B release from October 2025, with an expanded 128K context window, scaled-up pretraining (from 12T to 38T tokens), and large-scale reinforcement learning. We also doubled its vocabulary to improve tokenization efficiency for non-Latin languages. The result is a model that chains tool calls, achieves tasks, and fits comfortably even on an entry-level laptop.\n\nThe base (LFM2.5-8B-A1B-Base) and post-trained (LFM2.5-8B-A1B) models are available today on Hugging Face and our Playground . Check out our docs on how to run and fine-tune them locally.\n\nCompared to LFM2-8B-A1B, this new version expands the context window from 32,768 to 128,000 tokens . This allows the model to process longer documents and reason for longer. Its vocabulary size was also scaled up from 65,536 to 128,000 to tokenize non-Latin scripts more efficiently . We see particularly strong compression gains in Hindi, Thai, Vietnamese, Indonesian, and Arabic. The rest of the architecture follows the same combination of MoE, GQA, and gated short convolution blocks as LFM2-8B-A1B, as sh",
      "final_url": "https://www.liquid.ai/blog/lfm2-5-8b-a1b",
      "fetched_at": "2026-05-29T22:01:18.268Z",
      "description": "Today, we’re releasing LFM2.5-8B-A1B, a high-throughput edge model optimized for fast, reliable tool calling and complex instruction following on consumer hardware, delivering compressed performance competitive with much larger models and day-one support across major inference frameworks."
    },
    "selected_comments": [
      {
        "id": 48328322,
        "raw": {
          "by": "chabes",
          "id": 48328322,
          "kids": [
            48328383
          ],
          "text": "The small models are getting really impressive.<p>I recently realized that Qwen3.5:4B is way more capable than I thought a model that size could be.<p>Combine that with the work Liquid puts into RL and fine tuning, and you get models that perform extremely well on minimal hardware.<p>Combine that with your own fine tuning, and you get a specialized tool that is fast, private, and doesn’t require internet connection.",
          "time": 1780084352,
          "type": "comment",
          "parent": 48325306
        },
        "body": "The small models are getting really impressive. I recently realized that Qwen3.5:4B is way more capable than I thought a model that size could be. Combine that with the work Liquid puts into RL and fine tuning, and you get models that perform extremely well on minimal hardware. Combine that with your own fine tuning, and you get a specialized tool that is fast, private, and doesn’t require internet connection.",
        "is_op": false,
        "author": "chabes",
        "raw_body": "The small models are getting really impressive.<p>I recently realized that Qwen3.5:4B is way more capable than I thought a model that size could be.<p>Combine that with the work Liquid puts into RL and fine tuning, and you get models that perform extremely well on minimal hardware.<p>Combine that with your own fine tuning, and you get a specialized tool that is fast, private, and doesn’t require internet connection.",
        "created_at": 1780084352,
        "reply_count": 1
      },
      {
        "id": 48329846,
        "raw": {
          "by": "kilroy123",
          "id": 48329846,
          "text": "Hmm, I asked it who made it, and it says Google?",
          "time": 1780091823,
          "type": "comment",
          "parent": 48325306
        },
        "body": "Hmm, I asked it who made it, and it says Google?",
        "is_op": false,
        "author": "kilroy123",
        "raw_body": "Hmm, I asked it who made it, and it says Google?",
        "created_at": 1780091823,
        "reply_count": 0
      },
      {
        "id": 48329703,
        "raw": {
          "by": "irthomasthomas",
          "id": 48329703,
          "text": "Woah, chinchilla scaling is 20 x active_params. I think mistral was 2 x Chinchilla. This is 1800 x",
          "time": 1780090938,
          "type": "comment",
          "parent": 48325306
        },
        "body": "Woah, chinchilla scaling is 20 x active_params. I think mistral was 2 x Chinchilla. This is 1800 x",
        "is_op": false,
        "author": "irthomasthomas",
        "raw_body": "Woah, chinchilla scaling is 20 x active_params. I think mistral was 2 x Chinchilla. This is 1800 x",
        "created_at": 1780090938,
        "reply_count": 0
      },
      {
        "id": 48328511,
        "raw": {
          "by": "SubiculumCode",
          "id": 48328511,
          "text": "Anybody use their localcowork [1] before? \nThat is where the demo lives. Or not?<p>[1] <a href=\"https:&#x2F;&#x2F;github.com&#x2F;Liquid4All&#x2F;cookbook&#x2F;tree&#x2F;main&#x2F;examples&#x2F;localcowork\" rel=\"nofollow\">https:&#x2F;&#x2F;github.com&#x2F;Liquid4All&#x2F;cookbook&#x2F;tree&#x2F;main&#x2F;examples&#x2F;lo...</a>",
          "time": 1780085260,
          "type": "comment",
          "parent": 48325306
        },
        "body": "Anybody use their localcowork [1] before? That is where the demo lives. Or not? [1] https://github.com/Liquid4All/cookbook/tree/main/examples/lo...",
        "is_op": false,
        "author": "SubiculumCode",
        "raw_body": "Anybody use their localcowork [1] before? \nThat is where the demo lives. Or not?<p>[1] <a href=\"https:&#x2F;&#x2F;github.com&#x2F;Liquid4All&#x2F;cookbook&#x2F;tree&#x2F;main&#x2F;examples&#x2F;localcowork\" rel=\"nofollow\">https:&#x2F;&#x2F;github.com&#x2F;Liquid4All&#x2F;cookbook&#x2F;tree&#x2F;main&#x2F;examples&#x2F;lo...</a>",
        "created_at": 1780085260,
        "reply_count": 0
      },
      {
        "id": 48329676,
        "raw": {
          "by": "Ifkaluva",
          "id": 48329676,
          "kids": [
            48329781
          ],
          "text": "Liquid does amazing work, but I kinda feel like they are overtraining their models. 38T tokens seems like a lot for an 8B model",
          "time": 1780090811,
          "type": "comment",
          "parent": 48325306
        },
        "body": "Liquid does amazing work, but I kinda feel like they are overtraining their models. 38T tokens seems like a lot for an 8B model",
        "is_op": false,
        "author": "Ifkaluva",
        "raw_body": "Liquid does amazing work, but I kinda feel like they are overtraining their models. 38T tokens seems like a lot for an 8B model",
        "created_at": 1780090811,
        "reply_count": 1
      }
    ],
    "presentation_fields": {
      "title": "Liquid AI reveals 8B-A1B MoE trained on 38T",
      "tagline": "www.liquid.ai",
      "website_url": "https://www.liquid.ai/blog/lfm2-5-8b-a1b",
      "canonical_url": "https://news.ycombinator.com/item?id=48325306"
    },
    "external_url_hostname": "www.liquid.ai",
    "selected_comments_raw": [
      {
        "by": "chabes",
        "id": 48328322,
        "kids": [
          48328383
        ],
        "text": "The small models are getting really impressive.<p>I recently realized that Qwen3.5:4B is way more capable than I thought a model that size could be.<p>Combine that with the work Liquid puts into RL and fine tuning, and you get models that perform extremely well on minimal hardware.<p>Combine that with your own fine tuning, and you get a specialized tool that is fast, private, and doesn’t require internet connection.",
        "time": 1780084352,
        "type": "comment",
        "parent": 48325306
      },
      {
        "by": "kilroy123",
        "id": 48329846,
        "text": "Hmm, I asked it who made it, and it says Google?",
        "time": 1780091823,
        "type": "comment",
        "parent": 48325306
      },
      {
        "by": "irthomasthomas",
        "id": 48329703,
        "text": "Woah, chinchilla scaling is 20 x active_params. I think mistral was 2 x Chinchilla. This is 1800 x",
        "time": 1780090938,
        "type": "comment",
        "parent": 48325306
      },
      {
        "by": "SubiculumCode",
        "id": 48328511,
        "text": "Anybody use their localcowork [1] before? \nThat is where the demo lives. Or not?<p>[1] <a href=\"https:&#x2F;&#x2F;github.com&#x2F;Liquid4All&#x2F;cookbook&#x2F;tree&#x2F;main&#x2F;examples&#x2F;localcowork\" rel=\"nofollow\">https:&#x2F;&#x2F;github.com&#x2F;Liquid4All&#x2F;cookbook&#x2F;tree&#x2F;main&#x2F;examples&#x2F;lo...</a>",
        "time": 1780085260,
        "type": "comment",
        "parent": 48325306
      },
      {
        "by": "Ifkaluva",
        "id": 48329676,
        "kids": [
          48329781
        ],
        "text": "Liquid does amazing work, but I kinda feel like they are overtraining their models. 38T tokens seems like a lot for an 8B model",
        "time": 1780090811,
        "type": "comment",
        "parent": 48325306
      }
    ]
  },
  "selection_meta": {
    "discussion_depth": "top_comments_v1",
    "external_article": {
      "status": "ok",
      "final_url": "https://www.liquid.ai/blog/lfm2-5-8b-a1b",
      "status_code": 200,
      "content_type": "text/html; charset=utf-8",
      "failure_reason": null
    },
    "snapshot_version": "hn_story_v3",
    "selected_comments_count": 5,
    "external_article_resolved": true,
    "text_normalization_applied": false
  },
  "created_at": "2026-05-29T22:01:21.179Z",
  "updated_at": "2026-05-29T22:01:21.179Z"
}