星火 SparkCN

I'm making a Python script to use OpenAI via its API. However, I'm getting this error: openai.error.RateLimitError: You exceeded your current quota, please check your plan and billing details My script is the following: #!/usr/bin/env python3.8 # -*- coding: utf-8 -*- import openai openai.api_key = "<My PAI Key>" completion = openai.ChatCompletion.create( model="gpt-3.5-turbo", messages=[ {"role": "user", "content": "Tell the world about the ChatGPT API in the style of a pirate."} ] ) print(completion.choices[0].message.content) I'm declaring the shebang python3.8 , because I'm using pyenv . I think it should work, since I did 0 API requests, so I'm assuming there's an error in my code.

pythonpromptopenai-api

5 回答原文

DEV.TONO.01

How Are Developers Actually Using AI At Work?

JSNation is coming soon, and besides my talk (I’ll drop the link in the comments so I don’t spam you...

150 反应数原文

145分数

STACK OVERFLOWNO.05

Differences between Langchain & LlamaIndex

I'm currently working on developing a chatbot powered by a Large Language Model (LLM), and I want it to provide responses based on my own documents. I understand that using a fine-tuned model on my documents might not yield direct responses, so I'm exploring the concept of Retrieval-Augmented Generation (RAG) to enhance its performance. In my research, I've come across two tools, Langchain and LlamaIndex, that seem to facilitate RAG. However, I'm struggling to understand the main differences between them. I've noticed that some tutorials and resources use both tools simultaneously, and I'm curious about why one might choose to use one over the other or when it makes sense to use them together. Could someone please provide insights into the key distinctions between Langchain and LlamaIndex for RAG, and when it is beneficial to use one tool over the other or combine them in chatbot development?

chatbotlangchainlarge-language-model

3 回答原文

86Points

HACKER NEWSNO.10

The Speed of Prototyping in the Age of AI

50 讨论原文

DEV.TONO.03

An LLM API call, in 4 GIFs

This is the first post of series Building TinyAgent where we are going to build a small agent from...

80 反应数原文

DEV.TONO.05

You Don’t Need Microservices (Yet): A Reality Check for Devs

You Don't Need Microservices (Yet): A Reality Check for Devs How Premature...

41 反应数原文

DEV.TONO.02

I Spent 10x Longer Debugging AI Code Than Writing It

AI wrote the code in 30 seconds Three lines A simple function I prompted it generated I copied It...

40 反应数原文

DEV.TONO.06

Reviving a 12K+ Star Abandoned Library: toastr-next v3 🍞

This is a submission for the GitHub Finish-Up-A-Thon Challenge What I...

37 反应数原文

37Points

HACKER NEWSNO.08

Show HN: Streambed – Stream Postgres to Iceberg on S3, Supports Postgres Wire

1 讨论原文

DEV.TONO.08

Minimal Code Doesn’t Mean Stable Code

The argument sounds reasonable: fewer lines of code mean fewer bugs. Simpler to review, easier to...

32 反应数原文

DEV.TONO.09

Your AI Agent Should Text You First

A practical, slightly opinionated Hermes Agent use case: an always-on chief of staff that remembers your work, schedules itself, uses tools, and reports back with receipts.

31 反应数原文

DEV.TONO.07

AI Agents Are Great at 80% of Our Code. The Other 20% Is Why We Still Need Seniors.

We let AI agents loose on a payment platform. They crushed the boring stuff. Then they silently broke...

24 反应数原文

DEV.TONO.04

Toward a Standard Model for Agent Memory

Most agent memory systems are digital attics. You put things in. You hope to find them later. You...

22 反应数原文

DEV.TONO.10

I Made My AI Models Argue, Then Let Hermes Be the Judge

A $0 multi-model decision agent: three LLMs debate, Hermes judges, and it learns who to trust.

20 反应数原文

19Points

HACKER NEWSNO.04

The Four Programming Questions from My 1994 Microsoft Internship Interview (2023)

4 讨论原文

14Points

HACKER NEWSNO.06

ChatGPT for Google Sheets Exfiltrates Workbooks

0 讨论原文

6分数

STACK OVERFLOWNO.03

How to use AI as programmer

How to efficiently using AIs like claud and chatGPT to improve your programming abilities without losing critical thinking because i feel like we are so hyped up with the AI's ability to give a good response in a way that we are carried away with not able to actually think critically.

artificial-intelligenceopenai-apiclaude

6 回答原文

4分数

STACK OVERFLOWNO.07

Grouping text rows in equal-sized groups

I have a table containing a lot of typed/named items and I want to create a (materialized?) view to group them together so that a UI can later make it easier for users to select. Description The goal is: If there are N elements of the same type, make sqrt(N) groups each of size sqrt(N). Instead of having e.g. 10k elements to scroll through, the UI would show 100 lines (the groups), each of which can be expanded into 100 lines (the items inside each group). 1 more click to make but a lot less scrolling. The UI will display group names as the range containing the named elements inside it. For instance, if a group contains all the items whose names start with A and B (but not C and beyond), then the group should be named A-B . If a group contains all the elements starting with C until e.g. EG, but elements starting with EH are the the next group, the group should be name C-EG . The idea is: Instead of comparing the texts and deciding where the lists should be split into groups, I am using a function to score the items. Why I do that will make sense in a minute. Anyway, a heuristics turns that whole mess into a relatively usual problem to solve, that is searching for the local minimums of a function. Here is the twist (that explains why I am building a function): I could make it so that each group has exactly the same size (ignoring the rounding of sqrt(N) to the nearest integer) but instead, I want to find a compromise between group size and length of the group name. As groups are named with the pattern <start of group> - <end of group> , I ideally want <start of group> and <end of group> to be of length 3 at most. Since I am fine with counts being "reasonably" different from sqrt(N), I am also fine with them being slightly out of date. My idea therefore is to feed the result of this query into a materialized view that is refreshed e.g. every night. The function works that way: For each item, count how many first letters it has in common with the item immediately before it. Take a SIN squared function, with parameters so that it oscillate sqrt(N) times in the range [1-N]. Insert parametrization into the 2 steps above (add constants, multiply by constant, raise to some power ... whatever seems to give good results once I figure the problem out). Multiply the 2 together to get a "score" Look for a minimum score, it will act as the end of the group. If that helps, here is what it looks like for 100 items (names generated randomly with the below queries), with SIN in blue and the score in black. If I want, e.g. shorter group names at the cost of consistent group sizes, I can tweak the score so that the SIN is not just squared anymore but instead raised to a higher power: that will flatten the troughs of the sin-wave / sharpen the crests. By doing so, more items will be place inside the troughs, therefore I will get more chances that a shorter but off boundary will be preferred over a longer boundary that lands on sqrt(N). Here is a zoomed in version of the 2 charts: Here, the heuristics determins the list should be split around rtetx . The group right before this split will be named [...] - RC and the group after this split will be named RT - [...] . Raising the SIN function to a higher power (note: the software I used to draw the chart interpolates the points into a smooth curve. The curve is only correct on the points, not between them) increases the tolerance, like I said. Now, the group can equally be split around rcEYu or slkkE . Now, the group names will be either: [...] - Q and R - [...] Or [...] - R and S - [...] Now, 1 letter is required for the common boundary of both groups whereas 2 letters were required before. Queries I tried the following query Building the data sample with random data CREATE TABLE Test ( ItemType INTEGER, ItemName VARCHAR ); WITH Chars(chars) AS ( SELECT ARRAY[ 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' ] ) INSERT INTO Test SELECT t, string_agg(chars[random(0, 165)], '') FROM chars CROSS JOIN generate_series(1, 3) t CROSS JOIN generate_series(1, 5) l CROSS JOIN generate_series(1, 400) i GROUP BY t, i; Finding the text to use in group names (query takes < 5sec, suggests it needs a materialized view). WITH Items(ItemType, ItemName) AS ( SELECT ItemType, TRIM(BOTH FROM upper(ItemName)) FROM Test ), stats1(ItemType, NameStarts) AS ( SELECT DISTINCT Items.ItemType, TRIM(BOTH FROM upper(left(ItemName, generate_series(1, 3)))) AS btrim FROM Items ), stats2(ItemType, NameStarts, NameCount) AS ( SELECT s1.ItemType, s1.NameStarts, count(*) FILTER (WHERE left(i.ItemName, length(s1.NameStarts)) <= s1.NameStarts) AS count FROM stats1 s1 JOIN items i ON s1.ItemType = i.ItemType GROUP BY s1.ItemType, s1.NameStarts ), stats3(ItemType, NameStarts, NameCount, totalnamecount) AS ( SELECT stats2.ItemType, min(stats2.NameStarts) AS min, stats2.NameCount, max(stats2.NameCount) OVER (PARTITION BY stats2.ItemType) AS max FROM stats2 GROUP BY stats2.ItemType, stats2.NameCount ), stats4(ItemType, NameStarts, NameCount, score, groupnumber) AS ( SELECT stats3.ItemType, stats3.NameStarts, stats3.NameCount, length(stats3.NameStarts) * (.05 + .95 * power(sin(pi() * stats3.NameCount / sqrt(stats3.totalnamecount)::integer), 2)), ((stats3.NameCount - sqrt(stats3.totalnamecount) / 2) / sqrt(stats3.totalnamecount))::integer AS int4 FROM stats3 WHERE stats3.NameCount > 4 AND (stats3.totalnamecount - stats3.NameCount) > 4 AND stats3.totalnamecount > 50 ) SELECT s1.ItemType, s1.NameStarts, s1.NameCount FROM stats4 s1 JOIN ( SELECT stats4.ItemType, min(stats4.score) AS bestscore, stats4.groupnumber FROM stats4 GROUP BY stats4.ItemType, stats4.groupnumber ) s2 ON s1.ItemType = s2.ItemType AND s1.groupnumber = s2.groupnumber WHERE s1.score = s2.bestscore Problem In the above query, I have added the NameCount column to track how many items belong to 1 group. As the sample data consists in 3 groups of 400 items each, I expect 3 types, each of 20 groups, each group having around 20 items (thus this 3rd columns should show 0, 20, 40, 60, ...). However, while I get the correct rows, I also get additional groups with 1-2 items in it. For instance, I will see: For the first group: NameCount = 20 For the second group: NameCount = 39 For the third group: NameCount = 41 When this happens, that means the function calculates 2 group boundaries with equal value; one of them needs to go. I can't seem to find an easy way to remove those extra records. Has anyone got an idea? Edit: To clarify what I am after (some of which will be a repetition, bear with me): An application displays the result of the query in its UI. It can (and probably will) process the output of the query. Ultimately, my goal is to decrease the amount of scrolling required to view any item in the list by grouping them into a 2-level hierarchy. for N items, groups should have sqrt(N) groups, each with around sqrt(N) items. The number of item in each group depends on the heuristic being strict (aims at sqrt(N)) or tolerant (aims at a compromise between sqrt(N) and the length of the group name). sqrt(N) is not guaranteed to be an integer. For 10000, I expect exactly 100 groups of 100 items. For 10001, I can accept having 100 or 101 groups. If you choose to create 101 groups, I do not mind items being spread onto all those groups. The approach in my query (at least in 1 version before I try to fix its issues) was to ignore the issue completely: 100 items per group in the first 100 group, 1 item in the final group (it is only fair the final group is smaller as it appears last and requires more scrolling to reach than any of the other groups). There are 3 reasons why I currently save data in a materialized view: It is faster for the application. New data is not inserted often enough in the table for it to matter (think about 10 records inserted in a 10k table, that will often not change the group). Users visiting the same list several times in a day will be guaranteed to get the same grouping. It is only after they go to sleep that the grouping will be different in the UI. Because of the delay between data insertion and grouping update, the query does not need to (and probably should not) return both the lower and upper bound of each group. For example, if 2 consecutive groups are A-B and E-F , the application will show contiguous segments, such as A-D and E-F . If it did not, then items starting with C would not be findable.

sqlpostgresql

1 回答原文

1分数

STACK OVERFLOWNO.09

Why does my query insist on a full table scan?

I have a PostgreSQL table partitioned by a datetime field called journeyTime . If I have a query where the selection criteria is... WHERE date_trunc('day', journeyTime) = '01-May-2025' I would expect the explain plan to show that the query optimiser has immediately zoomed in on the single partition that all the records for that date are in. It doesn't. Instead it does a full table scan on all partitions in the table. Why?

postgresqlquery-optimization

7 回答原文

0分数

STACK OVERFLOWNO.10

How to "sync" Prisma with sessions?

I have a node js code that authenticates with cookies and sessions: import express from "express"; import session from "express-session"; import postgreSession from "connect-pg-simple"; import { Pool } from "pg"; const app = express(); const PostgreSession = postgreSession(session); const pool = new Pool({ connectionString: process.env.DATABASE_URL }); app.use( session({ store: new PostgreSession({ pool: pool, createTableIfMissing: true }), secret: process.env.SESSION_SECRET, resave: false, saveUninitialized: false, cookie: { httpOnly: true, secure: false, maxAge: 60 * 60 * 1000 }, }), ); and prisma schema. Do I really have to create new model Sessions to keep in sync database and schema? Is there a better approach, something where the prisma(or maybe some other frameworks) handle everything related to keeping in sync sessions in database and schema themselves? Some say to manually create model in prisma: model Session { sid String @id sess Json expire DateTime @@map("session") } I don't know why, it just looks a bit ugly.

javascriptnode.jssession

2 回答原文

0分数

STACK OVERFLOWNO.06

Does inputs_embeds bypass the positional encoding step of the model?

Using the Hugging Face transformer library, I want to feed logits (i.e., a vector such that if the softmax is taken would have the probability of each token) as the input to a model. Currently, I convert the logits to an embedding with the following code: token_probabilities = F.softmax(logits,dim=-1) embeddings = token_probabilities @ model.embed_tokens.weight out = model(inputs_embed = embeddings, attention_mask=attention_mask, labels=labels) However, I am concerned that this throws away the positional encoding that the model needs. Does inputs_embed bypass the part of the model where positional data is attached? (for example, when the RoPE is added). Ideally, this would be answered with the general interface that Hugging Face models use, but I specifically care about LlamaForCausalLM if there is no general answer.

huggingface-transformerslarge-language-model

1 回答原文

NEWSLETTERNO.01

The solution might be cancelling my AI subscription

The solution might be cancelling my AI subscription I find this post by David Wilson very relatable. David lists 16+ projects he's spun up with AI tooling, and concludes: I didn't mean to build most of these things. Usually the Claude session started with something like " write a quick script for X ", and one hour later the result is not a quick script for X , nor in the usual case is my problem solved, whatever the original itch happened to be. On that last point, this technology is horrific for attention. It's a thermonuclear ADHD amplifier and I have seen the same effect in every single one of my adult friends. Folk running 3 screens simultaneously working on totally unrelated "projects" they have little hope of maintaining, and such little commitment to the outcome that the time is obv

2026-06-01 · 读一篇原文

NEWSLETTERNO.02

A rational conversation on where AI is actually going | Benedict Evans

Benedict Evans on why AI is a 1997 internet moment, what everyone gets wrong about job loss, and where the real value may accrue

2026-06-01 · 读一篇原文

NEWSLETTERNO.03

Quoting Karen Kwok for Reuters Breakingviews

Anthropic defines “run-rate revenue” in two parts. Use the last 28 days of sales ⁠from customers charged on a consumption basis and multiply it by 13. Then, multiply the monthly subscription take by 12, and add the two together. - Karen Kwok for Reuters Breakingviews , citing "a person familiar with the matter" Tags: anthropic , ai

2026-06-01 · 读一篇原文

NEWSLETTERNO.04

How we contain Claude across products

How we contain Claude across products A complaint I often have about sandboxing products is that they are rarely thoroughly documented , and in the absence of detailed documentation it's hard to know how much I can trust them. Anthropic just published a fantastic overview of how their various sandbox techniques work across Claude.ai , Claude Code, and Cowork. We constrain where and how an agent can act with process sandboxes, VMs, filesystem boundaries, and egress controls. The goal is to set a hard boundary on what an agent can reach. For example, if credentials never enter the sandbox, they can't be exfiltrated, regardless of whether the cause is a user, a model finding a “creative” path, or an attacker. Claude.ai uses gVisor. Claude Code, run locally, uses Seatbelt on macOS and Bubblewr

2026-06-01 · 读一篇原文

NEWSLETTERNO.05

Running Python ASGI apps in the browser via Pyodide + a service worker

Research: Running Python ASGI apps in the browser via Pyodide + a service worker Datasette Lite is my version of Datasette that runs entirely in the browser using Pyodide in WebAssembly. When I first built it four years ago I used Web Workers and code that intercepts navigation operations and fetches the generated HTML by running the Python app. This worked, but had the disadvantage that any JavaScript in <script> tags would not be executed - breaking some Datasette functionality and a whole lot of Datasette plugins. This morning I set Claude Opus 4.8 the task (in Claude Code for web) of figuring out how to run Python ASGI apps in Pyodide using Service Workers instead, and it seems to work! Here's a basic ASGI FastCGI demo and here's a demo that runs Datasette 1.0a31 . I'm still getting my

2026-06-01 · 读一篇原文

NEWSLETTERNO.06

I Am Retiring from Tech to Live Offline

I Am Retiring from Tech to Live Offline I've seen a lot of posts on forums from people threatening to quit their careers over AI. This is not one of those: Chad Whitacre is taking concrete steps, starting with this typewritten, scanned letter I'm retiring from tech. Well, "retiring" is euphemistic. I'm stepping away from tech, and that includes Open Source. [...] AI was the last straw. Have you heard of that island off India where the indigenous population kills any outsiders fool-hardy enough to land? They are doing the rest of us a favor by preserving a way of life we may need again someday, or at the very least should not want to see completely extinguished. A reminder. Never forget your roots. Here in Pennsylvania we have the Amish performing a similar function. Significantly less host

2026-06-01 · 读一篇原文

NEWSLETTERNO.07

🧠 Community Wisdom: Catching people using AI during an interview, org design when everything lives in one person’s head, when to rename your product, from nurse to health-tech PM, and more

Community Wisdom 187

2026-06-01 · 读一篇原文

NEWSLETTERNO.08

[AINews] Founders and Forward Deployed Engineers

a quiet day lets us highlight the new AIE WF focuses

2026-06-01 · 读一篇原文

NEWSLETTERNO.09

2026.22: Luceing Their Mind

The best Stratechery content from the week of May 25, 2026, including why everyone hates Luce, how to monetize AI answers, and social mobility in China.

2026-06-01 · 读一篇原文

NEWSLETTERNO.10

[AINews] Anthropic raises $965B Series H, releases Opus 4.8 and Dynamic Workflows/ultracode

Total Anthropic victory!

2026-06-01 · 读一篇原文

0分数

STACK OVERFLOWNO.01

How should I structure autonomous AI agent workflows for production reliability in a TypeScript/Next.js fintech platform?

I’m building an AI-driven workflow platform using TypeScript, Next.js, Node.js, and GitHub-integrated deployment pipelines. The system coordinates multiple autonomous agents that handle orchestration, API actions, validation layers, and async task execution. Current architecture includes: Next.js frontend Node.js backend services GitHub-connected CI/CD Webhook/event-driven workflows AI agent task routing API validation + retry logic Fintech-oriented security requirements I’m trying to determine best practices for: Preventing cascading failures between autonomous agents Structuring agent-to-agent communication Managing retries/idempotency for webhook events Logging and observability across distributed workflows Safely deploying iterative AI workflow updates to production For developers who have worked on production AI orchestration systems: What architectural patterns worked best? Did you use queues/event buses/service meshes? How did you handle state management and rollback strategies? Would appreciate examples, frameworks, or lessons learned from scaling similar systems.

node.jstypescriptnext.js

0 回答原文

0分数

STACK OVERFLOWNO.04

What's the best "FreeTier" resource, Generative-ai, text-prompt/image-upload_to_video fullstack llmm application that's pipelined through HuggingFace?

As an academic in the field of computer science with advanced specialization in artificial intelligence and prompt engineering, I am currently exploring the most effective open-source generative AI frameworks suitable for constructing a comprehensive full-stack large language model (LLM) application. The intended system must be capable of accepting both textual and visual inputs and autonomously generating coherent video outputs. Moreover, this architecture should seamlessly integrate with the Hugging Face ecosystem to ensure streamlined model deployment, inference, and dataset management. An area of particular interest is the development of a continuous optimization workflow that dynamically identifies, retrieves, and deploys the most current and performant model iterations based on specified input parameters. This would effectively ensure uninterrupted system operation and adaptive scalability. Additionally, I aim to investigate the integration of supplementary digital hardware interfaces to enhance computational efficiency, enabling the system to support more complex generative tasks in real time.

react-nativecomputer-visionartificial-intelligence

0 回答原文

~ /trending

REPOSITORYNO.07

affaan-m/ECC

The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.

199k2026-05-31

git · 今日上榜原文

~ /trending

REPOSITORYNO.01

microsoft/markitdown

Python tool for converting files and office documents to Markdown.

132k2026-05-31

git · 今日上榜原文

~ /trending

REPOSITORYNO.03

anthropics/claude-code

Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflows - all through natural language commands.

128k2026-05-31

git · 今日上榜原文

~ /trending

REPOSITORYNO.02

harry0703/MoneyPrinterTurbo

利用AI大模型，一键生成高清短视频 Generate short videos with one click using AI LLM.

71.8k2026-05-31

git · 今日上榜原文

~ /trending

REPOSITORYNO.10

Crosstalk-Solutions/project-nomad

Project N.O.M.A.D, is a self-contained, offline survival computer packed with critical tools, knowledge, and AI to keep you informed and empowered—anytime, anywhere.

27.3k2026-05-31

git · 今日上榜原文

~ /trending

REPOSITORYNO.08

OpenBMB/VoxCPM

VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice Design, and True-to-Life Cloning

22.7k2026-05-31

git · 今日上榜原文

~ /trending

REPOSITORYNO.06

EveryInc/compound-engineering-plugin

Official Compound Engineering plugin for Claude Code, Codex, Cursor, and more