全部 今日 本周 本月
2026-02-13

LlamaIndex 联合 PostHog 推出 LLM 分析可观测性集成

LlamaIndex 与 PostHog 合作推出 LLM 分析功能,支持自动追踪 OpenAI Token 消耗、成本和延迟指标,帮助开发者监控 Agent 工作流性能。

产品发布
@llama_index 阅读 →

OpenAI:GPT-5.2 推翻数十年粒子物理定论,发现胶子振幅非零

论文揭示,长期以来被认为振幅为零的特定胶子相互作用(树级单负螺旋度),在粒子运动满足特定对齐条件时实际并非为零。这一发现纠正了物理学界数十年的假设。

研究
@OpenAI 阅读 →

OpenAI:GPT-5.2 在粒子物理学中取得突破,简化胶子散射公式

OpenAI 与物理学家合作发表预印本论文,GPT-5.2 成功简化了胶子相互作用的复杂表达式,并推测出适用于任意数量胶子的通用公式。另一个 OpenAI 内部模型经约 12 小时推理独立推导出相同公式。

研究
@OpenAI 阅读 →

n8n:全新快速入门教程上线,教你构建带知识库的 Q&A AI Agent

n8n 发布深入技术教程,由核心成员 Max 演示如何构建带知识库的问答 AI Agent,涵盖数据项和循环等大多数教程忽略的关键基础概念。

产品发布
@n8n_io 阅读 →

swyx:应用层比模型层更有结构性优势,Agent 开发者可以跑赢大厂

swyx 认为 Agent 实验室和开源开发者相比模型大厂有两大优势:可以在所有模型中取最优(argmax),且无需受安全审查约束自由探索能力边界。

观点
@swyx 阅读 →

OpenAI:GPT-5.2 在粒子物理学中发现新公式,推翻数十年理论假设

OpenAI 与物理学家合作发表预印本论文,GPT-5.2 简化了胶子相互作用的复杂表达式并推导出通用公式,推翻了粒子物理中「单负振幅为零」的长期假设。另一内部模型独立推导出相同结论。

大模型
@OpenAI 阅读 →

Browserbase + Cerebras:极速浏览器 Agent 模板开源

Browserbase 联合 Cerebras 推出浏览器 Agent 模板,使用最新开源模型批量启动浏览器爬取文档并验证与代码库的一致性,数分钟内完成。

产品发布
@browserbase 阅读 →

Latent Space:Jeff Dean 深度访谈,聊 Gemini Ultra 与 AI 工程关键数字

swyx 在 Latent Space 播客发布对 Google DeepMind 负责人 Jeff Dean 的深度采访,涉及 Gemini Deep Think、Gemini Ultra 去向以及「AI 工程师必知数字」等话题。

观点
@swyx 阅读 →

Replit:用户反馈一键变功能,Agent 自动实现需求

Replit 推出反馈组件功能,用户在已发布应用中提建议后,Agent 可自动将其转化为已上线的功能,实现需求闭环。

产品发布
@Replit 阅读 →

Forge:可扩展的 Agent 强化学习框架发布

新开源项目 Forge 发布,提供可扩展的 Agent 强化学习框架和算法,为构建 AI Agent 提供训练基础设施。

研究
@_akhaliq 阅读 →

MiniMax-M2.5 模型上线 Hugging Face 开源社区

MiniMax-M2.5 模型权重已发布到 Hugging Face,同时提供 API 服务,开发者可直接下载使用。

大模型
@_akhaliq 阅读 →

LMSys:MiniMax-M2.5 模型发布,编程与 Agent 能力达 SOTA

MiniMax 发布 M2.5 模型,在编码、Agent 工具调用和办公场景中达到 SOTA 水平。该模型通过大规模真实环境 RL 训练,具备架构级编程能力和高效搜索推理,SGLang 已提供 Day-0 支持。

大模型
@lmsysorg 阅读 →

AMA with MiniMax — Ask Us Anything!

<table> <tr><td> <a href="https://www.reddit.com/r/LocalLLaMA/comments/1r3t775/ama_with_minimax_ask_us_anything/"> <img...

研究
Reddit r/LocalLLaMA 阅读 →

[D] Teaching AI to Reason With Just 13 Parameters

<!-- SC_OFF --><div class="md"><p><em>Made with</em> <a href="https://paperglide.net/"><em>Paperglide</em></a> <em>✨ — digest research papers faster</em></p> <p><strong>TL;DR:</strong>...

产品发布
Reddit r/MachineLearning 阅读 →

[D] How do your control video resolution and fps for a R(2+1)D model?

<!-- SC_OFF --><div class="md"><p>So I am using a R(2+1)D with kinetics 400 weights to train a classifier on two sets of videos. The problem is that one of the two classes has all videos of the same resolution and fps, forcing the model to learn those features instead of...

大模型
Reddit r/MachineLearning 阅读 →

[D] Has anyone received their ICML papers to review yet?

<!-- SC_OFF --><div class="md"><p>I thought the reviewing period should have started yesterday, but it still says "You have no assigned papers. Please check again after the paper assignment process is complete." </p> </div><!-- SC_ON...

研究
Reddit r/MachineLearning 阅读 →

[P] SoproTTS v1.5: A 135M zero-shot voice cloning TTS model trained for ~$100 on 1 GPU, running...

<!-- SC_OFF --><div class="md"><p>I released a new version of my side project: SoproTTS</p> <p>A 135M parameter TTS model trained for ~$100 on 1 GPU, running ~20× real-time on a base MacBook M3 CPU.</p> <p>v1.5 highlights (on CPU):</p>...

产品发布
Reddit r/MachineLearning 阅读 →

[R] Has anyone experimented with MHC on traditional autoencoders/convolutional architectures?

<!-- SC_OFF --><div class="md"><p>I'm currently making a baseline autoencoder for this super freaking huge hyperspectral image dataset I have. It's a really big pain to work with and to get decent results, and I had to basically pull all stops including...

大模型
Reddit r/MachineLearning 阅读 →

[D] Benchmarking Deep RL Stability Capable of Running on Edge Devices

<!-- SC_OFF --><div class="md"><p>This post details my exploration for a "stable stack" for streaming deep RL (ObGD, SparseInit, LayerNorm, and online normalization) using 433,000 observations of real, non-stationary SSH attack traffic.</p>...

研究
Reddit r/MachineLearning 阅读 →

[R] Higher effort settings reduce deep research accuracy for GPT-5 and Gemini Flash 3

<!-- SC_OFF --><div class="md"><p>We evaluated 22 model configurations across different effort/thinking levels on Deep Research Bench (169 web research tasks, human-verified answers). For two of the most capable models, higher effort settings scored worse. </p>...

研究
Reddit r/MachineLearning 阅读 →

[D] ICML: every paper in my review batch contains prompt-injection text embedded in the PDF

<!-- SC_OFF --><div class="md"><p>I’m reviewing for ICML (Policy A, where LLM use is not allowed) and noticed that in my assigned batch, if you copy/paste the full PDF text into a text editor, every single paper contains prompt-injection style instructions embedded...

研究
Reddit r/MachineLearning 阅读 →

Show HN: Skill that lets Claude Code/Codex spin up VMs and GPUs

I&#x27;ve been working on CloudRouter, a skill + CLI that gives coding agents like Claude Code and Codex the ability to start cloud VMs and GPUs.<p>When an agent writes code, it usually needs to start a dev server, run tests, open a browser to verify its work. Today that all happens on your local...

产品发布
Hacker News 阅读 →

I'm not worried about AI job loss

I'm not worried about AI job loss

行业
Hacker News 阅读 →

The "AI agent hit piece" situation clarifies how dumb we are acting

Previously:<p><i>An AI agent published a hit piece on me</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=46990729">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=46990729</a> - Feb 2026 (916 comments)<p><i>AI agent opens a PR write a blogpost to shames the maintainer who...

行业
Hacker News 阅读 →