AI"偷懒"的本质是LLM输出长度限制导致的注意力分散。Wide Research通过多轻量模型并行处理子任务、主LLM汇总的方式解决,分享为Codex构建该能力的设计思路。
How to Stop AI from Slacking Off: Building Systematic Wide Research Capabilities for Codex
Why AI slacks off on large tasks: LLM output length limitations cause attention drift. Wide Research solves this by parallelizing with lightweight models, then aggregating results with a primary LLM.
为什么OpenAI Apps SDK对MCP的支持反而是MCP的危机
分析OpenAI Apps SDK通过_meta域绕过context window的做法如何违背MCP设计哲学,以及协议分裂成不同dialect的潜在危机。
Why OpenAI's Apps SDK Signals a Crisis for MCP
Analyzing how OpenAI's Apps SDK extension with _meta field violates MCP's design philosophy, creating dialects that may fragment the standard like SQL or CSS.
Kimi K2:超越聊天框的深度评测
在真实编程和调研任务中评测Kimi K2的Agentic能力:执行韧性出色,适合作为信息采集前端;但工具调用稳定性和生态适配仍有提升空间。