← 返回主报告：[市场·2026-05-17] Politics

Paper Layer — 2026-05-17

Paper-digest 5/17 32 篇候选 / 8 篇 top picks（HF Daily 主导 7/8，仅 LC-MAPF 同时被 arXiv + S2 命中），主线 Agent × 多代理协作 × 评测三件套：FutureSim（grounded simulation eval）+ FrontierSmith（开放式 coding 训练数据合成）+ Beyond Individual Intelligence（多代理失败归因综述）三篇同日撑起 agent eval / training / debug 闭环，LiSA 给出对应的 deployment guardrail 主线；视觉层 Realiz3D 与 PanoWorld 提供 3D / 360° 全景两条独立信号；Nexus 把 agentic forecasting 推到时间序列。affiliations / similar_papers 全空，机构归属与延伸阅读段降级。

论文层（from paper-digest top_picks）

2605.15188 FutureSim: Replaying World Events to Evaluate Adaptive Agents — 回放真实世界事件评测自适应代理预测。Agent eval 第三方独立链路：以 ‘replay’ 作为 ground truth 评估 agent 的预测能力，把 agent benchmark 从静态题库迁移到 grounded simulation，与 FrontierSmith（合成训练数据）+ Beyond Individual Intelligence（多代理失败归因综述）同日撑起代理评测闭环主线，对应 OpenAI ChatGPT Personal Finance / Anthropic PwC 企业代理 GA 的产能化背景。
2605.14454 LiSA: Lifelong Safety Adaptation via Conservative Policy Induction — 保守策略归纳实现代理终身安全自适应。CAISI / Mythos / EU AI Office 周末仍在生效窗口下，frontier lab 需要回填的中长期 alignment / guardrail 方法学样本。
2605.07637 Learning to Communicate Locally for Large-Scale Multi-Agent Pathfinding — 多代理局部通信预训练提升路径规划协作；本日唯一 HF Daily × S2 双源命中 top pick。
2605.13852 Realiz3D: 3D Generation Made Photorealistic via Domain-Aware Learning — 域感知微调让 3D 生成同时具备真实感与控制；对应 Google Gemini Omni / Meta Muse Spark 多模态主线方法学储备。
2605.14445 FrontierSmith: Synthesizing Open-Ended Coding Problems at Scale — 大规模合成 open-ended coding problems 训练 LLM；与 SWE-Bench Verified / LiveCodeBench 等冰山指标对齐，是 Anthropic Claude Code + Qwen 3.6-35B-A3B + Cursor SDK + Grok Build 等 agentic coding 主线的训练侧供给。
2605.14389 Nexus: An Agentic Framework for Time Series Forecasting — 代理式框架融合时序基础模型与文本上下文预测，对应 OpenAI ChatGPT Personal Finance × Plaid 金融时序问答的方法学底层；与 PwC × Anthropic AI Native Finance 主线形成研究 → 应用 gradient。
2605.14892 Beyond Individual Intelligence: Surveying Collaboration, Failure Attribution, and Self-Evolution in LLM-based Multi-Agent Systems — LLM 多代理协作 + 失败归因 + 自我进化综述；enterprise 代理部署（PwC × Anthropic / Microsoft Agent 365 / OpenAI Deployment Co）从 pilot 走向 production 时的根因分析方法学锚点。
2605.13169 PanoWorld: Towards Spatial Supersensing in 360° Panorama World — MLLM 空间感知向 360° 全景扩展，对应 Ray-Ban Meta Glasses / Vision Pro / robotics 主线方法学供给。

技术信号（paper-digest 范围外的当周开源 / 工具信号）

OpenHuman 登顶 GitHub Trending：github.com/tinyhumansai/openhuman 5/13 v0.53.35 → 5/16 持续登顶；定位 ‘private personal AI superintelligence’，GNU GPLv3，多代理 desktop agent，‘agent reads you first’ 反传统 prompt 模式。community / social TechTimes + AIToolly。
Simon Willison ‘LLM in shebang line’：5/11 simonwillison.net/2026/May/11/llm-shebang/，把工具调用 + 模型直接塞进脚本 shebang，作为 agent harness 极简范式延续；LLM CLI 0.32a2 doc 5/17 build 节点。
Mistral Vibe Remote Agents × Coding Stack：DevOps.com + StartupFortune 把 agent harness 推到 ‘cloud-resident + observability + human-in-the-loop approvals’ production 范式，5/13-17 持续扩散。

Hanzhi's BLOG

[市场·2026-05-17] Paper Layer

Paper Layer — 2026-05-17

论文层（from paper-digest top_picks）

技术信号（paper-digest 范围外的当周开源 / 工具信号）