论文雷达日报｜2026-05-13

一句话结论：今日 134 条候选里 「世界模型 × Agent × 推理优化」三线交织——Top 8 同时出现 robotics 世界模型综述（Abbeel/Malik 联署）、企业级 world-model 反思、HF 趋势 #2-#12 的多模态/多智能体推理与长上下文 KV-cache 重设计；最强观察点是世界模型主题在「理论综述（#1）」与「企业上下文反例（#5）」同日并发，意味着方法论已进入「是否仍需学习动态」的二阶质疑阶段。

摘要

三源全部正常返回：arXiv listing + HF Daily + Semantic Scholar，共 134 条候选，0 条命中 14 天 seen-pool（全部 fresh）。
Top 8 集中在 agent / reasoning / inference 三个高热关键词（各命中 30 次量级），watchlist 命中以 world model、quantization、preference optimization、vla 形成次梯队。
Tracked author 命中 2 篇：Pieter Abbeel + Jitendra Malik 联署 robotics world-model 综述（#1），Tri Dao 出现在 block floating-point 量化论文（#9，紧随 Top 8 之外）。
HF 趋势榜 Top 3 入选 2 篇（#2 UniPath / #3 Beyond Reasoning），第 1 名 ORBIT 因 ranking_score 5.4 退至 watchlist 段。
S2 similar_papers 全 134 条均未返回——延伸阅读段为空，覆盖缺口写 s2_similar_unavailable，并在「来源与交叉验证」说明 S2 引用图本日仅提供 tldr 而无邻居图。

📌 Top picks (交叉命中)

World Model for Robot Learning: A Comprehensive Survey（HF upvotes 9 / tracked-author: Pieter Abbeel + Jitendra Malik） → 系统综述机器人学习中的世界模型范式与挑战。
- reason：watchlist_keyword agent,world model + nice_to_have benchmark,evaluation,embodied + tracked_author 双命中，是当日唯一的方法论级综述。
- evidence：arxiv / hf / s2
Large Language Models over Networks: Collaborative Intelligence under Resource Constraints（HF trending #12 / HF upvotes 1） → 资源受限网络下 LLM 端-边-云协同推理。
- reason：hf_trending_rank:12 + watchlist_keyword agent,inference,dpo，对应"端侧 LLM × 网络资源调度"长期主线。
Beyond Reasoning: Reinforcement Learning Unlocks Parametric Knowledge in LLMs（HF trending #3 / HF upvotes 5） → RL 是激活而非新增 LLM 参数知识的工具。
- reason：hf_trending_rank:3 + watchlist_keyword reasoning,inference + nice_to_have benchmark，重定位 RL 在 LLM 中的角色定义。
- evidence：arxiv / hf / s2
TextSeal: A Localized LLM Watermark for Provenance & Distillation Protection（cs.CR/CL/LG） → Gumbel-max 双密钥水印零开销兼容投机解码。
- reason：watchlist_keyword reasoning,inference,speculative decoding + nice_to_have benchmark,evaluation，主张 strict-dominates SynthID-text。
Do Enterprise Systems Need Learned World Models? The Importance of Context to Infer Dynamics（HF upvotes 39，当日最高） → 企业系统可读规则即推断动态，未必需世界模型。
- reason：watchlist_keyword reasoning,agent,inference + nice_to_have benchmark,evaluation；与 #1 综述形成同日并发的二阶质疑。
UniPath: Adaptive Coordination of Understanding and Generation for Unified Multimodal Reasoning（HF trending #2 / HF upvotes 2） → 多模态推理：理解与生成的自适应路径协调。
- reason：hf_trending_rank:2 + watchlist_keyword reasoning,inference，针对 UMM 推理时协调缺口。
TacoMAS: Test-Time Co-Evolution of Topology and Capability in LLM-based Multi-Agent Systems（HF trending #9 / HF upvotes 2） → 测试时联合演化拓扑与能力的多智能体框架。
- reason：hf_trending_rank:9 + watchlist_keyword agent,inference + nice_to_have benchmark，主张两轴需在不同时间尺度共同演化。
KV-Fold: One-Step KV-Cache Recurrence for Long-Context Inference（cs.LG/AI/CL） → KV-cache 一步左折叠免训练长上下文推理。
- reason：watchlist_keyword agent,inference,kv cache + nice_to_have benchmark，函数式 foldl 视角包装 KV 累积。

🏷 Watchlist 分类命中

未进 Top 8、但在 watchlist 主题上有强信号的次梯队：

量化 / 系统加速 — Search Your Block Floating Point Scales!（score 6.5，tracked_author: Tri Dao；cs.LG/AR/PF）：block FP 量化 scale 搜索，紧随 Top 8 之外的系统层亮点。
VLA × 世界模型 — Reinforcing VLAs in Task-Agnostic World Models 与 World Action Models: The Next Frontier in Embodied AI（HF #23）共同把 VLA 接入 task-agnostic 世界模型这条线推到 3 篇。
多模态统一模型 — SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture（cs.CV，商汤）补完今日 UMM 阵营，与 #6 UniPath 形成「架构 vs 协调」双视角。
偏好优化 / 对齐 — Semantic Reward Collapse and the Preservation of Epistemic Integrity in Adaptive RLHF：preference_optimization × rlhf 跨标签命中，主线关注奖励语义坍缩。
长上下文 SFT — FocuSFT: Bilevel Optimization for Dilution-Aware Long-Context Fine-Tuning：watchlist long context 唯一专项命中。
CoT 可靠性 — Reliable Chain-of-Thought via Prefix Consistency（HF #5）：把 CoT 可靠性归一到前缀一致性，方向与 #3 Beyond Reasoning 的"激活 vs 学习"区分互补。
Agentic RL / 异步训练 — Missing Old Logits in Asynchronous Agentic RL: Semantic Mismatch and Repair（HF #20）：补全 Agentic RL 异步训练里旧 logits 失配的工程细节。

每条均为 raw fresh 抓来、关键词命中但未进 Top picks，且 ranking_score ≥ 4.0。

🔗 延伸阅读 (Semantic Scholar 相似论文)

本段今日无高置信度增量信号（S2 相似论文未返回）。134 条候选中 8 条拿到 S2 paper_id 与 tldr，但 similar_papers 字段一律为空数组——S2 本日引用图未对预印本下发邻居。我们维持「宁愿空也不要凑数」原则，不外部补抓。

🧑‍🔬 新出现的作者 / 团队

本日发现扫描未发现达标候选人。候选 JSON 的 authors 仅来自 arXiv + HF 元数据，无 affiliations 字段；要可靠地把"过去 48h 首次命中 watchlist"与机构种子做交叉认证需访问外部源，超出 paper-digest 单次抓取范围。Tracked 老相识（Pieter Abbeel / Jitendra Malik / Tri Dao）见 Top picks 与分类命中段。

📉 覆盖缺口与不确定性

s2_similar_unavailable — Semantic Scholar 本日仅返回 tldr，未提供 similar_papers（134/134 候选均为空），延伸阅读段无法支撑。
候选 JSON 缺 affiliations 与 published_at——以 arxiv_id 月份前缀（2605）反推为 2026-05 提交批次，机构归属本日无法机器确认。
s2_tldr 仅 8/134 命中，76 条 Top 30 论文需以 abstract 第一句生成 tldr_cn，存在中文压缩漂移风险。
HF hf_upvotes 仅 49/134 条返回数字，其他 None，HF 信号强度按 trending_rank 而非 upvotes 排序。

来源与交叉验证说明

源	角色	本期状态
arXiv listing	`primary` — 论文 PDF 与 abstract	正常，134 条 fresh
HuggingFace Daily Papers	`curated` — 社区热度	正常，hf_url 全覆盖；trending_rank 28 条命中前 30
Semantic Scholar	`metadata` — tldr / 引用图	部分降级：tldr 8/134；`similar_papers` 0/134

结论锚点优先级：primary > metadata > curated。HF trending 仅作为信号增益，不作为论文结果的证据；本期 Top picks 均可用 arxiv abstract 复核。S2 邻居图缺失记为 coverage_gap，不影响 Top picks 稳定契约字段。

Hanzhi's BLOG

[论文·2026-05-13]