论文雷达日报｜2026-05-11

一句话结论：今日 Agent × Multimodal Search × Inference 加速三路同时高密度命中——DTap 红队平台 (Percy Liang 署名) 把代理安全/评测推到榜首，SpecBlock 给出 vs EAGLE-3 +8-13% 的硬数字，ReasonMaxxer 则抛出 RL-free 反命题挑战 RLVR 范式。

摘要

今日 47 条候选三源齐备（arXiv 47 / HF Daily 47 cross-listed / Semantic Scholar 32 命中），主线被 Agent × Multimodal Search × Inference 加速 三路撑住——DTap 红队平台 (Percy Liang 署名) 和 LLMs-Improving-LLMs 双榜首把代理评测和 TTS 自动化推到 Top；HyperEyes / InterLV-Search 同日交付多模态代理搜索的方法面与基准面双信号；SpecBlock 给出本日推理加速主线唯一硬 benchmark 数字（vs EAGLE-3 +8-13%）。ReasonMaxxer 抛出 RL-free 反命题挑战 RLVR 范式，方法面冲击较强。MoE 命中 2 篇（MACE-Dance 是首条具象应用），DPO/长上下文/推测解码命中各 1-2 篇。S2 相似论文链路全候选未返回，延伸阅读段为空；所有候选 affiliations 字段为空，无法做机构/团队层归属。

📌 Top picks (交叉命中)

2605.04808 DecodingTrust-Agent Platform (DTap): A Controllable and Interactive Red-Teaming Platform for AI Agents
- 速读：DTap 红队平台覆盖 14 领域 50+ 环境压测 AI 代理。
- S2 TLDR：The DecodingTrust-Agent Platform (DTap) is introduced, the first controllable and interactive red-teaming platform for AI agents, spanning 14 real-world domains and over 50 simulation environments that replicate widely used systems such as Google Workspace, Paypal, and Slack.
- 入选理由：hf_trending_rank:14 + watchlist:agent + tracked_author:Percy Liang，首个可控交互式代理红队平台，安全/评测主线双命中。（score=6.6, hf_upvotes=14, reasons=hf_trending_rank:14; watchlist_keyword:agent; nice_to_have:evaluation; tracked_author:percy liang）
- 作者：Zhaorun Chen, Xun Liu, Haibo Tong, Chengquan Guo, Yuzhou Nie, Jiawei Zhang 等
- 链接：https://arxiv.org/abs/2605.04808 / https://huggingface.co/papers/2605.04808 / https://www.semanticscholar.org/paper/8b2bd7e1a717663be85a78f1486a8f3f415c551c
2605.08083 LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling
- 速读：代理自动发现 TTS 策略，胜过人工启发式调度。
- 入选理由：watchlist:reasoning,agent,inference 三重命中 + 51 HF upvotes（rank 外但热度第二），首次把 TTS 设计本身交给 LLM 代理迭代。（score=6.5, hf_upvotes=51, reasons=watchlist_keyword:reasoning,agent,inference; nice_to_have:benchmark）
- 作者：Tong Zheng, Haolin Liu, Chengsong Huang, Huiwen Bao, Sheng Zhang, Rui Liu 等
- 链接：https://arxiv.org/abs/2605.08083 / https://huggingface.co/papers/2605.08083
2605.06716 From Storage to Experience: A Survey on the Evolution of LLM Agent Memory Mechanisms
- 速读：LLM 代理记忆综述提出 Storage→Reflection→Experience 三阶段框架。
- S2 TLDR：This survey proposes a novel evolutionary framework for LLM agent memory mechanisms, formalizing the development process into three stages: Storage (trajectory preservation), Reflection (trajectory refinement), and Experience (trajectory abstraction).
- 入选理由：watchlist:agent + citation_velocity:4.0，将散乱记忆机制研究系统化，工程参考价值高。（score=6.0, hf_upvotes=5, reasons=watchlist_keyword:agent; citation_velocity:4.0）
- 作者：Jinghao Luo, Yuchen Tian, Chuxue Cao, Ziyang Luo, Hongzhan Lin, Kaixin Li 等
- 链接：https://arxiv.org/abs/2605.06716 / https://huggingface.co/papers/2605.06716 / https://www.semanticscholar.org/paper/ed20847a506473433843fe31f9024667c0f47325
2512.18181 MACE-Dance: Motion-Appearance Cascaded Experts for Music-Driven Dance Video Generation
- 速读：级联 MoE：动作专家+外观专家合成音乐驱动舞蹈视频。
- S2 TLDR：MACE-Dance is presented, a music-driven dance video generation framework with cascaded Mixture-of-Experts (MoE), where the Motion Expert performs music-to-3D motion generation while enforcing kinematic plausibility and artistic expressiveness, whereas the Appearance Expert carries out motion- and reference-conditioned video synthesis.
- 入选理由：hf_trending_rank:12 + 80 upvotes + watchlist:moe + benchmark/fine-tuning/evaluation 多重命中，MoE 主线本周内首条具象化应用。（score=5.596, hf_upvotes=80, reasons=hf_trending_rank:12; watchlist_keyword:moe; nice_to_have:benchmark,fine-tuning,evaluation; citation_velocity:0.296）
- 作者：Kaixing Yang, Jiashu Zhu, Xulong Tang, Ziqiao Peng, Xiangyue Zhang, Puwei Wang 等
- 链接：https://arxiv.org/abs/2512.18181 / https://huggingface.co/papers/2512.18181 / https://www.semanticscholar.org/paper/042c61783b406feb5ca8489f34213f837f8474a1
2605.06241 Rethinking RL for LLM Reasoning: It’s Sparse Policy Selection, Not Capability Learning
- 速读：ReasonMaxxer：熵门控对比损失替代 RL，单卡分钟级训练。
- S2 TLDR：ReasonMaxxer, a minimal RL-free method that applies contrastive loss only at entropy-gated decision points, matches or exceeds full RL performance while requiring only tens of problems and minutes of single-GPU training, a reduction in training cost of roughly three orders of magnitude.
- 入选理由：hf_trending_rank:4 + watchlist:reasoning，对当前 RLVR 范式提出 RL-free 反命题，训练成本下降约三量级，方法面冲击较强。（score=5.1, hf_upvotes=2, reasons=hf_trending_rank:4; watchlist_keyword:reasoning; nice_to_have:benchmark）
- 作者：Ömer Faruk Akgül, Rajgopal Kannan, Willie Neiswanger, Viktor Prasanna
- 链接：https://arxiv.org/abs/2605.06241 / https://huggingface.co/papers/2605.06241 / https://www.semanticscholar.org/paper/3fd6e403b398fa1ecf2618cce026272724ab6a5e
2605.07177 HyperEyes: Dual-Grained Efficiency-Aware Reinforcement Learning for Parallel Multimodal Search Agents
- 速读：HyperEyes 把效率写进 RL 目标，多模态搜索改并行原子动作。
- S2 TLDR：This work presents HyperEyes, a parallel multimodal search agent that fuses visual grounding and retrieval into a single atomic action, enabling concurrent search across multiple entities while treating inference efficiency as a first-class training objective and introduces IMEB, a human-curated benchmark of 300 instances that jointly evaluates search capability and efficiency.
- 入选理由：hf_trending_rank:24 + 54 upvotes + watchlist:agent,inference + benchmark，提出 IMEB 基准，与今日 InterLV-Search 形成多模态代理搜索双信号。（score=5.1, hf_upvotes=54, reasons=hf_trending_rank:24; watchlist_keyword:agent,inference; nice_to_have:benchmark）
- 作者：Guankai Li, Jiabin Chen, Yi Xu, Xichen Zhang, Yuan Lu
- 链接：https://arxiv.org/abs/2605.07177 / https://huggingface.co/papers/2605.07177 / https://www.semanticscholar.org/paper/21877966de98d8e37e7b5e0c7de4834ed2f9c8ad
2605.07243 SpecBlock: Block-Iterative Speculative Decoding with Dynamic Tree Drafting
- 速读：SpecBlock 块迭代推测解码比 EAGLE-3 提速 8-13%。
- S2 TLDR：This paper proposes SpecBlock, a block-iterative drafter that combines path dependence with cheap drafting, and shows that SpecBlock improves mean speedup by 8-13% over EAGLE-3 at 44-52% of its drafting cost, and cost-aware adaptation extends this lead to 11-19%.
- 入选理由：hf_trending_rank:20 + watchlist:inference,speculative decoding，本日推理加速主线唯一硬 benchmark 数字，相对 EAGLE-3 cost 仅 44-52%。（score=5.0, hf_upvotes=2, reasons=hf_trending_rank:20; watchlist_keyword:inference,speculative decoding）
- 作者：Weijie Shi, Qiang Xu, Fan Deng, Yaguang Wu, Jiarun Liu, Yehong Xu 等
- 链接：https://arxiv.org/abs/2605.07243 / https://huggingface.co/papers/2605.07243 / https://www.semanticscholar.org/paper/0a8c0922e16a6c1fd098dc663278c9a2acb13986
2605.07510 InterLV-Search: Benchmarking Interleaved Multimodal Agentic Search
- 速读：InterLV-Search：首个交错多模态代理搜索基准。
- 入选理由：watchlist:agent,dpo + benchmark/evaluation，把视觉证据纳入搜索轨迹，是今日代理评测面第二条信号。（score=5.0, hf_upvotes=5, reasons=watchlist_keyword:agent,dpo; nice_to_have:benchmark,evaluation）
- 作者：Bohan Hou, Jiuning Gu, Jiayan Guo, Ronghao Dang, Sicong Leng, Xin Li 等
- 链接：https://arxiv.org/abs/2605.07510 / https://huggingface.co/papers/2605.07510

🏷 Watchlist 分类命中

已扣除 Top picks 中已列条目；每桶最多列 4 条 fallback 候选。

`agent`（4 条 fallback）

2605.03353 SkCC: Portable and Secure Skill Compilation for Cross-Framework LLM Agents — score=4.5, hf_upvotes=6; reasons: hf_trending_rank:5; watchlist_keyword:agent
2604.25325 R^3-SQL: Ranking Reward and Resampling for Text-to-SQL — score=4.5, hf_upvotes=1; reasons: hf_trending_rank:10; watchlist_keyword:agent; nice_to_have:benchmark
2605.06455 PrefixGuard: From LLM-Agent Traces to Online Failure-Warning Monitors — score=4.2, hf_upvotes=2; reasons: hf_trending_rank:8; watchlist_keyword:agent
2605.07447 Sparse Autoencoders as Plug-and-Play Firewalls for Adversarial Attack Detection in VLMs — score=4.1, hf_upvotes=1; reasons: hf_trending_rank:9; watchlist_keyword:agent

`reasoning`（3 条 fallback）

2605.05997 4DThinker: Thinking with 4D Imagery for Dynamic Spatial Understanding — score=5.0, hf_upvotes=15; reasons: watchlist_keyword:reasoning,inference; nice_to_have:benchmark,fine-tuning
2605.08043 SCOPE: Structured Decomposition and Conditional Skill Orchestration for Complex Image Generation — score=3.6, hf_upvotes=7; reasons: hf_trending_rank:19; watchlist_keyword:reasoning; nice_to_have:benchmark
2605.06139 Listwise Policy Optimization: Group-based RLVR as Target-Projection on the LLM Response Simplex — score=2.0, hf_upvotes=57; reasons: watchlist_keyword:reasoning

`inference`（4 条 fallback）

2605.05997 4DThinker: Thinking with 4D Imagery for Dynamic Spatial Understanding — score=5.0, hf_upvotes=15; reasons: watchlist_keyword:reasoning,inference; nice_to_have:benchmark,fine-tuning
2605.08044 Fast Byte Latent Transformer — score=4.0, hf_upvotes=5; reasons: watchlist_keyword:inference,speculative decoding
2605.07363 MISA: Mixture of Indexer Sparse Attention for Long-Context LLM Inference — score=4.0, hf_upvotes=11; reasons: watchlist_keyword:long context,inference
2605.06105 Shallow Prefill, Deep Decoding: Efficient Long-Context Inference via Layer-Asymmetric KV Visibility — score=3.9, hf_upvotes=1; reasons: hf_trending_rank:16; watchlist_keyword:inference; nice_to_have:benchmark

`moe`（1 条 fallback）

2602.03473 Scaling Continual Learning to 300+ Tasks with Bi-Level Routing Mixture-of-Experts — score=3.1, hf_upvotes=7; reasons: hf_trending_rank:29; watchlist_keyword:moe; nice_to_have:benchmark,evaluation

`dpo`（1 条 fallback）

2605.00933 CGM-JEPA: Learning Consistent Continuous Glucose Monitor Representations via Predictive Self-Supervised Pretraining — score=4.4, hf_upvotes=1; reasons: hf_trending_rank:6; watchlist_keyword:dpo

`long context`（1 条 fallback）

2605.07363 MISA: Mixture of Indexer Sparse Attention for Long-Context LLM Inference — score=4.0, hf_upvotes=11; reasons: watchlist_keyword:long context,inference

`speculative decoding`（1 条 fallback）

2605.08044 Fast Byte Latent Transformer — score=4.0, hf_upvotes=5; reasons: watchlist_keyword:inference,speculative decoding

🔗 延伸阅读 (Semantic Scholar 相似论文)

本段今日无高置信度增量信号（S2 相似论文未返回）。Coverage gap：s2_similar_unavailable。

🧑‍🔬 新出现的作者 / 团队

在候选 affiliations / categories 全空的元数据约束下，本日仅靠 ranking_reasons 里的 tracked_author 标签做归属——DTap 红队论文（arxiv:2605.04808）联合署名 Percy Liang，是今日唯一可识别的 watchlist 已知作者活跃信号；其余候选未发现达标新作者 / 新团队。

Percy Liang — 在 2605.04808 《DecodingTrust-Agent Platform (DTap)》联合署名。watchlist 已知 tracked_author 在今日署名，作为已跟踪人物的活跃信号记录；其余候选 affiliations 字段空，无法做新作者甄别。

📉 覆盖缺口与不确定性

s2_similar_unavailable：S2 similar_papers 字段在所有候选上为 None，延伸阅读段为空。
affiliations_unavailable：47 条候选的 affiliations[] 全空，无法做机构 / 团队级新发现归属。
s2_partial_coverage：15/47 候选缺 s2_url（含 LLMs-Improving-LLMs / InterLV-Search 等热门条目），其 tldr_en 留空，未做替代翻译。
confidence_flags: ranking_relies_on_hf_upvotes_and_keyword_only / no_tracked_lab_attribution_today。

来源与交叉验证说明

arXiv (primary) — 47 条，作为结论锚点，引用 arxiv_url。
HuggingFace Daily Papers (curated) — 47 条全部 cross-listed，hf_upvotes / hf_trending_rank 仅作注意力指标。
Semantic Scholar (metadata) — 32/47 命中，提供 tldr_en / citation_velocity；similar_papers 全候选未返回。

Top picks 的 tldr_cn 由 s2_tldr 或 abstract 第一句浓缩，未触发外部 fetch / 翻译；ranking_score 由 paper_fetch.py 一次性给出，未二次重排。Source mix：arXiv 47 / HF 47 / S2 32（primary>metadata>curated>other）。

Hanzhi's BLOG

[论文·2026-05-11]

论文雷达日报｜2026-05-11

摘要

📌 Top picks (交叉命中)

🏷 Watchlist 分类命中

`agent`（4 条 fallback）

`reasoning`（3 条 fallback）

`inference`（4 条 fallback）

`moe`（1 条 fallback）

`dpo`（1 条 fallback）

`long context`（1 条 fallback）

`speculative decoding`（1 条 fallback）

🔗 延伸阅读 (Semantic Scholar 相似论文)

🧑‍🔬 新出现的作者 / 团队

📉 覆盖缺口与不确定性

来源与交叉验证说明

论文雷达日报｜2026-05-11

摘要

📌 Top picks (交叉命中)

🏷 Watchlist 分类命中

agent（4 条 fallback）

reasoning（3 条 fallback）

inference（4 条 fallback）

moe（1 条 fallback）

dpo（1 条 fallback）

long context（1 条 fallback）

speculative decoding（1 条 fallback）

🔗 延伸阅读 (Semantic Scholar 相似论文)

🧑‍🔬 新出现的作者 / 团队

📉 覆盖缺口与不确定性

来源与交叉验证说明

`agent`（4 条 fallback）

`reasoning`（3 条 fallback）

`inference`（4 条 fallback）

`moe`（1 条 fallback）

`dpo`（1 条 fallback）

`long context`（1 条 fallback）

`speculative decoding`（1 条 fallback）