论文雷达日报｜2026-04-28

一句话结论：今日 VLA / 具身安全与过程级奖励是高密度交叉信号；KV cache、长上下文 upcycling、智能体自组织亦有可读论文，但 S2 相似论文图谱未返回，延伸阅读维度降级。

摘要

三源（arXiv + HF Daily + Semantic Scholar）共抓 129 篇 raw 候选，过去 14 天 seen-pool 命中 0 篇，全部 fresh。
VLA 方向同时出现「方法层」(CF-VLA 粗细两阶段动作生成) 与「安全层」(VLA Safety 综述)，是今日唯一双信号 cluster。
过程奖励模型 (PRM) 出现两条不同方向延伸：感知中心 (Perceval) 与数据分析智能体 (Scientific Process)，PRM 正在跳出数学领域。
推理效率侧 DepthKV 给出层级差异化 KV pruning 的实证；多智能体侧 ReTAS 用辩证对齐缓解 Actor-Observer 归因偏差。
S2 similar_papers 字段未返回，延伸阅读段降级；无 tracked-author / 机构命中，新作者发现段为空。

📌 Top picks (交叉命中)

CF-VLA: Efficient Coarse-to-Fine Action Generation for Vision-Language-Action Policies（cs.CV/cs.AI · watchlist: inference+vla+dpo · ranking_score 7.00）→ 粗细两阶段重塑 VLA 动作生成，先粗初始化骨架再精修，缩短实时推理路径。
Vision-Language-Action Safety: Threats, Challenges, Evaluations, and Mechanisms（HF trending #17, 40 upvotes · watchlist: vla+inference · ranking_score 6.80）→ 系统综述具身 VLA 的多模态攻击面、长视野错误传播与数据供应链漏洞。
Improving Vision-language Models with Perception-centric Process Reward Models（cs.CV · HF trending #9 · ranking_score 6.60）→ 提出 Perceval：把 VLM 回复中的图像断言逐条对照视觉证据，定位感知错误而非仅看最终答案。
Rewarding the Scientific Process: Process-Level Reward Modeling for Agentic Data Analysis（HF trending #27, 14 upvotes · watchlist: reasoning+agent+inference · ranking_score 6.30）→ 把 PRM 推到数据分析智能体场景，识别静默错误并避免错惩 exploratory 动作。
DepthKV: Layer-Dependent KV Cache Pruning for Long-Context LLM Inference（cs.CL/cs.AI · watchlist: kv cache+inference · ranking_score 6.00）→ 反对统一剪枝率假设，按层重要性差异化裁剪 KV cache，降低长上下文显存。
Discovering Agentic Safety Specifications from 1-Bit Danger Signals（HF trending #10 · watchlist: reasoning+agent · S2 已索引 · ranking_score 6.00）→ EPO-Safe：仅靠稀疏二元危险信号让 LLM 通过反思自我演化出自然语言安全规范。
Taming Actor-Observer Asymmetry in Agents via Dialectical Alignment（HF trending #15, 8 upvotes · watchlist: reasoning+agent · S2 已索引 · ranking_score 6.00）→ ReTAS 用「正-反-合」辩证对齐训练，强制视角不变推理，缓解多智能体归因偏差。
RaV-IDP: A Reconstruction-as-Validation Framework for Faithful Intelligent Document Processing（HF trending #2 · watchlist: inference · ranking_score 5.80）→ 用「重建反向校验」让文档抽取自带忠实性验证，捕捉静默错误。

🏷 Watchlist 分类命中

ProEval: Proactive Failure Discovery and Efficient Performance Estimation（HF #24）→ 用迁移学习提前发现失败模式并估算性能。
Disentangled Robot Learning via Separate Forward and Inverse Dynamics Pretraining（HF #4 · S2 引用速率 0.22）→ DeFI 解耦视觉前向与逆向动力学预训练。
UniGeo: Unifying Geometric Guidance for Camera-Controllable Image Editing（HF #5）→ 帧解耦几何参考注入提升相机可控图像编辑跨视图一致性。
From Skills to Talent: Organising Heterogeneous Agents as a Real-World Company（HF #13, 73 upvotes）→ 把多智能体系统组织成自演化「公司」，由静态 pipeline 转为自组织。

🔗 延伸阅读 (Semantic Scholar 相似论文)

本段今日无高置信度增量信号（S2 相似论文未返回，candidate JSON 缺 similar_papers 字段，按硬性约束不再外部检索）。

🧑‍🔬 新出现的作者 / 团队

本日发现扫描未发现达标候选人：候选 ranking_reasons 中 0 条 tracked_author / tracked_affiliation 命中，HF Daily / arXiv abs 元数据均未附 affiliations 字段，按 discovery_rules 不为凑数硬塞外部搜索结果。

📉 覆盖缺口与不确定性

s2_similar_unavailable：候选 JSON 没有 similar_papers，延伸阅读段为空。
affiliations_missing：129 篇候选 affiliations 全部为空数组，机构与新作者发现段降级；后续可考虑在 paper_fetch 阶段补 OpenReview / S2 author lookup。
HF Daily / arXiv 三源全部 OK（/tmp/paper_fetch.err 为空），seen-pool 14 天滚动命中 0；不存在源降级。

来源与交叉验证说明

本期权重：arXiv (primary, 排序贡献分类与 watchlist 关键词) + HF Daily (curated, 提供 trending rank 与 upvote) + Semantic Scholar (metadata, 提供 s2_tldr 与 citation_velocity，本期 similar_papers 缺失)。Top picks 中 6/8 同时被 arXiv 与 HF Daily 命中（双源交叉），其余 2 篇 (CF-VLA, DepthKV) 仅 arXiv 单源命中但 watchlist 关键词强匹配。结论锚定 primary 即 arXiv 摘要文本；citation_count 普遍为 0/None 属于新预印本正常情形，不作为降权依据。

Hanzhi's BLOG

[论文·2026-04-28]

论文雷达日报｜2026-04-28

摘要

📌 Top picks (交叉命中)

🏷 Watchlist 分类命中

cs.CL (4)

cs.AI (3)

cs.RO (1)

🔗 延伸阅读 (Semantic Scholar 相似论文)

🧑‍🔬 新出现的作者 / 团队

📉 覆盖缺口与不确定性

来源与交叉验证说明

论文雷达日报｜2026-04-28

摘要

📌 Top picks (交叉命中)

🏷 Watchlist 分类命中

cs.CL (4)

cs.AI (3)

cs.RO (1)

HF Daily 候选（无 arXiv 主分类，按 trending 排序，4 篇）

🔗 延伸阅读 (Semantic Scholar 相似论文)

🧑‍🔬 新出现的作者 / 团队

📉 覆盖缺口与不确定性

来源与交叉验证说明