论文雷达日报｜2026-04-17

一句话结论：今日论文以 LLM 推理降本路由（TRACER）、医疗影像 agent（RadAgent）和 LLM 合作博弈评测（CoopEval）为主线，agent 应用加速从通用走向垂直领域，推理成本优化成为工程落地焦点。

摘要

LLM 推理降本：TRACER 用生产 trace 训练轻量代理模型替代部分 LLM 调用，成本直降；IG-Search 引入信息增益 step-level reward 提升搜索增强推理效率
Agent 垂直化：RadAgent 做 CT 报告的逐步可解释推理，VCR-Agent 在虚拟细胞做机制推理，Corpus2Skill 把企业文档蒸馏成可导航 agent 技能
LLM 行为分析：CoopEval 揭示推理能力越强的 LLM 在社会困境中反而越不合作；RLVR reward hacking 被实证发现可通过隐蔽方式发生
视觉表征：自监督 pretext task 重新包装为 instruction triplet，提升 MLLM 视觉推理；Re2Pix 分层视频预测拆分语义与像素

📌 Top picks (交叉命中)

TRACER: Trace-Based Adaptive Cost-Efficient Routing for LLM Classification（HF trending #9 / inference+dpo 命中） → 用 LLM 生产日志训练代理模型，成本近零替代部分推理流量

tldr_cn：生产 trace 训代理模型替代 LLM 分类调用，近零边际成本
reason：hf_trending_rank:9 + watchlist_keyword:inference,dpo，推理降本工程化方向
证据：https://arxiv.org/abs/2604.14531

RadAgent: A tool-using AI agent for stepwise interpretation of chest computed tomography（HF trending #5 / reasoning+agent 命中） → 工具调用 agent 逐步生成 CT 报告，可解释推理链

tldr_cn：工具调用 agent 逐步生成 CT 报告，推理过程可解释
reason：hf_trending_rank:5 + watchlist_keyword:reasoning,agent，医疗 agent 代表作
证据：https://arxiv.org/abs/2604.15231

Towards Autonomous Mechanistic Reasoning in Virtual Cells（HF trending #11 / reasoning+agent 命中） → 多 agent 框架做生物机制推理，发布 VC-TRACES 数据集

tldr_cn：多 agent 框架自主生成生物机制解释并验证
reason：hf_trending_rank:11 + watchlist_keyword:reasoning,agent，科学发现 agent 新方向
证据：https://arxiv.org/abs/2604.11661

An Optimal Transport-driven Approach for Cultivating Latent Space in Online Incremental Learning（HF trending #3 / inference 命中） → 最优传输驱动在线增量学习，动态保持类可分性

tldr_cn：最优传输框架做在线增量学习，动态维持类可分性
reason：hf_trending_rank:3 + watchlist_keyword:inference，增量学习基础方法
证据：https://arxiv.org/abs/2211.16780

Boosting Visual Instruction Tuning with Self-Supervised Guidance（HF trending #4 / reasoning 命中） → 自监督 pretext task 包装为 instruction triplet 提升 MLLM 视觉推理

tldr_cn：自监督任务转为指令三元组，增强多模态视觉推理
reason：hf_trending_rank:4 + watchlist_keyword:reasoning，MLLM 训练增强实用方法
证据：https://arxiv.org/abs/2604.12966

Don’t Retrieve, Navigate: Distilling Enterprise Knowledge into Navigable Agent Skills for QA and RAG（HF trending #6 / agent 命中） → 文档蒸馏为层级技能目录，agent 导航式问答超越 dense retrieval

tldr_cn：文档蒸馏为技能目录，agent 导航式问答优于传统检索
reason：hf_trending_rank:6 + watchlist_keyword:agent，企业 RAG 替代方案
证据：https://arxiv.org/abs/2604.14572

Representations Before Pixels: Semantics-Guided Hierarchical Video Prediction（HF trending #7 / inference 命中） → 分层视频预测：先预测语义表征再生成像素

tldr_cn：先预测语义表征再生成像素的分层视频预测框架
reason：hf_trending_rank:7 + watchlist_keyword:inference，视频生成新范式
证据：https://arxiv.org/abs/2604.11707

CoopEval: Benchmarking Cooperation-Sustaining Mechanisms and LLM Agents in Social Dilemmas（reasoning+agent 命中） → 推理越强的 LLM 在社会困境中反而越不合作，契约与调解最有效

tldr_cn：强推理 LLM 在社会困境中反而不合作，契约机制最有效
reason：watchlist_keyword:reasoning,agent，LLM 安全与多 agent 协作核心问题
证据：https://arxiv.org/abs/2604.15267

🏷 Watchlist 分类命中

reasoning / inference

IG-Search: Step-Level Information Gain Rewards for Search-Augmented Reasoning → 引入信息增益 step-level reward，RL 训练搜索增强推理

LLMs Gaming Verifiers: RLVR can Lead to Reward Hacking → RLVR 训练可导致隐蔽 reward hacking，不仅限于显式操纵

Context Over Content: Exposing Evaluation Faking in Automated Judges → 实验揭示自动评估器受上下文影响大于内容本身

agent

Sema Code: Decoupling AI Coding Agents into Programmable, Embeddable Infrastructure（HF ↑22） → 开源 AI 编码框架，解耦为可嵌入可编程基础设施

MM-WebAgent: A Hierarchical Multimodal Web Agent for Webpage Generation → 分层多模态 web agent 做网页生成

Blue Data Intelligence Layer: Streaming Data and Agents for Multi-source Multi-modal AI → 多源多模态流数据 agent 基础设施层

inference / vision

GlobalSplat: Efficient Feed-Forward 3D Gaussian Splatting via Global Scene Tokens（HF trending #12 / HF ↑13） → 全局场景 token 实现高效前馈 3D 高斯溅射

robotics / quantization

A Hierarchical Spatiotemporal Action Tokenizer for In-Context Imitation Learning → 分层向量量化做动作 token 化，支持 in-context 模仿学习

🔗 延伸阅读 (Semantic Scholar 相似论文)

本段今日无高置信度增量信号（S2 相似论文未返回）。

🧑‍🔬 新出现的作者 / 团队

本日发现扫描未发现达标候选人。今日 Top picks 作者均不在追踪列表中且机构信息缺失（HF 源不附机构），无法可靠判定是否为追踪机构新面孔。

📉 覆盖缺口与不确定性

s2_similar_unavailable：Semantic Scholar 相似论文未返回，延伸阅读为空
affiliations_sparse：HF 源候选不附机构信息，影响机构命中和新作者发现
s2_tldr_partial_coverage：部分候选缺少 S2 TLDR，tldr_cn 从 abstract 浓缩
本次候选 arxiv_id 2211.16780 为 2022 年旧论文重新 trending（HF #3），信号可能为期刊版本更新

来源与交叉验证说明

本期依赖 arXiv + HuggingFace Daily Papers + Semantic Scholar 三源，全部成功抓取，无降级。排序主要依赖 HF trending rank（×3.0）和 watchlist keyword 命中（×2.0）。结论锚定在 arXiv 预印本原文（primary source），HF 趋势作为辅助信号，S2 提供 citation 元数据但不作为新预印本排序依据。citation_count 对新预印本普遍为 null，不影响排名。

Hanzhi's BLOG

[论文·2026-04-17]