Computer Science

Control Tax: The Price of Keeping AI in Check
Avatar
Mikhail Terekhov
14 views
MesaNet: Sequence Modeling by Locally Optimal Test-Time Training
Avatar
Johannes von Oswald
11 views
Kinetics: Rethinking Test-Time Scaling Laws
Avatar
librarian
16 views
Truly Self-Improving Agents Require Intrinsic Metacognitive Learning
Avatar
librarian
11 views
LLM-First Search: Self-Guided Exploration of the Solution Space
Avatar
librarian
11 views
Just Enough Thinking: Efficient Reasoning with Adaptive Length Penalties
  Reinforcement Learning
Avatar
librarian
11 views
An upper bound of the mutation probability in the genetic algorithm for
  general 0-1 knapsack problem
Avatar
zhugemutian
11 views
Thinking Beyond Visibility: A Near-Optimal Policy Framework for Locally
  Interdependent Multi-Agent MDPs
Avatar
librarian
20 views
Interpretability by Design for Efficient Multi-Objective Reinforcement
  Learning
Avatar
Qiyue Xia
15 views
TRiSM for Agentic AI: A Review of Trust, Risk, and Security Management
  in LLM-based Agentic Multi-Agent Systems
Avatar
librarian
15 views
Horizon Reduction Makes RL Scalable
Avatar
librarian
15 views
OpenThoughts: Data Recipes for Reasoning Models
Avatar
librarian
14 views
AgentMisalignment: Measuring the Propensity for Misaligned Behaviour in
  LLM-Based Agents
Avatar
Akshat Naik
15 views
macOSWorld: A Multilingual Interactive Benchmark for GUI Agents
Avatar
Pei Yang
15 views
Does Thinking More always Help? Understanding Test-Time Scaling in
  Reasoning Models
Avatar
Soumya Suvra Ghosal
15 views
Critique-GRPO: Advancing LLM Reasoning with Natural Language and
  Numerical Feedback
Avatar
librarian
13 views
Linear Spatial World Models Emerge in Large Language Models
Avatar
Matthieu Tehenan
15 views
DPO Learning with LLMs-Judge Signal for Computer Use Agents
Avatar
librarian
15 views
Not All Tokens Are Meant to Be Forgotten
Avatar
librarian
15 views
The Limits of Predicting Agents from Behaviour
Avatar
Alexis Bellot
21 views