Artificial Intelligence

AI-for-Science Low-code Platform with Bayesian Adversarial Multi-Agent Framework
Avatar
librarian
2 views
Density-Guided Response Optimization: Community-Grounded Alignment via Implicit Acceptance Signals
Avatar
Patrick Gerard
4 views
Inherited Goal Drift: Contextual Pressure Can Undermine Agentic Goals
Avatar
librarian
3 views
OrchMAS: Orchestrated Reasoning with Multi Collaborative Heterogeneous Scientific Expert Structured Agents
Avatar
Yichao Feng
0 views
RAPO: Expanding Exploration for LLM Agents via Retrieval-Augmented Policy Optimization
Avatar
Siwei Zhang
1 view
Beyond Task Completion: Revealing Corrupt Success in LLM Agents through Procedure-Aware Evaluation
Avatar
Hongliu CAO
1 view
Teaching LLMs to Plan: Logical Chain-of-Thought Instruction Tuning for Symbolic Planning
Avatar
Artem Kolesnikov
10 views
Pencil Puzzle Bench: A Benchmark for Multi-Step Verifiable Reasoning
Avatar
Justin Waugh
3 views
Nano-EmoX: Unifying Multimodal Emotional Intelligence from Perception to Empathy
Avatar
Xuechao Yang
5 views
Conformal Policy Control

Conformal Policy Control

Artificial Intelligence
Avatar
librarian
3 views
Tool Verification for Test-Time Reinforcement Learning
Avatar
librarian
5 views
LLM Novice Uplift on Dual-Use, In Silico Biology Tasks
Avatar
librarian
22 views
The Trinity of Consistency as a Defining Principle for General World Models
Avatar
librarian
15 views
A Decision-Theoretic Formalisation of Steganography With Applications to LLM Monitoring
Avatar
Usman Anwar
13 views
A Model-Free Universal AI

A Model-Free Universal AI

Artificial Intelligence
Avatar
librarian
14 views
ProactiveMobile: A Comprehensive Benchmark for Boosting Proactive Intelligence on Mobile Devices
Avatar
librarian
16 views
Semantic Partial Grounding via LLMs
Avatar
librarian
16 views
Architecting AgentOS: From Token-Level Context to Emergent System-Level Intelligence
Avatar
librarian
26 views
A Benchmark for Deep Information Synthesis
Avatar
librarian
16 views
Aletheia tackles FirstProof autonomously
Avatar
librarian
58 views
Agents of Chaos

Agents of Chaos

Artificial Intelligence
Avatar
librarian
71 views
CausalFlip: A Benchmark for LLM Causal Judgment Beyond Semantic Matching
Avatar
Yuzhe Wang
30 views
ReSyn: Autonomously Scaling Synthetic Environments for Reasoning Models
Avatar
librarian
28 views
Recurrent Structural Policy Gradient for Partially Observable Mean Field Games
Avatar
Clarisse Wibault
27 views
KLong: Training LLM Agent for Extremely Long-horizon Tasks
Avatar
librarian
32 views
AI Gamestore: Scalable, Open-Ended Evaluation of Machine General Intelligence with Human Games
Avatar
librarian
24 views
AutoNumerics: An Autonomous, PDE-Agnostic Multi-Agent Pipeline for Scientific Computing
Avatar
librarian
25 views
Causally-Guided Automated Feature Engineering with Multi-Agent Reinforcement Learning
Avatar
Arun Vignesh Malarkkan
26 views
Leveraging Large Language Models for Causal Discovery: a Constraint-based, Argumentation-driven Approach
Avatar
librarian
30 views
Agent Skill Framework: Perspectives on the Potential of Small Language Models in Industrial Environments
Avatar
librarian
27 views
Towards a Science of AI Agent Reliability
Avatar
Stephan Rabanser
35 views
Recursive Concept Evolution for Compositional Reasoning in Large Language Models
Avatar
librarian
28 views