Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2509.08755

AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning

Paper • 2402.15506 • Published Feb 23, 2024 • 18
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent

Paper • 2404.03648 • Published Apr 4, 2024 • 30
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts

Paper • 2405.19893 • Published May 30, 2024 • 33
Parrot: Efficient Serving of LLM-based Applications with Semantic Variable

Paper • 2405.19888 • Published May 30, 2024 • 7

Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward

Paper • 2510.03222 • Published Oct 3 • 75
In-the-Flow Agentic System Optimization for Effective Planning and Tool Use

Paper • 2510.05592 • Published Oct 7 • 105
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 497
Multi-Agent Tool-Integrated Policy Optimization

Paper • 2510.04678 • Published Oct 6 • 30

AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning

Paper • 2509.08755 • Published Sep 10 • 56
The Majority is not always right: RL training for solution aggregation

Paper • 2509.06870 • Published Sep 8 • 16
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Paper • 2509.07980 • Published Sep 9 • 101
Emergent Hierarchical Reasoning in LLMs through Reinforcement Learning

Paper • 2509.03646 • Published Sep 3 • 30

Interessting papers

PVPO: Pre-Estimated Value-Based Policy Optimization for Agentic Reasoning

Paper • 2508.21104 • Published Aug 28 • 35
FNet: Mixing Tokens with Fourier Transforms

Paper • 2105.03824 • Published May 9, 2021 • 1
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Paper • 2509.02479 • Published Sep 2 • 83
RL + Transformer = A General-Purpose Problem Solver

Paper • 2501.14176 • Published Jan 24 • 28

Provable Benefits of In-Tool Learning for Large Language Models

Paper • 2508.20755 • Published Aug 28 • 11
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

Paper • 2508.20453 • Published Aug 28 • 63
How Can Input Reformulation Improve Tool Usage Accuracy in a Complex Dynamic Environment? A Study on τ-bench

Paper • 2508.20931 • Published Aug 28 • 15
AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning

Paper • 2509.08755 • Published Sep 10 • 56

about 14 hours ago

Guided Self-Evolving LLMs with Minimal Human Supervision

Paper • 2512.02472 • Published 8 days ago • 48
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

Paper • 2509.25454 • Published Sep 29 • 140
Video Reasoning without Training

Paper • 2510.17045 • Published Oct 19 • 7
Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9 • 266

AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning

Paper • 2509.08755 • Published Sep 10 • 56

FinMem: A Performance-Enhanced LLM Trading Agent with Layered Memory and Character Design

Paper • 2311.13743 • Published Nov 23, 2023 • 1
QuantAgent: Price-Driven Multi-Agent LLMs for High-Frequency Trading

Paper • 2509.09995 • Published Sep 12 • 14
TradingAgents: Multi-Agents LLM Financial Trading Framework

Paper • 2412.20138 • Published Dec 28, 2024 • 14
The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs

Paper • 2509.09677 • Published Sep 11 • 34

Infra & tools for Agentic Systems

AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications

Paper • 2508.16279 • Published Aug 22 • 53
AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning

Paper • 2509.08755 • Published Sep 10 • 56

Good agents related space, model, dataset

Good agents related space, model, dataset collection

zai-org/GLM-4.5

Text Generation • 358B • Updated Aug 11 • 23.7k • • 1.39k
Running

26

GLM 4.5V Demo App

🏃

26

Demo App of dmg file
nvidia/Cosmos-Reason1-7B

Image-Text-to-Text • 8B • Updated about 15 hours ago • 117k • 218
Running

MCP

Featured

139

Web Search MCP

🔎

139

Search and extract web content for LLM ingestion

AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning

Paper • 2402.15506 • Published Feb 23, 2024 • 18
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent

Paper • 2404.03648 • Published Apr 4, 2024 • 30
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts

Paper • 2405.19893 • Published May 30, 2024 • 33
Parrot: Efficient Serving of LLM-based Applications with Semantic Variable

Paper • 2405.19888 • Published May 30, 2024 • 7

about 14 hours ago

Guided Self-Evolving LLMs with Minimal Human Supervision

Paper • 2512.02472 • Published 8 days ago • 48
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

Paper • 2509.25454 • Published Sep 29 • 140
Video Reasoning without Training

Paper • 2510.17045 • Published Oct 19 • 7
Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9 • 266

Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward

Paper • 2510.03222 • Published Oct 3 • 75
In-the-Flow Agentic System Optimization for Effective Planning and Tool Use

Paper • 2510.05592 • Published Oct 7 • 105
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 497
Multi-Agent Tool-Integrated Policy Optimization

Paper • 2510.04678 • Published Oct 6 • 30

AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning

Paper • 2509.08755 • Published Sep 10 • 56

AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning

Paper • 2509.08755 • Published Sep 10 • 56
The Majority is not always right: RL training for solution aggregation

Paper • 2509.06870 • Published Sep 8 • 16
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Paper • 2509.07980 • Published Sep 9 • 101
Emergent Hierarchical Reasoning in LLMs through Reinforcement Learning

Paper • 2509.03646 • Published Sep 3 • 30

FinMem: A Performance-Enhanced LLM Trading Agent with Layered Memory and Character Design

Paper • 2311.13743 • Published Nov 23, 2023 • 1
QuantAgent: Price-Driven Multi-Agent LLMs for High-Frequency Trading

Paper • 2509.09995 • Published Sep 12 • 14
TradingAgents: Multi-Agents LLM Financial Trading Framework

Paper • 2412.20138 • Published Dec 28, 2024 • 14
The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs

Paper • 2509.09677 • Published Sep 11 • 34

Interessting papers

PVPO: Pre-Estimated Value-Based Policy Optimization for Agentic Reasoning

Paper • 2508.21104 • Published Aug 28 • 35
FNet: Mixing Tokens with Fourier Transforms

Paper • 2105.03824 • Published May 9, 2021 • 1
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Paper • 2509.02479 • Published Sep 2 • 83
RL + Transformer = A General-Purpose Problem Solver

Paper • 2501.14176 • Published Jan 24 • 28

Infra & tools for Agentic Systems

AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications

Paper • 2508.16279 • Published Aug 22 • 53
AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning

Paper • 2509.08755 • Published Sep 10 • 56

Provable Benefits of In-Tool Learning for Large Language Models

Paper • 2508.20755 • Published Aug 28 • 11
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

Paper • 2508.20453 • Published Aug 28 • 63
How Can Input Reformulation Improve Tool Usage Accuracy in a Complex Dynamic Environment? A Study on τ-bench

Paper • 2508.20931 • Published Aug 28 • 15
AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning

Paper • 2509.08755 • Published Sep 10 • 56

Good agents related space, model, dataset

Good agents related space, model, dataset collection

zai-org/GLM-4.5

Text Generation • 358B • Updated Aug 11 • 23.7k • • 1.39k
Running

26

GLM 4.5V Demo App

🏃

26

Demo App of dmg file
nvidia/Cosmos-Reason1-7B

Image-Text-to-Text • 8B • Updated about 15 hours ago • 117k • 218
Running

MCP

Featured

139

Web Search MCP

🔎

139

Search and extract web content for LLM ingestion

Previous
1
2
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs