-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 29 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 14 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 44 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 23
Collections
Discover the best community collections!
Collections including paper arxiv:2506.06395
-
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 263 -
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models
Paper • 2506.06395 • Published • 133 -
Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models
Paper • 2506.05176 • Published • 74 -
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
Paper • 2505.24726 • Published • 277
-
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 429 -
Training Language Models to Self-Correct via Reinforcement Learning
Paper • 2409.12917 • Published • 140 -
StoryMaker: Towards Holistic Consistent Characters in Text-to-image Generation
Paper • 2409.12576 • Published • 16 -
Transformer Explainer: Interactive Learning of Text-Generative Models
Paper • 2408.04619 • Published • 172
-
Snowflake/Arctic-Text2SQL-R1-7B
8B • Updated • 11.8k • 56 -
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
Paper • 2505.24726 • Published • 277 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 263 -
Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights
Paper • 2506.16406 • Published • 128
-
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models
Paper • 2506.06395 • Published • 133 -
Discrete Diffusion in Large Language and Multimodal Models: A Survey
Paper • 2506.13759 • Published • 43 -
LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning
Paper • 2506.18841 • Published • 56 -
SRFT: A Single-Stage Method with Supervised and Reinforcement Fine-Tuning for Reasoning
Paper • 2506.19767 • Published • 15
-
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models
Paper • 2506.06395 • Published • 133 -
Magistral
Paper • 2506.10910 • Published • 65 -
Overclocking LLM Reasoning: Monitoring and Controlling Thinking Path Lengths in LLMs
Paper • 2506.07240 • Published • 7 -
Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation
Paper • 2506.09991 • Published • 55
-
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models
Paper • 2506.06395 • Published • 133 -
MultiFinBen: A Multilingual, Multimodal, and Difficulty-Aware Benchmark for Financial LLM Evaluation
Paper • 2506.14028 • Published • 93 -
UloRL:An Ultra-Long Output Reinforcement Learning Approach for Advancing Large Language Models' Reasoning Abilities
Paper • 2507.19766 • Published • 14
-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 29 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 14 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 44 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 23
-
Snowflake/Arctic-Text2SQL-R1-7B
8B • Updated • 11.8k • 56 -
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
Paper • 2505.24726 • Published • 277 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 263 -
Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights
Paper • 2506.16406 • Published • 128
-
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 263 -
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models
Paper • 2506.06395 • Published • 133 -
Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models
Paper • 2506.05176 • Published • 74 -
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
Paper • 2505.24726 • Published • 277
-
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models
Paper • 2506.06395 • Published • 133 -
Discrete Diffusion in Large Language and Multimodal Models: A Survey
Paper • 2506.13759 • Published • 43 -
LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning
Paper • 2506.18841 • Published • 56 -
SRFT: A Single-Stage Method with Supervised and Reinforcement Fine-Tuning for Reasoning
Paper • 2506.19767 • Published • 15
-
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 429 -
Training Language Models to Self-Correct via Reinforcement Learning
Paper • 2409.12917 • Published • 140 -
StoryMaker: Towards Holistic Consistent Characters in Text-to-image Generation
Paper • 2409.12576 • Published • 16 -
Transformer Explainer: Interactive Learning of Text-Generative Models
Paper • 2408.04619 • Published • 172
-
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models
Paper • 2506.06395 • Published • 133 -
Magistral
Paper • 2506.10910 • Published • 65 -
Overclocking LLM Reasoning: Monitoring and Controlling Thinking Path Lengths in LLMs
Paper • 2506.07240 • Published • 7 -
Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation
Paper • 2506.09991 • Published • 55
-
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models
Paper • 2506.06395 • Published • 133 -
MultiFinBen: A Multilingual, Multimodal, and Difficulty-Aware Benchmark for Financial LLM Evaluation
Paper • 2506.14028 • Published • 93 -
UloRL:An Ultra-Long Output Reinforcement Learning Approach for Advancing Large Language Models' Reasoning Abilities
Paper • 2507.19766 • Published • 14