Agent tuning zai-org/SWE-Dev-train Viewer • Updated Jul 9 • 20.1k • 210 • 10 SWE-Gym/OpenHands-SFT-Trajectories Viewer • Updated May 10 • 491 • 167 • 14 lmarena-ai/webdev-arena-preference-10k Viewer • Updated Mar 10 • 10.5k • 125 • 14 SWE-bench/SWE-smith-trajectories Viewer • Updated Jul 19 • 76k • 2.46k • 34
Agent Benchmarks xw27/scibench Viewer • Updated May 6, 2024 • 692 • 674 • 21 google/frames-benchmark Viewer • Updated Oct 15, 2024 • 824 • 12.7k • 235 gaia-benchmark/GAIA Viewer • Updated Oct 28 • 932 • 13.1k • 493 HuggingFaceH4/MATH-500 Viewer • Updated Nov 15, 2024 • 500 • 74.8k • 223
Agent tuning zai-org/SWE-Dev-train Viewer • Updated Jul 9 • 20.1k • 210 • 10 SWE-Gym/OpenHands-SFT-Trajectories Viewer • Updated May 10 • 491 • 167 • 14 lmarena-ai/webdev-arena-preference-10k Viewer • Updated Mar 10 • 10.5k • 125 • 14 SWE-bench/SWE-smith-trajectories Viewer • Updated Jul 19 • 76k • 2.46k • 34
Agent Benchmarks xw27/scibench Viewer • Updated May 6, 2024 • 692 • 674 • 21 google/frames-benchmark Viewer • Updated Oct 15, 2024 • 824 • 12.7k • 235 gaia-benchmark/GAIA Viewer • Updated Oct 28 • 932 • 13.1k • 493 HuggingFaceH4/MATH-500 Viewer • Updated Nov 15, 2024 • 500 • 74.8k • 223