-
Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM
Paper • 2401.02994 • Published • 52 -
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
Paper • 2401.05566 • Published • 30 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 69 -
Zero Bubble Pipeline Parallelism
Paper • 2401.10241 • Published • 25
Josh Fourie
JoshFourie
AI & ML interests
None yet
Organizations
None yet
3d
Transformers
Healthcare
Benchmarks
-
Rethinking FID: Towards a Better Evaluation Metric for Image Generation
Paper • 2401.09603 • Published • 17 -
LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models
Paper • 2402.10524 • Published • 23 -
Copilot Evaluation Harness: Evaluating LLM-Guided Software Programming
Paper • 2402.14261 • Published • 11 -
RewardBench: Evaluating Reward Models for Language Modeling
Paper • 2403.13787 • Published • 22
LLM-Stuff
-
Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM
Paper • 2401.02994 • Published • 52 -
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
Paper • 2401.05566 • Published • 30 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 69 -
Zero Bubble Pipeline Parallelism
Paper • 2401.10241 • Published • 25
Healthcare
3d
Benchmarks
-
Rethinking FID: Towards a Better Evaluation Metric for Image Generation
Paper • 2401.09603 • Published • 17 -
LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models
Paper • 2402.10524 • Published • 23 -
Copilot Evaluation Harness: Evaluating LLM-Guided Software Programming
Paper • 2402.14261 • Published • 11 -
RewardBench: Evaluating Reward Models for Language Modeling
Paper • 2403.13787 • Published • 22
Transformers