OpenAGI: an open platform that lets LLMs plan and call specialist models to solve multi-step tasks
OpenAGI shows you can compose existing specialist models under LLM control and use RL-style tuning to make smaller, cheaper models competitive—useful for building product workflows that call vision, text, or web tools.
Key finding
A large, general LLM (GPT-4) achieves the highest overall OpenAGI scores in zero/few-shot.
Numbers: GPT-4 overall: 0.2378 (zero) -> 0.5281 (few)

