A mid-size LLM trained with a large curated finance corpus yields big real-world gains on finance tasks while staying useful on general tasks, so firms can get domain accuracy without running huge models.
Key finding
Mixed training (curated finance + public data) yields strong finance performance without losing general abilities
Numbers: Training corpus: 363B financial + 345B public ≈ 709B tokens; trained on 569B tokens

