In September 2024, OpenAI released its o1 model, trained on large-scale reinforcement learning, giving it “advanced reasoning’s capabilities . The details of how they pulled this off were never shared publicly .

AI
How to Train LLMs to “Think” (o1 & DeepSeek-R1)
Published
Reading time1 min read
Article Summary
Advanced reasoning models explained
More in AI
A Bear Case: My Predictions Regarding AI Progress
A rigorous analysis blending technical insights with philosophical reflection, arguing for a more cautious view on the pace of artificial intelligence development.
Command A
An exploration of novel command interfaces for language models, examining how new techniques are redefining AI interaction.
LLM Distillation Demystified: A Complete Guide
A comprehensive guide to LLM distillation, detailing the process of compressing large language models while preserving their performance.