How to Train LLMs to “Think” (o1 & DeepSeek-R1)

Article Summary

Advanced reasoning models explained

Last updated: January 13, 2026

In September 2024, OpenAI released its o1 model, trained on large-scale reinforcement learning, giving it “advanced reasoning’s capabilities . The details of how they pulled this off were never shared publicly .

Topics Covered

AI

More in AI

A Bear Case: My Predictions Regarding AI Progress

A rigorous analysis blending technical insights with philosophical reflection, arguing for a more cautious view on the pace of artificial intelligence development.

Command A

An exploration of novel command interfaces for language models, examining how new techniques are redefining AI interaction.

LLM Distillation Demystified: A Complete Guide

A comprehensive guide to LLM distillation, detailing the process of compressing large language models while preserving their performance.