Less is More: Recursive Reasoning with Tiny Networks
Title: Detailed Description of Less is More: Recursive Reasoning with Tiny Networks
Section 1: Overview
- This document describes the core ideas, design, and experimental landscape surrounding the work titled “Less is More: Recursive Reasoning with Tiny Networks,” which introduces the Tiny Recursion Model (TRM).
- TRM presents a paradigm where recursive reasoning is performed by a compact neural network, challenging the common assumption that solving hard problems requires large, expensive models.
- The central claim is that a tiny model, pre-trained from scratch and iteratively refining its own answers through recursion, can achieve competitive results on challenging tasks such as ARC-AGI-1 and ARC-AGI-2 with substantially fewer parameters (approximately 7 million).
- The narrative emphasizes efficiency, accessibility, and the potential to democratize high-quality reasoning systems by avoiding dependence on colossal foundational models trained with vast resources.
- Acknowledging the existence of broader research in hierarchical reasoning (HRM), the TRM work argues for a simplified, self-improving loop that operates without fixed-point guarantees or hierarchical architectural requirements.
- The accompanying materials outline installation prerequisites, dataset preparation pipelines for multiple tasks, and a suite of experiments designed to highlight TRM’s performance across diverse problem domains, including Sudoku variants and maze challenges.
Section 2: Motivation and Philosophical Underpinnings
- The motivation centers on a critique of current trends that tie achievement on hard tasks to the scale of the model and the scale of the training infrastructure.
- The authors advocate for “less is more”: a tiny model, when endowed with a robust recursive reasoning mechanism, can outperform expectations by iteratively refining its own outputs rather than blindly enlarging the parameter count.
- TRM is proposed as a minimal yet effective approach to reasoning: a model that, starting from a basic embedded representation of a question and an initial guess, repeatedly revisits and revises its own latent state and answer.
- The work is positioned as a response to the perception that breakthroughs are inseparable from big budgets, offering a counterpoint in which clever architectural dynamics and an iterative improvement loop compensate for small parameter budgets.
- The inspiration drawn from Hierarchical Reasoning Model (HRM) is acknowledged, but TRM seeks to distill the essence of recursive reasoning into a streamlined process that forgoes certain complex dependencies, mathematical fixed-point guarantees, or hierarchical scaffolding.
Section 3: TRM: A Concise Description
TRM stands for Tiny Recursion Model and is designed to recursively refine its predicted answer y by leveraging a compact neural network.
The model takes as input:
The embedded input question x.
An initial embedded answer y.
A latent z that can be interpreted as a developing internal state used during recursion.
The core loop comprises a fixed number K of improvement steps, during which the model alternates between two stages: 1) Recursive reasoning stage: update the latent z given the current question x, current answer y, and the current latent z (this step is performed multiple times within each step to refine the internal representation). 2) Answer update stage: generate a new version of the answer y conditioned on the updated latent z and the preceding answer y.
Through these alternating updates, TRM incrementally enhances its output, addressing prior errors and enhancing consistency, all while maintaining a tiny parameter footprint.
The mechanism is explicitly designed to minimize overfitting and to promote robust improvements even when the network capacity is limited.
Visual reference: The TRM figure (TRM_fig.png) illustrates the high-level data flow: x feeds into the system with an initial y and z, multiple recursive updates of z are performed, followed by an updated y, iterating over K cycles.
Image reference:
Figure: TRM
Source: https://AlexiaJM.github.io/assets/images/TRM_fig.png
Alt text: TRM diagram showing recursive refinement of the answer through latent state updates.
Section 4: Architectural and Procedural Details
- The design hinges on four core components: 1) Input embedding: The question x is embedded into a latent representation suitable for subsequent processing, while the initial answer y is also embedded to provide a starting point for refinement. 2) Latent state dynamics: A latent z is maintained and updated within each recursion step, enabling the model to carry information across iterations and to refine its internal understanding of the problem. 3) Recursive reasoning subroutine: The update of z leverages the current x, y, and z to generate a richer latent representation that better informs the next stage of answer refinement. 4) Answer refinement subroutine: The model uses the updated latent z to construct a new version of the answer y, ensuring that improvements in the latent representation translate into clearer and more accurate outputs.
- The recursive loop can be executed for a user-specified number of improvements, K, allowing researchers to trade off compute for potential gains in answer quality.
- The entire process is designed to be highly parameter-efficient. The small parameter count (on the order of millions) is a deliberate design choice to demonstrate that iterative self-improvement, rather than sheer scale, can drive significant performance.
- By avoiding reliance on fixed-point theorems or rigid hierarchical scaffolding, TRM emphasizes practical, implementable recursion that can be trained from scratch without leveraging pre-trained large models.
Section 5: Requirements and Setup
- The project’s setup notes emphasize that installation should take a few minutes for standard experiments, with longer runs for more demanding tasks.
- Hardware considerations:
- Sudoku-Extreme experiments can be run on a single GPU (e.g., an L40S with 48 GB RAM), with runtimes on the order of hours for substantial training durations.
- More demanding tasks may benefit from multi-GPU configurations (e.g., 4× L40S or equivalent) to accelerate training and experimentation.
- Software prerequisites and environment:
- Python 3.10 (or compatible).
- CUDA toolkit that matches the chosen PyTorch build (e.g., CUDA 12.6 in the example setup).
- Package installation:
- A modern PyTorch nightly build aligned with the CUDA version (e.g., pip install --pre --upgrade torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu126).
- A requirements file with specific dependencies (requirements.txt) to be installed with pip install -r requirements.txt.
- Optional logging integration with Weights & Biases (wandb) via wandb login YOUR-LOGIN to synchronize results.
- Practical notes:
- The exact requirement versions are captured in a specific_requirements.txt resource.
- If version conflicts arise, users are advised to adjust the environment to mirror the documented configuration.
Section 6: Dataset Preparation
- The repository provides scripts to prepare multiple datasets used in TRM experiments. Each dataset has particular characteristics and subsets that must be respected to avoid leakage between training and evaluation data.
- ARC-AGI-1:
- Data preparation is performed via a Python module that builds the ARC-AGI-1 dataset from a Kaggle-based input source.
- Output directories include data/arc1concept-aug-1000, with subsets including training, evaluation, and concept, plus a dedicated test-set name evaluation.
- ARC-AGI-2:
- Similarly prepared using a build script targeted at arc-agi-2, with its own output directory data/arc2concept-aug-1000 and subsets training2, evaluation2, and concept, plus test-set evaluation2.
- Important caveat:
- It is explicitly noted that ARC-AGI-1 and ARC-AGI-2 training data overlap with evaluation data; you cannot train on ARC-AGI-1 data and evaluate ARC-AGI-2 or vice versa because of cross-contamination between datasets.
- Sudoku-Extreme:
- The dataset: data/sudoku-extreme-1k-aug-1000.
- Subsample size is 1000 with 1000 augmentation instances, creating a robust yet compact training regime for puzzle-solving tasks.
- Maze-Hard:
- The dataset: data/maze-30x30-hard-1k.
- The repository provides a script to generate approximately 1000 examples and 8 augments, designed to test spatial reasoning and planning under challenging conditions.
- Setup notes:
- Each dataset comes with a dedicated script (e.g., dataset/buildarcdataset.py, dataset/buildsudokudataset.py, dataset/buildmazedataset.py) that accepts parameters such as output-dir, input prefixes, and subset selections.
- This structured approach enables reproducible experiments and consistent comparisons across tasks.
- Practical considerations:
- The prepared datasets serve as diverse testbeds to demonstrate TRM’s ability to generalize from small-scale training data to reasoning tasks that require careful, iterative refinement.
Section 7: Experiments and Expected Outcomes
- The experimental plan targets multiple tasks to illustrate the versatility and effectiveness of TRM in different problem domains. Each task has its expected performance window, runtime considerations, and architectural configuration.
- Sudoku-Extreme experiments:
- Pretraining configuration (example):
- runname: pretrainmlptsudoku
- arch: trm
- data_paths: [data/sudoku-extreme-1k-aug-1000]
- evaluators: []
- epochs: 50000
- eval_interval: 5000
- learning rates: lr 1e-4, puzzleemblr 1e-4
- regularization: weight_decay 1.0 for both standard and puzzle embedding components
- architectural settings: Llayers=2, Hcycles=3, L_cycles=6
- expected outcome: approximately 87% exact-accuracy, with a tolerance of plus/minus about 2 percentage points
- Alternative pretraining: pretrainattsudoku
- Similar configuration with identical training length and evaluation cadence but with minor architectural variants to explore robustness.
- Runtime expectation: under 20 hours on the specified hardware.
- Maze-Hard experiments:
- Pretraining configuration for four-GPU setups:
- runname: pretrainatt_maze30x30
- multi-GPU orchestration using torchrun with 4 processes
- data_paths: [data/maze-30x30-hard-1k]
- epochs: 50000
- eval_interval: 5000
- learning rates: lr 1e-4, puzzleemblr 1e-4
- regularization: weight_decay 1.0 for both standard and puzzle embeddings
- architectural settings: Llayers=2, Hcycles=3, L_cycles=4
- expected outcome: if successful, results should demonstrate strong spatial reasoning capabilities within the 30x30 maze space.
- A variant enabling single-GPU training by reducing batch size:
- runname: pretrainattmaze30x301gpu
- data_paths: [data/maze-30x30-hard-1k]
- globalbatchsize: 128
- same architectural settings as above
- runtime: under 24 hours
- ARC-AGI-1 experiments:
- Configuration exploring 4-GPU training:
- runname: pretrainattarc1concept4
- torchrun with 4 processes
- data_paths: [data/arc1concept-aug-1000]
- arch: trm
- architectural settings: Llayers=2, Hcycles=3, L_cycles=4
- ema: True
- expected runtime: approximately 3 days
- ARC-AGI-2 experiments:
- Configuration with 4-GPU setup:
- runname: pretrainattarc2concept4
- torchrun setup similar to ARC-AGI-1
- data_paths: [data/arc2concept-aug-1000]
- architectural settings: Llayers=2, Hcycles=3, L_cycles=4
- ema: True
- expected runtime: approximately 3 days
- General observations:
- Across tasks, the emphasis is on demonstrating that a small model, through iterative self-improvement, can achieve meaningful performance without resorting to massive-scale training.
- Runtime estimates reflect the computational intensity of recursive reasoning at scale, especially when combined with multi-GPU configurations and extended training durations.
- The design decisions emphasize reproducibility and clarity of training pipelines, enabling other researchers to replicate and compare TRM’s performance on comparable tasks.
Section 8: Reference and Context
- The work provides bibliographic references for further reading and context:
- Primary paper and arXiv entry:
- Jolicoeur-Martineau, Alexia. “Less is More: Recursive Reasoning with Tiny Networks.” 2025. arXiv:2510.04871.
- Related work in Hierarchical Reasoning Model (HRM):
- Wang, Guan; Li, Jin; Sun, Yuhao; Chen, Xing; Liu, Changling; Wu, Yue; Lu, Meng; Song, Sen; Yadkori, Yasin Abbasi. “Hierarchical Reasoning Model.” 2025. arXiv:2506.21734.
- The codebase references established repositories and codes:
- HRM: a source of inspiration and related implementation ideas.
- HRM Analysis code: a separate repository illustrating hierarchical reasoning model analyses.
- The TRM project is presented as a practical extension of ideas from HRM, with a focus on minimal parameterization and recursive refinement rather than hierarchical design complexity.
- Readers are encouraged to consult the cited arXiv entries for detailed mathematical formulations, experimental results, and broader discussions about recursive reasoning and model efficiency.
Section 9: Notes on Archival Status and Access
- An update within the input text notes a temporary archival status for this repository and several others due to a combination of issues and limited maintenance capacity.
- The archival status implies:
- The repository is read-only for the time being.
- New issues or updates may be limited, and ongoing work may be paused to some extent.
- Despite the archival status, the description and instructions remain valuable for understanding the TRM concept, its intended methodology, and the experimental design, as well as for replicating experiments in environments with similar resources.
- Users who wish to explore further or reproduce results can rely on:
- The provided dataset preparation scripts and configuration guidelines.
- The documented hardware and software prerequisites.
- The explicit run configurations for Sudoku-Extreme, Maze-Hard, and ARC-AGI tasks.
- Effective collaboration and ongoing development may resume if and when repository maintenance capacity is restored, or when forks and mirrored repositories provide updated resources and clearer guidance.
Section 10: Summary of Key Concepts
- Core idea: A tiny neural network (approximately 7 million parameters) can perform recursive reasoning to refine its own answers over multiple improvement steps, achieving competitive performance without large-scale models.
- Process: Starting from an embedded input x, an initial answer y, and a latent state z, the model repeatedly updates z (recursive reasoning) and then updates y (answer refinement) for K improvement steps.
- Philosophy: “Less is more” in model size, with emphasis on an iterative, self-correcting reasoning loop rather than expanding the number of parameters or relying on fixed-point mathematical structures.
- Practical deployment: A clear pathway for installation, dataset preparation, and experiments across Sudoku, maze navigation, and ARC-AGI tasks, with explicit runtime expectations and hardware considerations.
- Visual aid: The TRM figure encapsulates the data flow and recursive refinement process, providing a tangible reference for how the latent state and answer evolve through the loop.
Section 11: Closing Remarks
- The TRM work contributes a distinctive perspective to the ongoing discourse on model efficiency and generalization: that a carefully engineered, recursive, and compact model can achieve meaningful reasoning capabilities without the crutch of enormous parameter budgets.
- The combination of accessible dataset preparation, explicit experimental protocols, and a clear justification for the architectural choices offers a compelling blueprint for researchers seeking to explore recursive reasoning with small networks.
- The inclusion of the image, figure, and reference materials helps readers visually and conceptually connect with the proposed mechanism behind Tiny Recursion Model, reinforcing the notion that iterative self-improvement can be a powerful paradigm in machine learning research.
Appendix: Image and Resource Reference
- Figure: TRM (TRM_fig.png)
- URL: https://AlexiaJM.github.io/assets/images/TRM_fig.png
- Description: A schematic depiction of the recursive reasoning process used by the Tiny Recursion Model, illustrating how x, y, and z interact across recursive steps to refine the final answer.
If you would like, I can tailor this description further to fit a specific word count target, adjust the balance between sections, or emphasize particular aspects (e.g., more technical depth about the recursive update mechanism or a deeper dive into ARC-AGI dataset specifics).
Enjoying this project?
Discover more amazing open-source projects on TechLogHub. We curate the best developer tools and projects.
Repository:https://github.com/SamsungSAILMontreal/TinyRecursiveModels
GitHub - SamsungSAILMontreal/TinyRecursiveModels: Less is More: Recursive Reasoning with Tiny Networks
A concise overview of the Tiny Recursion Model (TRM) and its iterative self‑improvement approach to reasoning tasks such as Sudoku, maze navigation, and ARC‑AGI...
github - samsungsailmontreal/tinyrecursivemodels