AI Research 84% 1 min readJun 22, 2026, 5:58 PM

AIR: Adaptive Interleaved Reasoning with Code in MLLMs

Evolving story · 1 updatesAdaptive Interleaved Reasoning in Multimodal LLMsTimeline →

30-second summary

A new paper introduces AIR, a method enabling multimodal LLMs to adaptively interleave reasoning with code execution, addressing numerical computation gaps in current MLLM tool-use approaches.

Key takeaways

›AIR introduces adaptive interleaved reasoning with code execution for multimodal LLMs (MLLMs), addressing gaps in numerical computation and dynamic problem-solving.
›Existing MLLM tool-use methods rely on predefined heuristics for visual tasks and fail to handle numerical computations effectively.
›The method uses extended reinforcement learning to train MLLMs for adaptive reasoning and code execution.
›The paper positions AIR as a response to the paradigm shift initiated by OpenAI's o3 model.
›AIR aims to enable MLLMs to tackle complex, multi-step tasks requiring both visual and numerical reasoning.

Full story

Researchers propose Adaptive Interleaved Reasoning (AIR), a framework that extends reinforcement learning to train multimodal large language models (MLLMs) to dynamically alternate between reasoning steps and code execution. Unlike prior tool-use methods in MLLMs, which focus on visual perception tasks with predefined heuristics, AIR enables numerical computation and adaptive problem-solving. The approach leverages extended reinforcement learning to enhance the model's ability to handle complex, multi-step tasks that require both visual and numerical reasoning. The paper highlights limitations in existing MLLM tool-use paradigms and demonstrates AIR's potential to bridge these gaps.

Source: AIR: Adaptive Interleaved Reasoning with Code in MLLMs. Read the full piece at the source.

Why this matters

Developers

Provides a new framework for training MLLMs to handle numerical and adaptive reasoning tasks, expanding their utility beyond visual perception.

Businesses

Could lead to more capable AI systems for industries requiring multimodal and numerical reasoning, such as robotics, automation, and data analysis.

Investors

Signals progress in MLLM capabilities, potentially increasing investment interest in companies developing advanced multimodal AI systems.

Students

Offers a novel approach to training multimodal models, relevant for research in AI, machine learning, and robotics.

Everyone

Demonstrates advancements in AI's ability to perform complex, multi-step reasoning tasks, bringing us closer to more versatile AI systems.

Glossary

MLLM: Multimodal Large Language Model, an AI system capable of processing and reasoning across multiple types of data, such as text, images, and code.
Interleaved reasoning: A problem-solving approach where reasoning steps alternate with actions like code execution or tool use, rather than being linear.
Reinforcement learning: A machine learning paradigm where models learn to make decisions by receiving rewards or penalties for their actions.
Tool-use in AI: The ability of AI models to interact with external tools, such as code interpreters or APIs, to enhance their problem-solving capabilities.

AI bias estimate: Neutral academic framing; no overt bias detected. (Automated estimate, not a definitive judgement.)

Sources · 1

AIR: Adaptive Interleaved Reasoning with Code in MLLMs ↗

Summary and analysis generated by AI (mistral). Always verify against the original sources.

TickrWire

NSF Prepares To Announce Artificial Intelligence Coordination Hubs - AFCEA International

1 min read5h ago

TickrWire

Chinese A.I. Models Close the Gap With Anthropic and OpenAI - The New York Times

1 min read9h ago

TickrWire

A Pilot Study on the Efficacy of Artificial Intelligence-Driven Monocular Three-Dimensional Conversion for Endoscopic Spatial Perception - Cureus

1 min read10h ago

TickrWire

Nearly 100% of patients surveyed say they’d want to know when AI is used in imaging - Radiology Business

1 min read11h ago