← Back to feed
AI Research 84% 1 min readJun 22, 2026, 5:58 PM

AIR: Adaptive Interleaved Reasoning with Code in MLLMs

Evolving story · 1 updatesAdaptive Interleaved Reasoning in Multimodal LLMsTimeline →
30-second summary

A new paper introduces AIR, a method enabling multimodal LLMs to adaptively interleave reasoning with code execution, addressing numerical computation gaps in current MLLM tool-use approaches.

AIR: Adaptive Interleaved Reasoning with Code in MLLMs
Key takeaways
  • AIR introduces adaptive interleaved reasoning with code execution for multimodal LLMs (MLLMs), addressing gaps in numerical computation and dynamic problem-solving.
  • Existing MLLM tool-use methods rely on predefined heuristics for visual tasks and fail to handle numerical computations effectively.
  • The method uses extended reinforcement learning to train MLLMs for adaptive reasoning and code execution.
  • The paper positions AIR as a response to the paradigm shift initiated by OpenAI's o3 model.
  • AIR aims to enable MLLMs to tackle complex, multi-step tasks requiring both visual and numerical reasoning.
Full story

Researchers propose Adaptive Interleaved Reasoning (AIR), a framework that extends reinforcement learning to train multimodal large language models (MLLMs) to dynamically alternate between reasoning steps and code execution. Unlike prior tool-use methods in MLLMs, which focus on visual perception tasks with predefined heuristics, AIR enables numerical computation and adaptive problem-solving. The approach leverages extended reinforcement learning to enhance the model's ability to handle complex, multi-step tasks that require both visual and numerical reasoning. The paper highlights limitations in existing MLLM tool-use paradigms and demonstrates AIR's potential to bridge these gaps.

Source: AIR: Adaptive Interleaved Reasoning with Code in MLLMs. Read the full piece at the source.

Why this matters
Developers

Provides a new framework for training MLLMs to handle numerical and adaptive reasoning tasks, expanding their utility beyond visual perception.

Businesses

Could lead to more capable AI systems for industries requiring multimodal and numerical reasoning, such as robotics, automation, and data analysis.

Investors

Signals progress in MLLM capabilities, potentially increasing investment interest in companies developing advanced multimodal AI systems.

Students

Offers a novel approach to training multimodal models, relevant for research in AI, machine learning, and robotics.

Everyone

Demonstrates advancements in AI's ability to perform complex, multi-step reasoning tasks, bringing us closer to more versatile AI systems.

Glossary
MLLM
Multimodal Large Language Model, an AI system capable of processing and reasoning across multiple types of data, such as text, images, and code.
Interleaved reasoning
A problem-solving approach where reasoning steps alternate with actions like code execution or tool use, rather than being linear.
Reinforcement learning
A machine learning paradigm where models learn to make decisions by receiving rewards or penalties for their actions.
Tool-use in AI
The ability of AI models to interact with external tools, such as code interpreters or APIs, to enhance their problem-solving capabilities.

AI bias estimate: Neutral academic framing; no overt bias detected. (Automated estimate, not a definitive judgement.)

Sources · 1

Summary and analysis generated by AI (mistral). Always verify against the original sources.

Related
TickrWire

AI news intelligence. We aggregate, verify, summarise and explain the latest artificial intelligence news from open, legal sources.

Daily AI digest

Top AI stories, summarised, in your inbox each morning.

© 2026 TickrWire. Summaries and analysis are AI-generated and may contain errors.Privacy