AI Research 84% 1 min readJun 26, 2026, 4:05 PM

HAT-4D: Lifting Monocular Video for 4D Multi-Object Interactions via Human-Agent Collaboration

Evolving story · 1 updatesHAT-4D: Multi-Object 4D Reconstruction FrameworkTimeline →

30-second summary

Researchers propose HAT-4D, a novel agentic framework using VLMs to reconstruct 3D geometry, temporal dynamics, and physical interactions of multiple objects from a single monocular video, addressing occlusions and complex dynamics in multi-object interactions.

Full story

Extracting dynamic 4D object interactions from massive, in-the-wild monocular videos offers a highly efficient data collection pathway for scaling Embodied AI and training VLAs. However, existing monocular 4D reconstruction methods primarily focus on isolated objects, often failing under the severe occlusions and complex dynamics inherent in multi-object interactions. To bridge this gap, we propose HAT-4D, the first agentic framework designed to reconstruct the 3D geometry, temporal dynamics, and physical interactions of multiple objects from a single video. By integrating VLMs with a multi-le

Source: HAT-4D: Lifting Monocular Video for 4D Multi-Object Interactions via Human-Agent Collaboration. Read the full piece at the source.

Sources · 1

HAT-4D: Lifting Monocular Video for 4D Multi-Object Interactions via Human-Agent Collaboration ↗

Summary and analysis generated by AI (mistral). Always verify against the original sources.

TickrWire

NSF Prepares To Announce Artificial Intelligence Coordination Hubs - AFCEA International

1 min read5h ago

TickrWire

Chinese A.I. Models Close the Gap With Anthropic and OpenAI - The New York Times

1 min read9h ago

TickrWire

A Pilot Study on the Efficacy of Artificial Intelligence-Driven Monocular Three-Dimensional Conversion for Endoscopic Spatial Perception - Cureus

1 min read10h ago

TickrWire

Nearly 100% of patients surveyed say they’d want to know when AI is used in imaging - Radiology Business

1 min read11h ago