From Aerial Imagery to Multimodal Reasoning: TerraSense Presents AVION at CVPR 2026.
- Mohsen Zardadi
- 4 days ago
- 3 min read
TerraSense Presents AVION Research at CVPR 2026

TerraSense Analytics is proud to share that our paper, “AVION: Aerial Vision–Language Instruction from Offline Teacher to Prompt-Tuned Network,” was presented at CVPR 2026 in Denver.
CVPR is one of the world’s leading conferences for computer vision research, bringing together researchers, engineers, and industry leaders advancing the way machines perceive, interpret, and reason about visual information. For TerraSense, presenting at CVPR was an opportunity to contribute to an important and fast-moving area of AI research: how vision-language models can be better adapted to aerial and remote sensing imagery.
Watch the AVION presentation below. For more technical detail, the full AVION paper is available through the official CVPR 2026 program.
Advancing Multimodal Understanding for Remote Sensing
Modern vision-language models have made major progress in connecting visual information with language. However, adapting those models to remote sensing introduces unique challenges.
Aerial and satellite imagery often contains complex scenes, subtle visual differences, shifting perspectives, and objects that may look very different depending on scale, angle, sensor type, or environmental conditions. In these scenarios, a simple class label is rarely enough to capture the full meaning of what is being observed.
AVION was developed to address these challenges.

The paper introduces a knowledge distillation framework designed specifically for remote sensing adaptation. In simple terms, AVION uses a larger “teacher” model to help construct richer semantic understanding, then transfers that knowledge to a lightweight “student” model that can operate more efficiently during inference.
This approach helps improve the model’s ability to align visual and textual representations in remote sensing environments, where both visual adaptability and semantic precision matter.
Why This Matters
For TerraSense, this research connects directly to the broader challenge of helping systems interpret complex visual information in real-world operational environments.
Remote sensing, EO/IR data, aerial imagery, and multimodal AI are becoming increasingly important across defence, public safety, infrastructure monitoring, environmental assessment, and other mission-critical domains. But these applications require more than object detection alone. They require systems that can support better understanding, reasoning, and decision-making across diverse visual inputs.
AVION contributes to this direction by exploring how AI models can better bridge what they see with what they understand.
A Strong Showing From the Research Community
Presenting AVION at CVPR was also a valuable opportunity to connect with the broader computer vision community, exchange ideas, and discuss ongoing advances in multimodal EO/IR understanding and reasoning.
We are grateful to everyone who visited, asked questions, and engaged with the research. Conferences like CVPR are an important reminder that progress in AI comes not only from technical development, but from collaboration, discussion, and shared curiosity.
The exhibition floor also offered a few memorable moments, including a dancing robot that drew plenty of attention from attendees. It was a fitting reminder that even in a field defined by rigorous research, innovation is still full of curiosity, experimentation, and a little bit of spectacle.
Thank You
A big thank you to our team, collaborators, and everyone who contributed to this work.
Congratulations to the AVION research team:
Yu Hu, Jianyang Gu, Hao Liu, Yue Cao, Jozsef Hamari, Zheng Liu, and Mohsen Zardadi.
We are excited to continue contributing to the future of computer vision, remote sensing, and multimodal AI.


Comments