
Nov 27, 2025
- Company
- Press Releases
- AI & Robotics

Oct 17, 2025
Company / Press Releases
Panasonic Holdings Co., Ltd. (Panasonic HD) and Panasonic R&D Company of America (PRDCA), in collaboration with researchers at the Stanford University, have developed UniEgoMotion, that enables both current motion estimation and future motion forecasting from egocentric video and head trajectory.
With the rise of first-person cameras and smart glasses capable of recording video, there is growing interest in technologies that can understand and predict human motions from the user’s own perspective. However, estimating motion from egocentric video or head trajectories alone has been technically challenging, often requiring additional third-person views or surrounding scene information, which has limited practical applications. Our newly developed approach, UniEgoMotion, overcomes these limitations by introducing a novel method that does not rely on third-person perspectives or surrounding scene information. UniEgoMotion enables highly accurate 3D motion reconstruction, forecasting, and generation solely from egocentric video and head trajectories (Figure 1).
This technology has been internationally recognized for its advanced technology and has been accepted at IEEE/CVF International Conference on Computer Vision (ICCV) 2025, a top conference for AI and Computer Vision. It will be presented at the plenary conference to be held in Hawaii, USA from October 19, 2025 to October 23, 2025.
Figure 1 Abstract of UniEgoMotion
Panasonic HD and PRDCA are conducting research on egocentric motion modeling. With the growing adoption of wearable devices and smart glasses, the demand for analyzing and predicting human motion from first-person video has been rapidly increasing. However, egocentric videos typically capture only partial views of the user’s body and are affected by camera motion, making accurate motion estimation in real-world scenarios highly challenging.
Addressing this challenge is important for enabling personalized motion assistance and advancing applications in VR/AR. Conventional approaches to motion estimation from egocentric video often require extensive third-person scene information or explicit 3D data such as point clouds and meshes, which makes real-world deployment difficult.
Our newly developed method, UniEgoMotion, introduces an integrated motion diffusion model that reconstructs, forecasts, and generates motion using only egocentric video and head trajectory data from wearable devices—without relying on 3D scene information. By incorporating a head-centric motion representation well-suited for head-mounted devices and leveraging a DINOv2-based (*1) image encoder for superior feature extraction, UniEgoMotion achieves higher accuracy and generalization than previous methods. Furthermore, the model employs a masking strategy during training, enabling a single AI model to handle not only “current motion estimation” but also “future motion forecast” and “novel motion generation” (Figure 2).
Figure 2 The architecture of UniEgoMotion
In our evaluation experiments, we compared UniEgoMotion with existing methods on the task of “motion reconstruction,” which estimates the current motion from egocentric video. UniEgoMotion achieved superior accuracy in pose reproduction and higher scores on metrics assessing the naturalness of motion, outperforming conventional approaches (Figure 3).
Figure 3 The results of the experiment
For all metrics except Semantic Sim., lower scores indicate better performance.
The newly developed UniEgoMotion is the world’s first integrated motion diffusion model capable of reconstructing, forecasting, and generating high-precision 3D motion from egocentric video and head trajectory—all within a single model. This technology is expected to be applied across a wide range of business domains—not only by expanding the scope of on-site operation visualization and efficiency improvement, but also through real-time motion analysis, on-site work assistance, and motion monitoring in rehabilitation and healthcare fields.
Going forward, Panasonic HD will continue to accelerate the social implementation of AI and promote the research and development of AI technologies that contribute to the usefulness of our customers' lives and workplaces.
*1 DINOv2 is a large-scale model capable of extracting high-quality image features from image.
Paper “UniEgoMotion: A Unified Model for Egocentric Motion Reconstruction, Forecasting, and Generation”
This research is the result of joint research by Chaitanya Patel, Juan Carlos Niebles and Ehsan Adeli (Stanford University), Yuta Kyuragi (PRDCA), Hiroki Nakamura and Kazuki Kozuka (Panasonic HD).
arXiv Link https://arxiv.org/abs/2508.01126
Additionally, the collaborative research “Reflect-DiT: Inference-Time Scaling for Text-to-Image Diffusion Transformers via In-Context Reflection” was also accepted for ICCV 2025.
For more details, please refer to the press release below.
[Press Release]Panasonic Holdings develops “Reflect-DiT”, an Image Generation Technology where AI reflects on its own output to improve (Oct 17, 2025)
https://news.panasonic.com/global/press/en251017-2
Also, the following research paper from Panasonic Connect was accepted for the ICCV 2025 LIMIT Workshop (Representation Learning with Very Limited Resources: When Data, Modalities, Labels, and Computing Resources are Scarce).
Intraclass Compactness: A Metric for Evaluating Models Pre-Trained on Various Synthetic Data
Tomoki Suzuki, Kazuki Maeno, Yasunori Ishii, Takayoshi Yamashita
https://openreview.net/pdf?id=G2WCuOVVti
The content in this website is accurate at the time of publication but may be subject to change without notice.
Please note therefore that these documents may not always contain the most up-to-date information.
Please note that German, Spanish and Chinese versions are machine translations, so the quality and accuracy may vary.