Visual Place Recognition Using Omnidirectional Cameras and Monocular Depth Estimation

Omnidirectional cameras are a suitable and cost-effective choice for Visual Place Recognition (VPR), as they provide comprehensive information from the scene regardless of the robot orientation. However, vision sensors are vulnerable to environmental appearance changes (e.g., illumination, season). While multi-modal approaches can overcome these challenges, they introduce significant cost and system complexity. This paper introduces a novel fusion framework that enhances VPR robustness by integrating visual data with geometric features derived from monocular depth estimation, retaining a single-camera setup. In the ablation study, both early and late fusion strategies are evaluated to optimally combine appearance-based and depth-derived descriptors. The extensive evaluation on challenging, indoor and outdoor datasets demonstrates that the proposed method consistently boosts retrieval performance across multiple state-of-the-art VPR backbones. Furthermore, this improvement is achieved without requiring end-to-end retraining, allowing our method to function as a pluggable module for pre-trained models. Consequently, this work presents a powerful, practical, and low-cost solution for robust VPR, with high potential to scale as monocular depth estimation and VPR models continue to improve.

PDPR: Panoramic-Depth Place Recognition through the fusion of visual and geometric-aware features

Abstract

Depth Estimation and Processing

COLD database [2]

We propose PDPR, a novel fusion framework that enhances VPR robustness by integrating visual data with depth maps, which are preprocessed to adapt them as suitable inputs for VPR models.

360Loc database [3]

Depth maps obtained by means of Depth Anything v2 [1], a state-of-the-art depth estimation model, and preprocessed through various techniques to ensure compatibility with the VPR model knowledge and to enhance the geometric information captured.

Trajectory Results

COLD database [2]

FR-A Environment - Query Cloudy / Database Cloudy

FR-A Environment - Query Night / Database Cloudy

SA-A Environment - Query Cloudy / Database Cloudy

SA-B Environment - Query Sunny / Database Cloudy

360Loc dataset [3]

Atrium Environment - Query Day / Database Day

Atrium Environment - Query Night / Database Day

Concourse Environment - Query Day / Database Day

Hall Environment - Query Night / Database Day

Highlighted Examples

COLD database [2]

R@1% successful retrieval example in the FR-A environment under sunny conditions with PDPR.

R@1 successful retrieval example in the FR-B environment under sunny conditions with PDPR.

R@1 successful retrieval example in the SA-A environment under night conditions with PDPR.

Wrong retrieval example in the SA-B environment under night conditions with PDPR.

360Loc database [3]

R@1 successful retrieval example in the atrium environment under nighttime conditions with PDPR.

Wrong retrieval example in the concourse environment under daytime conditions with PDPR.

R@1 successful retrieval example in the hall environment under nighttime conditions with PDPR.

Wrong retrieval example in the piatrium environment under nighttime conditions with PDPR.

Quantitative Results

R@1 and R@1 results achieved by different VPR models with RGB images (no fusion) and with our method (PDPR).

References

How to cite this work

Acknowledgements