Multi-Sense-Rescuer: Multi-Target Audio-Visual Learning
and Navigation in Search and Rescue Scenarios
IROS Learning Robot Super Autonomy Workshop 2023
Under Review at ICRA 2024

Abstract

overview

In autonomous navigation, there are numerous applications in which audio plays a crucial role as an essential source of information. This study investigates the efficacy of employing transfer learning for optimal path planning through multiple sound-emitting destinations. This problem is challenging due to the intricate feature extraction from mixed audio signals and combinatorial complexity inherent in multi-destination path planning. Expanding beyond the current reinforcement learning study for the single sound source scenario, we present a multi-targeted formulation and explore how effectively fine- tuning a pre-trained agent adapts its performance to the multi- sound source scenario. We provide a rigorous evaluation of our proposed multi-source approach on the widely adopted Matterport3D dataset to showcase its effectiveness. The test results underscore a notable acceleration in the training process by more than one order of magnitude.

Hypotheses

H.1 Pre-trained Audio-Visual Navigation Policy in single-target scenarios transfers to multi-target scenarios. In particular, pre-training an agent on a single-target task expedites the convergence process in multi-target scenarios and leads to optimal performance in a fraction of the training updates.
H.2 Pre-trained Audio-Visual Navigation Policy in single-target scenarios transfers to random number of target scenarios. Specifically, a pre-trained agent on the single target task generalizes to an arbitrary number of destinations in an effective manner.

Results

overview

Comparision of training cost with/without transfer learning

overview

Test trajectories

overview
overview

Videos






Acknowledgements

We thank the Research Computing department of University of South Carolina for providing compute support.

The website template was borrowed from Michaƫl Gharbi.