See installation instructions.
Dataset | Format | Type | URL |
---|---|---|---|
EigenScape | em32 | real | Link |
STARSS23 | mic & em32 | real | Link |
LOCATA | em32 | real | Link |
SpatialScaper Simulated Audio | mic & em32 | synthetic | Link |
See more details on how to generate the HDF dataset.
Use train.py
to train the model.
-h
, display help information-C, --config
, specify the configuration file required for training-R, --resume
, continue training from the checkpoint of the last saved model
Please refer to the config files config/train/README to understand how to setup your training config.
Example:
# The configuration file used to train the model is "config/train/train.json"
python train.py -C config/train/train.json
# continue training from the last saved model checkpoint
python train.py -C config/train/train.json -R
Use infer.py
to run inference with a pre-trained model.
-h
, display help information-D, --device
, GPU index to be use (0 for single GPU / default)-C, --config
, Configuration for k-means inference (*.json).
Please refer to the config files config/infer/README to understand how to setup your inference config.
python infer.py -C /path/to/config/inference.json -D 0
Example:
python infer.py -C config/inference/inference.json -D 0
python doa_metrics.py -C /path/to/config/inference.json
Use LAM's spherical acoustic maps (SAMs) as features to a SELD network (DCASE-style). Please refer to the seld directory, where you can perform batch feature extraction of SAMS and then train a network to perform DOA on datasets like STARSS23 or LOCATA.
# Run tensorboard pointing to your directory of logs generated during training
tensorboard --logdir train
# You can use --port to specify the port of the tensorboard static server
tensorboard --logdir train --port <port> --bind_all
Model | Input | Checkpoint |
---|---|---|
UpLAM | 4-channel | UpLAM.pth |
LAM | 32-channel | LAM.pth |
If you find our work useful, please cite our paper:
@article{roman2025latent,
title={Latent Acoustic Mapping for Direction of Arrival Estimation: A Self-Supervised Approach},
author={Roman, Adrian S, Roman, Iran R and Bello, Juan P},
journal={IEEE Workshop on Appplications of Signal Processing to Audio and Acoustics},
year={2025}
}