Latent Acoustic Mapping for Direction of Arrival Estimation: A Self-Supervised Approach

Installation

Datasets

Dataset	Format	Type	URL
EigenScape	em32	real	Link
STARSS23	mic & em32	real	Link
LOCATA	em32	real	Link
SpatialScaper Simulated Audio	mic & em32	synthetic	Link

Generate dataset

See more details on how to generate the HDF dataset.

Training

Use train.py to train the model.

-h, display help information
-C, --config, specify the configuration file required for training
-R, --resume, continue training from the checkpoint of the last saved model

Please refer to the config files config/train/README to understand how to setup your training config.

Example:

# The configuration file used to train the model is "config/train/train.json"
python train.py -C config/train/train.json

# continue training from the last saved model checkpoint
python train.py -C config/train/train.json -R

Inference

Use infer.py to run inference with a pre-trained model.

-h, display help information
-D, --device, GPU index to be use (0 for single GPU / default)
-C, --config, Configuration for k-means inference (*.json).

Please refer to the config files config/infer/README to understand how to setup your inference config.

python infer.py -C /path/to/config/inference.json -D 0

Example:

python infer.py -C config/inference/inference.json -D 0

DoA Metrics from Infered K-means Output

python doa_metrics.py -C /path/to/config/inference.json

Sound Event Localization using LAM

Use LAM's spherical acoustic maps (SAMs) as features to a SELD network (DCASE-style). Please refer to the seld directory, where you can perform batch feature extraction of SAMS and then train a network to perform DOA on datasets like STARSS23 or LOCATA.

Visualization

# Run tensorboard pointing to your directory of logs generated during training
tensorboard --logdir train

# You can use --port to specify the port of the tensorboard static server
tensorboard --logdir train --port <port> --bind_all

Pre-trained Models

Model	Input	Checkpoint
UpLAM	4-channel	UpLAM.pth
LAM	32-channel	LAM.pth

Citation

If you find our work useful, please cite our paper:

@article{roman2025latent,
  title={Latent Acoustic Mapping for Direction of Arrival Estimation: A Self-Supervised Approach},
  author={Roman, Adrian S, Roman, Iran R and Bello, Juan P},
  journal={IEEE Workshop on Appplications of Signal Processing to Audio and Acoustics},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 229 Commits
analysis		analysis
checkpoints		checkpoints
config		config
dataset		dataset
docs		docs
model		model
seld		seld
trainer		trainer
util		util
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
SELD_evaluation_metrics.py		SELD_evaluation_metrics.py
doa_metrics.py		doa_metrics.py
hyper_parameter_tuning.py		hyper_parameter_tuning.py
infer.py		infer.py
requirements.txt		requirements.txt
setup.py		setup.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Latent Acoustic Mapping for Direction of Arrival Estimation: A Self-Supervised Approach

Installation

Datasets

Generate dataset

Training

Inference

DoA Metrics from Infered K-means Output

Sound Event Localization using LAM

Visualization

Pre-trained Models

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

adrianSRoman/LAM

Folders and files

Latest commit

History

Repository files navigation

Latent Acoustic Mapping for Direction of Arrival Estimation: A Self-Supervised Approach

Installation

Datasets

Generate dataset

Training

Inference

DoA Metrics from Infered K-means Output

Sound Event Localization using LAM

Visualization

Pre-trained Models

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages