Audio Source Separation Issue in DPRNN-TasNet Model #708

drAliMollaei · 2025-02-21T17:28:34Z

Hello everyone.
I am currently working on the audio source separation by using the DPRNNTasNet model, and I have encountered an issue with the output when performing separation. the training loss (using the PITLossWrapper with pairwise_neg_sisdr) reaches a value of -10 during training.

I am using the following setup:

Model architecture: DPRNNTasNet
Loss function: PITLossWrapper with pairwise_neg_sisdr
Checkpoint: Trained model from checkpoint
After performing the separation, the separated signals, both of them(separated_1wav and separated_2.wav) are similar. there may be an issue with how the separation is being performed or how the sources are being generated.

Would you be able to provide any insights or suggestions on why this might be happening? Perhaps there are specific parameters I need to adjust or additional steps I might have missed during the separation process?

Thanks

code : ------------------------------------------------------------------------
import torch
import soundfile as sf
from asteroid.models import DPRNNTasNet
from asteroid.utils import tensors_to_device

def separate_and_save(input_wav, model_path, output_dir, use_gpu=True):
"""
Separate the sources from a mixed audio file using a trained DPRNNTasNet model.

Args:
    input_wav (str): Path to the mixed audio file.
    model_path (str): Path to the trained model checkpoint.
    output_dir (str): Directory to save the separated audio files.
    use_gpu (bool): Whether to use GPU for inference or not.
"""
checkpoint = torch.load(model_path, map_location='cpu')
params = checkpoint['training_config']

model = DPRNNTasNet(
    n_src=params['masknet']['n_src'],
    n_repeats=params['masknet']['n_repeats'],
    bn_chan=params['masknet']['bn_chan'],
    hid_size=params['masknet']['hid_size'],
    chunk_size=params['masknet']['chunk_size'],
    hop_size=params['masknet']['hop_size'],
    mask_act=params['masknet']['mask_act'],
    bidirectional=params['masknet']['bidirectional'],
    dropout=params['masknet']['dropout'],
    in_chan=params['masknet']['in_chan'],
    out_chan=params['masknet']['out_chan'],
    n_filters=params['filterbank']['n_filters'],
    kernel_size=params['filterbank']['kernel_size'],
    stride=params['filterbank']['stride']
)
model.load_state_dict(checkpoint['state_dict'], strict=False)
model.eval()

device = torch.device("cuda" if use_gpu and torch.cuda.is_available() else "cpu")
model.to(device)

# Load the mixed audio file
print(f"Loading input audio from {input_wav}...")
mix_wav, sample_rate = sf.read(input_wav)
mix_tensor = torch.tensor(mix_wav, dtype=torch.float32).unsqueeze(0).to(device)

# Perform the separation
with torch.no_grad():
    print("Separating sources...")
    est_sources = model(mix_tensor)
    # est_sources = model.separate(input_wav)
    print("------------------------------------------------------------")
    print(est_sources)
    print("------------------------------------------------------------")
# Save the separated sources as WAV files
output_paths = [f"{output_dir}/separated_{i + 1}.wav" for i in range(est_sources.shape[1])]
for i, separated_signal in enumerate(est_sources[0]):
    separated_signal_np = separated_signal.cpu().numpy()

    # Save the separated audio
    sf.write(output_paths[i], separated_signal_np, sample_rate)
    print(f"Separated audio saved as {output_paths[i]}")

print("Separation completed!")

input_wav = "13.wav" # مثال: "input/test_mixed.wav"
output_dir = "./output" # مثال: "separated_output/"
model_path = "./exp/tmp/checkpoints/epoch=131-step=351252.ckpt"
separate_and_save(input_wav, model_path, output_dir)

The text was updated successfully, but these errors were encountered:

drAliMollaei · 2025-02-23T13:56:27Z

My problem is solved. In the above code, the trained weights of the model should also be loaded.

drAliMollaei added the question Further information is requested label Feb 21, 2025

drAliMollaei changed the title ~~Request for Help with Audio Source Separation Issue in DPRNN-TasNet Model~~ Audio Source Separation Issue in DPRNN-TasNet Model Feb 21, 2025

drAliMollaei closed this as completed Feb 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Audio Source Separation Issue in DPRNN-TasNet Model #708

Audio Source Separation Issue in DPRNN-TasNet Model #708

drAliMollaei commented Feb 21, 2025 •

edited

Loading

drAliMollaei commented Feb 23, 2025

Audio Source Separation Issue in DPRNN-TasNet Model #708

Audio Source Separation Issue in DPRNN-TasNet Model #708

Comments

drAliMollaei commented Feb 21, 2025 • edited Loading

drAliMollaei commented Feb 23, 2025

drAliMollaei commented Feb 21, 2025 •

edited

Loading