Skip to content

joint_norm in spectral whitening does not work with missing data on multiple cores #89

@jojomale

Description

@jojomale

Avoid duplicates

  • I searched existing issues

Bug Summary

I encountered an AssertionError when computing cross-correlations using gappy data and spectral whitening with joint_norm=True on multiple cores.

I used two 3-component stations, one of which is missing a couple hours of data on all 3 components. When the Correlator tries to process the first time window with missing data (i.e. 3 traces less), it ends up with (3, Nsamples)-matrix instead of a (6, Nsamples). These 3 rows are data from 1 station. Still it splits that to the 2 cores, which receive a (2, Nsamples) and (1, Nsamples). But for joint normalization, you obviously need all three channels on one core, which is (roughly) checked by the assertion that the number of rows (i.e. data traces) is a multiple of 3. So, I believe the problem is that the streams are simply converted into a matrix (technically it's an np.array). Then the rows no longer have an identity and are distributed evenly to the cores.

It seems like if joint normalization is wanted, the distribution of the matrix to the different cores needs to be managed more carefully. Maybe the mapping indices could be derived somehow from a list of seedids?
Besides, nn my case, all three channels of one station were missing, but I suppose there could also be just one broken channel and then things might get really messy. Maybe one could fill the missing channel with Nans, if the processing functions can deal with that, or remove the station entirely for the affected time window.

Code to Reproduce

Error Traceback

Traceback (most recent call last):
  File "/home/jlehr/seismic/playground/debug_jointnorm_error/run_seismic.py", line 57, in <module>
    main()
  File "/home/jlehr/seismic/playground/debug_jointnorm_error/run_seismic.py", line 49, in main
    st = c.pxcorr()
         ^^^^^^^^^^
  File "/home/jlehr/seismic/SeisMIC/src/seismic/correlate/correlate.py", line 354, in pxcorr
    cst.extend(self._pxcorr_inner(st, inv))
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jlehr/seismic/SeisMIC/src/seismic/correlate/correlate.py", line 390, in _pxcorr_inner
    A, startlags = self._pxcorr_matrix(A)
                   ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jlehr/seismic/SeisMIC/src/seismic/correlate/correlate.py", line 723, in _pxcorr_matrix
    B[ind, :] = func(B[ind, :], proc['args'], params)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jlehr/seismic/SeisMIC/src/seismic/correlate/preprocessing_fd.py", line 110, in spectralWhitening
    assert B.shape[0] % 3 == 0, "for joint normalization the number\
           ^^^^^^^^^^^^^^^^^^^
AssertionError: for joint normalization the number                      of traces needs to the multiple of 3: 46657
--------------------------------------------------------------------------
prterun detected that one or more processes exited with non-zero status,
thus causing the job to be terminated. The first process to do so was:

SeisMIC Version?

dev-jojomale

Operating System?

Ubuntu

Python Version?

3.11.10

Installation Method?

developer installation / from source / git checkout

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions