Skip to content

Issue with RawDataStore Selection and Usage for Custom Data in NoisePy #149

@GeoGarfias

Description

@GeoGarfias

I encountered issues when trying to use my own data with NoisePy after successfully following the tutorial. To prepare my data for cross-correlation, I used "S0B_to_ASDF.py" to convert my mseed data to h5 format, which took care of the necessary pre-processing steps.

Now, I need to perform cross-correlation using the "cross_correlate" function, which requires a RawDataStore, config_parameters, and a CrossCorrelationDataStore as inputs. The tutorial explains three options to create a RawDataStore:

ASDFRawDataStore
PNWDataStore
SCEDCS3DataStore

However, it is challenging for me to apply these options to my data, as it doesn't require any preprocessing steps.

I attempted to use the ASDFRawDataStore and SCEDCS3DataStore with my preprocessed .h5 data, but encountered errors. When using ASDFRawDataStore, I followed the tutorial's guidance by providing the path of the h5 file as input, like this: "raw_store = ASDFRawDataStore(raw_data_path)"Although this step didn't raise an error, the error occurred when calling the "cross_correlate" function: "cross_correlate(raw_store, config, cc_store)". I attempted to understand and resolve the error by examining the ASDFRawDataStore class, but without success.

Regarding SCEDCS3DataStore, following the tutorial, I set up the stations using the stations = "HAUP,PYKE".split(",") format and established the stationxml file with catalog = XMLStationChannelCatalog(S3_STATION_XML). Then, I created the SCEDCS3DataStore as follows:

S3_DATA = '/Volumes/GeoPhysics_23/users-data/juarezilma/Noisepy/RAWDATA/' (this is the folder containing the preprocessed .h5 file)
raw_store = SCEDCS3DataStore(S3_DATA, catalog, channel_filter(stations, "HH1"), range).

However, the output showed:
2023-07-11 11:07:35,157 INFO scedc_s3store._load_channels(): Loading 0 files from /Volumes/GeoPhysics_23/users-data/juarezilma/Noisepy/RAWDATA/2022/2022_001/
2023-07-11 11:07:35,157 INFO scedc_s3store._load_channels(): Init: 0 timespans and 0 channels

I suspect that the SCEDCS3DataStore expects raw data in .h5 format within directories following the pattern "2022/2022_001/." As I don't have .h5 raw files, it couldn't find any files.

My data is structured as follows:
juarezilma@:/Volumes/GeoPhysics_23/users-data/juarezilma/Noisepy/RAWDATA/2022/2022_001$ ls
2022.001.HAUP.10-HH1.ZX.D.IRremoved 2022.001.HOST.10-HH1.ZX.D.IRremoved 2022.001.PYKE.10-HH1.ZX.D.IRremoved
2022.001.HAUP.10-HH2.ZX.D.IRremoved 2022.001.HOST.10-HH2.ZX.D.IRremoved 2022.001.PYKE.10-HH2.ZX.D.IRremoved
2022.001.HAUP.10-HHZ.ZX.D.IRremoved 2022.001.HOST.10-HHZ.ZX.D.IRremoved 2022.001.PYKE.10-HHZ.ZX.D.IRremoved

I would greatly appreciate your assistance in creating a RawDataStore using my own data.

Metadata

Metadata

Labels

help wantedExtra attention is needed

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions