The issue is not just concatenation itself; it is recursive concatenation.
For example, if the 2026042200 file is made by concatenating:
* 2026042100
* 2026042106
* 2026042112
* 2026042118
* 2026042200
then that can be OK only if those are raw, non-concatenated source files.
But if each of those input files was already produced by concatenating the previous window, then we are re-ingesting observations that were already included in earlier cycles.
So the duplication pattern looks like this:
cycle 00 = raw 00 + raw 18 + raw 12 + raw 06 + raw 00_prev
cycle 06 = cycle 00 + raw 06 + raw 00 + raw 18 + raw 12
At that point, cycle 06 contains observations from cycle 00, and cycle 00 already contains several earlier files. The overlap compounds every cycle.
The global attribute confirms this:
obs_source_files =
gdas.t00z.insitu_profile_argo.2026042100.nc,
gdas.t06z.insitu_profile_argo.2026042106.nc,
gdas.t12z.insitu_profile_argo.2026042112.nc,
gdas.t18z.insitu_profile_argo.2026042118.nc,
gdas.t00z.insitu_profile_argo.2026042200.nc
If any of those source files are themselves concatenated products, then we are not building a 24-hour window from unique raw inputs; we are recursively accumulating prior windows.
The fix is to concatenate only from the original per-cycle source files, or explicitly de-duplicate after concatenation using a stable observation key, such as platform/profile ID + dateTime + lat/lon + depth, depending on what metadata are available in the IODA file.
The issue is not just concatenation itself; it is recursive concatenation.
For example, if the 2026042200 file is made by concatenating:
then that can be OK only if those are raw, non-concatenated source files.
But if each of those input files was already produced by concatenating the previous window, then we are re-ingesting observations that were already included in earlier cycles.
So the duplication pattern looks like this:
At that point,
cycle 06contains observations fromcycle 00, andcycle 00already contains several earlier files. The overlap compounds every cycle.The global attribute confirms this:
If any of those source files are themselves concatenated products, then we are not building a 24-hour window from unique raw inputs; we are recursively accumulating prior windows.
The fix is to concatenate only from the original per-cycle source files, or explicitly de-duplicate after concatenation using a stable observation key, such as
platform/profile ID+dateTime+lat/lon+depth, depending on what metadata are available in the IODA file.