Skip to content

flush_file I/O errors on NFS are silently swallowed, producing corrupt output with exit code 0 #229

@mtcaton

Description

@mtcaton

Problem

When writing to an NFS/Lustre filesystem under high concurrency, flush_file in platform.cpp can fail with Input/output
error. The error is logged but not raised so the write returns success to the caller, the process exits 0, and the
resulting shard files are silently corrupt.

Log output observed:
platform.cpp:93 flush_file: Failed to flush file: Input/output error

Reading back the written shards then raises:
RuntimeError: error during blosc decompression: 0

The corruption is partial and timepoint-dependent: shards written during peak NFS contention are corrupt while earlier
ones are fine. This makes it extremely hard to detect without reading back every chunk.

Environment: NFS/Lustre archive filesystem, 200+ parallel writers, acquire-zarr 0.8.0

Expected behavior
If flush_file fails, the error should propagate up as a C++ exception (and surface as a Python exception), so the
caller knows the write failed and can exit non-zero, retry, or abort cleanly.

Suggested fixes

  1. Propagate the error: Raise from flush_file rather than logging and continuing, so callers always know when a write
    has failed.
  2. Atomic shard writes: Write each shard to a temp file and rename it into place only on success. This prevents a
    partially-flushed file from ever appearing as a valid (but corrupt) shard, regardless of whether the exception
    propagates correctly.

Reproduction

The failure is not easily reproducible without an NFS filesystem under write contention, but the pattern is:

  import acquire_zarr as aqz
  import numpy as np
  import zarr

  settings = aqz.StreamSettings(
      store_path="/nfs/archive/test.zarr",  # NFS path under high concurrency
      arrays=[
          aqz.ArraySettings(
              output_key="0",
              shape=(1, 1, 1, 2048, 2048),
              chunks=(1, 1, 1, 2048, 2048),
              dtype=aqz.DataType.UINT16,
              version=aqz.ZarrVersion.V3,
          )
      ],
      version=aqz.ZarrVersion.V3,
  )

  stream = aqz.ZarrStream(settings)
  for t in range(141):
      block = np.random.randint(0, 65535, (1, 1, 1, 2048, 2048), dtype=np.uint16)
      stream.append(block, key="")

stream.close() # Returns without raising, even when flush failed.
# stderr shows: platform.cpp:93 flush_file: Failed to flush file: Input/output error

Reading back the written data then raises:
arr = zarr.open("/nfs/archive/test.zarr/0", mode="r")
np.asarray(arr[70, 0, 0, :4, :4]) # RuntimeError: error during blosc decompression: 0

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions