Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add wav write support for patches with non-distance dimensions #488

Merged
merged 4 commits into from
Feb 18, 2025

Conversation

d-chambers
Copy link
Contributor

@d-chambers d-chambers commented Feb 7, 2025

This PR simply allows patches that have a time dimension and a non-distance dimension to be written as a single wav file or a directory of wav files.

Checklist

I have (if applicable):

  • referenced the GitHub issue this PR closes.
  • documented the new feature with docstrings or appropriate doc page.
  • included a test. See testing guidelines.
  • your name has been added to the contributors page (docs/contributors.md).
  • added the "ready_for_review" tag once the PR is ready to be reviewed.

Summary by CodeRabbit

  • New Features
    • Enhanced audio file writing to dynamically support patches with additional dimensions beyond time, improving overall flexibility.
  • Tests
    • Introduced new tests to ensure correct processing of audio patches with non-standard dimension configurations, including non-distance dimensions.

@d-chambers d-chambers added the IO Work for reading/writing different formats label Feb 7, 2025
Copy link
Contributor

coderabbitai bot commented Feb 7, 2025

Walkthrough

The changes update the WAV file writing logic in the WavIO class within the core module. The write method now dynamically identifies non-time dimensions, replacing the hardcoded "distance" reference. The _get_wav_data method has been adjusted to handle 2D patches containing only a time dimension and to transpose data for additional dimensions correctly, using the time coordinate’s step for sample rate calculation. New tests and fixtures have been added to ensure proper handling of patches with non-distance dimensions, such as "microphone."

Changes

File(s) Change Summary
dascore/io/wav/core.py Modified WavIO.write to dynamically identify non-time dimensions instead of using a hardcoded "distance". Updated _get_wav_data to handle 2D patches with time only and adjusted sample rate calculation.
tests/test_io/test_wav/test_wav.py Added fixture audio_patch_non_distance_dim that renames the coordinate from "distance" to "microphone" and a new test test_write_non_distance_dims to verify proper file writing for non-distance dimensions.

Sequence Diagram(s)

sequenceDiagram
    participant C as Caller
    participant W as WavIO
    participant FS as File System

    C->>W: write(patch, resource, ...)
    W->>W: Determine non-time dimension dynamically
    W->>W: Process patch data & compute sample rate using time step
    W->>W: Invoke _get_wav_data(patch, resample)
    W->>FS: Write WAV data to directory based on patch dimensions
    FS-->>W: Confirmation of write
    W-->>C: Return success
Loading

Poem

In the code garden, I hop with glee,
Tweaking WAV files so wild and free.
Non-time dims now lead the way,
With tests ensuring a brighter day.
Hoppy vibes in every line—code carrots, oh so fine! 🐇


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7a218c3 and 5944eac.

📒 Files selected for processing (1)
  • tests/test_io/test_wav/test_wav.py (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/test_io/test_wav/test_wav.py
⏰ Context from checks skipped due to timeout of 90000ms (15)
  • GitHub Check: test_code (windows-latest, 3.12)
  • GitHub Check: test_code (windows-latest, 3.11)
  • GitHub Check: test_code (windows-latest, 3.10)
  • GitHub Check: test_code_min_deps (windows-latest, 3.13)
  • GitHub Check: test_code (macos-latest, 3.12)
  • GitHub Check: test_code_min_deps (windows-latest, 3.12)
  • GitHub Check: test_code_min_deps (macos-latest, 3.13)
  • GitHub Check: test_code (macos-latest, 3.11)
  • GitHub Check: test_code_min_deps (macos-latest, 3.12)
  • GitHub Check: test_code (macos-latest, 3.10)
  • GitHub Check: test_code (ubuntu-latest, 3.12)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.13)
  • GitHub Check: test_code (ubuntu-latest, 3.11)
  • GitHub Check: test_code (ubuntu-latest, 3.10)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.12)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🔭 Outside diff range comments (1)
dascore/io/wav/core.py (1)

20-55: Update docstring to reflect non-distance dimension support.

The docstring still references "distance" dimension in multiple places. Update it to reflect that any non-time dimension is supported.

Example enhancement:

-            If a path that ends with .wav, write all the distance channels
+            If a path that ends with .wav, write all channels from the non-time dimension
             to a single file. If not, assume the path is a directory and write
-            each distance channel to its own wav file.
+            each channel to its own wav file.

             ...

-            the output the patch has more than one len along the distance
-            dimension, a multi-channel wavefile is created. There may be some
+            the output patch has more than one value along the non-time
+            dimension, a multi-channel wavefile is created. There may be some
🧹 Nitpick comments (1)
tests/test_io/test_wav/test_wav.py (1)

53-60: Enhance test coverage for non-distance dimension WAV writing.

While the test verifies basic functionality, consider adding assertions to:

  1. Verify the number of WAV files matches the number of microphone coordinates
  2. Check sample rate and data integrity of written files
  3. Validate the naming pattern of generated files

Example enhancement:

 def test_write_non_distance_dims(
     self, audio_patch_non_distance_dim, tmp_path_factory
 ):
     """Ensure any non-time dimension still works."""
     path = tmp_path_factory.mktemp("wav_resample")
     patch = audio_patch_non_distance_dim
     patch.io.write(path, "wav")
     assert path.exists()
+    # Verify number of WAV files
+    wavs = list(path.rglob("*.wav"))
+    assert len(wavs) == len(patch.coords.get_array("microphone"))
+    # Verify file naming
+    for mic_val in patch.coords.get_array("microphone"):
+        assert path / f"microphone_{mic_val}.wav" in wavs
+    # Verify content of first file
+    sr, data = read_wav(str(wavs[0]))
+    assert sr == int(ONE_SECOND / patch.get_coord("time").step)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4117475 and 0172c51.

📒 Files selected for processing (2)
  • dascore/io/wav/core.py (1 hunks)
  • tests/test_io/test_wav/test_wav.py (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (15)
  • GitHub Check: test_code (windows-latest, 3.12)
  • GitHub Check: test_code (windows-latest, 3.11)
  • GitHub Check: test_code (windows-latest, 3.10)
  • GitHub Check: test_code (macos-latest, 3.12)
  • GitHub Check: test_code_min_deps (windows-latest, 3.13)
  • GitHub Check: test_code (macos-latest, 3.11)
  • GitHub Check: test_code_min_deps (windows-latest, 3.12)
  • GitHub Check: test_code (macos-latest, 3.10)
  • GitHub Check: test_code_min_deps (macos-latest, 3.13)
  • GitHub Check: test_code (ubuntu-latest, 3.12)
  • GitHub Check: test_code (ubuntu-latest, 3.11)
  • GitHub Check: test_code_min_deps (macos-latest, 3.12)
  • GitHub Check: test_code (ubuntu-latest, 3.10)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.13)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.12)
🔇 Additional comments (3)
tests/test_io/test_wav/test_wav.py (1)

28-32: LGTM! Well-structured fixture for testing non-distance dimensions.

The fixture correctly creates a test patch by renaming the "distance" coordinate to "microphone", enabling testing of WAV writing functionality with non-distance dimensions.

dascore/io/wav/core.py (2)

82-95: LGTM! Robust handling of patch data preparation.

The changes correctly:

  1. Verify time dimension presence using check_patch_coords
  2. Ensure proper data shape with time as first dimension
  3. Calculate sample rate from time coordinate step

63-77: Consider handling multiple non-time dimensions.

The current implementation assumes exactly one non-time dimension. Consider handling cases where patches might have multiple non-time dimensions.

Example enhancement:

-    non_time_name = next(
-        iter(
-            set(patch.dims)
-            - {
-                "time",
-            }
-        )
-    )
+    non_time_dims = set(patch.dims) - {"time"}
+    if len(non_time_dims) > 1:
+        raise ValueError(
+            f"Writing WAV files with multiple non-time dimensions {non_time_dims} "
+            "is not supported"
+        )
+    non_time_name = next(iter(non_time_dims))

Copy link

codecov bot commented Feb 7, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.85%. Comparing base (4117475) to head (5944eac).
Report is 1 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #488   +/-   ##
=======================================
  Coverage   99.85%   99.85%           
=======================================
  Files         118      118           
  Lines        9693     9696    +3     
=======================================
+ Hits         9679     9682    +3     
  Misses         14       14           
Flag Coverage Δ
unittests 99.85% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

coderabbitai bot added a commit that referenced this pull request Feb 7, 2025
Docstrings generation was requested by @d-chambers.

* #488 (comment)

The following files were modified:

* `dascore/io/wav/core.py`
* `tests/test_io/test_wav/test_wav.py`
Copy link
Contributor

coderabbitai bot commented Feb 7, 2025

Note

Generated docstrings for this pull request at #489

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
tests/test_io/test_wav/test_wav.py (1)

55-71: Add verification of WAV data content.

The test thoroughly checks file existence, count, and naming, but it's missing verification of the actual WAV data content. Consider adding assertions to verify that the data in the WAV files matches the expected values from the patch.

     def test_write_non_distance_dims(
         self, audio_patch_non_distance_dim, tmp_path_factory
     ):
         """Ensure any non-time dimension still works."""
         path = tmp_path_factory.mktemp("wav_resample")
         patch = audio_patch_non_distance_dim
         patch.io.write(path, "wav")
         assert path.exists()
         # Verify number of WAV files
         wavs = list(path.rglob("*.wav"))
         assert len(wavs) == len(patch.coords.get_array("microphone"))
         # Verify file naming
         for mic_val in patch.coords.get_array("microphone"):
             assert path / f"microphone_{mic_val}.wav" in wavs
             # Verify content of first file
             sr, data = read_wav(str(wavs[0]))
         assert sr == int(ONE_SECOND / patch.get_coord("time").step)
+        # Verify data content
+        for wav_path, mic_val in zip(sorted(wavs), patch.coords.get_array("microphone")):
+            sr, data = read_wav(str(wav_path))
+            expected_data = patch.sel(microphone=mic_val).data
+            # Compare data after accounting for normalization
+            expected_data = (expected_data - expected_data.mean()) / abs(expected_data).max()
+            np.testing.assert_allclose(data, expected_data, rtol=1e-5)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0172c51 and 283675e.

📒 Files selected for processing (2)
  • dascore/io/wav/core.py (3 hunks)
  • tests/test_io/test_wav/test_wav.py (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (15)
  • GitHub Check: test_code (windows-latest, 3.12)
  • GitHub Check: test_code (windows-latest, 3.11)
  • GitHub Check: test_code (windows-latest, 3.10)
  • GitHub Check: test_code_min_deps (windows-latest, 3.13)
  • GitHub Check: test_code (macos-latest, 3.12)
  • GitHub Check: test_code (macos-latest, 3.11)
  • GitHub Check: test_code_min_deps (windows-latest, 3.12)
  • GitHub Check: test_code (macos-latest, 3.10)
  • GitHub Check: test_code_min_deps (macos-latest, 3.13)
  • GitHub Check: test_code (ubuntu-latest, 3.12)
  • GitHub Check: test_code_min_deps (macos-latest, 3.12)
  • GitHub Check: test_code (ubuntu-latest, 3.11)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.13)
  • GitHub Check: test_code_min_deps (ubuntu-latest, 3.12)
  • GitHub Check: test_code (ubuntu-latest, 3.10)
🔇 Additional comments (3)
tests/test_io/test_wav/test_wav.py (1)

30-34: LGTM! Well-structured fixture.

The fixture is well-documented and follows pytest fixture patterns. The scope matches other fixtures, and the implementation is clear.

dascore/io/wav/core.py (2)

29-31: LGTM! Clear and accurate docstring updates.

The docstring updates accurately reflect the changes in functionality, maintaining clarity and consistency.

Also applies to: 48-49


77-80: LGTM! Robust handling of non-time dimensions.

The changes correctly handle non-time dimensions while maintaining proper validation and sample rate calculation. The transposition ensures consistent data layout.

Also applies to: 82-82, 88-88

Comment on lines +63 to 71
else: # write data to directory, one file for each non-time
resource.mkdir(exist_ok=True, parents=True)
distances = patch.coords.get_array("distance")
for ind, dist in enumerate(distances):
non_time_set = set(patch.dims) - {"time"}
non_time_name = next(iter(non_time_set))
non_time = patch.coords.get_array(non_time_name)
for ind, val in enumerate(non_time):
sub_data = np.take(data, ind, axis=1)
sub_path = resource / f"{dist}.wav"
sub_path = resource / f"{non_time_name}_{val}.wav"
write(filename=str(sub_path), rate=int(sr), data=sub_data)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add validation for multiple non-time dimensions.

The code assumes only one non-time dimension exists but doesn't validate this assumption. This could lead to cryptic errors if a patch has multiple non-time dimensions.

         else:  # write data to directory, one file for each non-time
             resource.mkdir(exist_ok=True, parents=True)
             non_time_set = set(patch.dims) - {"time"}
+            if len(non_time_set) != 1:
+                raise ValueError(
+                    f"Expected exactly one non-time dimension, got {len(non_time_set)}: {non_time_set}"
+                )
             non_time_name = next(iter(non_time_set))
             non_time = patch.coords.get_array(non_time_name)
             for ind, val in enumerate(non_time):
                 sub_data = np.take(data, ind, axis=1)
                 sub_path = resource / f"{non_time_name}_{val}.wav"
                 write(filename=str(sub_path), rate=int(sr), data=sub_data)

d-chambers and others added 2 commits February 7, 2025 13:07
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@d-chambers d-chambers merged commit 1dbb10e into master Feb 18, 2025
19 checks passed
@d-chambers d-chambers deleted the wav_non_distance branch February 18, 2025 23:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO Work for reading/writing different formats
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant