Skip to content

In Data Loading, Clip to Layer BoundingBox #8551

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Apr 28, 2025
Merged

In Data Loading, Clip to Layer BoundingBox #8551

merged 9 commits into from
Apr 28, 2025

Conversation

fm3
Copy link
Member

@fm3 fm3 commented Apr 23, 2025

Steps to test:

  • Edit the datasource-properties.json of an existing dataset to have a smaller bounding box than the existing data on disk
  • When viewing the dataset (after cache clear), the loaded data for that layer should be clipped accordingly
  • Zoom out to test different mags

Issues:


  • Updated changelog
  • Removed dev-only changes like prints and application.conf edits
  • Considered common edge cases
  • Needs datastore update after deployment

@fm3 fm3 self-assigned this Apr 23, 2025
Copy link
Contributor

coderabbitai bot commented Apr 23, 2025

📝 Walkthrough

"""

Walkthrough

This change introduces logic to ensure that data returned from a data layer is clipped to its defined bounding box. Specifically, when a data request includes regions outside the layer’s bounding box, those out-of-bounds regions are zeroed out before further processing. This is implemented by adding a clipping function in the backend service, modifying the data conversion pipeline to apply this clipping step, and updating related geometry utility classes to facilitate bounding box operations. Additional utility methods are added to support bounding box containment checks and coordinate transformations.

Changes

Files/Paths Change Summary
CHANGELOG.unreleased.md Added a changelog entry documenting the new clipping behavior for data outside the bounding box.
util/src/main/scala/com/scalableminds/util/geometry/BoundingBox.scala Made bottomRight a lazy val, added isFullyContainedIn and move methods for bounding box operations.
util/src/main/scala/com/scalableminds/util/geometry/Vec3Int.scala Added unary negation operator (unary_-), removed the negate method.
webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/requests/Cuboid.scala Added methods to convert a Cuboid to a BoundingBox in various magnitudes.
webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/requests/DataServiceRequests.scala Added a mag method to expose the Cuboid's magnitude in DataServiceDataRequest.
webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/BinaryDataService.scala Introduced a clipToLayerBoundingBox method to zero out data outside the bounding box, integrated this step into the data conversion pipeline, and simplified offset calculations in cuboid extraction.

Assessment against linked issues

Objective (Issue #) Addressed Explanation
Ensure data outside the layer bounding box is zeroed/clipped before being returned (#5775)
Provide backend-side solution for cropping at bounding box, not just frontend (#5775)
Prevent exposure of data outside bounding box even if chunks are not aligned (#5775)

Poem

A bounding box, a tidy frame,
Now keeps stray bytes from wild acclaim.
If data dares to cross the line,
It’s zeroed out—no chance to shine!
With bounding checks and tidy code,
The rabbit hops a safer road.
🐇✨ Boundaries set, the data’s right—
All thanks to code that clips so tight!
"""


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@fm3 fm3 added the bug label Apr 23, 2025
@fm3 fm3 marked this pull request as ready for review April 23, 2025 11:06
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
CHANGELOG.unreleased.md (1)

16-16: Minor language improvement suggestion.

Consider simplifying "outside of the bounding box" to "outside the bounding box" to remove redundancy.

- When loading data from a data layer that has data stored beyond the bounding box specified in the datasource-properties.json, data outside of the bounding box is now zeroed. (the layer is "clipped"). [#8551](https://github.com/scalableminds/webknossos/pull/8551)
+ When loading data from a data layer that has data stored beyond the bounding box specified in the datasource-properties.json, data outside the bounding box is now zeroed. (the layer is "clipped"). [#8551](https://github.com/scalableminds/webknossos/pull/8551)
🧰 Tools
🪛 LanguageTool

[style] ~16-~16: This phrase is redundant. Consider using “outside”.
Context: ...in the datasource-properties.json, data outside of the bounding box is now zeroed. (the la...

(OUTSIDE_OF)

webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/requests/Cuboid.scala (1)

55-60: Document coordinate system & inclusive/exclusive semantics

These helpers are valuable, but future maintainers need to know

  • whether bottomRight is treated as inclusive or exclusive, and
  • whether width/height/depth here are in current‑mag voxel units or already scaled.

Adding a brief Scaladoc (1‑2 lines) to toBoundingBoxInMag / toMag1BoundingBox would prevent off‑by‑one misunderstandings later.

webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/BinaryDataService.scala (1)

109-137: Skip allocation when nothing needs to be clipped

clipToLayerBoundingBox allocates a full‑sized zero buffer even if the intersection coincides with the request cuboid (possible when isFullyContainedIn was overly conservative due to rounding).

A cheap guard avoids the extra work:

@@
   val outputArray = Array.fill[Byte](inputArray.length)(0)
-  intersectionOpt.foreach { intersection =>
+  intersectionOpt match {
+    case Some(intersection) if
+      intersection.width  == requestBboxInMag.width &&
+      intersection.height == requestBboxInMag.height &&
+      intersection.depth  == requestBboxInMag.depth =>
+      // Nothing to clip – reuse original array
+      return Full(inputArray)
+    case Some(intersection) =>
       for {
         z <- intersection.topLeft.z until intersection.bottomRight.z
@@
-  }
+  }
+  case None => // keep all zeros (completely outside)
+}

This keeps memory steady and avoids an unnecessary copy in borderline cases.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f4f597d and 4a58957.

📒 Files selected for processing (6)
  • CHANGELOG.unreleased.md (1 hunks)
  • util/src/main/scala/com/scalableminds/util/geometry/BoundingBox.scala (3 hunks)
  • util/src/main/scala/com/scalableminds/util/geometry/Vec3Int.scala (1 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/requests/Cuboid.scala (2 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/requests/DataServiceRequests.scala (2 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/BinaryDataService.scala (2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/requests/DataServiceRequests.scala (2)
util/src/main/scala/com/scalableminds/util/geometry/Vec3Int.scala (2)
  • Vec3Int (7-89)
  • Vec3Int (91-157)
webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/requests/Cuboid.scala (1)
  • mag (46-46)
util/src/main/scala/com/scalableminds/util/geometry/BoundingBox.scala (1)
util/src/main/scala/com/scalableminds/util/geometry/Vec3Int.scala (3)
  • move (53-56)
  • move (56-59)
  • other (32-35)
🪛 LanguageTool
CHANGELOG.unreleased.md

[style] ~16-~16: This phrase is redundant. Consider using “outside”.
Context: ...in the datasource-properties.json, data outside of the bounding box is now zeroed. (the la...

(OUTSIDE_OF)

⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: backend-tests
  • GitHub Check: build-smoketest-push
  • GitHub Check: frontend-tests
🔇 Additional comments (7)
webknossos-datastore/app/com/scalableminds/webknossos/datastore/models/requests/DataServiceRequests.scala (1)

3-3: Accessor method looks good.

The addition of the mag method provides convenient access to the cuboid's magnitude vector, maintaining a consistent pattern of delegating geometric information through the object hierarchy.

Also applies to: 25-25

util/src/main/scala/com/scalableminds/util/geometry/Vec3Int.scala (1)

26-28: Good choice replacing with standard Scala operator.

Replacing a custom negate method with the standard Scala unary negation operator unary_- improves code readability and follows idiomatic Scala conventions. This allows for more natural syntax when negating vectors (using -vec instead of a method call).

util/src/main/scala/com/scalableminds/util/geometry/BoundingBox.scala (3)

9-9: Good optimization with lazy evaluation.

Making bottomRight a lazy val is a good optimization, especially if this field isn't always accessed. It defers the computation until the first time the field is accessed, potentially improving performance.


48-50: Elegant containment check implementation.

The isFullyContainedIn method provides a clean way to check if a bounding box is fully contained within another, leveraging the existing intersection method. This approach is both concise and logically clear.


67-69: Efficient implementation of move operation.

The move method provides a convenient way to translate a bounding box by a vector delta. Using copy with just the topLeft parameter is an efficient approach that maintains immutability while only modifying the necessary field.

webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/BinaryDataService.scala (2)

141-146: Verify mixed coordinate spaces in containment check vs. clipping

Containment is evaluated in mag‑1 space (toMag1BoundingBox) whereas clipping operates in the request’s own mag (toBoundingBoxInMag).
When integer division rounds coordinates differently across mags, a cuboid might be considered fully contained at mag‑1 yet still leak a few voxels outside the layer bbox at the coarser mag.

If the asymmetry is deliberate, please add an explanatory comment; otherwise align both steps to the same coordinate system.


227-230: Cleaner offset computation – looks good

Swapping the division‑based offsets for straight coordinate deltas (rx, ry, rz) removes dead code (subsamplingStrides) and clarifies intent. Nice simplification.

@fm3 fm3 requested a review from frcroth April 23, 2025 11:34
@fm3 fm3 changed the title Clip Data to Layer BoundingBox In Data Loading, Clip to Layer BoundingBox Apr 23, 2025
Copy link
Member

@frcroth frcroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
CHANGELOG.unreleased.md (1)

18-18: Refine phrasing for clarity and conciseness

Consider removing the redundant “of” in “outside of the bounding box” and streamlining the sentence. Also, replace the parentheses with a more integrated explanation. For example:

- - When loading data from a data layer that has data stored beyond the bounding box specified in the datasource-properties.json, data outside of the bounding box is now zeroed. (the layer is “clipped”). [#8551](https://github.com/scalableminds/webknossos/pull/8551)
+ - For data layers storing voxels beyond the bounding box defined in `datasource-properties.json`, any voxel data outside the bounding box is now zeroed out (clipped). [#8551](https://github.com/scalableminds/webknossos/pull/8551)
🧰 Tools
🪛 LanguageTool

[style] ~18-~18: This phrase is redundant. Consider using “outside”.
Context: ...in the datasource-properties.json, data outside of the bounding box is now zeroed. (the la...

(OUTSIDE_OF)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4a58957 and ab97fb2.

📒 Files selected for processing (1)
  • CHANGELOG.unreleased.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
CHANGELOG.unreleased.md

[style] ~18-~18: This phrase is redundant. Consider using “outside”.
Context: ...in the datasource-properties.json, data outside of the bounding box is now zeroed. (the la...

(OUTSIDE_OF)

⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: build-smoketest-push
  • GitHub Check: backend-tests
  • GitHub Check: frontend-tests

@fm3 fm3 enabled auto-merge (squash) April 28, 2025 09:21
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
CHANGELOG.unreleased.md (1)

21-21: Refine phrasing to improve clarity and remove redundancy.

The phrase “outside of” is redundant; use “outside”. Also consider folding the parenthetical into the main sentence for smoother reading.

Apply this diff:

- When loading data from a data layer that has data stored beyond the bounding box specified in the datasource-properties.json, data outside of the bounding box is now zeroed. (the layer is “clipped”). [#8551]
+ When loading data from a data layer that has data stored beyond the bounding box specified in datasource-properties.json, data outside the bounding box is now zeroed, effectively clipping the layer. [#8551]
🧰 Tools
🪛 LanguageTool

[style] ~21-~21: This phrase is redundant. Consider using “outside”.
Context: ...in the datasource-properties.json, data outside of the bounding box is now zeroed. (the la...

(OUTSIDE_OF)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ab97fb2 and cac3521.

📒 Files selected for processing (2)
  • CHANGELOG.unreleased.md (1 hunks)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/BinaryDataService.scala (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • webknossos-datastore/app/com/scalableminds/webknossos/datastore/services/BinaryDataService.scala
🧰 Additional context used
🪛 LanguageTool
CHANGELOG.unreleased.md

[style] ~21-~21: This phrase is redundant. Consider using “outside”.
Context: ...in the datasource-properties.json, data outside of the bounding box is now zeroed. (the la...

(OUTSIDE_OF)

⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: build-smoketest-push
  • GitHub Check: frontend-tests
  • GitHub Check: backend-tests

@fm3 fm3 merged commit 7275b8b into master Apr 28, 2025
5 checks passed
@fm3 fm3 deleted the clip-to-bbox branch April 28, 2025 09:28
MichaelBuessemeyer added a commit that referenced this pull request Apr 28, 2025
MichaelBuessemeyer added a commit that referenced this pull request Apr 28, 2025
@MichaelBuessemeyer MichaelBuessemeyer restored the clip-to-bbox branch April 28, 2025 14:28
Copy link
Contributor

@MichaelBuessemeyer MichaelBuessemeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hej, I tried to look out for the bug a little but couldn't reproduce the error. So here are my guesses by only looking at the code

@@ -51,4 +51,10 @@ case class Cuboid(topLeft: VoxelPosition, width: Int, height: Int, depth: Int) {
height * mag.y,
depth * mag.z
)

def toBoundingBoxInMag: BoundingBox =
BoundingBox(Vec3Int(topLeft.voxelXInMag, topLeft.voxelYInMag, topLeft.voxelZInMag), width, height, depth)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't width, height, depth also be in mag?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they already are, they are just not named as explicitly

Comment on lines +131 to +135
System.arraycopy(inputArray,
offset,
outputArray,
offset,
(intersection.bottomRight.x - intersection.topLeft.x) * bytesPerElement)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this is the most likely cause for the nullpointer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some out of bounds maybe. Maybe even due to my comment above regarding them dimensions not being converted to the respective mag

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record, the null was explicitly set for request.dataSource, compare #8573 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Layer should be clipped at bounding box
3 participants