Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ways to remove 0-byte pyramid files #1486

Closed
wants to merge 2 commits into from

Conversation

melissalinkert
Copy link
Member

See https://trac.openmicroscopy.org.uk/ome/ticket/11395

There are two approaches in this PR: one is to automatically remove 0-byte pyramid files automatically when a BfPixelBuffer attempts to load them, and the other is the bin/omero admin fixpyramids command which will remove all 0-byte pyramid files that do not have a lock or temp file.

bin/omero admin fixpyramids should satisfy the ticket requirements, but I'd be happy to move that logic to a different one of the CLI plugins and/or change fixpyramids to something else.

I'm not 100% sure of the BfPixelBuffer changes - the impact shouldn't be huge, but I don't know to what extent that will cause an infinite pyramid generation loop with seriously broken data. If it's going too far, happy to revert the corresponding commit.

If a pixel buffer is instantiated with a pyramid file of length 0, then
the file is deleted so that the pyramid file can be regenerated.

See ticket #11395.
'bin/omero admin fixpyramids' will now remove any 0-length pyramid files
that do not have a corresponding lock file or temp file.  This allows
administrators to force regeneration of pyramid files that could not be
created before (e.g. due to OOMs as a result of the server
configuration).

See ticket #11395.
def fixpyramids(self, args, config):
self.check_access();
config = config.as_map()
omero_data_dir = '/OMERO'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack! For diagnostics, this pattern (`/OMERO') makes sense, but for fixpyramids (and cleanse!) it's scary. I can fix in another PR if you want.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can fix here if told what the correct alternative is, otherwise yes, an additional PR (or a PR against this branch) would be appreciated since I'm a little out of my depth here.

@joshmoore
Copy link
Member

I haven't had a chance to test this, but conceptually:

  • +1 on the CLI tool. I wonder though if it should be a flag to cleanse.
  • I'm certainly worried about the infinite loop, though it'll probably taking faking something to properly test it. Can you suggest a fake setting (or similar) that would always fail to generate a pyramid? (Preferably with and without an OOM).

@melissalinkert
Copy link
Member Author

This should give you an invalid (0-length) pyramid with no OOM: pyramid-no-oom&sizeX=2147483647&sizeY=10.fake

I haven't reliably been able to come up with a fake file that will generate an OOM, but test_images_good/jpeg/8kx8k.jpg with a relatively low memory setting usually does the trick.

@melissalinkert
Copy link
Member Author

If the original import was not successful (as would be the case with the above .fake file), then pyramid regeneration is not attempted. If the import was successful, then regeneration is attempted when the image is clicked in Insight/web.

I tried this first with test_images_good/png/4kx4k.png (and Blitz memory settings reduced to 256M). The original pyramid generation fails with an OOM; upon attempting to view the image in Insight, the 0-byte pyramid file is deleted, pyramid generation is attempted again and is successful. Subsequent viewing of the image does not alter the pyramid file.

I then tried with test_images_good/png/8kx8k.png. As with 4kx4k.png, the original pyramid generation fails with an OOM; every attempt to view the image in Insight results in the 0-byte pyramid file being deleted and a single attempt at pyramid regeneration occurring (which also fails nearly instantly with an OOM). There doesn't appear to be a background infinite loop of deleting/regenerating, so I'd only expect this to be a problem if the user is going in and clicking the image more or less continuously.

@joshmoore
Copy link
Member

Hmmm....ok. That may be the best way forward, but we are certainly in the case that a file which OOMs will do so on every view, right? (At least in PixelData). Assuming a larger file and more realistic memory settings, that would be quiet the explosion (i.e. server overhead) on each kaboom.

@melissalinkert
Copy link
Member Author

Yes, you'd still see an OOM on every view, which I realize is not ideal (though hopefully the user would learn pretty quickly to stop clicking the button). Still happy to revert that commit, though we are at least not in the situation of having infinite loops of OOMs behind the scenes.

@joshmoore
Copy link
Member

While testing this PR (not related) on a server with 200MB of memory:

$ bin/omero import 'b&sizeX=4000&sizeY=4000.fake'
...
    serverStackTrace = "ome.conditions.InternalException:  Wrapped Exception: (java.lang.OutOfMemoryError):
                        Java heap space
                            at ome.io.bioformats.BfPixelsWrapper.getTimepoint(BfPixelsWrapper.java:326)
                            at ome.io.bioformats.BfPixelsWrapper.getMessageDigest(BfPixelsWrapper.java:79)
                            at ome.io.bioformats.BfPixelBuffer.calculateMessageDigest(BfPixelBuffer.java:127)
                            at ome.io.bioformats.BfPyramidPixelBuffer.calculateMessageDigest(BfPyramidPixelBuffer.java:607)
                            at ome.services.RawPixelsBean.save(RawPixelsBean.java:182)
                            at ome.services.RawPixelsBean.close(RawPixelsBean.java:210)
...

@joshmoore
Copy link
Member

@joshmoore
Copy link
Member

melissalinkert#3 opened. I also saw the behavior of re-clicking on an OOM'd image re-attempts the generation. Let's discuss possible pitfalls regarding that with @chris-allan. Seems like me may only need one of these two solutions.

@joshmoore
Copy link
Member

@chris-allan and I both worry about the loop logic. I'd propose:

  • we omit your BfPixelBuffer commit for 4.4.9
  • encourage the use of fixpyramids (possibly adding a check for a running PixelData service)
  • double check the *.tmp and *.lock logic
  • and later possible store the reason for the 0-byte file (e.g. OOM)

If so, I'll close this PR and open another one with both your and my fixpyramids commits.

@melissalinkert
Copy link
Member Author

No objections - that sounds like a good plan.

@joshmoore joshmoore mentioned this pull request Oct 5, 2013
@joshmoore
Copy link
Member

Closing in favor of #1589

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants