-
Notifications
You must be signed in to change notification settings - Fork 15
feat(docker): blackwell GPU compatibility #112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Amadeusz Szymko <[email protected]>
e65651b
to
e98e25b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@amadeuszsz
|
Signed-off-by: Amadeusz Szymko <[email protected]>
@vividf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File "/opt/conda/lib/python3.11/site-packages/mmengine/runner/runner.py", line 2127, in load_checkpoint
checkpoint = _load_checkpoint(filename, map_location=map_location)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/mmengine/runner/checkpoint.py", line 548, in _load_checkpoint
return CheckpointLoader.load_checkpoint(filename, map_location, logger)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/mmengine/runner/checkpoint.py", line 330, in load_checkpoint
return checkpoint_loader(filename, map_location)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/mmengine/runner/checkpoint.py", line 347, in load_from_local
checkpoint = torch.load(filename, map_location=map_location)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/torch/serialization.py", line 1529, in load
raise pickle.UnpicklingError(_get_wo_message(str(e))) from None
_pickle.UnpicklingError: Weights only load failed. This file can still be loaded, to do so you have two options, do those steps only if you trust the source of the checkpoint.
(1) In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from `False` to `True`. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source.
(2) Alternatively, to load with `weights_only=True` please check the recommended steps in the following error message.
WeightsUnpickler error: Unsupported global: GLOBAL mmengine.logging.history_buffer.HistoryBuffer was not an allowed global by default. Please use `torch.serialization.add_safe_globals([mmengine.logging.history_buffer.HistoryBuffer])` or the `torch.serialization.safe_globals([mmengine.logging.history_buffer.HistoryBuffer])` context manager to allowlist this global if you trust this class/function.
Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html.
I think some additional work needs to be done (trying now)
@vividf |
Summary
Update core dependencies.
Note
Test performed