SCORM contentstore sync for OLX export/import#113
Conversation
Open edX's OLX export bundles contentstore assets into the export tarball, but SCORM zips are unpacked into Django's default_storage and never make it into the bundle. As a result, a course exported and re-imported into a new course keeps the SCORM block's metadata but not its content. PR overhangio#71 made this more visible by keying the extraction directory on sha1(usage_key) instead of block_id; that fix prevented cross-course corruption but left imported blocks pointing at an empty directory. Add an opt-in sync that closes the gap without fighting PR overhangio#71: - studio_submit: after extract_package, mirror the original zip into the current course's contentstore as scorm_packages/<sha1>.zip (locked). - index_page_url and assets_proxy: before serving content, if package_meta is set but the unpacked tree is missing from default_storage, fetch the zip from the current course's contentstore and re-extract it. Gated behind XBLOCK_SETTINGS["ScormXBlock"]["CONTENTSTORE_SYNC_ENABLED"], defaulting to False; legacy behavior is unchanged unless an operator opts in. Imports of xmodule.contentstore are lazy and every contentstore call is wrapped in try/except so uploads still succeed when the contentstore is unavailable (tests, non-platform environments). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
| @@ -0,0 +1 @@ | |||
| - [Feature] Optionally mirror the SCORM zip into the course contentstore on upload and rehydrate the unpacked tree from the contentstore on first access. This makes blocks survive OLX course export/import without breaking the extraction path keyed by `sha1(usage_key)` introduced in PR #71. Disabled by default; enable with `XBLOCK_SETTINGS["ScormXBlock"]["CONTENTSTORE_SYNC_ENABLED"] = True`. (by @Syed-Ali-Abbas-568) No newline at end of file | |||
There was a problem hiding this comment.
nit: The changelog entry can be reworded to better describe the change. We don't need to link the PR either.
| # If this block was just imported via OLX, the unpacked files may be | ||
| # missing from default_storage. Rehydrate from the contentstore before | ||
| # serving. No-op when the feature is disabled or the tree exists. | ||
| self._rehydrate_from_contentstore() |
There was a problem hiding this comment.
assets_proxy is called on every request to the LMS and page load. We don't need to extract scorm files every time. Unless, there is a specific reason for this, we should only extract once when the xblock is initialized or on course import.
| Disabled by default; opt in by setting | ||
| XBLOCK_SETTINGS["ScormXBlock"]["CONTENTSTORE_SYNC_ENABLED"] = True. | ||
| """ | ||
| return bool(self.xblock_settings.get("CONTENTSTORE_SYNC_ENABLED", False)) |
There was a problem hiding this comment.
Why are we type casting this? We've provided a default value here.
| from xmodule.contentstore.django import contentstore | ||
| from xmodule.contentstore.content import StaticContent |
There was a problem hiding this comment.
Unless these packages have been deprecated or renamed somehow, I believe xmodule is available in the openedx-platform itself and we don't need to put a try catch here as we can assume openedx is installed.
Do correct me if I'm wrong. My knowledge on this might be a little stale.
| contentstore().save(content) | ||
| logger.info("Mirrored SCORM zip to contentstore at %s", asset_key) | ||
| except Exception: | ||
| logger.exception("Failed to mirror SCORM zip to contentstore") |
There was a problem hiding this comment.
We should log the actual error that we face for debugging purposes.
| if not self.contentstore_sync_enabled: | ||
| return None | ||
| try: | ||
| from xmodule.contentstore.django import contentstore |
|
|
||
| course_key = getattr(self.runtime, "course_id", None) | ||
| asset_path = self._contentstore_asset_path() | ||
| if course_key is None or asset_path is None: |
There was a problem hiding this comment.
We should log this issue here rather than just exiting.
| ) | ||
| return True | ||
| except Exception: | ||
| logger.exception("Failed to rehydrate SCORM package from contentstore") |
| * On upload, the original zip is written to the current course's contentstore as ``scorm_packages/<sha1>.zip`` (locked). Course export bundles contentstore assets into the OLX tarball; course import copies them under the new ``course_key`` automatically. | ||
| * On read, if ``package_meta`` is set but the extracted tree is missing from ``default_storage``, the XBlock fetches the zip from the current course's contentstore and re-extracts it into ``default_storage``. | ||
|
|
||
| This composes with the per-block extraction path introduced in PR #71 (extraction is still keyed by ``sha1(usage_key)``, so courses cannot share extraction directories). Pre-existing blocks already have a populated ``default_storage`` cache, so the rehydrate branch never fires for them. The feature is off by default; if anything misbehaves in your environment, flipping it back to ``False`` restores the prior behavior exactly. |
There was a problem hiding this comment.
No need for such details. We can just mention that this is off by default for xyz reason.
| Disabled by default; opt in by setting | ||
| XBLOCK_SETTINGS["ScormXBlock"]["CONTENTSTORE_SYNC_ENABLED"] = True. | ||
| """ | ||
| return bool(self.xblock_settings.get("CONTENTSTORE_SYNC_ENABLED", False)) |
There was a problem hiding this comment.
I am still not sure whether this should be False by default or not considering it is a request feature. @ahmed-arb What do you think?
| data, | ||
| locked=True, | ||
| ) | ||
| contentstore().save(content) |
There was a problem hiding this comment.
We should be deleting the previous scorm file when uploading the new one. This replicates the behaviour already defined here
| """ | ||
| Mirror the original SCORM zip into the course's contentstore. | ||
|
|
||
| Best-effort: any failure (missing imports, no course context, contentstore |
There was a problem hiding this comment.
nit: Lets try not to use obvious AI generated comments
| content = contentstore().find(asset_key) | ||
| return content.data | ||
| except Exception as exc: | ||
| if NotFoundError is not None and isinstance(exc, NotFoundError): |
There was a problem hiding this comment.
This line seems very confusing. Maybe it can be made more readable?
Danyal-Faheem
left a comment
There was a problem hiding this comment.
I am not sure if this is the best approach for this as I'd like to stay away from contentstore as much as possible but since this is an important change, I'll agree with it for now.
What this does
SCORM zips live in
default_storagewhich is never included in OLX course tarballs. This means exported courses keep their SCORM metadata (filename, sha1, size) but lose the actual package — imported blocks load broken.This PR adds an opt-in sync that mirrors uploaded SCORM zips into the course's contentstore. Since contentstore assets are already bundled into OLX exports and restored on import, the block survives a full export → import round-trip without any changes to the export/import pipeline.
Gated behind
XBLOCK_SETTINGS["ScormXBlock"]["CONTENTSTORE_SYNC_ENABLED"](defaultFalse) so existing deployments are unaffected.Changes
studio_submit— afterextract_package, mirrors the zip into contentstore asscorm_packages/<sha1>.zip(locked). All contentstore calls are wrapped intry/exceptso uploads still succeed if contentstore is unavailable (tests, non-platform environments).index_page_url/assets_proxy— ifpackage_metais populated but the unpacked tree is missing fromdefault_storage, fetches the zip from contentstore and re-extracts. Fires once after import; after that the cache is warm and behavior is identical to today.Pre-existing blocks are unaffected — they already have a populated
default_storagecache so the rehydrate branch never fires for them.README + changelog updated.
Why this approach
sha1(usage_key)extraction path is fully preservedCo-authored-by: Claude Code claude-code@anthropic.com