Skip to content

SCORM contentstore sync for OLX export/import#113

Closed
Syed-Ali-Abbas-568 wants to merge 2 commits into
overhangio:masterfrom
Syed-Ali-Abbas-568:syed-ali/contentstore-sync-branch
Closed

SCORM contentstore sync for OLX export/import#113
Syed-Ali-Abbas-568 wants to merge 2 commits into
overhangio:masterfrom
Syed-Ali-Abbas-568:syed-ali/contentstore-sync-branch

Conversation

@Syed-Ali-Abbas-568
Copy link
Copy Markdown
Collaborator

What this does

SCORM zips live in default_storage which is never included in OLX course tarballs. This means exported courses keep their SCORM metadata (filename, sha1, size) but lose the actual package — imported blocks load broken.

This PR adds an opt-in sync that mirrors uploaded SCORM zips into the course's contentstore. Since contentstore assets are already bundled into OLX exports and restored on import, the block survives a full export → import round-trip without any changes to the export/import pipeline.

Gated behind XBLOCK_SETTINGS["ScormXBlock"]["CONTENTSTORE_SYNC_ENABLED"] (default False) so existing deployments are unaffected.

Note: When enabled, uploaded SCORM packages will appear as scorm_packages/<sha1>.zip under Files & Uploads in Studio. These assets are locked and should not be deleted manually — they are what the block pulls from during an import.


Changes

studio_submit — after extract_package, mirrors the zip into contentstore as scorm_packages/<sha1>.zip (locked). All contentstore calls are wrapped in try/except so uploads still succeed if contentstore is unavailable (tests, non-platform environments).

index_page_url / assets_proxy — if package_meta is populated but the unpacked tree is missing from default_storage, fetches the zip from contentstore and re-extracts. Fires once after import; after that the cache is warm and behavior is identical to today.

Pre-existing blocks are unaffected — they already have a populated default_storage cache so the rehydrate branch never fires for them.

README + changelog updated.


Why this approach

Co-authored-by: Claude Code claude-code@anthropic.com

Syed-Ali-Abbas-568 and others added 2 commits May 4, 2026 11:33
Open edX's OLX export bundles contentstore assets into the export tarball,
but SCORM zips are unpacked into Django's default_storage and never make it
into the bundle. As a result, a course exported and re-imported into a new
course keeps the SCORM block's metadata but not its content. PR overhangio#71 made
this more visible by keying the extraction directory on sha1(usage_key)
instead of block_id; that fix prevented cross-course corruption but left
imported blocks pointing at an empty directory.

Add an opt-in sync that closes the gap without fighting PR overhangio#71:

- studio_submit: after extract_package, mirror the original zip into the
  current course's contentstore as scorm_packages/<sha1>.zip (locked).
- index_page_url and assets_proxy: before serving content, if package_meta
  is set but the unpacked tree is missing from default_storage, fetch the
  zip from the current course's contentstore and re-extract it.

Gated behind XBLOCK_SETTINGS["ScormXBlock"]["CONTENTSTORE_SYNC_ENABLED"],
defaulting to False; legacy behavior is unchanged unless an operator opts
in. Imports of xmodule.contentstore are lazy and every contentstore call
is wrapped in try/except so uploads still succeed when the contentstore
is unavailable (tests, non-platform environments).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@@ -0,0 +1 @@
- [Feature] Optionally mirror the SCORM zip into the course contentstore on upload and rehydrate the unpacked tree from the contentstore on first access. This makes blocks survive OLX course export/import without breaking the extraction path keyed by `sha1(usage_key)` introduced in PR #71. Disabled by default; enable with `XBLOCK_SETTINGS["ScormXBlock"]["CONTENTSTORE_SYNC_ENABLED"] = True`. (by @Syed-Ali-Abbas-568) No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The changelog entry can be reworded to better describe the change. We don't need to link the PR either.

# If this block was just imported via OLX, the unpacked files may be
# missing from default_storage. Rehydrate from the contentstore before
# serving. No-op when the feature is disabled or the tree exists.
self._rehydrate_from_contentstore()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assets_proxy is called on every request to the LMS and page load. We don't need to extract scorm files every time. Unless, there is a specific reason for this, we should only extract once when the xblock is initialized or on course import.

Disabled by default; opt in by setting
XBLOCK_SETTINGS["ScormXBlock"]["CONTENTSTORE_SYNC_ENABLED"] = True.
"""
return bool(self.xblock_settings.get("CONTENTSTORE_SYNC_ENABLED", False))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we type casting this? We've provided a default value here.

Comment on lines +934 to +935
from xmodule.contentstore.django import contentstore
from xmodule.contentstore.content import StaticContent
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless these packages have been deprecated or renamed somehow, I believe xmodule is available in the openedx-platform itself and we don't need to put a try catch here as we can assume openedx is installed.

Do correct me if I'm wrong. My knowledge on this might be a little stale.

contentstore().save(content)
logger.info("Mirrored SCORM zip to contentstore at %s", asset_key)
except Exception:
logger.exception("Failed to mirror SCORM zip to contentstore")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should log the actual error that we face for debugging purposes.

if not self.contentstore_sync_enabled:
return None
try:
from xmodule.contentstore.django import contentstore
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as above.


course_key = getattr(self.runtime, "course_id", None)
asset_path = self._contentstore_asset_path()
if course_key is None or asset_path is None:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should log this issue here rather than just exiting.

)
return True
except Exception:
logger.exception("Failed to rehydrate SCORM package from contentstore")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as above.

Comment thread README.rst
* On upload, the original zip is written to the current course's contentstore as ``scorm_packages/<sha1>.zip`` (locked). Course export bundles contentstore assets into the OLX tarball; course import copies them under the new ``course_key`` automatically.
* On read, if ``package_meta`` is set but the extracted tree is missing from ``default_storage``, the XBlock fetches the zip from the current course's contentstore and re-extracts it into ``default_storage``.

This composes with the per-block extraction path introduced in PR #71 (extraction is still keyed by ``sha1(usage_key)``, so courses cannot share extraction directories). Pre-existing blocks already have a populated ``default_storage`` cache, so the rehydrate branch never fires for them. The feature is off by default; if anything misbehaves in your environment, flipping it back to ``False`` restores the prior behavior exactly.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for such details. We can just mention that this is off by default for xyz reason.

Disabled by default; opt in by setting
XBLOCK_SETTINGS["ScormXBlock"]["CONTENTSTORE_SYNC_ENABLED"] = True.
"""
return bool(self.xblock_settings.get("CONTENTSTORE_SYNC_ENABLED", False))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still not sure whether this should be False by default or not considering it is a request feature. @ahmed-arb What do you think?

data,
locked=True,
)
contentstore().save(content)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be deleting the previous scorm file when uploading the new one. This replicates the behaviour already defined here

"""
Mirror the original SCORM zip into the course's contentstore.

Best-effort: any failure (missing imports, no course context, contentstore
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Lets try not to use obvious AI generated comments

content = contentstore().find(asset_key)
return content.data
except Exception as exc:
if NotFoundError is not None and isinstance(exc, NotFoundError):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line seems very confusing. Maybe it can be made more readable?

Copy link
Copy Markdown
Contributor

@Danyal-Faheem Danyal-Faheem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this is the best approach for this as I'd like to stay away from contentstore as much as possible but since this is an important change, I'll agree with it for now.

@Syed-Ali-Abbas-568 Syed-Ali-Abbas-568 moved this from Pending Triage to In Progress in Tutor project management May 12, 2026
@ahmed-arb ahmed-arb moved this from In Progress to Won't fix in Tutor project management May 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Won't fix

Development

Successfully merging this pull request may close these issues.

4 participants