forked from sigmf/sigmf-python
-
Notifications
You must be signed in to change notification settings - Fork 0
Multi recording archive #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
jhazentia
wants to merge
29
commits into
main
Choose a base branch
from
multi-recording-archive
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from 20 commits
Commits
Show all changes
29 commits
Select commit
Hold shift + click to select a range
208d74b
add support for multiple recordings in archives
jhazentia 44462f1
fix SigMFArchiveReader error
jhazentia 832b731
support single or multiple sigmffiles in archive __init__()
jhazentia 8d25adf
renamed archive "name" to "path", allow os.PathLike
jhazentia 4f58453
Fixed bug in checking sigmffiles type
jhazentia 89242c8
add test for missing name
jhazentia 0c503ab
require name in SigMFFile constructor
jhazentia d234ddf
return single or list of SigMFFiles in fromarchive
jhazentia 348bed8
fix some formatting, unused imports, docstrings, rename archivereader…
jhazentia b6df262
add support for collections in archives, check for path and fileobj i…
jhazentia 4cfc8c2
rename collectionfile to collection
jhazentia ea4e633
make json end of file new line consistent, add support for collection…
jhazentia 68c6825
add README examples for archives with multiple recordings
jhazentia 454dd34
fix archive docstring, remove unneeded variables from archivereader
jhazentia af9002d
simplify SigMFCollection archive tests
jhazentia f1d108b
organize SigMFFile constructor doc string
jhazentia a631eb3
clarify different ways to do the same thing in README
jhazentia 74a7b86
fix typo
jhazentia ae4c424
Merge branch 'main' of https://github.com/NTIA/sigmf-python into mult…
jhazentia 93ab02b
add support for passing SigMFFile objects to SigMFCollection to impro…
jhazentia 5376ece
fix SigMFCollection docstring
jhazentia 46e7d8f
SigMFCollection set_streams() will check type for each element of met…
jhazentia 660ba82
break up and simplify archive examples in README
jhazentia e2919d8
fix docstring, add ability to control pretty print JSON for archive
jhazentia e4e1775
update docstrings, formatting
jhazentia 3131683
improve docstrings, remove duplicative test, add test for fromarchive…
jhazentia 29827af
fix error message
jhazentia b81289b
make archives work when using folders
jhazentia 15ca451
folders in archives are no longer created by default to maintain cons…
jhazentia File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -6,12 +6,17 @@ | |
|
|
||
| """Create and extract SigMF archives.""" | ||
|
|
||
| import collections | ||
| import os | ||
| import shutil | ||
| import tarfile | ||
| import tempfile | ||
| from typing import BinaryIO, Iterable, Union | ||
|
|
||
| from .error import SigMFFileError | ||
| import sigmf | ||
|
|
||
|
|
||
| from .error import SigMFFileError, SigMFValidationError | ||
|
|
||
|
|
||
| SIGMF_ARCHIVE_EXT = ".sigmf" | ||
|
|
@@ -21,59 +26,75 @@ | |
|
|
||
|
|
||
| class SigMFArchive(): | ||
| """Archive a SigMFFile. | ||
| """Archive one or more `SigMFFile`s. A collection file can | ||
| optionally be included. | ||
|
|
||
| A `.sigmf` file must include both valid metadata and data. | ||
| If `self.data_file` is not set or the requested output file | ||
| is not writable, raise `SigMFFileError`. | ||
|
|
||
| Parameters: | ||
|
|
||
| sigmffile -- A SigMFFile object with valid metadata and data_file | ||
|
|
||
| name -- path to archive file to create. If file exists, overwrite. | ||
| If `name` doesn't end in .sigmf, it will be appended. | ||
| For example: if `name` == "/tmp/archive1", then the | ||
| following archive will be created: | ||
| /tmp/archive1.sigmf | ||
| - archive1/ | ||
| - archive1.sigmf-meta | ||
| - archive1.sigmf-data | ||
|
|
||
| fileobj -- If `fileobj` is specified, it is used as an alternative to | ||
| a file object opened in binary mode for `name`. It is | ||
| supposed to be at position 0. `name` is not required, but | ||
| if specified will be used to determine the directory and | ||
| file names within the archive. `fileobj` won't be closed. | ||
| For example: if `name` == "archive1" and fileobj is given, | ||
| a tar archive will be written to fileobj with the | ||
| following structure: | ||
| - archive1/ | ||
| - archive1.sigmf-meta | ||
| - archive1.sigmf-data | ||
| sigmffiles -- A single SigMFFile or an iterable of SigMFFile objects with | ||
| valid metadata and data_files | ||
|
|
||
| collection -- An optional SigMFCollection. | ||
|
|
||
| path -- Path to archive file to create. If file exists, overwrite. | ||
| If `path` doesn't end in .sigmf, it will be appended. The | ||
| `self.path` instance variable will be updated upon | ||
| successful writing of the archive to point to the final | ||
| archive path. | ||
|
|
||
|
|
||
| fileobj -- If `fileobj` is specified, it is used as an alternative to | ||
| a file object opened in binary mode for `path`. If | ||
| `fileobj` is an open tarfile, it will be appended to. It is | ||
| supposed to be at position 0. `fileobj` won't be closed. If | ||
| `fileobj` is given, `path` has no effect. | ||
| """ | ||
| def __init__(self, sigmffile, name=None, fileobj=None): | ||
| self.sigmffile = sigmffile | ||
| self.name = name | ||
| def __init__(self, | ||
| sigmffiles: Union["sigmf.sigmffile.SigMFFile", | ||
| Iterable["sigmf.sigmffile.SigMFFile"]], | ||
| collection: "sigmf.sigmffile.SigMFCollection" = None, | ||
| path: Union[str, os.PathLike] = None, | ||
| fileobj: BinaryIO = None): | ||
|
|
||
| if (not path) and (not fileobj): | ||
| raise SigMFFileError("'path' or 'fileobj' required for creating " | ||
| "SigMF archive!") | ||
|
|
||
| if isinstance(sigmffiles, sigmf.sigmffile.SigMFFile): | ||
| self.sigmffiles = [sigmffiles] | ||
| elif (hasattr(collections, "Iterable") and | ||
| isinstance(sigmffiles, collections.Iterable)): | ||
| self.sigmffiles = sigmffiles | ||
| elif isinstance(sigmffiles, collections.abc.Iterable): # python 3.10 | ||
| self.sigmffiles = sigmffiles | ||
| else: | ||
| raise SigMFFileError("Unknown type for sigmffiles argument!") | ||
|
|
||
| if path: | ||
| self.path = str(path) | ||
| else: | ||
| self.path = None | ||
| self.fileobj = fileobj | ||
| self.collection = collection | ||
|
|
||
| self._check_input() | ||
|
|
||
| archive_name = self._get_archive_name() | ||
| mode = "a" if fileobj is not None else "w" | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Eventual PR should call out this change in behavior and note that it could be changed to preserver the original behavior of writing over the archive. |
||
| sigmf_fileobj = self._get_output_fileobj() | ||
| sigmf_archive = tarfile.TarFile(mode="w", | ||
| fileobj=sigmf_fileobj, | ||
| format=tarfile.PAX_FORMAT) | ||
| tmpdir = tempfile.mkdtemp() | ||
| sigmf_md_filename = archive_name + SIGMF_METADATA_EXT | ||
| sigmf_md_path = os.path.join(tmpdir, sigmf_md_filename) | ||
| sigmf_data_filename = archive_name + SIGMF_DATASET_EXT | ||
| sigmf_data_path = os.path.join(tmpdir, sigmf_data_filename) | ||
|
|
||
| with open(sigmf_md_path, "w") as mdfile: | ||
| self.sigmffile.dump(mdfile, pretty=True) | ||
|
|
||
| shutil.copy(self.sigmffile.data_file, sigmf_data_path) | ||
| try: | ||
| sigmf_archive = tarfile.TarFile(mode=mode, | ||
| fileobj=sigmf_fileobj, | ||
| format=tarfile.PAX_FORMAT) | ||
| except tarfile.ReadError: | ||
| # fileobj doesn't contain any archives yet, so reopen in 'w' mode | ||
| sigmf_archive = tarfile.TarFile(mode='w', | ||
| fileobj=sigmf_fileobj, | ||
| format=tarfile.PAX_FORMAT) | ||
|
|
||
| def chmod(tarinfo): | ||
| if tarinfo.isdir(): | ||
|
|
@@ -82,47 +103,102 @@ def chmod(tarinfo): | |
| tarinfo.mode = 0o644 # -wr-r--r-- | ||
| return tarinfo | ||
|
|
||
| sigmf_archive.add(tmpdir, arcname=archive_name, filter=chmod) | ||
| if collection: | ||
| with tempfile.NamedTemporaryFile(mode="w") as tmpfile: | ||
| collection.dump(tmpfile, pretty=True) | ||
| tmpfile.flush() | ||
| collection_filename = archive_name + SIGMF_COLLECTION_EXT | ||
| sigmf_archive.add(tmpfile.name, | ||
| arcname=collection_filename, | ||
| filter=chmod) | ||
|
|
||
| for sigmffile in self.sigmffiles: | ||
| with tempfile.TemporaryDirectory() as tmpdir: | ||
| sigmf_md_filename = sigmffile.name + SIGMF_METADATA_EXT | ||
| sigmf_md_path = os.path.join(tmpdir, sigmf_md_filename) | ||
| sigmf_data_filename = sigmffile.name + SIGMF_DATASET_EXT | ||
| sigmf_data_path = os.path.join(tmpdir, sigmf_data_filename) | ||
|
|
||
| with open(sigmf_md_path, "w") as mdfile: | ||
| sigmffile.dump(mdfile, pretty=True) | ||
|
|
||
| shutil.copy(sigmffile.data_file, sigmf_data_path) | ||
| sigmf_archive.add(tmpdir, arcname=sigmffile.name, filter=chmod) | ||
|
|
||
| sigmf_archive.close() | ||
| if not fileobj: | ||
| sigmf_fileobj.close() | ||
|
|
||
| shutil.rmtree(tmpdir) | ||
| else: | ||
| sigmf_fileobj.seek(0) # ensure next open can read this as a tar | ||
|
|
||
| self.path = sigmf_archive.name | ||
|
|
||
| def _check_input(self): | ||
| self._ensure_name_has_correct_extension() | ||
| self._ensure_data_file_set() | ||
| self._validate_sigmffile_metadata() | ||
|
|
||
| def _ensure_name_has_correct_extension(self): | ||
| name = self.name | ||
| if name is None: | ||
| self._ensure_path_has_correct_extension() | ||
| for sigmffile in self.sigmffiles: | ||
| self._ensure_sigmffile_name_set(sigmffile) | ||
| self._ensure_data_file_set(sigmffile) | ||
| self._validate_sigmffile_metadata(sigmffile) | ||
| if self.collection: | ||
| self._validate_sigmffile_collection(self.collection, | ||
| self.sigmffiles) | ||
|
|
||
| def _ensure_path_has_correct_extension(self): | ||
| path = self.path | ||
| if path is None: | ||
| return | ||
|
|
||
| has_extension = "." in name | ||
| has_correct_extension = name.endswith(SIGMF_ARCHIVE_EXT) | ||
| has_extension = "." in path | ||
| has_correct_extension = path.endswith(SIGMF_ARCHIVE_EXT) | ||
| if has_extension and not has_correct_extension: | ||
| apparent_ext = os.path.splitext(name)[-1] | ||
| apparent_ext = os.path.splitext(path)[-1] | ||
| err = "extension {} != {}".format(apparent_ext, SIGMF_ARCHIVE_EXT) | ||
| raise SigMFFileError(err) | ||
|
|
||
| self.name = name if has_correct_extension else name + SIGMF_ARCHIVE_EXT | ||
| self.path = path if has_correct_extension else path + SIGMF_ARCHIVE_EXT | ||
|
|
||
| @staticmethod | ||
| def _ensure_sigmffile_name_set(sigmffile): | ||
| if not sigmffile.name: | ||
| err = "the `name` attribute must be set to pass to `SigMFArchive`" | ||
| raise SigMFFileError(err) | ||
|
|
||
| def _ensure_data_file_set(self): | ||
| if not self.sigmffile.data_file: | ||
| @staticmethod | ||
| def _ensure_data_file_set(sigmffile): | ||
| if not sigmffile.data_file: | ||
| err = "no data file - use `set_data_file`" | ||
| raise SigMFFileError(err) | ||
|
|
||
| def _validate_sigmffile_metadata(self): | ||
| self.sigmffile.validate() | ||
| @staticmethod | ||
| def _validate_sigmffile_metadata(sigmffile): | ||
| sigmffile.validate() | ||
|
|
||
| @staticmethod | ||
| def _validate_sigmffile_collection(collectionfile, sigmffiles): | ||
| if len(collectionfile) != len(sigmffiles): | ||
| raise SigMFValidationError("Mismatched number of recordings " | ||
| "between sigmffiles and collection " | ||
| "file!") | ||
| streams_key = collectionfile.STREAMS_KEY | ||
| streams = collectionfile.get_collection_field(streams_key) | ||
| sigmf_meta_hashes = [s["hash"] for s in streams] | ||
| if not streams: | ||
| raise SigMFValidationError("No recordings in collection file!") | ||
| for sigmffile in sigmffiles: | ||
| with tempfile.NamedTemporaryFile(mode="w") as tmpfile: | ||
| sigmffile.dump(tmpfile, pretty=True) | ||
| tmpfile.flush() | ||
| meta_path = tmpfile.name | ||
| sigmf_meta_hash = sigmf.sigmf_hash.calculate_sha512(meta_path) | ||
| if sigmf_meta_hash not in sigmf_meta_hashes: | ||
| raise SigMFValidationError("SigMFFile given that " | ||
| "is not in collection file!") | ||
|
|
||
| def _get_archive_name(self): | ||
| if self.fileobj and not self.name: | ||
| if self.fileobj and not self.path: | ||
| pathname = self.fileobj.name | ||
| else: | ||
| pathname = self.name | ||
| pathname = self.path | ||
|
|
||
| filename = os.path.split(pathname)[-1] | ||
| archive_name, archive_ext = os.path.splitext(filename) | ||
|
|
@@ -135,7 +211,7 @@ def _get_output_fileobj(self): | |
| if self.fileobj: | ||
| err = "fileobj {!r} is not byte-writable".format(self.fileobj) | ||
| else: | ||
| err = "can't open {!r} for writing".format(self.name) | ||
| err = "can't open {!r} for writing".format(self.path) | ||
|
|
||
| raise SigMFFileError(err) | ||
|
|
||
|
|
@@ -146,6 +222,6 @@ def _get_open_fileobj(self): | |
| fileobj = self.fileobj | ||
| fileobj.write(bytes()) # force exception if not byte-writable | ||
| else: | ||
| fileobj = open(self.name, "wb") | ||
| fileobj = open(self.path, "wb") | ||
|
|
||
| return fileobj | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.