-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Clean up test data generation * Almanac data generation. * Linting
- Loading branch information
1 parent
2a35ae8
commit 7f5e6ca
Showing
28 changed files
with
451 additions
and
202 deletions.
There are no files selected for viewing
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,170 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# CLOUD unit test data\n", | ||
"\n", | ||
"There are two types of data used in unit tests in this repo: local and cloud. This notebook concerns itself only with the CLOUD versions of test data, so you can re-generate it.\n", | ||
"\n", | ||
"This also works to initialize data in a new cloud provider, instead of simply copying an existing data set.\n", | ||
"\n", | ||
"## Object catalog: small sky\n", | ||
"\n", | ||
"This is the same \"object catalog\" with 131 randomly generated radec values inside the order0-pixel11 healpix pixel that is used in hipscat and LSDB unit test suites." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"import hipscat_import.pipeline as runner\n", | ||
"from hipscat_import.catalog.arguments import ImportArguments\n", | ||
"from hipscat_import.index.arguments import IndexArguments\n", | ||
"from hipscat_import.margin_cache.margin_cache_arguments import MarginCacheArguments\n", | ||
"import tempfile\n", | ||
"from pathlib import Path\n", | ||
"import os\n", | ||
"\n", | ||
"tmp_path = tempfile.TemporaryDirectory()\n", | ||
"tmp_dir = tmp_path.name\n", | ||
"\n", | ||
"storage_options = {\n", | ||
" \"account_key\": os.environ.get(\"ABFS_LINCCDATA_ACCOUNT_KEY\"),\n", | ||
" \"account_name\": os.environ.get(\"ABFS_LINCCDATA_ACCOUNT_NAME\"),\n", | ||
"}\n", | ||
"storage_options" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"### small_sky\n", | ||
"\n", | ||
"This catalog was generated with the following snippet:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"args = ImportArguments(\n", | ||
" input_path=\"small_sky_parts\",\n", | ||
" highest_healpix_order=1,\n", | ||
" file_reader=\"csv\",\n", | ||
" output_path=\"abfs://hipscat/pytests/data\",\n", | ||
" output_artifact_name=\"small_sky\",\n", | ||
" output_storage_options=storage_options,\n", | ||
" overwrite=True,\n", | ||
" tmp_dir=tmp_dir,\n", | ||
" dask_tmp=tmp_dir,\n", | ||
")\n", | ||
"runner.pipeline(args)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"### small_sky_order1\n", | ||
"\n", | ||
"This catalog has the same data points as other small sky catalogs, but is coerced to spreading these data points over partitions at order 1, instead of order 0.\n", | ||
"\n", | ||
"This means there are 4 leaf partition files, instead of just 1, and so can be useful for confirming reads/writes over multiple leaf partition files.\n", | ||
"\n", | ||
"NB: Setting `constant_healpix_order` coerces the import pipeline to create leaf partitions at order 1.\n", | ||
"\n", | ||
"This catalog was generated with the following snippet:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"args = ImportArguments(\n", | ||
" input_path=\"small_sky_parts\",\n", | ||
" file_reader=\"csv\",\n", | ||
" constant_healpix_order=1,\n", | ||
" output_path=\"abfs://hipscat/pytests/data\",\n", | ||
" output_storage_options=storage_options,\n", | ||
" output_artifact_name=\"small_sky_order1\",\n", | ||
" tmp_dir=tmp_dir,\n", | ||
" dask_tmp=tmp_dir,\n", | ||
" overwrite=True,\n", | ||
")\n", | ||
"runner.pipeline(args)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Almanac info\n", | ||
"\n", | ||
"For the above catalogs, create almanac data." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from hipscat.inspection.almanac import Almanac\n", | ||
"from hipscat.inspection.almanac_info import AlmanacInfo\n", | ||
"\n", | ||
"almanac_info = AlmanacInfo.from_catalog_dir(\n", | ||
" \"abfs://hipscat/pytests/data/small_sky\", storage_options=storage_options\n", | ||
")\n", | ||
"almanac_info.write_to_file(\n", | ||
" directory=\"abfs://hipscat/pytests/data/almanac\", default_dir=False, storage_options=storage_options\n", | ||
")\n", | ||
"\n", | ||
"almanac_info = AlmanacInfo.from_catalog_dir(\n", | ||
" \"abfs://hipscat/pytests/data/small_sky_order1\", storage_options=storage_options\n", | ||
")\n", | ||
"almanac_info.write_to_file(\n", | ||
" directory=\"abfs://hipscat/pytests/data/almanac\", default_dir=False, storage_options=storage_options\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"tmp_path.cleanup()" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "hipscatenv", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.10.13" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 2 | ||
} |
Oops, something went wrong.