Add migration scripts integration tests #55

trevorbonas · 2025-05-20T19:56:44Z

Changes:

Integration tests have been added that run all scripts end-to-end.
All scripts now use absolute import paths, meaning they can be run from any directory.
All scripts have a new main() function that accepts a list of arguments. This enables testing and a single wrapper script in the future.
validator.py now returns exit code 1 if any exceptions occur or if the counts don't match.
The method sync_line_protocol_to_storage has been added to s3_utils.py and allows line protocol files to be downloaded to a local directory. This is crucial for integration tests to work.
A test_scripts directory has been added, with scripts that set up a local InfluxDB Docker container for testing.
timestream-export-logs added to .gitignore.

Refining structure for migration scripts

Changes: - The cardinality calculation script has been added. - `main.py` and `cardinality.py` have been combined into a single `cardinality.py` file. - A README has been added for the script. - `timestream_utils.py` has been updated. - The class `timestreamUtility` has been renamed to `TimestreamUtility`. - Static functions for validating dimension, database, and table names have been added to `TimestreamUtility`. - The function `comma_separated_list` has been added to `TimestreamUtility`. - Broken `utils` imports in `timestream_utils.py` have been fixed. - `timestream_client` renamed to `timestream_write_client`, for clarity. - On line 105, the undefined variable `parition_count` has been corrected to `partition_count`. - `timestream_utils.py` has been reformatted with `ruff`. - `s3_utils.py` has been updated. - Broken `utils` import has been fixed. - The class `s3Utility` has been changed to `S3Utility`. - `unload.py` has been updated. - References to `s3Utility` and `timestreamUtility` have been updated. - Wording of various CLI options have been fixed. - All CLI options use `-` instead of `_` between words. - `datetime` object properly used. - Reformatted with `ruff`.

* Initial commit for integrating ingestion script. Signed-off-by: forestmvey <[email protected]> * Fixing ingestion_test naming for ingestion script. Signed-off-by: forestmvey <[email protected]> * Removing unecessary deprecation warning removal. Signed-off-by: forestmvey <[email protected]> * Fixing dead references to tests. Signed-off-by: forestmvey <[email protected]> * Update to using python3 for virtual environment. Signed-off-by: forestmvey <[email protected]> * Adding directories to the appropriate migration scripts in parent README. Signed-off-by: forestmvey <[email protected]> * Configuring docker to delete influx image after successful execution. Minor documentation revisions. Signed-off-by: forestmvey <[email protected]> * Adding test for resume functionality. Signed-off-by: forestmvey <[email protected]> --------- Signed-off-by: forestmvey <[email protected]>

Adding validation script

Changes: - `transform.py` has been added, allowing already-unloaded Timestream for LiveAnalytics data to be transformed into line protocol using Athena. - `athena_utils.py` has been added. - `TimestreamUtility` now has a wrapper function for `list_tables`. - `S3Utility` now has the method `s3_bucket_exists` for checking the existence of an S3 bucket. - The region for the `S3Utility` constructor defaults to `None`, causing the AWS configured default region to be used. An `S3Utility` object can now be created simply with `S3Utility()`. - A single `requirements.txt` file has been added to `liveanalytics_migration_scripts/` and is intended to be used for all scripts within. - All other `requirements.txt` files have been removed. - The path where data would be unloaded by `unload.py` used to include a space for example, `unload-2025-05-07 22:31:43`. This space has been replaced with `-`.

… README (#49) * Initial commit for migration README, starting timestream for InfluxDB migration README. Signed-off-by: forestmvey <[email protected]> * Initial version of the Timestream for InfluxDB migration README. Signed-off-by: forestmvey <[email protected]> * Minor revisions to workflow stages and updating script names and parameters. Signed-off-by: forestmvey <[email protected]> --------- Signed-off-by: forestmvey <[email protected]>

Changes: - Add `--start-time` and `--end-time` to `cardinality.py`. - Fix typos `validation/README.md`.

Consolidating installation docs, minor cleanup

…eam for InfluxDB (awslabs#213) * Initial structure for Timestream to InfluxDB migration Refining structure for migration scripts * Add cardinality calculation script (#45) Changes: - The cardinality calculation script has been added. - `main.py` and `cardinality.py` have been combined into a single `cardinality.py` file. - A README has been added for the script. - `timestream_utils.py` has been updated. - The class `timestreamUtility` has been renamed to `TimestreamUtility`. - Static functions for validating dimension, database, and table names have been added to `TimestreamUtility`. - The function `comma_separated_list` has been added to `TimestreamUtility`. - Broken `utils` imports in `timestream_utils.py` have been fixed. - `timestream_client` renamed to `timestream_write_client`, for clarity. - On line 105, the undefined variable `parition_count` has been corrected to `partition_count`. - `timestream_utils.py` has been reformatted with `ruff`. - `s3_utils.py` has been updated. - Broken `utils` import has been fixed. - The class `s3Utility` has been changed to `S3Utility`. - `unload.py` has been updated. - References to `s3Utility` and `timestreamUtility` have been updated. - Wording of various CLI options have been fixed. - All CLI options use `-` instead of `_` between words. - `datetime` object properly used. - Reformatted with `ruff`. * Adding InfluxDB ingestion script (#46) * Initial commit for integrating ingestion script. Signed-off-by: forestmvey <[email protected]> * Fixing ingestion_test naming for ingestion script. Signed-off-by: forestmvey <[email protected]> * Removing unecessary deprecation warning removal. Signed-off-by: forestmvey <[email protected]> * Fixing dead references to tests. Signed-off-by: forestmvey <[email protected]> * Update to using python3 for virtual environment. Signed-off-by: forestmvey <[email protected]> * Adding directories to the appropriate migration scripts in parent README. Signed-off-by: forestmvey <[email protected]> * Configuring docker to delete influx image after successful execution. Minor documentation revisions. Signed-off-by: forestmvey <[email protected]> * Adding test for resume functionality. Signed-off-by: forestmvey <[email protected]> --------- Signed-off-by: forestmvey <[email protected]> * Add migration validation script (#47) Adding validation script * Add transform script (#48) Changes: - `transform.py` has been added, allowing already-unloaded Timestream for LiveAnalytics data to be transformed into line protocol using Athena. - `athena_utils.py` has been added. - `TimestreamUtility` now has a wrapper function for `list_tables`. - `S3Utility` now has the method `s3_bucket_exists` for checking the existence of an S3 bucket. - The region for the `S3Utility` constructor defaults to `None`, causing the AWS configured default region to be used. An `S3Utility` object can now be created simply with `S3Utility()`. - A single `requirements.txt` file has been added to `liveanalytics_migration_scripts/` and is intended to be used for all scripts within. - All other `requirements.txt` files have been removed. - The path where data would be unloaded by `unload.py` used to include a space for example, `unload-2025-05-07 22:31:43`. This space has been replaced with `-`. * Add LiveAnalytics migration readme and Timestream for InfluxDB target README (#49) * Initial commit for migration README, starting timestream for InfluxDB migration README. Signed-off-by: forestmvey <[email protected]> * Initial version of the Timestream for InfluxDB migration README. Signed-off-by: forestmvey <[email protected]> * Minor revisions to workflow stages and updating script names and parameters. Signed-off-by: forestmvey <[email protected]> --------- Signed-off-by: forestmvey <[email protected]> * additional rebase cleanup * Consolidating env, fixing links * Add time range to cardinality script (#52) Changes: - Add `--start-time` and `--end-time` to `cardinality.py`. - Fix typos `validation/README.md`. * Consolidating installation docs, minor cleanup (#54) Consolidating installation docs, minor cleanup * restructuring --------- Signed-off-by: forestmvey <[email protected]> Co-authored-by: Trevor Bonas <[email protected]> Co-authored-by: Forest Vey <[email protected]>

…eam-tools into dev-migration-scripts-integration-tests

Signed-off-by: forestmvey <[email protected]>

…eam-tools into dev-migration-scripts-integration-tests

…mazon-timestream-tools into dev-migration-scripts-integration-tests

… into dev-migration-scripts-integration-tests

tools/python/liveanalytics_migration_scripts/test_scripts/influxdb-restart.sh

tools/python/liveanalytics_migration_scripts/pytest.ini

tools/python/liveanalytics_migration_scripts/integration_test.py

…uxdb-restart.sh Co-authored-by: Forest Vey <[email protected]>

tools/python/liveanalytics_migration_scripts/integration_test.py

forestmvey · 2025-05-22T15:50:06Z

tools/python/liveanalytics_migration_scripts/unload/__init__.py

@@ -0,0 +1 @@
+from .unload import *


I am having issues executing any of the scripts using the non-module method:

python3 unload/unload.py --export-table --database InfluxDBMetrics --table boltdb_reads_total --start-time '2020-01-01 11:11:11' --enable-dynamodb-logger true Traceback (most recent call last): File "/Users/.../liveanalytics_migration_scripts/unload/unload.py", line 11, in <module> from unload.utils.logger_utils import create_logger File "/Users/.../liveanalytics_migration_scripts/unload/unload.py", line 11, in <module> from unload.utils.logger_utils import create_logger ModuleNotFoundError: No module named 'unload.utils'; 'unload' is not a package

I am however able to run the scripts as modules:

python3 -m unload.unload --export-table --database InfluxDBMetrics --table boltdb_reads_total --start-time '2020-01-01 11:11:11' --enable-dynamodb-logger true

This should be fixed now. All scripts should be able to be executed directly, from any directory. Let me know if you if you still have any issues.

fredjoonpark · 2025-05-22T17:26:03Z

tools/python/liveanalytics_migration_scripts/README.md

@@ -81,3 +81,40 @@ Create a virtual environment using `venv` and install required dependencies.
    - [Timestream for InfluxDB](./targets/timestream_for_influxdb/README.md)
    - [RDS for PostgreSQL](./targets/rds_for_postgresql/README.md)

+## Testing


Thoughts on moving this testing section out into its own readme in the tests directory?

Also, I think it might make more sense to separate out integration tests for common utils (cardinality, unload) from the migration integration tests. Then, we could add in end-to-end tests which combines unload + migration. Having this kind of structure makes it easier for me to understand which flow is being tested, while making a distinction between integration/functional tests and end-to-end/smoke tests.

for additional context of how Im defining integration/e2e tests: something like integ/common.py (which tests cardinality, unload) and integ/timestream_for_influxdb (which tests transform+ingestion+validation) and in the future maybe integ/postgres. for end-to-end, have e2e/la_to_influx (card + unload + t/i/v) and maybe e2e/la_to_postgres. more a suggestion than anything, happy to discuss further - hopefully this will make it easier for us to think about how we go about designing the wrapper script if it comes down to it

Sure, I can create a README.md in tests/.

We can separate tests by creating multiple test cases. Currently, there is only one, MigrationTest.

I have created a README.md for testing and tests have now been separated into test cases.

bobigbal and others added 30 commits May 2, 2025 21:45

migration scripts

131e59b

fixing images

0d90054

fixing images-1

80df8ae

fixing images-2

ad95e52

minor changes in README

dceaa1f

Delete tools/python/migration-timestream-s3 directory

21bfd72

minor changes in README

f384c9e

minor changes in README

3de2b07

fix syntax error

92c0c2e

additional validations

00fb6de

adding order_by_asc arg

4033cad

minor change

c509683

postgres ingestion

1e5111d

postgres ingestion 1

5420ebc

Initial structure for Timestream to InfluxDB migration

f6f2c9c

Refining structure for migration scripts

Add migration validation script (#47)

66987bd

Adding validation script

additional rebase cleanup

65c813b

Consolidating env, fixing links

884c60e

Add time range to cardinality script (#52)

9d1b58d

Changes: - Add `--start-time` and `--end-time` to `cardinality.py`. - Fix typos `validation/README.md`.

Consolidating installation docs, minor cleanup (#54)

afbd7e3

Consolidating installation docs, minor cleanup

restructuring

d58be01

Remove unused imports in unload.py

613c851

Fix escaped help message in unload.py

7a21b91

Merge branch 'migration-scripts' of github.com:awslabs/amazon-timestr…

8178ef1

…eam-tools into dev-migration-scripts-integration-tests

Improve failed Athena query error message

807d323

bobigbal and others added 16 commits May 15, 2025 22:24

changes in unload

8e3a4d1

changes in unload

ffd94d3

Fixing codeql scans. (awslabs#220)

40d20aa

Signed-off-by: forestmvey <[email protected]>

addr comments (awslabs#221)

2281eff

Allow S3 bucket paths to be flexible

66107a3

Check s3_lp_output_path before calling rstrip

67e9fe6

Update README with new options

5531034

Update Timestream for InfluxDB target README

08ddf3e

Remove s3:// from S3 bucket paths as early as possible

2ad59d6

Merge branch 'migration-scripts' of github.com:awslabs/amazon-timestr…

52a4c16

…eam-tools into dev-migration-scripts-integration-tests

Merge branch 'dev-transform-bucket-parsing' of github.com:Bit-Quill/a…

193963a

…mazon-timestream-tools into dev-migration-scripts-integration-tests

Add basic tests

735e449

Merge branch 'mainline' of github.com:awslabs/amazon-timestream-tools…

01f66ba

… into dev-migration-scripts-integration-tests

Add first version of basic test

3eb7f3d

Remove export logs

10d43f7

Add more test cases

4249bd2

trevorbonas marked this pull request as ready for review May 20, 2025 22:33

Remove empty __init__.py

119fff8

forestmvey reviewed May 21, 2025

View reviewed changes

tools/python/liveanalytics_migration_scripts/test_scripts/influxdb-restart.sh Outdated Show resolved Hide resolved

forestmvey reviewed May 21, 2025

View reviewed changes

tools/python/liveanalytics_migration_scripts/pytest.ini Show resolved Hide resolved

forestmvey reviewed May 21, 2025

View reviewed changes

tools/python/liveanalytics_migration_scripts/integration_test.py Outdated Show resolved Hide resolved

trevorbonas and others added 2 commits May 21, 2025 09:38

Update tools/python/liveanalytics_migration_scripts/test_scripts/infl…

f11edfb

…uxdb-restart.sh Co-authored-by: Forest Vey <[email protected]>

Add silence_cleanup_logging flag to integration_test.py

11173d4

forestmvey reviewed May 22, 2025

View reviewed changes

tools/python/liveanalytics_migration_scripts/integration_test.py Outdated Show resolved Hide resolved

forestmvey reviewed May 22, 2025

View reviewed changes

Move integration_test.py into tests/integration

1112aa6

fredjoonpark reviewed May 22, 2025

View reviewed changes

trevorbonas added 3 commits May 27, 2025 17:12

Reorganize test files

0c838c2

Add documentation for testing

dc14e3e

Fix BaseIntegrationTestCase comment

a7b79ca

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add migration scripts integration tests #55

Add migration scripts integration tests #55

Uh oh!

trevorbonas commented May 20, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

forestmvey May 22, 2025

Uh oh!

trevorbonas May 28, 2025 •

edited

Loading

Uh oh!

fredjoonpark May 22, 2025

Uh oh!

fredjoonpark May 22, 2025

Uh oh!

trevorbonas May 22, 2025

Uh oh!

trevorbonas May 28, 2025 •

edited

Loading

Uh oh!

Uh oh!

Add migration scripts integration tests #55

Are you sure you want to change the base?

Add migration scripts integration tests #55

Uh oh!

Conversation

trevorbonas commented May 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

forestmvey May 22, 2025

Choose a reason for hiding this comment

Uh oh!

trevorbonas May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fredjoonpark May 22, 2025

Choose a reason for hiding this comment

Uh oh!

fredjoonpark May 22, 2025

Choose a reason for hiding this comment

Uh oh!

trevorbonas May 22, 2025

Choose a reason for hiding this comment

Uh oh!

trevorbonas May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

trevorbonas commented May 20, 2025 •

edited

Loading

trevorbonas May 28, 2025 •

edited

Loading

trevorbonas May 28, 2025 •

edited

Loading