Skip to content

Releases: aws/aws-sdk-pandas

AWS Data Wrangler 0.2.1

09 Jan 18:31
Compare
Choose a tag to compare

Enhancements

  • Support for empty dataframe for Pandas.read_sql_athena(ctas_approach=True)
  • Cleaning temp S3 files for Pandas.read_sql_athena(ctas_approach=True)
  • Inverting file format and file compression extensions (key suffix) (Hadoop/Spark/Hive compatibility)
  • Aurora ingestion revisited
  • Bumping dependencies version
  • Add Pandas.read_csv_prefix()
  • Improve Athena._normalize_name() rules
  • Improving autocomplete support
  • Simplifying everything on Sagemaker
  • Adding Glue.get_connection()
  • Adapt read_sql_athena(ctas_approach=True) for eventual consistency caveats.

Bugfixes

  • Fixing bug to fetch Glue tables comments
  • Fixing Spark for default Session

Docs

  • Add athena_nested.ipynb tutorial
  • Add catalog_and_metadata.ipynb tutorial

P.S. Lambda Layer's bundle and Glue's wheel/egg are available below. Just upload it and run!

P.P.S. Have you never used Layers? Check the step-by-step guide.

P.P.P.S. AWS Data Wrangler counts on compiled dependencies (C/C++) so there is no support for Glue PySpark by now (Only Glue Python Shell).

AWS Data Wrangler 0.2.0

03 Jan 00:25
Compare
Choose a tag to compare

Enhancements

  • Add description, parameters and column's comments as arguments to all methods that creates any Glue tables (METADATA).
  • Add several methods to explore the Glue Catalog.

P.S. Lambda Layer's bundle and Glue's wheel/egg are available below. Just upload it and run!

P.P.S. Have you never used Layers? Check the step-by-step guide.

P.P.P.S. AWS Data Wrangler counts on compiled dependencies (C/C++) so there is no support for Glue PySpark by now (Only Glue Python Shell).

AWS Data Wrangler 0.1.4

31 Dec 12:21
Compare
Choose a tag to compare

Enhancements

  • Pandas -> Aurora (MySQL/PostgreSQL) (Append/Overwrite) (Via S3)
  • Aurora -> Pandas (MySQL) (Via S3)
  • Aurora -> CSV (S3) (MySQL)
  • Smaller lambda layers

P.S. Lambda Layer's bundle and Glue's wheel/egg are available below. Just upload it and run!

P.P.S. Have you never used Layers? Check the step-by-step guide.

P.P.P.S. AWS Data Wrangler counts on compiled dependencies (C/C++) so there is no support for Glue PySpark by now (Only Glue Python Shell).

AWS Data Wrangler 0.1.3

20 Dec 22:04
Compare
Choose a tag to compare

Bugfixes

  • Fix Default Session bug for environments without credentials

P.S. Lambda Layer's bundle and Glue's wheel/egg are available below. Just upload it and run!

AWS Data Wrangler 0.1.1

20 Dec 15:07
Compare
Choose a tag to compare

Enhancements

  • Pandas to Redshift with upsert mode
  • Load SageMaker Job outputs
  • Default Session

P.S. Lambda Layer's bundle and Glue's wheel/egg are available below. Just upload it and run!

AWS Data Wrangler 0.1.0

13 Dec 12:59
Compare
Choose a tag to compare

Enhancements

  • Read Parquet tables from Glue Catalog directly to Pandas DataFrame
  • Read Athena's results to Pandas DataFrame via CTAS (Blazing fast 🚀)
  • Redshift's results to S3 as Parquet
  • Read Redshift's results to Pandas DataFrame via Parquet export (Blazing fast 🚀)

P.S. Lambda Layer's bundle and Glue's wheel/egg are available below. Just upload it and run!

AWS Data Wrangler 0.0.25

07 Dec 17:05
Compare
Choose a tag to compare

Enhancements

  • Read parquet data from s3 directly to Pandas DataFrame #73

Bugfixes

  • Fix Pandas.read_sql_athena() usage with the Session() default s3_output

P.S. Lambda Layer's bundle and Glue's wheel/egg are available below. Just upload it and run!

AWS Data Wrangler 0.0.24

05 Dec 10:50
Compare
Choose a tag to compare

Enhancements

  • Add support for Decimal data type #58
  • Add more Athena's settings in Session() (defaults)
  • Add PyArrow's toggle option for EMR.create_cluster()

Bugfixes

  • Fix Pandas.read_sql_athena() issues with arrays data types #72

P.S. Lambda Layer's bundle and Glue's wheel/egg are available below. It's just upload and run!

AWS Data Wrangler 0.0.23

23 Nov 02:25
Compare
Choose a tag to compare

Enhancements

  • Improving cast for date columns

P.S. Lambda Layer's bundle and Glue's wheel/egg are available below. It's just upload and run!

AWS Data Wrangler 0.0.22

22 Nov 21:26
8e5853b
Compare
Choose a tag to compare

Bugfixes

  • Setting null date values as None for pandas.read_sql_athena() #69

P.S. Lambda Layer's bundle and Glue's wheel/egg are available below. It's just upload and run!