Skip to content

Releases: aws/aws-sdk-pandas

AWS Data Wrangler 1.6.2

01 Jul 19:39
Compare
Choose a tag to compare

Enhancements

  • Now casting columns before append on an existing table only if necessary (wr.s3.to_parquet()).
  • Add retry mechanism for InternalError on s3 object deletion.
  • Add handling of immutable numpy arrays. (flag.writeable==False)

P.S. Lambda Layer's zip-file and Glue's wheel/egg are available below. Just upload it and run!

P.P.S. AWS Data Wrangler counts on compiled dependencies (C/C++) so there is no support for Glue PySpark by now (Only Glue Python Shell).

AWS Data Wrangler 1.6.1

26 Jun 02:55
Compare
Choose a tag to compare

Enhancements

  • Casting support for any column type to string using dtype argument on wr.s3.to_parquet()

Bug Fix

  • General bugs related to Athena Cache. 🐞

Docs

  • General small updates.

P.S. Lambda Layer's zip-file and Glue's wheel/egg are available below. Just upload it and run!

P.P.S. AWS Data Wrangler counts on compiled dependencies (C/C++) so there is no support for Glue PySpark by now (Only Glue Python Shell).

AWS Data Wrangler 1.6.0

24 Jun 19:26
Compare
Choose a tag to compare

New Functionalities

  • Amazon Athena CACHE 🚀 #285
  • Initial AWS STS module

Enhancements

  • Numpy 1.19.0
  • Add auto_create and db_groups arguments to get_redshift_temp_engine #288
  • Add validate_schema arguments to wr.s3.read_parquet_table
  • Add safe argument to read_parquet #296
  • Refactor naming of pandas kwargs #291
  • Allow providing suffix to s3.store_parquet_metadata #295
  • Add last_modified_begin and last_modified_begin to list_objects, read_csv, read_json, read_fwf and read_parquet

Bug Fix

  • Fix bug on get_table_description on tables w/o description #294

Docs

Thanks

We thank the following contributors/users for their work on this release:

@koiker, @patrick-muller, @flaviomax, @acere, @jarretg, @bryanyang0528, @schrobot, @kinghuang, @igorborgest.


P.S. Lambda Layer's zip-file and Glue's wheel/egg are available below. Just upload it and run!

P.P.S. AWS Data Wrangler counts on compiled dependencies (C/C++) so there is no support for Glue PySpark by now (Only Glue Python Shell).

AWS Data Wrangler 1.5.0

14 Jun 01:39
01b7bef
Compare
Choose a tag to compare

New Functionalities

  • Amazon QuickSight support! 🎉
  • Add create/delete database on wr.glue

Enhancements

  • General improvements in the tutorials
  • New Amazon S3 path check
  • Add sanitize_columns arg for s3.to_parquet and s3.to_csv #278 #279
  • Remove memory copy of DataFrame for to_parquet and to_csv

Bug Fix

  • Force index=False for wr.db.to_sql() with redshift

Thanks

We thank the following contributors/users for their work on this release:

@ywang103, @patrick-muller, @tuliocasagrande, @sarojdongol, @sdknij, @ilyanoskov, @igorborgest.


P.S. Lambda Layer's zip-file and Glue's wheel/egg are available below. Just upload it and run!

P.P.S. AWS Data Wrangler counts on compiled dependencies (C/C++) so there is no support for Glue PySpark by now (Only Glue Python Shell).

AWS Data Wrangler 1.4.0

02 Jun 22:58
0dcda71
Compare
Choose a tag to compare

New Functionalities

  • Add support for reading CSV, JSON and FWF partitions. #265

Enhancements

  • General improvement of moto tests

Bug Fix

  • Fix encoding arg support for reading CSV, JSON and FWF. #271

Thanks

We thank the following contributors/users for their work on this release:

@bryanyang0528, @dwbelliston, @patrick-muller, @sdknij, @igorborgest.


P.S. Lambda Layer's zip-file and Glue's wheel/egg are available below. Just upload it and run!

P.P.S. AWS Data Wrangler counts on compiled dependencies (C/C++) so there is no support for Glue PySpark by now (Only Glue Python Shell).

AWS Data Wrangler 1.3.0

28 May 01:33
57b6133
Compare
Choose a tag to compare

New Functionalities

  • Support for Athena Partition Projection [TUTORIAL]

Enhancements

  • Bumping SQLAlchemy version to 1.3.15 #259
  • General improvement of moto tests #254

Bug Fix

  • Fix dtype (cast) on wr.s3.to_parquet with nested types #263
  • Fix EMR utilities for others region different than us-east-1 #252
  • Fix wr.s3.to_parquet for partitions in reverse order #264

Thanks

We thank the following contributors/users for their work on this release:

@bryanyang0528, @zachmoshe, @buseynehannes, @jiajie999, @igorborgest.


P.S. Lambda Layer's zip-file and Glue's wheel/egg are available below. Just upload it and run!

P.P.S. AWS Data Wrangler counts on compiled dependencies (C/C++) so there is no support for Glue PySpark by now (Only Glue Python Shell).

AWS Data Wrangler 1.2.0

20 May 12:08
34608fa
Compare
Choose a tag to compare

New Functionalities

Enhancements

Bug Fix

Thanks

We thank the following contributors/users for their work on this release:

@mrshu, @bryanyang0528, @JPFrancoia, @jaidisido, @qemtek, @dwbelliston, @mbiemann, @parasml, @BrainMonkey, @hyperloglog, @igorborgest.


P.S. Lambda Layer's zip-file and Glue's wheel/egg are available below. Just upload it and run!

P.P.S. AWS Data Wrangler counts on compiled dependencies (C/C++) so there is no support for Glue PySpark by now (Only Glue Python Shell).

AWS Data Wrangler 1.1.2

08 May 21:11
85b00d5
Compare
Choose a tag to compare

New Functionalities

  • Add support for uint8, uint16, uint32 and uint64 on Parquet. #76
  • Add get_table_parameters, upsert_table_parameters and upsert_table_parameters on wr.catalog. #224

Enhancements

  • Add readahead cache for s3fs.

Bug Fix

  • Fixing type hints for sortkey. #226
  • Fix s3.to_parquet overwriting with different partition schema.

Thanks

We thank the following contributors/users for their work on this release:

@robertaves ,@jar-no1, @JPFrancoia, @igorborgest.


P.S. Lambda Layer's zip-file and Glue's wheel/egg are available below. Just upload it and run!

P.P.S. AWS Data Wrangler counts on compiled dependencies (C/C++) so there is no support for Glue PySpark by now (Only Glue Python Shell).

AWS Data Wrangler 1.1.1

06 May 19:12
Compare
Choose a tag to compare

Bug Fix

  • Removing objects ending with "/" from wr.s3.list_objects()

P.S. Lambda Layer's zip-file and Glue's wheel/egg are available below. Just upload it and run!

P.P.S. AWS Data Wrangler counts on compiled dependencies (C/C++) so there is no support for Glue PySpark by now (Only Glue Python Shell).

AWS Data Wrangler 1.1.0

05 May 20:12
Compare
Choose a tag to compare

New Functionalities

  • Support for nested arrays and structs on wr.s3.to_parquet() #206
  • Support for Read Parquet/Athena/Redshift chunked by number of rows #192
  • Add custom_classifications to wr.emr.create_cluster() #193
  • Support for Docker on EMR #193
  • Add kms_key_id, max_file_size, region arguments to wr.db.unload_redshift() #197
  • Add catalog_versioning argument to wr.s3.to_csv() and wr.s3.to_parquet() #198
  • Add keep_files and ctas_temp_table_name arguments to wr.athena.read_sql_*() #203
  • Add replace_filenames argument to wr.s3.copy_objects() #215

Enhancements

  • wr.s3.to_csv() and wr.s3.to_parquet() no longer need delete table permission to overwrite catalog table #198
  • Added support for UUID on wr.db.read_sql_query()(PostgreSQL) #200
  • Refactoring of Athena encryption and workgroup support #212

Bug Fix

  • Support for read full NULL columns from PostgreSQL, MySQL, and Redshift #218

Thanks

We thank the following contributors/users for their work on this release:

@robkano ,@luigift, @parasml, @OElesin, @jar-no1, @keatmin, @pmleveque, @sapientderek, @jadayn, @igorborgest.


P.S. Lambda Layer's zip-file and Glue's wheel/egg are available below. Just upload it and run!

P.P.S. AWS Data Wrangler counts on compiled dependencies (C/C++) so there is no support for Glue PySpark by now (Only Glue Python Shell).