Skip to content

Modin 0.17.0

Compare
Choose a tag to compare
@mvashishtha mvashishtha released this 11 Nov 14:45
· 834 commits to master since this release
e50cec1

This release includes support for pyhdk 0.2. It also includes many bug fixes and some performance enhancements.

Key Features and Updates Since 0.16.0

  • Stability and Bugfixes
    • FIX-#3764: Ensure df.loc with a scalar out of bounds appends to df (#3765)
    • FIX-#4016, FIX-#4086, FIX-#4039: Fall back to pandas in case of duplicate column names (#4896)
    • FIX-#4023: Fall back to pandas in case of MultiIndex columns (#5149)
    • FIX-#4660: Fix fillna when Modin series object is an argument (#4674)
    • FIX-#5034: Handle lists in df.get() (#5035)
    • FIX-#5097: Stop using deprecated mangle_dup_cols. (#5104)
    • FIX-#5098: Stop using append internally. (#5100)
    • FIX-#5099: Fix PandasQueryCompiler.groupby_mean with timestamp in by (#5140)
    • FIX-#5112: allows empty partition to be passed into query_compiler.dt_prop_map (#5133)
    • FIX-#5128: Fix reading parquet directory from s3. (#5129)
    • FIX-#5150: Sync row labels after read_csv when index_col is False (#5151)
    • FIX-#5158: Synchronize metadata before to_parquet (#5161)
    • FIX-#5168: module 'collections' has no attribute 'Sequence' in dataframe protocol (#5169)
    • FIX-#5174: Pin xgboost < 1.7. (#5175)
    • FIX-#5180: Do not set OMP_NUM_THREADS=1 on modin.pandas init (#5181)
    • FIX-#5184: Fix get_dummies to respect passed columns to be encoded (#5185)
    • FIX-#5188: Fix getitem_bool when the key is Series with empty partition (#5189)
    • FIX-#5206: pin mypy<0.990 (#5207)
    • FIX-#5208: pin ray version under 2.1.0 (#5209)
  • Performance enhancements
    • PERF-#5029: Don't use _compute_axis_labels_and_lengths for computing _row_lengths/_column_widths (#5030)
    • PERF-#5087: use cache for widths/lengths/index/columns if possible (#5031)
    • PERF-#5162: precompute new row/column lengths in '._reorder_labels' (#5144)
  • Refactor Codebase
    • REFACTOR-#4631: Add mypy checks for modin.distributed (#5109)
    • REFACTOR-#5079: Add mypy checks for modin.core.dataframe.base (#5110)
    • REFACTOR-#5092: Fix future warning for set_axis function (#5093)
  • Update testing suite
    • TEST-#4982: Require format for PR descriptions instead of commit descriptions (#5117)
    • TEST-#5124: Disable codecov comments. (#5125)
    • TEST-#5135: Return CI back after accidental removal (#5136)
    • TEST-#5172: Add fuzzydata logs to artifacts (#5173)
  • Benchmarking enhancements
    • BENCH: add some cases for join and merge ops from pandas (#5021)
    • TEST-#5102: Add HDK benchmarks to github workflows (#5063)
  • Documentation improvements
    • DOCS-#3634: Fix examples related to ProgressBar usage (#5119)
    • DOCS-#5019: Update HDK on native documentation (#5088)
    • DOCS-#5095: Remove release note checkbox from PR template (#5096)
    • DOCS-#5105: Update release procedure (#5106)
  • New Features
    • FEAT-#5120: Update to pyhdk 0.2 (#5121)
    • FEAT-#5141: Implement 2D insertion of Modin DFs in .__setitem__ (#5142)
    • FEAT-#5145: Upgrade pandas to 1.5.1 (#5146)

Contributors

@AndreyPavlenko
@Billy2551
@RehanSD
@YarShev
@anmyachev
@dchigarev
@devin-petersohn
@ienkovich
@mvashishtha
@noloerino
@pyrito
@rosdyana
@shalearkane
@suhailrehman
@vnlitvinov