Modin 0.17.0
This release includes support for pyhdk 0.2. It also includes many bug fixes and some performance enhancements.
Key Features and Updates Since 0.16.0
- Stability and Bugfixes
- FIX-#3764: Ensure df.loc with a scalar out of bounds appends to df (#3765)
- FIX-#4016, FIX-#4086, FIX-#4039: Fall back to pandas in case of duplicate column names (#4896)
- FIX-#4023: Fall back to pandas in case of MultiIndex columns (#5149)
- FIX-#4660: Fix
fillna
when Modin series object is an argument (#4674) - FIX-#5034: Handle lists in
df.get()
(#5035) - FIX-#5097: Stop using deprecated mangle_dup_cols. (#5104)
- FIX-#5098: Stop using append internally. (#5100)
- FIX-#5099: Fix
PandasQueryCompiler.groupby_mean
with timestamp in by (#5140) - FIX-#5112: allows empty partition to be passed into
query_compiler.dt_prop_map
(#5133) - FIX-#5128: Fix reading parquet directory from s3. (#5129)
- FIX-#5150: Sync row labels after read_csv when index_col is False (#5151)
- FIX-#5158: Synchronize metadata before
to_parquet
(#5161) - FIX-#5168: module 'collections' has no attribute 'Sequence' in dataframe protocol (#5169)
- FIX-#5174: Pin xgboost < 1.7. (#5175)
- FIX-#5180: Do not set OMP_NUM_THREADS=1 on modin.pandas init (#5181)
- FIX-#5184: Fix
get_dummies
to respect passed columns to be encoded (#5185) - FIX-#5188: Fix
getitem_bool
when the key is Series with empty partition (#5189) - FIX-#5206: pin mypy<0.990 (#5207)
- FIX-#5208: pin ray version under 2.1.0 (#5209)
- Performance enhancements
- Refactor Codebase
- Update testing suite
- Benchmarking enhancements
- Documentation improvements
- New Features
Contributors
@AndreyPavlenko
@Billy2551
@RehanSD
@YarShev
@anmyachev
@dchigarev
@devin-petersohn
@ienkovich
@mvashishtha
@noloerino
@pyrito
@rosdyana
@shalearkane
@suhailrehman
@vnlitvinov