First usable version of the package. We decided on the api:
pipeasy_spark.build_pipeline(column_transformers={'column': []})
is the core function where you can define a list of transormers for each columns.pipeasy_spark.build_pipeline_by_dtypes(df, string_transformers=[])
allows you to define a list of transormers for two types of columns:string_
andnumeric_
.pipeasy_spark.build_default_pipeline(df, exclude_columns=['target'])
builds a default transformer for thedf
dataframe.
- I am still learning how all these tools interact with each other
- First release on PyPI.