Releases: dlt-hub/dlt
0.3.22
Core Library
- Fix: make Load single-thread compatible by @codingcyclist in #698
- Enable compatible s3 storages like R2: support aws config
endpoint_url
with fsspec by @steinitzu in #701 - performance improvements in Arrow loading by @rudolfix in #707
- datatype autodetection for unix timestamp removed from defaults by @rudolfix in #707
Docs
Full Changelog: 0.3.21...0.3.22
0.3.21
What's Changed
-
athena iceberg by @sh-rp in #659
Use new table hint table format to store selected Athena tables in iceberg format (https://dlthub.com/docs/dlt-ecosystem/destinations/athena#iceberg-data-tables) -
Pyarrow direct loading by @steinitzu and @tomsej in #679
Allows to pass Arrow tables and Panda frames to therun
method and load them directly (via parquet) without data copy (https://dlthub.com/docs/dlt-ecosystem/verified-sources/arrow-pandas) which should result in immense speedups in many loading cases -
Features/dbt cloud by @AstrakhantsevaAA in #694
Run dbt jobs in the cloud (https://dlthub.com/docs/dlt-ecosystem/transformations/dbt/dbt_cloud) -
enables duckdb 0.9.1 and improves motherduck docs by @rudolfix in #695
Please also read Motherduck updated documentation (https://dlthub.com/docs/dlt-ecosystem/destinations/motherduck) - you may want to reduce load parallelism if you are on weak internet connection -
allows to provide custom implementation of
DltSource
to the source decorator by @rudolfix in #687
Bugfixes
- change source schema handling and normalizer
root_key
propagation. fixes various problems where merge and replace write dispositions were subsequently used in the same pipeline by @sh-rp in #686 - fixes bug in drop command by @sh-rp in #693
Docs
Full Changelog: 0.3.19...0.3.21
0.3.19
What's Changed
- easy renaming of resources by @rudolfix in #671
- standalone resources and transformers intended to be used outside of the source by @rudolfix in #671
- Building blocks for reading and manipulating files in buckets available in
dlt.sources.filesystem
by @rudolfix in #671
New dlt sources
- Read files from buckets, stream large json, csv, parquet and other files - also incrementally
- Read messages and atachments from e-mail inbox
Docs
- Code Examples for docs by @sh-rp in #616
- Holistic integration blogpost by @zem360
- Improved sources documentation by @AstrakhantsevaAA @dat-a-man
Bugfixes
- Update min snowflake-connector version by @steinitzu in #664
- Fix: read pkey as DER format, not PEM by @codingcyclist in #680
New Contributors
Full Changelog: 0.3.18...0.3.19
0.3.18
Core Library
- Support for precision, scale in column schema by @steinitzu in #646
- validation of extract data with Pydantic models by @steinitzu in #638
- moves fsspec support to common code by @rudolfix in #626
- Allow base64 encoded private keys for Snowflake destination by @codingcyclist in #637
- duck case naming convention that allows for emojis and other special characters in identifiers by @rudolfix in #660
- Typing fixes and enable mypy in tests by @steinitzu in #661
Bugfixes
- Allows table and resource names like
state
by fixing set attribute bug by @steinitzu in #657 - fixes
dlt pipeline show
streamlit app start-up by @rudolfix in #645 - replaces
depends_on
withdata_from
in resource decorator @rudolfix in #645 - fixes json logger reinit and drop json-logger dependency @rudolfix in #645
- forces local duckdb version when creating dbt runner venv to prevent storage version clashes @rudolfix in #645
- detects when motherduck does not support local duckdb version @rudolfix in #645
Docs
- Rework 'Understanding the tables' by @burnash in #629
- Added Airtable docs by @dat-a-man in #635
dlt
API Reference by @AstrakhantsevaAA in #642- Add a Zendesk to Weaviate walkthrough by @burnash in #641
- Add a blog post: Load Zendesk tickets to Verba by @burnash in #654
- Added Slack Docs! by @dat-a-man in #643
Full Changelog: 0.3.17...0.3.18
0.3.17
Core Library
- Add support for TIME data type by @steinitzu in #606
- mssql destination by @steinitzu in #611
- fixes duckdb parallel parquet loads and JSON compression problem by @rudolfix in #619
- allows tables to be created upfront ie. to use custom data types by allowing hints (ie. NOT NULL) in ALTER TABLE ADD @rudolfix in #621
- fixes apply hints and schema merge behavior by @rudolfix in #621
- case insensitive naming convention for Waviate by @rudolfix in #621
Docs
- Added MongoDB documentation. by @dat-a-man in #607
- Updates performance docs: performance tips, memory finetuning, running in parallel and more. with working examples. by @rudolfix in #619
Full Changelog: 0.3.16...0.3.17
0.3.16
Core Library
-
add default user agent header to
dlt
requests client by @sh-rp in #595 -
Add pydantic support by @steinitzu in #589
You can use pydantic to define table schemas. You can load pydantic instances like you can load dictionaries -
NormalizerInfo
: item counts in table present in trace by @sh-rp in #582
Get counts of items added to table from normalization stage -
Add azure blob storage filesystem/staging destination by @steinitzu in #592
Also includes Snowflake stage support -
general state sync interface by @sh-rp in #564
You can restore state and schemas from Weaviate now (filesystem comes later) -
uses botocore instead of boto3 in AwsCredentials by @rudolfix in #590
Docs
- Update pseudonymizing_columns.md by @wtfzambo in #598
- Docs: getting started by @AstrakhantsevaAA in #568
We have a really nice getting started now: https://dlthub.com/docs/getting-started
New Verified Sources
- really nice
airtable
source by @willi-mueller in dlt-hub/verified-sources#218
thx for amazing contribution!
New Contributors
Full Changelog: 0.3.13...0.3.16
0.3.13
Core Library
- Feat: don't require AWS credentials for external Snowflake stage by @codingcyclist in #587
- connecting to local Weaviate made easy by @rudolfix in #591
- allows setting table name via property on DltResource by @rudolfix in #593
- destination tests refactored by @sh-rp in #572
Docs
- docs snippet and examples will be now linted and tested by @sh-rp in #559
- several blog posts and verified sourced docs updates by @adrianbr and @dat-a-man
New Verified Sources
- MongoDb source working in the same way as sql database by @sehnem in dlt-hub/verified-sources#239
New Contributors
Full Changelog: 0.3.12...0.3.13
0.3.12
Core Library
In this version we release two new types of a destinations:
- Add a Weaviate destination by @burnash in #479
A vector data store: load and query vectorized text data - Basic AWS Athena support by @sh-rp in #522
A data lake destination which works together withfilesystem
as a staging
Apart from that bug fixes:
- fixes airflow provider init sequence by @rudolfix in #569
- fixes transformer decorator typings by @rudolfix in #554
Docs
- We improved documentation for many verified sources (thx @dat-a-man and @AstrakhantsevaAA )
- updates contribution and readme + small docs fixes by @rudolfix in #553
- Edit weaviate docs by @hsm207 in #566
New Contributors
Full Changelog: 0.3.10...0.3.12
0.3.10
Core Library
- Fix config dataclasses on python 3.11 by @steinitzu in #541
Now P3.11 is fully tested on CI - removes optional dependencies by @rudolfix in #552
sentry-sdk
and several dependencies used bydlt deploy
command were moved to extras. several others (includingfsspec
) have their minimal version set to earlier versions - PR above is also fixing #539 and #540
Full Changelog: 0.3.9...0.3.10
0.3.9
Bugfix Release
When a replace with staging dataset was used in version 0.3.8, tables with other write dispositions were also truncated (in other words all the tables in the schema could be truncated). Note that default replace strategy does not use staging dataset so if you didn't explicitly changed you were not affected.
This release fixes that bug. If you use the replace strategy above, update the library.
Full Changelog: 0.3.8...0.3.9