-
Notifications
You must be signed in to change notification settings - Fork 3.3k
Add Trino 477 release notes #26350
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Trino 477 release notes #26350
Conversation
a61e75a
to
01b8911
Compare
92a1949
to
a8867e8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A first review done .. will look again when its all rejigged and updated. Ping me
@@ -0,0 +1,202 @@ | |||
# Release 477 (dd MMM 2025) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reminder
* Prevent creating multiple Alluxio client which can cause excessive resource usage. ({issue}`26121`) | ||
* Fix reading specifying format for date partition projection. ({issue}`25642`) | ||
* Fix skipping of row groups when the trino type is different from logical types in case of parquet files. ({issue}`26203`) | ||
* Add `azure.multipart-write-enabled` that enables multipart uploads for large files. ({issue}`26225`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not really an "upload" from the users perspective right? What does that do for the user?
a8867e8
to
8fe12ec
Compare
* Fix failure when aggregation exists in other expressions in `GROUP BY AUTO`. ({issue}`25987`) | ||
* Improve spilling reliability for join queries. ({issue}`25976`) | ||
* Add support for `ALTER MATERIALIZED VIEW ... SET AUTHORIZATION`. ({issue}`25910`) | ||
* Fix incorrect results with spatial joins. ({issue}`26021`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should say "for queries involving joins using x, y, and z functions".
## Delta Lake connector | ||
|
||
* Add support for using GCS without credentials. ({issue}`25810`) | ||
* Add ability to detect resource leakage in the runtime. ({issue}`26087`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What kind of resource leakage? Which runtime? What happens when leakage is detected? What's the user-visible behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing this entry. It's a low level detail meant for debugging and troubleshooting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resource is closed automatically. This should generally increase FS throughput since more connections are available
about the authorization for given entities. ({issue}`25907`) | ||
* Add `debug_adaptive_planner` session property which allows gathering extra | ||
diagnostics information regarding adaptive planner operation. ({issue}`26274`) | ||
* Add the `coordinatorId` to the `/v1/info` endpoint. ({issue}`23910`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add the coordinatorId
to the information exposed by /v1/info
endpoint.
|
||
## CLI | ||
|
||
* Send detailed client information such as user-agent in the source. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is that user visible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing this entry
{issue}`26263`) | ||
* Prevent workers from going into full GC or crashing when decoding unusually | ||
large Parquet footers. ({issue}`25973`) | ||
* Prevent creating multiple Alluxio client which can cause excessive resource |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: client -> clients
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or use client instances as below. Let's make it consistent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rephrased
large Parquet footers. ({issue}`25973`) | ||
* Prevent creating multiple Alluxio client which can cause excessive resource | ||
usage. ({issue}`26121`) | ||
* Release native filesystem resources/prevent leaks. ({issue}`26085`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensure native filesystem resources are released?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rephrased
({issue}`26145`) | ||
* Fix failure when reading `null` values on `json` type columns. | ||
({issue}`26184`) | ||
* Fix skipping of row groups when the trino type is different from logical types |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trino -> Trino
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logical types -> logical type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rephrased
|
||
* Add support for using GCS without credentials. ({issue}`25810`) | ||
* Add support for reading tables using the ESRI JSON format. ({issue}`25241`) | ||
* Add ability to detect resource leakage. ({issue}`26087`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
user visible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing this entry. See above.
|
||
* Add support for using GCS without credentials. ({issue}`25810`) | ||
* Add ability to detect resource leakage in the runtime. ({issue}`26087`) | ||
* Add `azure.multipart-write-enabled` that enables multipart uploads for large |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the user-visible impact of multipart uploads? cc @wendigo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this improves performance .. but @wendigo should confirm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Connection is not held for the duration of the single write which should improve thoughput
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about:
- Improve throughput for write-heavy queries on Azure when the
azure.multipart-write-enabled
config option is set totrue
.
{issue}`26263`) | ||
* Prevent workers from going into full GC or crashing when decoding unusually | ||
large Parquet footers. ({issue}`25973`) | ||
* Prevent creating multiple Alluxio client which can cause excessive resource |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rephrased
large Parquet footers. ({issue}`25973`) | ||
* Prevent creating multiple Alluxio client which can cause excessive resource | ||
usage. ({issue}`26121`) | ||
* Release native filesystem resources/prevent leaks. ({issue}`26085`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rephrased
({issue}`26145`) | ||
* Fix failure when reading `null` values on `json` type columns. | ||
({issue}`26184`) | ||
* Fix skipping of row groups when the trino type is different from logical types |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rephrased
|
||
* Add support for using GCS without credentials. ({issue}`25810`) | ||
* Add support for reading tables using the ESRI JSON format. ({issue}`25241`) | ||
* Add ability to detect resource leakage. ({issue}`26087`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing this entry. See above.
@mosabua, I made a bunch of edits. Can you take another look? |
* {{breaking}} Improve precision and scale inference for arithmetic operations between | ||
decimal values. The previous behavior can be restored by setting the | ||
`deprecated.legacy-arithmetic-decimal-operators` config property to `true`. ({issue}`26422`) | ||
* Add support for more complex Python UDFs. ({issue}`26515`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a link or some more detail to add .. its really vague like this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc: @wendigo
|
||
* Add support for using GCS without credentials. ({issue}`25810`) | ||
* Add ability to detect resource leakage in the runtime. ({issue}`26087`) | ||
* Add `azure.multipart-write-enabled` that enables multipart uploads for large |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this improves performance .. but @wendigo should confirm
* Add physical data scan tracking to resource groups. ({issue}`25003`) | ||
* Add `internal_network_input_bytes` column to `system.runtime.tasks` table. ({issue}`26524`) | ||
* Add support for `Geometry` type in {func}`to_geojson_geometry`. ({issue}`26451`) | ||
* Add support for group name case conversion in the group provider properties. ({issue}`25396`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's "group name case conversion" ? cc @kokosing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this is case-sensitivness handling
`system.runtime.tasks` table. ({issue}`26524`) | ||
* Remove `totalBytes` and `totalRows` | ||
from `io.trino.spi.eventlistener.QueryStatistics`. ({issue}`26524`) | ||
* Don't encode crs in GeoJSON. ({issue}`26499`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In what context? What functions?
geometry area. ({issue}`26459`) | ||
* Fix over-reporting the amount of memory used when aggregating over `ROW` | ||
values when nested inside of an `ARRAY` type ({issue}`26405`) | ||
* Preserve client trace token in `CREATE VIEW` ({issue}`25716`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the user visible behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously the trace token supplied by the client would not be tracked when running CREATE VIEW.
I believe trace tokens end up in http request logs in the servers and in query event listener events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand how it relates to CREATE VIEW. Views don't store the trace token. If this is about http logs or event listener, how is it specific to CREATE VIEW and does not affect everything else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fix is one liner 632842e
It's about populating token in the session created to run CREATE VIEW
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so that's not just for CREATE VIEW. It's for any context where a view is analyzed/executed.
|
||
* Add user identifying fields to the OpenLineage `trino_query_context` facet. ({issue}`26074`) | ||
* Add `query_id` field to `trino_metadata` facet. ({issue}`26074`) | ||
* Add `openlineage-event-listener.transport.compression` config property. ({issue}`26535`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Describe the feature that's being added, not the property.
@@ -0,0 +1,225 @@ | |||
# Release 477 (dd MMM 2025) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
24 Sep
## Delta Lake connector | ||
|
||
* Add support for using GCS without credentials. ({issue}`25810`) | ||
* Add ability to detect resource leakage in the runtime. ({issue}`26087`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resource is closed automatically. This should generally increase FS throughput since more connections are available
|
||
* Add support for using GCS without credentials. ({issue}`25810`) | ||
* Add ability to detect resource leakage in the runtime. ({issue}`26087`) | ||
* Add `azure.multipart-write-enabled` that enables multipart uploads for large |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Connection is not held for the duration of the single write which should improve thoughput
9b110bb
to
8870950
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed .. good to go so we can ship a release .. we will have to follow up with details and improvements
Description
Assemble the release notes for the upcoming Trino 477 release.
Additional context and related issues
See https://github.com/trinodb/trino/pulls?q=is%3Apr+is%3Aclosed+milestone%3A477
Release notes
(x) This is not user-visible or is docs only, and no release notes are required.
Verification for each pull request
Format: PR/issue number, ✅ / ❌ rn ✅ / ❌ docs
✅ rn - release note added and verified, or assessed to be not necessary, set to ❌ rn before completion
✅ docs - need for docs assessed and merged, or assessed to be not necessary, set to ❌ docs before completion
06 Jun 2025
08 Jun 2025
09 Jun 2025
10 Jun 2025
11 Jun 2025
12 Jun 2025
parquet_max_read_block_row_count
session property in Hudi #25981 ✅ rn ✅ docs13 Jun 2025
14 Jun 2025
15 Jun 2025
16 Jun 2025
17 Jun 2025
18 Jun 2025
19 Jun 2025
parquetColumnFieldsBuilder
withImmutableList.copyOf
#26032 ✅ rn ✅ docscoordinatorId
to/v1/info
#23910 ✅ rn ✅ docs20 Jun 2025
21 Jun 2025
23 Jun 2025
24 Jun 2025
25 Jun 2025
26 Jun 2025
String.format
andformatted
wrapping #26070 ✅ rn ✅ docs27 Jun 2025
28 Jun 2025
29 Jun 2025
TestIcebergParquetConnectorTest
usingcomputeScalar
#26098 ✅ rn ✅ docs30 Jun 2025
01 Jul 2025
02 Jul 2025
03 Jul 2025
BaseConnectorSmokeTest
#26127 ✅ rn ✅ docs07 Jul 2025
RunLengthEncodedBlock
for null column appends #26140 ✅ rn ✅ docs08 Jul 2025
09 Jul 2025
10 Jul 2025
information_schema.columns
table #26163 ✅ rn ✅ docs14 Jul 2025
MysqlEventListener
#26192 ✅ rn ✅ docs15 Jul 2025
16 Jul 2025
threeten.bp.Duration
withjava.time.Duration
#26210 ✅ rn ✅ docs@Experimental
annotation #26200 ✅ rn ✅ docs17 Jul 2025
max.request.size
andbatch.size
configuration for Kafka Event Listener #26132 ✅ rn ✅ docs18 Jul 2025
optimize_manifests
failure caused byNULL
in first partition #26227 ✅ rn ✅ docs21 Jul 2025
22 Jul 2025
23 Jul 2025
toString
method in InformationSchemaTableHandle and InformationSchemaColumnHandle #26268 ✅ rn ✅ docs24 Jul 2025
25 Jul 2025
TestKafkaWithConfluentSchemaRegistryMinimalFunctionality
#26280 ✅ rn ✅ docs26 Jul 2025
28 Jul 2025
29 Jul 2025
TransactionLogTail#loadNewTail
method #25856 ✅ rn ✅ docs30 Jul 2025
31 Jul 2025
01 Aug 2025
03 Aug 2025
04 Aug 2025
05 Aug 2025
LazyLoadedProtobufSchemaProvider
#26322 ✅ rn ✅ docs06 Aug 2025
07 Aug 2025
08 Aug 2025
getEntriesFromJson
in Delta Lake #26351 ✅ rn ✅ docs09 Aug 2025
11 Aug 2025
12 Aug 2025
13 Aug 2025
14 Aug 2025
15 Aug 2025
16 Aug 2025
17 Aug 2025
18 Aug 2025
19 Aug 2025
20 Aug 2025
22 Aug 2025
23 Aug 2025
24 Aug 2025
25 Aug 2025
26 Aug 2025
27 Aug 2025
IcebergFileSystemModule
constructor #26498 ✅ rn ✅ docs28 Aug 2025
29 Aug 2025
01 Sep 2025
02 Sep 2025
TransactionLogEntryIterator
#26527 ✅ rn ✅ docsDeltaLakeFileSystemFactory
andVendedCredentialsProvider
in Delta Lake #26281 ✅ rn ✅ docs03 Sep 2025
04 Sep 2025
05 Sep 2025
07 Sep 2025
08 Sep 2025
mongodb.implicit-row-field-prefix
in Mongodb #26576 ✅ rn ✅ docs09 Sep 2025
getRetainedSizeInBytes
for AddFileEntry #26594 ✅ rn ✅ docs10 Sep 2025
11 Sep 2025
12 Sep 2025
13 Sep 2025
15 Sep 2025
16 Sep 2025
17 Sep 2025
bigquery.arrow-serialization.max-allocation
property #26409 ✅ rn ✅ docs