[SPARK-53890][SDP] Test (and fix) read/readstream options are respected for pipelines #53073

AnishMahto · 2025-11-14T20:13:55Z

What changes were proposed in this pull request?

Today, read options attached to any UnresolvedRelation that is analyzed by the pipelines flow analyzer are dropped. This PR fixes that bug, and in doing so also makes the following micro refactors:

Get rid of StreamingReadOptions/BatchReadOptions. Previously neither of the fields of either classes were ever populated, and the classes were instead used to determine whether a streaming read or batch read was being executed.
Propagate the streaming or batch dataframe reader as the sole source of truth for options to execute reads with, rather than passing in both a reader and read options side-by-side.
Correct the Table class hierarchy. Table is a GraphElement but it is not an Input. Because it was previously inheriting Input it had a load override, but that was dead code; logically a Table could never be passed into the polymorphic call sites of Input.load.
Get rid of AnalysisWarning, whose exceptions were also dead code

Why are the changes needed?

Prior to these changes, any options specified in UnresolvedRelation.options would be dropped when analyzed via FlowAnalysis.analyze. To my knowledge, in a vanilla installation of Spark (ex. without Delta io) today there are no options that could be dropped that would've otherwise actually been respected by the creation of an UnresolvedRelation (ex. via spark.read.table), but at the very least this is future proofing a definite bug.

How was this patch tested?

org.apache.spark.sql.pipelines.analysis.ReadOptionsPropagationOnAnalysisSuite

AnishMahto · 2025-11-14T23:22:35Z

...ipelines/src/test/scala/org/apache/spark/sql/pipelines/graph/ConnectValidPipelineSuite.scala

      mem.addData(1, 2)
      registerPersistedView("complete-view", query = dfFlowFunc(Seq(1, 2).toDF("x")))
      registerPersistedView("incremental-view", query = dfFlowFunc(mem.toDF()))
-      registerTable("`complete-table`", query = Option(readFlowFunc("complete-view")))


With my changes we are actually parsing the identifier passed into readFlowFunc using the catalyst parser, hence why this invalid flow function name is only now throwing an invalid name exception. I am simply quoting the name to resolve this.

AnishMahto · 2025-11-14T23:33:29Z

@sryza Ready for review

tests and refactoring

9f6b5dc

github-actions bot added the SQL label Nov 14, 2025

AnishMahto changed the title ~~[SPARK-53890] Test (and fix) read/readstream options are respected for pipelines~~ [SPARK-53890][SDP] Test (and fix) read/readstream options are respected for pipelines Nov 14, 2025

vrozov mentioned this pull request Nov 14, 2025

[SPARK-52408][BUILD][SQL] Upgrade Hive to 4.1 #52099

Open

anishm-db added 2 commits November 14, 2025 22:28

fix tests

1e6b74d

test fix

72a6891

AnishMahto commented Nov 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-53890][SDP] Test (and fix) read/readstream options are respected for pipelines #53073

[SPARK-53890][SDP] Test (and fix) read/readstream options are respected for pipelines #53073

Uh oh!

AnishMahto commented Nov 14, 2025 •

edited

Loading

Uh oh!

AnishMahto Nov 14, 2025

Uh oh!

AnishMahto commented Nov 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[SPARK-53890][SDP] Test (and fix) read/readstream options are respected for pipelines #53073

Are you sure you want to change the base?

[SPARK-53890][SDP] Test (and fix) read/readstream options are respected for pipelines #53073

Uh oh!

Conversation

AnishMahto commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

How was this patch tested?

Uh oh!

AnishMahto Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

AnishMahto commented Nov 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

AnishMahto commented Nov 14, 2025 •

edited

Loading