Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make_Fetch_Possible_And_Deterministic with Arel.sql #1319

Open
andyundso opened this issue Mar 23, 2025 · 9 comments
Open

make_Fetch_Possible_And_Deterministic with Arel.sql #1319

andyundso opened this issue Mar 23, 2025 · 9 comments

Comments

@andyundso
Copy link
Member

solid_cache uses a couple of queries that look like this:

Entry.largest_byte_sizes(samples).pick(Arel.sql("sum(byte_size), count(*), min(byte_size)"))

Performance-wise, it is the best approach since you can fetch multiple values within one query.

But since pick uses limit under the hood, the sqlserver adapter has to add an ORDER BY clause to use OFFSET. This by default is ORDER BY [id], which in the case above throws an error:

ActiveRecord::StatementInvalid: TinyTds::Error: Column "solid_cache_entries.id" is invalid in the ORDER BY clause because it is not contained in either an aggregate function or the GROUP BY clause.

I looked a bit into make_Fetch_Possible_And_Deterministic, where this ORDER BY clause is added. In theory, it should be possible to find out which columns are referenced in the SELECT statements, by looking into o.cores.first.projections. I think there are two cases: Either the projection is already an Arel::Attributes::Attribute or a String.

If it's a string, we would have to extract the column name in order to get a Arel::Attributes::Attribute from the Arel::Table. we could separate it by each ,, then check if any known column (using the schema_cache) is mentioned, likely using a similar regex as in query_requires_identity_insert? where the identity columns are being searched.

I am not sure if this is a smart approach. So I wanted to have your feedback before actually looking to implement this.

@aidanharan
Copy link
Contributor

This looks like a bug in the adapter from first glance. From solid_cache:

scope :largest_byte_sizes, -> (limit) { from(order(byte_size: :desc).limit(limit).select(:byte_size)) }
...
Entry.largest_byte_sizes(samples).pick(Arel.sql("sum(byte_size), count(*), min(byte_size)"))

Which expanded out is:

Entry.from(order(byte_size: :desc).limit(samples).select(:byte_size)).pick(Arel.sql("sum(byte_size), count(*), min(byte_size)"))

Since there is already an order in the query, I wouldn't have thought the default order (ORDER BY [id]) would have been added. Guess the issue arises because the order is within a from (from(order(byte_size: :desc)).

@aidanharan
Copy link
Contributor

@andyundso I had a quick look and was not able to recreate the same error that you were seeing. Would you be able to create a bug script that recreates your error? https://github.com/rails-sqlserver/activerecord-sqlserver-adapter/wiki/How-to-report-a-bug

@andyundso
Copy link
Member Author

@aidanharan can do. I am also not sure if it is really a bug, as there are other cases (mostly failures in the minitest suite) where I run into the ORDER BY issue.

@aidanharan
Copy link
Contributor

@andyundso Could you try updating the solid_cache query as follows and see if it work? From:

Entry.largest_byte_sizes(samples).pick(Arel.sql("sum(byte_size), count(*), min(byte_size)"))

To:

Entry.largest_byte_sizes(samples).order(Arel.sql("count(*)")).pick(Arel.sql("sum(byte_size), count(*), min(byte_size)"))

If it works then this is a known SQLServer/adapter limitation. If a query has an OFFSET/FETCH then it must have an ordering too. See

it "exception thrown if FROM subquery is provided without an order" do

@andyundso
Copy link
Member Author

I will give you feedback on the weekend - there are so many failing tests for solid_cache. I'm having a hard time sorting it out and my time during the work week is a bit limited.

Maybe I even check solid_cable first before going back to solid_cache.

@andyundso
Copy link
Member Author

@aidanharan so solid_cable works fine.

for solid_cache I played around some more with the statements and feel the "easiest" way to fix it would be to use standard ActiveRecord statements. so instead of pick(Arel.sql("max(id) - min(id) + 1")) || 0, it would look like:

current_maximum = maximum(:id)
current_maximum.blank? ? 0 : current_maximum - minimum(:id) + 1

performance-wise, this code is worse than before. so I do no think it has any chance of getting accepted upstream. I likely provide a separate gem with a bunch of refinements to replace these queries.

So I think we can close this issue.

@aidanharan
Copy link
Contributor

@andyundso I meant to comment earlier. I had a look and I think your initial suggestion might be best. Currently if an explicit ordering isn't given then an implicit ordering using the primary key is used. Instead an implicit ordering using the first useable projection could be used. This would be a change to the adapter but should hopefully mean that 'solid_cache' should work with minimal changes.

I've worked on a PR but still trying to get all the adapters tests passing. Once that's done I'll run the 'solid_cache' tests to see what other changes might be needed.

@andyundso
Copy link
Member Author

@aidanharan here would be a gist with the changes necessary to run the solid_cache test suite against MSSQL. for the tests themselves, you invoke TARGET_DB=mssql bin/rails db:setup and TARGET_DB bin/rails test. Docker run statement for the MSSQL container:

docker run -e "ACCEPT_EULA=Y" -e 'MSSQL_SA_PASSWORD=yourStrong(!)Password' -e "MSSQL_PID=Evaluation" -p 11433:1433  --name sqlpreview --hostname sqlpreview -d mcr.microsoft.com/mssql/server:2022-preview-ubuntu-22.04

if the projection stuff would work, it would definitely be a win.

@aidanharan
Copy link
Contributor

@andyundso The changes in #1322 allow the tests in solid_cache to pass. I need to ensure the changes don't have any unintended consequences before I merge it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants