Skip to content

Conversation

@b41sh
Copy link
Member

@b41sh b41sh commented Nov 8, 2025

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

Problem: Databend's query parsing function relied on a simplistic, custom parser. This implementation proved insufficient for handling complex query strings, particularly those involving multiple nested logical operators (e.g., AND, OR).

for example:

SELECT doc_id, video_created_at
FROM ods_video_text
WHERE
QUERY('(doc_body.videoInfo.typeProp.project:E260S AND doc_body.videoInfo.createdAtEpoch:[1735689600000 TO 1735776000000]) OR doc_body.videoInfo.extraData.category:camera')
LIMIT 10;
error: APIError: QueryFailed: [1065]error:
  --> SQL:4:7
  |
1 | SELECT doc_id, video_created_at
2 | FROM ods_video_text
3 | WHERE
4 | QUERY('(doc_body.videoInfo.typeProp.project:E260S AND doc_body.videoInfo.createdAtEpoch:[1735689600000 TO 1735776000000]) OR doc_body.videoInfo.extraData.category:camera')
  |       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ column (doc_body doesn't exist

This PR addresses the parsing limitations by replacing the custom parser with the tantivy_query_grammar parser. tantivy_query_grammar is a well-established parser designed for full-text search query languages, guarantees accurate parsing of complex and nested query conditions, eliminating previous parsing errors

  • fixes: #[Link the issue here]

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@b41sh b41sh requested a review from sundy-li November 8, 2025 02:20
@github-actions github-actions bot added the pr-bugfix this PR patches a bug in codebase label Nov 8, 2025
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-bugfix this PR patches a bug in codebase

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants