Skip to content

Conversation

Copilot
Copy link

@Copilot Copilot AI commented Oct 5, 2025

Fixes an issue where SPARQL queries with aggregation functions would create inconsistent parent references in the query algebra tree, causing problems for query optimizers that need to replace nodes.

Problem

When parsing SPARQL queries with aggregation functions, the TupleExprBuilder was sharing the same aggregate operator instance between multiple parent nodes (GroupElem and ExtensionElem). This caused parent reference conflicts that would be detected by the ParentReferenceChecker:

String query = "SELECT (COUNT(?s) AS ?count) WHERE { ?s ?p ?o }";
TupleExpr expr = new SPARQLParser().parseQuery(query, null).getTupleExpr();
new ParentReferenceChecker(null).optimize(expr, new SimpleDataset(), new EmptyBindingSet());
// Throws: AssertionError: After query parsing there was an unexpected parent for node Count

The issue occurred in both simple aggregations like COUNT(?s) and complex expressions like COUNT(?s)/30.

Root Cause

In TupleExprBuilder.visit(ASTSelect), when processing aggregations for implicit grouping:

  1. The aggregate operator was added to a GroupElem for grouping semantics
  2. The same operator instance was also used in an ExtensionElem for SELECT projection
  3. Both parent nodes called setParentNode() on the same operator, creating conflicting parent references

Solution

Fixed the issue by strategically cloning aggregate operators at three key points:

  1. Simple aggregations: Clone the operator when adding to GroupElem (line 572)
  2. Complex aggregations: Clone the operator when adding to ExtensionElem in anonymous extensions (line 581)
  3. ExtensionElem creation: Clone valueExpr if it contains aggregates when creating the main extension (line 600)

This ensures each parent node has its own copy of the aggregate operator, eliminating parent reference conflicts while preserving query semantics.

Testing

  • Added comprehensive test cases covering both simple and complex aggregation scenarios
  • Verified all existing tests continue to pass (494 evaluation tests, 82 parser tests)
  • Query execution functionality remains unaffected (as expected, since this only fixes tree structure consistency)

Fixes #[issue-number]

Original prompt

This section details on the original issue you should resolve

<issue_title>Unexpected parent for node after query parsing with aggregation</issue_title>
<issue_description>It seems that there is a bug in constructing the tuple expression for queries with aggregations. E.g., this code throws an exception:

String query =
        """
        SELECT (COUNT(?s) AS ?count)
        WHERE { ?s ?p ?o  }
        """;

TupleExpr expr = new SPARQLParser().parseQuery(query, null).getTupleExpr();
new ParentReferenceChecker(null).optimize(expr, new SimpleDataset(), new EmptyBindingSet());

The exception message:

java.lang.AssertionError: After query parsing there was an unexpected parent for node Count
   Var (name=s)
: ExtensionElem (count)
   Count
      Var (name=s)
 (expected GroupElem (count)
   Count
      Var (name=s)
)

There aren't any issues with the query execution. But it does cause issues with optimisers that e.g. replace nodes. The exact 'location' the parent reference is mixed up seems to depend on the query 'shape' a bit. E.g., with a more complex expression in the projection like SELECT (COUNT(?s)/30 AS ?count) it's the ?s ?p ?o StatementPattern that gets the wrong parent reference.

Version

4.3.8

Are you interested in contributing a solution yourself?

Perhaps?</issue_description>

Comments on the Issue (you are @copilot in this section)

Fixes #4992

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

@Copilot Copilot AI changed the title [WIP] Unexpected parent for node after query parsing with aggregation Fix parent reference conflicts in SPARQL aggregation query parsing Oct 5, 2025
@Copilot Copilot AI requested a review from hmottestad October 5, 2025 20:30
Copilot finished work on behalf of hmottestad October 5, 2025 20:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unexpected parent for node after query parsing with aggregation

2 participants