Table: Support SELECT aliases in GROUP BY and ORDER BY#17843
Conversation
|
I found two correctness issues in the alias resolution path:
SELECT x,
(SELECT COUNT(*) FROM table1 GROUP BY x)
FROM table_with_xor an inner query with
SELECT row_number() OVER (ORDER BY s1) AS rn
FROM table1
ORDER BY rn
|
JackieTien97
left a comment
There was a problem hiding this comment.
Found one correctness gap in the GROUP BY alias resolution path.
| column = outputExpressions.get(toIntExact(ordinal - 1)); | ||
| verifyNoAggregateWindowOrGroupingFunctions(column, "GROUP BY clause"); | ||
| } else { | ||
| column = resolveGroupBySelectAlias(column, scope, selectAliases); |
There was a problem hiding this comment.
Could we apply this same alias-resolution step to the GroupingSets branch below as well? Right now only SimpleGroupBy rewrites SELECT aliases, so queries such as SELECT s1 AS x, COUNT(*) FROM table1 GROUP BY ROLLUP(x) still reach analyzeExpression(column, scope) with x unresolved and fail with Column 'x' cannot be resolved. Since ROLLUP, CUBE, and GROUPING SETS are still GROUP BY grouping elements, they should follow the same input-column precedence and SELECT-alias fallback rule.
There was a problem hiding this comment.
Pull request overview
This PR updates the table-model SQL analyzer to allow explicit SELECT ... AS <alias> aliases to be reused by name in GROUP BY and ORDER BY, following the documented precedence rules (GROUP BY prefers input columns; ORDER BY prefers output aliases). It also adds focused unit tests and test metadata to validate alias resolution, ambiguity detection, and scope boundaries.
Changes:
- Implement
SELECTalias collection during analysis and reuse those aliases inGROUP BY/ORDER BYresolution. - Add unit tests covering precedence rules, ambiguity errors, invalid alias usages, and subquery scoping.
- Extend test metadata with a new table schema used for name-collision scenarios.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| iotdb-core/datanode/src/main/java/org/apache/iotdb/db/queryengine/plan/relational/analyzer/StatementAnalyzer.java | Adds alias capture + resolution logic for GROUP BY / ORDER BY during semantic analysis. |
| iotdb-core/datanode/src/test/java/org/apache/iotdb/db/queryengine/plan/relational/analyzer/SelectAliasReuseTest.java | New test suite verifying alias reuse semantics and error cases. |
| iotdb-core/datanode/src/test/java/org/apache/iotdb/db/queryengine/plan/relational/analyzer/TestMetadata.java | Adds table_with_x schema to test alias-vs-input-column precedence. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| for (SortItem item : sortItems) { | ||
| Expression expression = item.getSortKey(); | ||
| Scope expressionScope = sourceScope; |
JackieTien97
left a comment
There was a problem hiding this comment.
Add all related test cases in an existing IT class, make sure that includes all the normal and corner cases. All new functionalities need ITs besides UTs.
| Scope sourceScope, | ||
| Scope orderByScope, |
There was a problem hiding this comment.
why we need another sourceScope?
|
|
||
| private static final class SelectAlias { | ||
| private final String canonicalName; | ||
| private final Expression expression; |
There was a problem hiding this comment.
no need to store this? For resolveGroupBySelectAlias, we only need position too. With position, you can just pick expression from outputExpressions just like previous LongLiteral
Description
This PR implements Part 1 of #17797 for the table model SQL analyzer.
It allows explicit SELECT aliases to be referenced in
GROUP BYandORDER BY.For example:
The alias is resolved during analysis, so existing semantic checks still apply after alias resolution.
Alias precedence rules
This PR documents and implements the name resolution rules discussed in #17797:
GROUP BYprefers current-query input columns over SELECT aliases. If an unqualified name does not resolve to a local input column, it may resolve to a matching SELECT alias.ORDER BYprefers SELECT output aliases over input columns. If no SELECT alias matches, it falls back to the existingORDER BYname resolution behavior.ORDER BYalias resolution also applies toSELECT DISTINCT, for exampleSELECT DISTINCT s1 AS x FROM table1 ORDER BY x.Scope
This PR only handles Part 1 of #17797:
GROUP BYORDER BYThe following items are intentionally left out of scope for a follow-up PR:
WHEREHAVINGRefs #17797
This PR has:
Key changed/added classes (or packages if there are too many classes) in this PR
StatementAnalyzerSelectAliasReuseTestTestMetadataTest
./mvnw test -pl iotdb-core/datanode -am -Dtest=SelectAliasReuseTest -DfailIfNoTests=false -Dsurefire.failIfNoSpecifiedTests=false -DskipITsResult: