Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/core/replace rowstream with arrow tpch #20

Open
wants to merge 103 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
103 commits
Select commit Hold shift + click to select a range
849acf4
setup allocator for RequestContext
aqni Sep 14, 2024
cf95b00
store
aqni Sep 14, 2024
82cab1b
add some TODO
aqni Sep 18, 2024
e984b88
feat(arrow): add framework of operator executor and impl row-to-colum…
aqni Sep 19, 2024
5d04f2b
feat(core): impl Projector (#452)
jzl18thu Sep 20, 2024
728ff46
feat(core): release resources correctly && adjust some sessions (#455)
jzl18thu Oct 5, 2024
2c5bd8d
start to finish stage 2 (#464)
aqni Oct 14, 2024
e50d7dc
remove type declaration in RowTransform
aqni Oct 15, 2024
12bca48
feat(arrow): row mapping implementation in arrow version (#466)
shinyano Oct 16, 2024
816f7fb
feat(arrow):revert multiply expr (#467)
shinyano Oct 17, 2024
2d33e22
feat(core): impl filter (#468)
aqni Oct 17, 2024
fa2c9cc
feat: refactor accumulator (#469)
aqni Oct 17, 2024
6ee3c0d
refactor: cast (#471)
aqni Oct 18, 2024
cd2e14c
refactor time accumulator (#472)
aqni Oct 18, 2024
f666df3
fix insert (#474)
aqni Oct 18, 2024
72a882d
fix insert again (#475)
aqni Oct 18, 2024
987f91c
feat(arrow): implement constant & bracket expr; bool filter (#479)
shinyano Oct 23, 2024
e203ce2
feat(arrow): impl group-by without distinct (#480)
aqni Oct 23, 2024
537da40
feat(arrow): impl sort inner batch (#482)
aqni Oct 24, 2024
2cb7bb5
feat(arrow): logical & casewhen (#483)
shinyano Oct 31, 2024
f218db1
feat(arrow): impl tpch-h q3 (#486)
aqni Oct 31, 2024
4bcb4d8
fix (#487)
aqni Oct 31, 2024
7eb903b
feat(arrow): impl TPC-H q4 (#489)
aqni Nov 1, 2024
67ba828
feat(arrow): stage4 (#490)
aqni Nov 5, 2024
a9ae96d
feat(arrow): short circuit logic, ConstantPool and constant function …
aqni Nov 6, 2024
f4171b0
feat(arrow): short circuit logic, constant pool, constant function an…
aqni Nov 14, 2024
32fae0c
fix(arrow): max, min, first, last
aqni Nov 14, 2024
0695627
fix(arrow): add prefixed key column in join result
aqni Nov 14, 2024
1b19a0d
feat(arrow): impl remain operator by converting batch to rows (#499)
aqni Nov 18, 2024
3fca3c5
refactor: add hasNext method for BatchStream (#500)
aqni Nov 20, 2024
50aa52f
feat(arrow): fetch async (#501)
aqni Nov 21, 2024
7ea3312
feat(arrow): parallel pipeline task for RowTransform (#504)
aqni Nov 26, 2024
1c20218
feat(arrow): impl value2meta (#506)
aqni Nov 27, 2024
c326dbb
fix(arrow): fix most of SQLSessionIT (#508)
aqni Nov 29, 2024
c21b1c5
fix(arrow): most of SQLSessionIT (#509)
aqni Dec 2, 2024
6d11412
Feat/core/replace rowstream with arrow feat/core/replace rowstream wi…
aqni Dec 2, 2024
b82a042
fix missing bracket (#511)
aqni Dec 3, 2024
3b582cb
fix SQLSessionIT for some of new PR (#512)
aqni Dec 3, 2024
5b13ddc
fix: sqlsessionit and udf (#513)
aqni Dec 4, 2024
8371051
fix unit-test (#515)
aqni Dec 5, 2024
6fbcceb
fix: tpc-h (#519)
aqni Dec 6, 2024
208cc51
fix: logic short-circuit in OrNode
aqni Dec 6, 2024
1cb22e3
fix(arrow): copy logic operator in physical engine (#532)
aqni Dec 12, 2024
02c81ce
fix(arrow): filter push down & add RemoveNullColumnExecutor info (#533)
aqni Dec 13, 2024
a344e76
feat(arrow): refactor & session implements (#521)
shinyano Dec 19, 2024
b7c2144
fix: count * contains key (#548)
aqni Dec 27, 2024
44d81a9
merge main (#550)
aqni Dec 30, 2024
f2b64e8
feat(arrow): merge main (#560)
aqni Jan 23, 2025
43eeff0
feat(arrow): merge main (#561)
aqni Jan 23, 2025
2d10b6a
Merge branch 'main' into feat/core/replace-rowstream-with-arrow
aqni Jan 23, 2025
bbf4921
merge main (#568)
aqni Feb 9, 2025
9e4036d
Merge branch 'main' into feat/core/replace-rowstream-with-arrow
aqni Feb 9, 2025
60fee23
fix
aqni Feb 9, 2025
64089e9
feat(arrow): use arrow impl cross-join (#573)
aqni Feb 14, 2025
0c33230
fix(arrow): fix stream query && insert with select (#574)
shinyano Feb 21, 2025
3ec8b9e
Merge branch 'main' into feat/core/replace-rowstream-with-arrow
aqni Feb 21, 2025
0965874
fix(arrow): remove unuseful code (#575)
aqni Feb 21, 2025
5422cfc
Revert "fix(arrow): fix stream query && insert with select (#574)" (#…
aqni Feb 21, 2025
7ff3da5
fix(arrow): fix stream query && insert with select (#577)
shinyano Feb 24, 2025
b278628
merge main
aqni Feb 24, 2025
068db5d
fix some problems after merge main
aqni Feb 24, 2025
b3e2f46
empty stream (#579)
shinyano Feb 25, 2025
a22ce17
fix: removeNullColumn and HashJoin (#580)
aqni Feb 25, 2025
18e771a
feat(arrow): export byte && csv (#581)
shinyano Feb 25, 2025
7140d48
empty res
shinyano Feb 25, 2025
8b31813
feat(arrow): empty result (#583)
shinyano Feb 26, 2025
891ecc9
fix: compile (#584)
aqni Feb 27, 2025
7afe655
fix(arrow): fix empty res with header && insert from select (#585)
shinyano Feb 27, 2025
f7ead63
fix: count point 0
aqni Feb 27, 2025
e8b5d6d
fix: parallel RowTransform
aqni Feb 27, 2025
c13bce5
fix: NewSessionIT
aqni Feb 27, 2025
b485068
fix: TagIT
aqni Feb 27, 2025
ae37792
feat(arrow): transform && empty res (#586)
shinyano Feb 27, 2025
de99493
fix: RestAnnotationIT
aqni Feb 27, 2025
3036300
fix: RestAnnotationIT
aqni Feb 27, 2025
72abcbb
fix: CompactionIT
aqni Feb 27, 2025
a776a99
fix: UDFIT
aqni Feb 27, 2025
1b5a0c8
fix: TPCH
aqni Feb 27, 2025
257f820
fix: SQLSessionIT
aqni Feb 27, 2025
05c2d47
fix: UDFIT
aqni Feb 27, 2025
be6f701
fix: TPCH
aqni Feb 27, 2025
79247d1
fix: DBCE
aqni Feb 27, 2025
df3cdec
fix: IT of Influxdb
aqni Feb 28, 2025
266f4bf
fix: avoid transfer Arrow to Row in TPCH Q15
aqni Feb 28, 2025
b1af7cf
fix: add TPCH Warmup repeats and records
aqni Feb 28, 2025
e0cbe8c
Update SQLWarmupIT.java
aqni Mar 2, 2025
18786b4
fix last commit
aqni Mar 4, 2025
cdc80b6
feat(arrow): export & PySessionIT (#588)
shinyano Mar 4, 2025
e41cf90
format
aqni Mar 4, 2025
c1cb3fc
remove incorrect opt rule
aqni Mar 4, 2025
40b042e
fix(arrow): install pyarrow (#590)
shinyano Mar 5, 2025
5b150a8
merge main
aqni Mar 6, 2025
d24e8fa
fix: UdfIT (#592)
aqni Mar 7, 2025
bc68b9a
test tpch only
aqni Mar 4, 2025
1481ca3
avoid building Parallel Pipeline for no udf RowTransform
aqni Mar 4, 2025
af87e71
avoid extra fetch async
aqni Mar 4, 2025
ec6f7dd
revert max_repetitions_num
aqni Mar 4, 2025
7190d8f
Revert "Update SQLWarmupIT.java"
aqni Mar 4, 2025
e18b4f4
Revert "fix: add TPCH Warmup repeats and records"
aqni Mar 4, 2025
d648c3a
refactor: join physical planner
aqni Mar 4, 2025
6b50cc5
refactor: arch of physical join
aqni Mar 7, 2025
a8fca7a
feat: impl arrow NestedLoopJoin
aqni Mar 7, 2025
82f60fc
refactor: dedup physical join code
aqni Mar 7, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
1 change: 1 addition & 0 deletions .github/actions/dependence/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,7 @@ runs:
echo "tqdm" >> requirements.txt
echo "requests" >> requirements.txt
echo "torch" >> requirements.txt
echo "pyarrow" >> requirements.txt

- if: inputs.scope=='all' && inputs.iginx-conda-flag != 'true'
name: Set up Python ${{ inputs.python-version }}
Expand Down
2 changes: 2 additions & 0 deletions .github/actions/service/freeThreadConda/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,10 +27,12 @@ runs:
python -VV
python -c "import sys;print(sys._is_gil_enabled())"
curl -L -O https://github.com/IGinX-THU/IGinX-resources/raw/refs/heads/main/resources/python/pandas-3.0.0.dev0+1654.g32a97a969a-cp313-cp313t-win_amd64.whl
curl -L -O https://pypi.fury.io/arrow-nightlies/-/ver_1AsSYu/pyarrow-19.0.0.dev286-cp313-cp313t-win_amd64.whl#md5=3c940f7b4ffd0debaa8b23d06bc9b2ea
ls -l
if [ "$RUNNER_OS" == "Windows" ]; then
python -m pip install numpy thrift pemjax
python -m pip install pandas*.whl
python3.13t -m pip install pyarrow*.whl
else
python -m pip install pandas==2.2.3 numpy thrift pemjax
fi
14 changes: 12 additions & 2 deletions .github/actions/tpchSingleTest/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -49,17 +49,26 @@ runs:
- name: Run SQL Warmup on Old IGinX
if: inputs.status != 'ok'
shell: bash -el {0}
working-directory: IGinX
run: mvn test -q -Dtest=SQLWarmupIT -DfailIfNoTests=false -P-format

- name: Run TPCH Test on Old IGinX
if: inputs.status != 'ok'
shell: bash -el {0}
run: mvn test -q -Dtest=TPCHOldIT -DfailIfNoTests=false -P-format
working-directory: IGinX
run: |
mkdir -p test/src/test/resources/tpch/runtimeInfo
cp -f ../test/src/test/resources/tpch/runtimeInfo/newTimeCosts.txt test/src/test/resources/tpch/runtimeInfo/newTimeCosts.txt
mvn test -q -Dtest=TPCHOldIT -DfailIfNoTests=false -P-format
mkdir -p ../test/src/test/resources/tpch/runtimeInfo
cp -f test/src/test/resources/tpch/runtimeInfo/failedQueryIds.txt ../test/src/test/resources/tpch/runtimeInfo/failedQueryIds.txt
cp -f test/src/test/resources/tpch/runtimeInfo/iterationTimes.txt ../test/src/test/resources/tpch/runtimeInfo/iterationTimes.txt

- name: Show Old IGinX log
if: always() && inputs.status != 'ok'
shell: bash
run: cat IGinX/iginx-*.log
working-directory: IGinX
run: cat iginx-*.log

- name: Stop Old IGinX
if: inputs.status != 'ok'
Expand All @@ -70,6 +79,7 @@ runs:
- name: Get Test Result
id: get
shell: bash
working-directory: IGinX
run: |
if [ -f test/src/test/resources/tpch/runtimeInfo/status.txt ]; then
STATUS=$(cat test/src/test/resources/tpch/runtimeInfo/status.txt)
Expand Down
162 changes: 81 additions & 81 deletions .github/workflows/standard-test-suite.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,72 +11,72 @@ concurrency:
cancel-in-progress: true

jobs:
unit-test:
uses: ./.github/workflows/unit-test.yml
unit-mds:
uses: ./.github/workflows/unit-mds.yml
case-regression:
uses: ./.github/workflows/case-regression.yml
with:
metadata-matrix: '["zookeeper"]'
standalone-test:
uses: ./.github/workflows/standalone-test.yml
with:
metadata-matrix: '["zookeeper"]'
standalone-test-no-optimizer:
uses: ./.github/workflows/standalone-test-no-optimizer.yml
with:
metadata-matrix: '["zookeeper"]'
db-ce:
uses: ./.github/workflows/DB-CE.yml
with:
metadata-matrix: '["zookeeper"]'
db-ce-no-optimizer:
uses: ./.github/workflows/DB-CE.yml
with:
metadata-matrix: '["zookeeper"]'
close-optimizer: "true"
standalone-test-vectordb:
uses: ./.github/workflows/standalone-test.yml
with:
metadata-matrix: '["zookeeper"]'
os-matrix: '["ubuntu-latest", "windows-latest"]'
db-matrix: '["VectorDB"]'
timeout-minutes: 300
standalone-test-no-optimizer-vectordb:
uses: ./.github/workflows/standalone-test-no-optimizer.yml
with:
metadata-matrix: '["zookeeper"]'
os-matrix: '["ubuntu-latest", "windows-latest"]'
db-matrix: '["VectorDB"]'
timeout-minutes: 300
db-ce-vectordb:
uses: ./.github/workflows/DB-CE.yml
with:
metadata-matrix: '["zookeeper"]'
os-matrix: '["ubuntu-latest", "windows-latest"]'
db-matrix: '["VectorDB"]'
functest: "NewSessionIT,SQLCompareIT,TagIT,RestIT,TransformIT,UDFIT,RestAnnotationIT,SQLSessionIT,SQLSessionPoolIT,SessionV2IT,CompactionIT,TimePrecisionIT,PySessionIT"
timeout-minutes: 360
db-ce-no-optimizer-vectordb:
uses: ./.github/workflows/DB-CE.yml
with:
metadata-matrix: '["zookeeper"]'
os-matrix: '["ubuntu-latest", "windows-latest"]'
db-matrix: '["VectorDB"]'
functest: "NewSessionIT,SQLCompareIT,TagIT,RestIT,TransformIT,UDFIT,RestAnnotationIT,SQLSessionIT,SQLSessionPoolIT,SessionV2IT,CompactionIT,TimePrecisionIT,PySessionIT"
timeout-minutes: 360
close-optimizer: "true"
remote-test:
uses: ./.github/workflows/remote-test.yml
with:
metadata-matrix: '["zookeeper"]'
assembly-test:
uses: ./.github/workflows/assembly-test.yml
free-thread-test:
uses: ./.github/workflows/free-thread-test.yml
with:
metadata-matrix: '["zookeeper"]'
# unit-test:
# uses: ./.github/workflows/unit-test.yml
# unit-mds:
# uses: ./.github/workflows/unit-mds.yml
# case-regression:
# uses: ./.github/workflows/case-regression.yml
# with:
# metadata-matrix: '["zookeeper"]'
# standalone-test:
# uses: ./.github/workflows/standalone-test.yml
# with:
# metadata-matrix: '["zookeeper"]'
# standalone-test-no-optimizer:
# uses: ./.github/workflows/standalone-test-no-optimizer.yml
# with:
# metadata-matrix: '["zookeeper"]'
# db-ce:
# uses: ./.github/workflows/DB-CE.yml
# with:
# metadata-matrix: '["zookeeper"]'
# db-ce-no-optimizer:
# uses: ./.github/workflows/DB-CE.yml
# with:
# metadata-matrix: '["zookeeper"]'
# close-optimizer: "true"
# standalone-test-vectordb:
# uses: ./.github/workflows/standalone-test.yml
# with:
# metadata-matrix: '["zookeeper"]'
# os-matrix: '["ubuntu-latest", "windows-latest"]'
# db-matrix: '["VectorDB"]'
# timeout-minutes: 300
# standalone-test-no-optimizer-vectordb:
# uses: ./.github/workflows/standalone-test-no-optimizer.yml
# with:
# metadata-matrix: '["zookeeper"]'
# os-matrix: '["ubuntu-latest", "windows-latest"]'
# db-matrix: '["VectorDB"]'
# timeout-minutes: 300
# db-ce-vectordb:
# uses: ./.github/workflows/DB-CE.yml
# with:
# metadata-matrix: '["zookeeper"]'
# os-matrix: '["ubuntu-latest", "windows-latest"]'
# db-matrix: '["VectorDB"]'
# functest: "NewSessionIT,SQLCompareIT,TagIT,RestIT,TransformIT,UDFIT,RestAnnotationIT,SQLSessionIT,SQLSessionPoolIT,SessionV2IT,CompactionIT,TimePrecisionIT,PySessionIT"
# timeout-minutes: 360
# db-ce-no-optimizer-vectordb:
# uses: ./.github/workflows/DB-CE.yml
# with:
# metadata-matrix: '["zookeeper"]'
# os-matrix: '["ubuntu-latest", "windows-latest"]'
# db-matrix: '["VectorDB"]'
# functest: "NewSessionIT,SQLCompareIT,TagIT,RestIT,TransformIT,UDFIT,RestAnnotationIT,SQLSessionIT,SQLSessionPoolIT,SessionV2IT,CompactionIT,TimePrecisionIT,PySessionIT"
# timeout-minutes: 360
# close-optimizer: "true"
# remote-test:
# uses: ./.github/workflows/remote-test.yml
# with:
# metadata-matrix: '["zookeeper"]'
# assembly-test:
# uses: ./.github/workflows/assembly-test.yml
# free-thread-test:
# uses: ./.github/workflows/free-thread-test.yml
# with:
# metadata-matrix: '["zookeeper"]'
tpc-h-regression-test:
uses: ./.github/workflows/tpc-h.yml
with:
Expand All @@ -88,18 +88,18 @@ jobs:
os-matrix: '["ubuntu-latest"]'
metadata-matrix: '["zookeeper"]'
close-optimizer: "true"
tpc-h-regression-test-vectordb:
uses: ./.github/workflows/tpc-h.yml
with:
os-matrix: '["ubuntu-latest"]'
metadata-matrix: '["zookeeper"]'
db-matrix: '["VectorDB"]'
timeout-minutes: 360
tpc-h-regression-test-no-optimizer-vectordb:
uses: ./.github/workflows/tpc-h.yml
with:
os-matrix: '["ubuntu-latest"]'
metadata-matrix: '["zookeeper"]'
db-matrix: '["VectorDB"]'
close-optimizer: "true"
timeout-minutes: 360
# tpc-h-regression-test-vectordb:
# uses: ./.github/workflows/tpc-h.yml
# with:
# os-matrix: '["ubuntu-latest"]'
# metadata-matrix: '["zookeeper"]'
# db-matrix: '["VectorDB"]'
# timeout-minutes: 360
# tpc-h-regression-test-no-optimizer-vectordb:
# uses: ./.github/workflows/tpc-h.yml
# with:
# os-matrix: '["ubuntu-latest"]'
# metadata-matrix: '["zookeeper"]'
# db-matrix: '["VectorDB"]'
# close-optimizer: "true"
# timeout-minutes: 360
47 changes: 47 additions & 0 deletions arrow-patch/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
<?xml version="1.0" encoding="UTF-8"?>
<!--

IGinX - the polystore system with high performance
Copyright (C) Tsinghua University
[email protected]

This program is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License
along with this program; if not, write to the Free Software Foundation,
Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.

-->
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>cn.edu.tsinghua</groupId>
<artifactId>iginx</artifactId>
<version>${revision}</version>
</parent>

<artifactId>arrow-0patch</artifactId>

<properties>
<maven.compiler.source>8</maven.compiler.source>
<maven.compiler.target>8</maven.compiler.target>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>

<dependencies>
<dependency>
<groupId>org.apache.arrow</groupId>
<artifactId>arrow-vector</artifactId>
</dependency>
</dependencies>

</project>
Loading
Loading