Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cherry-pick][feature](load) new ingestion load (#45937) #46092

Open
wants to merge 1 commit into
base: branch-3.0
Choose a base branch
from

Conversation

gnehil
Copy link
Contributor

@gnehil gnehil commented Dec 27, 2024

Cherry-picked from #45937

### What problem does this PR solve?

Problem Summary:

Ingestion Load is used to load pre-processed data into doris.

Preprocessing refers to writing the result data to an external storage
system after the data is processed according to the partitioning,
bucketing and aggregation methods defined by the doris table.

The preprocessing is completed by the external system, and then the BE
reads the data and converts it into segment files and saves it.

The basic flow is as follows:

![ingestion_load](https://github.com/apache/doris/assets/30104232/aa468cd4-90bf-4d9d-b69b-0425b66b15f4)

### Release note
[feature](load) new ingestion load

(cherry picked from commit 6580f6b)
@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@gnehil
Copy link
Contributor Author

gnehil commented Dec 27, 2024

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 41058 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit a0b16e643c07b38c2168c132c36ae4a5252d4109, data reload: false

------ Round 1 ----------------------------------
q1	17586	7536	7348	7348
q2	2063	177	176	176
q3	10760	1061	1205	1061
q4	10559	745	701	701
q5	7745	2862	2869	2862
q6	235	148	146	146
q7	977	607	611	607
q8	9361	1974	2061	1974
q9	6680	6506	6474	6474
q10	7042	2271	2311	2271
q11	471	272	261	261
q12	403	216	208	208
q13	17785	2987	3013	2987
q14	234	209	206	206
q15	559	535	523	523
q16	711	624	620	620
q17	992	595	570	570
q18	7383	6727	6657	6657
q19	1419	1013	1003	1003
q20	475	204	200	200
q21	4160	3213	3220	3213
q22	1089	1023	990	990
Total cold run time: 108689 ms
Total hot run time: 41058 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7421	7240	7280	7240
q2	322	231	230	230
q3	2911	2927	2977	2927
q4	2062	1888	1791	1791
q5	5714	5757	5743	5743
q6	224	147	143	143
q7	2275	1792	1832	1792
q8	3395	3577	3584	3577
q9	8816	8899	8918	8899
q10	3606	3545	3572	3545
q11	588	505	493	493
q12	801	602	642	602
q13	9064	3131	3133	3131
q14	293	273	288	273
q15	572	528	522	522
q16	721	677	670	670
q17	1863	1636	1611	1611
q18	8217	7662	7898	7662
q19	1688	1602	1604	1602
q20	2137	1876	1894	1876
q21	5634	5392	5308	5308
q22	1198	1035	1040	1035
Total cold run time: 69522 ms
Total hot run time: 60672 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 197853 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit a0b16e643c07b38c2168c132c36ae4a5252d4109, data reload: false

query1	1317	924	931	924
query2	6223	2117	2114	2114
query3	10959	4596	4631	4596
query4	33281	23503	23336	23336
query5	3538	470	445	445
query6	268	171	197	171
query7	3987	314	312	312
query8	297	236	226	226
query9	9463	2694	2693	2693
query10	465	284	258	258
query11	18041	15273	15184	15184
query12	158	100	99	99
query13	1539	438	418	418
query14	8595	7654	7088	7088
query15	272	193	190	190
query16	7999	511	444	444
query17	1714	591	591	591
query18	2152	332	325	325
query19	383	166	164	164
query20	134	120	128	120
query21	68	56	56	56
query22	4810	4838	4624	4624
query23	35233	34491	34593	34491
query24	11142	2978	2934	2934
query25	646	415	413	413
query26	1167	174	171	171
query27	2230	305	300	300
query28	7680	2515	2470	2470
query29	848	471	456	456
query30	287	170	170	170
query31	1064	843	860	843
query32	99	51	55	51
query33	833	283	280	280
query34	1315	511	497	497
query35	860	744	737	737
query36	1126	988	985	985
query37	136	69	66	66
query38	4154	3928	4021	3928
query39	1530	1452	1481	1452
query40	142	85	84	84
query41	49	48	49	48
query42	105	94	100	94
query43	539	509	489	489
query44	1257	846	826	826
query45	189	167	172	167
query46	1151	748	709	709
query47	2027	1918	1948	1918
query48	462	379	376	376
query49	952	392	383	383
query50	826	438	427	427
query51	7529	7234	7100	7100
query52	104	96	87	87
query53	249	185	174	174
query54	1338	457	454	454
query55	77	80	73	73
query56	259	241	243	241
query57	1297	1158	1149	1149
query58	247	214	199	199
query59	3304	3031	2875	2875
query60	287	253	246	246
query61	107	111	108	108
query62	886	664	670	664
query63	219	196	186	186
query64	4160	653	650	650
query65	3276	3218	3166	3166
query66	828	303	296	296
query67	16175	15746	15497	15497
query68	4979	563	581	563
query69	441	256	267	256
query70	1184	1128	1087	1087
query71	336	251	282	251
query72	6417	4002	3972	3972
query73	741	338	350	338
query74	10270	9000	9075	9000
query75	3441	2614	2700	2614
query76	2711	977	1066	977
query77	379	260	261	260
query78	10671	9666	9655	9655
query79	2393	610	613	610
query80	1211	459	410	410
query81	558	241	243	241
query82	771	114	113	113
query83	221	145	142	142
query84	227	75	77	75
query85	1586	299	300	299
query86	487	294	294	294
query87	4427	4344	4367	4344
query88	4296	2372	2333	2333
query89	414	290	293	290
query90	2023	183	183	183
query91	181	152	147	147
query92	59	51	47	47
query93	2231	540	545	540
query94	896	296	287	287
query95	353	252	252	252
query96	597	279	269	269
query97	3330	3185	3187	3185
query98	233	222	201	201
query99	1571	1310	1312	1310
Total cold run time: 303689 ms
Total hot run time: 197853 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.67 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit a0b16e643c07b38c2168c132c36ae4a5252d4109, data reload: false

query1	0.03	0.03	0.04
query2	0.07	0.04	0.03
query3	0.23	0.07	0.06
query4	1.63	0.11	0.11
query5	0.52	0.50	0.51
query6	1.14	0.73	0.72
query7	0.02	0.02	0.01
query8	0.04	0.03	0.03
query9	0.57	0.51	0.50
query10	0.55	0.55	0.56
query11	0.14	0.10	0.10
query12	0.14	0.12	0.11
query13	0.62	0.61	0.60
query14	2.89	2.99	3.07
query15	0.91	0.82	0.82
query16	0.36	0.39	0.39
query17	1.00	1.07	1.05
query18	0.23	0.22	0.22
query19	1.85	1.88	1.98
query20	0.02	0.01	0.00
query21	15.36	0.60	0.60
query22	2.41	2.50	1.55
query23	17.12	0.96	0.74
query24	3.44	0.76	0.87
query25	0.25	0.11	0.10
query26	0.47	0.14	0.15
query27	0.06	0.03	0.04
query28	10.60	1.12	1.06
query29	12.51	3.28	3.26
query30	0.25	0.06	0.06
query31	2.85	0.39	0.38
query32	3.25	0.47	0.48
query33	2.98	3.02	3.06
query34	17.03	4.49	4.50
query35	4.55	4.57	4.51
query36	0.69	0.52	0.50
query37	0.09	0.06	0.06
query38	0.04	0.04	0.03
query39	0.03	0.02	0.02
query40	0.17	0.13	0.12
query41	0.08	0.03	0.02
query42	0.04	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 107.27 s
Total hot run time: 32.67 s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants