Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improvement](fragment) Use partitioned hash map to manage contexts (… #46282

Merged
merged 2 commits into from
Jan 3, 2025

Conversation

Gabriel39
Copy link
Contributor

#46235)

Contexts in fragment_mgr are managed by a global map and accessed by multiple threads concurrently with a global lock. It introduced a obvious overhead. To solve it , this PR use a partitioned hash table to optimize the global lock.

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Jan 2, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@Gabriel39
Copy link
Contributor Author

run buildall

…pache#46235)

Contexts in `fragment_mgr` are managed by a global map and accessed by
multiple threads concurrently with a global lock. It introduced a
obvious overhead. To solve it , this PR use a partitioned hash table to
optimize the global lock.
@Gabriel39
Copy link
Contributor Author

run buildall

@Gabriel39
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 41079 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 39730dfb4db970ffd44c2533d86ef480a1e01351, data reload: false

------ Round 1 ----------------------------------
q1	17581	7455	7266	7266
q2	2072	171	170	170
q3	10572	1116	1199	1116
q4	10525	798	786	786
q5	7750	2901	2827	2827
q6	238	143	139	139
q7	966	611	604	604
q8	9356	2011	2070	2011
q9	6642	6429	6424	6424
q10	7010	2291	2345	2291
q11	474	267	264	264
q12	417	218	216	216
q13	17788	3029	2996	2996
q14	243	209	207	207
q15	579	518	524	518
q16	719	615	608	608
q17	983	605	561	561
q18	7338	6725	6575	6575
q19	1395	1094	1107	1094
q20	485	204	198	198
q21	4035	3249	3251	3249
q22	1131	996	959	959
Total cold run time: 108299 ms
Total hot run time: 41079 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7268	7272	7241	7241
q2	325	234	245	234
q3	2975	2949	3003	2949
q4	2064	1886	1824	1824
q5	5774	5778	5795	5778
q6	225	145	146	145
q7	2244	1864	1855	1855
q8	3379	3594	3535	3535
q9	8891	8988	8914	8914
q10	3612	3595	3558	3558
q11	618	500	505	500
q12	845	636	636	636
q13	10620	3202	3157	3157
q14	314	272	273	272
q15	566	517	515	515
q16	709	691	685	685
q17	1839	1646	1599	1599
q18	8327	7791	7610	7610
q19	1678	1566	1490	1490
q20	2132	1874	1897	1874
q21	5611	5510	5379	5379
q22	1230	992	1039	992
Total cold run time: 71246 ms
Total hot run time: 60742 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 198633 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 39730dfb4db970ffd44c2533d86ef480a1e01351, data reload: false

query1	1329	939	919	919
query2	6216	2046	2074	2046
query3	10938	4427	4587	4427
query4	67325	29166	23445	23445
query5	4932	471	469	469
query6	419	175	174	174
query7	5518	312	314	312
query8	319	241	242	241
query9	8836	2724	2718	2718
query10	444	283	267	267
query11	17295	15378	15952	15378
query12	159	103	107	103
query13	1503	446	449	446
query14	10076	7644	7748	7644
query15	203	183	184	183
query16	7121	484	491	484
query17	1319	603	597	597
query18	1842	332	343	332
query19	211	177	163	163
query20	119	114	115	114
query21	65	50	45	45
query22	4875	4383	4607	4383
query23	34875	34659	34161	34161
query24	6103	3018	2935	2935
query25	531	418	428	418
query26	653	174	174	174
query27	1707	311	313	311
query28	4532	2543	2517	2517
query29	737	473	474	473
query30	242	172	180	172
query31	1002	836	829	829
query32	70	56	58	56
query33	399	297	281	281
query34	937	535	522	522
query35	859	750	753	750
query36	1094	952	977	952
query37	124	71	76	71
query38	4103	4229	4057	4057
query39	1530	1453	1516	1453
query40	147	83	79	79
query41	50	49	50	49
query42	109	99	102	99
query43	538	502	502	502
query44	1203	858	848	848
query45	188	167	171	167
query46	1138	732	753	732
query47	2009	1873	1911	1873
query48	464	397	392	392
query49	712	393	399	393
query50	872	436	439	436
query51	7405	7292	7112	7112
query52	96	88	87	87
query53	250	178	188	178
query54	554	438	445	438
query55	80	77	79	77
query56	261	246	248	246
query57	1218	1117	1110	1110
query58	217	216	209	209
query59	3126	3184	2996	2996
query60	272	252	262	252
query61	113	108	102	102
query62	776	678	665	665
query63	222	203	187	187
query64	1398	690	668	668
query65	3276	3196	3201	3196
query66	705	301	302	301
query67	15839	15634	15607	15607
query68	4102	597	591	591
query69	453	271	270	270
query70	1233	1063	1151	1063
query71	339	252	254	252
query72	6527	4164	4037	4037
query73	750	351	359	351
query74	10187	8947	9084	8947
query75	3370	2693	2615	2615
query76	1826	1063	977	977
query77	480	268	273	268
query78	10566	9618	9626	9618
query79	1260	604	620	604
query80	870	438	459	438
query81	503	245	241	241
query82	1272	126	117	117
query83	237	148	142	142
query84	286	83	76	76
query85	887	318	296	296
query86	339	296	304	296
query87	4437	4376	4261	4261
query88	3759	2435	2396	2396
query89	416	303	294	294
query90	2032	193	189	189
query91	194	149	155	149
query92	63	49	51	49
query93	1537	549	548	548
query94	842	305	299	299
query95	364	258	263	258
query96	618	284	293	284
query97	3387	3204	3283	3204
query98	214	210	190	190
query99	1575	1322	1290	1290
Total cold run time: 318725 ms
Total hot run time: 198633 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 33.33 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 39730dfb4db970ffd44c2533d86ef480a1e01351, data reload: false

query1	0.04	0.03	0.03
query2	0.06	0.03	0.03
query3	0.22	0.07	0.07
query4	1.63	0.10	0.10
query5	0.52	0.51	0.52
query6	1.13	0.73	0.72
query7	0.02	0.02	0.01
query8	0.04	0.03	0.04
query9	0.57	0.50	0.50
query10	0.55	0.55	0.55
query11	0.14	0.10	0.10
query12	0.14	0.13	0.11
query13	0.62	0.59	0.59
query14	3.01	3.00	3.01
query15	0.92	0.84	0.84
query16	0.37	0.39	0.38
query17	1.08	1.05	1.02
query18	0.24	0.22	0.22
query19	1.91	1.81	1.92
query20	0.01	0.01	0.02
query21	15.36	0.62	0.58
query22	2.52	2.70	2.36
query23	17.06	0.92	0.95
query24	3.26	1.29	0.38
query25	0.13	0.25	0.13
query26	0.36	0.14	0.14
query27	0.04	0.04	0.04
query28	10.73	1.13	1.08
query29	12.56	3.29	3.28
query30	0.24	0.06	0.06
query31	2.87	0.38	0.40
query32	3.23	0.48	0.47
query33	3.01	3.02	3.04
query34	17.26	4.53	4.51
query35	4.62	4.63	4.57
query36	0.66	0.48	0.48
query37	0.09	0.06	0.06
query38	0.04	0.03	0.04
query39	0.03	0.02	0.02
query40	0.16	0.12	0.12
query41	0.08	0.03	0.02
query42	0.03	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 107.6 s
Total hot run time: 33.33 s

@Gabriel39 Gabriel39 merged commit b2caa40 into apache:branch-3.0 Jan 3, 2025
18 of 21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants