You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The HashJoin operator has two versions: HashJoinV1 and HashJoinV2. You can specify the desired version using the [`tidb_hash_join_version`](/system-variables.md#tidb_hash_join_version-new-in-v840) system variable. The following sections describe the execution process of each version respectively.
205
+
206
+
#### HashJoinv1
207
+
204
208
The `HashJoin` operator has an inner worker, an outer worker, and N join workers. The detailed execution process is as follows:
205
209
206
210
1. The inner worker reads inner table rows and constructs a hash table.
-`probe`: The total time consumed for joining with outer table rows and the hash table.
227
231
-`fetch`: The total time that the join worker waits to read the outer table rows data.
228
232
233
+
#### HashJoinv2
234
+
235
+
The `HashJoin` operator has one fetcher, N row table builders, and N hash table builders on the build side, and has one fetcher and N workers on the probe side. The detailed execution process is as follows:
236
+
237
+
1. The fetcher on the build side reads data from the downstream executor and dispatches data to each row table builder.
238
+
2. Each row table builder receives data chunks, splits them into several partitions, and builds row tables.
239
+
3. The process waits until all row tables are built.
240
+
4. Hash table builders build hash tables using row tables.
241
+
5. The fetcher on the probe side reads data from the downstream executor and dispatches it to workers.
242
+
6. After receiving data, workers look up hash tables, build the final results, and dispatch the results to the result channel.
243
+
7. The main thread of `HashJoin` retrieves the join results from the result channel.
244
+
245
+
The `HashJoin` operator contains the following execution information:
-`build_hash_table`: The execution information of reading data from the downstream operator and building hash tables.
252
+
-`time`: The total time consumption of building hash tables.
253
+
-`fetch`: The total time spent reading data from the downstream.
254
+
-`max_partition`: The longest execution time among all row table builders.
255
+
-`total_partition`: The total execution time taken by all row table builders.
256
+
-`max_build`: The longest execution time among all hash table builders.
257
+
-`total_build`: The total execution time taken by all hash table builders.
258
+
-`probe`: The execution information of reading data from the downstream operator and performing probe operations.
259
+
-`time`: The total time consumption of probing.
260
+
-`fetch_and_wait`: The total time spent reading data from downstream and waiting for the data to be received by the upstream.
261
+
-`max_worker_time`: The longest execution time among all workers, including reading data from downstream, executing probe operations, and waiting for the data received by the upstream.
262
+
-`total_worker_time`: The total execution time of all workers.
263
+
-`max_probe`: The longest probe time among all workers.
264
+
-`total_probe`: The total probing time of all workers.
265
+
-`probe_collision`: The number of hash collisions encountered during probing.
266
+
-`spill`: The execution information during the spill.
267
+
-`round`: The number of spill rounds.
268
+
-`spilled_partition_num_per_round`: The number of spilled partitions per round, formatted as `x/y`, where `x` is the number of spilled partitions and `y` is the total number of partitions.
269
+
-`total_spill_GiB_per_round`: The total size of data written into the disk in each spill round.
270
+
-`build_spill_row_table_GiB_per_round`: The total size of row table data written into the disk in each spill round on the build side.
271
+
-`build_spill_hash_table_per_round`: The total size of hash table data written into the disk in each spill round on the build side.
272
+
229
273
### TableFullScan (TiFlash)
230
274
231
275
The `TableFullScan` operator executed on a TiFlash node contains the following execution information:
0 commit comments