forked from secretflow/secretflow
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathtranslation.json
781 lines (781 loc) · 59.7 KB
/
translation.json
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
{
".": {
"secretflow": "secretflow",
"First-party SecretFlow components.": "官方SecretFlow组件"
},
"data_filter/condition_filter:0.0.1": {
"data_filter": "数据过滤",
"condition_filter": "行级过滤",
"Filter the table based on a single column's values and condition.\nWarning: the party responsible for condition filtering will directly send the sample distribution to other participants.\nMalicious participants can obtain the distribution of characteristics by repeatedly calling with different filtering values.\nAudit the usage of this component carefully.": "根据单个列的值和条件筛选表。\n警告:负责条件过滤的一方将直接将样本分发发送给其他参与者。\n恶意参与者可以通过使用不同的过滤值重复调用来获得特征的分布。\n仔细审核此组件的使用情况。",
"0.0.1": "0.0.1",
"comparator": "比较条件",
"Comparator to use for comparison. Must be one of '==','<','<=','>','>=','IN'": "用于比较的条件。必须是'=='、'<'、'<='、'>'、'>='、'IN'之一",
"value_type": "值类型",
"Type of the value to compare with. Must be one of ['STRING', 'FLOAT']": "要与之进行比较的值的类型。必须是“STRING”、“FLOAT”中的一个",
"bound_value": "条件值",
"Input a str with values separated by ','. List of values to compare with. If comparator is not 'IN', we only support one element in this list.": "输入一个str,其值以“,”分隔。 表示比较的值的列表。如果比较条件不是“IN”,则此列表中应该仅含一个元素。",
"float_epsilon": "浮点数误差值",
"Epsilon value for floating point comparison. WARNING: due to floating point representation in computers, set this number slightly larger if you want filter out the values exactly at desired boundary. for example, abs(1.001 - 1.002) is slightly larger than 0.001, and therefore may not be filter out using == and epsilson = 0.001": "用于浮点比较的Epsilon值。警告:由于计算机中的浮点表示,如果您想在所需的边界处过滤掉值,请将此数字设置得稍大一些。例如,abs(1.001-1.002)略大于0.001,因此可能无法使用==和epsilson=0.001进行过滤",
"in_ds": "输入数据集",
"Input vertical table.": "输入竖排表格。",
"features": "特征",
"Feature(s) to operate on.": "要操作的特征。",
"out_ds": "输出数据集",
"Output vertical table that satisfies the condition.": "输出满足条件的垂直表格。",
"out_ds_else": "输出数据集",
"Output vertical table that does not satisfies the condition.": "输出不满足条件的垂直表格。"
},
"data_filter/feature_filter:0.0.1": {
"data_filter": "数据过滤",
"feature_filter": "列级过滤",
"Drop features from the dataset.": "从数据集中删除特征",
"0.0.1": "0.0.1",
"in_ds": "输入数据集",
"Input vertical table.": "输入联合表",
"drop_features": "删除的特征",
"Features to drop.": "删除的特征",
"out_ds": "输出数据集",
"Output vertical table.": "输出联合表"
},
"data_prep/psi:0.0.2": {
"data_prep": "数据准备",
"psi": "隐私求交",
"PSI between two parties.": "双方之间的PSI",
"0.0.2": "0.0.2",
"protocol": "协议",
"PSI protocol.": "PSI协议",
"disable_alignment": "禁用对齐",
"It true, output is not promised to be aligned. Warning: enable this option may lead to errors in the following components. DO NOT TURN ON if you want to append other components.": "如果打开,双方输出不承诺对齐。警告:启用此选项可能会导致后续组件出现错误。如果要附加其他组件,请不要打开该选项。",
"skip_duplicates_check": "禁用重复值检查",
"If true, the check of duplicated items will be skiped.": "如果为true,将跳过重复值检查",
"check_hash_digest": "检查hash摘要",
"Check if hash digest of keys from parties are equal to determine whether to early-stop.": "检查各方求交键的哈希摘要是否相等,以确定是否提前停止",
"ecdh_curve": "ECDH 曲线类型",
"Curve type for ECDH PSI.": "ECDH PSI的曲线类型。",
"receiver_input": "接收方的输入",
"Individual table for receiver": "接收方的输入",
"key": "主键",
"Column(s) used to join.": "用于求交的列",
"sender_input": "发送方的输入",
"Individual table for sender": "发送方的样本表",
"psi_output": "PSI 输出",
"Output vertical table": "输出联合表"
},
"data_prep/train_test_split:0.0.1": {
"data_prep": "数据准备",
"train_test_split": "随机分割",
"Split datasets into random train and test subsets.\n- Please check: https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html": "将数据集拆分为随机的训练子集和测试子集\n- 请检查:https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html",
"0.0.1": "0.0.1",
"train_size": "训练子集大小",
"Proportion of the dataset to include in the train subset. The sum of test_size and train_size should be in the (0, 1] range.": "要包含在训练子集中的数据集的比例。测试子集大小 和 训练子集大小 的总和应在 (0, 1] 范围内。",
"test_size": "测试子集大小",
"Proportion of the dataset to include in the test subset. The sum of test_size and train_size should be in the (0, 1] range.": "要包含在测试子集中的数据集的比例。测试子集大小 和 训练子集大小 的总和应在 (0, 1] 范围内。",
"random_state": "数据打乱的随机种子",
"Specify the random seed of the shuffling.": "指定数据打乱的随机种子",
"shuffle": "数据打乱",
"Whether to shuffle the data before splitting.": "拆分前是否对数据进行数据打乱",
"input_data": "输入数据集",
"Input vertical table.": "输入联合表",
"train": "训练数据子集",
"Output train dataset.": "输出训练数据子集",
"test": "测试数据子集",
"Output test dataset.": "输出测试数据子集"
},
"feature/vert_binning:0.0.2": {
"feature": "特征",
"vert_binning": "常规分箱",
"Generate equal frequency or equal range binning rules for vertical partitioning datasets.": "为垂直分区数据集生成等频或等距分箱规则",
"0.0.2": "0.0.2",
"binning_method": "分箱方式",
"How to bin features with numeric types: \"quantile\"(equal frequency)/\"eq_range\"(equal range)": "如何对特征进行分箱:“quantile”(等频)/“eq_range”(等距)",
"bin_num": "bin_num",
"Max bin counts for one features.": "一个特征的最大分箱数",
"report_rules": "报告规则",
"Whether report binning rules.": "是否有报表装箱规则。",
"input_data": "输入数据集",
"Input vertical table.": "输入垂直表",
"feature_selects": "特征列",
"which features should be binned.": "应对哪些特征进行分箱",
"bin_rule": "分箱规则",
"Output bin rule.": "输出分箱规则",
"report": "报告",
"report rules details if report_rules is true": "如果report_rules为true,则报告规则详细信息"
},
"feature/vert_woe_binning:0.0.2": {
"feature": "特征",
"vert_woe_binning": "WOE分箱",
"Generate Weight of Evidence (WOE) binning rules for vertical partitioning datasets.": "为垂直分割数据集生成 Weight of Evidence (WOE) 分箱规则",
"0.0.2": "0.0.2",
"secure_device_type": "安全设备类型",
"Use SPU(Secure multi-party computation or MPC) or HEU(Homomorphic encryption or HE) to secure bucket summation.": "使用 SPU(安全多方计算, MPC)或 HEU(同态加密, HE)来保护桶求和",
"binning_method": "分箱方式",
"How to bin features with numeric types: \"quantile\"(equal frequency)/\"chimerge\"(ChiMerge from AAAI92-019: https://www.aaai.org/Papers/AAAI/1992/AAAI92-019.pdf)/\"eq_range\"(equal range)": "如何使用数值类型对特征进行分箱:“分位数”(等频)/“chimerge”(来自 AAAI92-019 的 ChiMerge:https://www.aaai.org/Papers/AAAI/1992/AAAI92-019.pdf)/“等宽”(等宽分箱)",
"bin_num": "分箱个数",
"Max bin counts for one features.": "一个特征的最大分箱数",
"positive_label": "正值标签",
"Which value represent positive value in label.": "哪个值表示标签中的正值",
"chimerge_init_bins": "chimerge初始分箱数",
"Max bin counts for initialization binning in ChiMerge.": "在 ChiMerge 中初始化分箱的最大分箱数",
"chimerge_target_bins": "chimerge目标分箱数",
"Stop merging if remaining bin counts is less than or equal to this value.": "在 ChiMerge 中如果剩余箱计数小于或等于此值,则停止合并",
"chimerge_target_pvalue": "chimerge目标 p-value 值",
"Stop merging if biggest pvalue of remaining bins is greater than this value.": "在 ChiMerge 中如果剩余分箱的最大 p-value 大于此值,则停止合并",
"report_rules": "报告规则",
"Whether report binning rules.": "是否有报表装箱规则。",
"input_data": "输入数据集",
"Input vertical table.": "输入联合表",
"feature_selects": "特征列",
"which features should be binned.": "应对哪些特征进行分箱",
"label": "标签",
"Label of input data.": "输入数据的标签",
"bin_rule": "分箱规则",
"Output WOE rule.": "输出 WOE 规则",
"report": "报告",
"report rules details if report_rules is true": "如果report_rules为true,则报告规则详细信息"
},
"io/identity:0.0.1": {
"io": "IO系列",
"identity": "恒等",
"map any input to output": "将任何输入映射到输出",
"0.0.1": "0.0.1",
"input_data": "input_data",
"Input data": "输入数据",
"output_data": "output_data",
"Output data": "输出数据"
},
"io/read_data:0.0.1": {
"io": "io",
"read_data": "读取数据",
"read model or rules from sf cluster": "从sf集群读取模型或规则",
"0.0.1": "0.0.1",
"input_dd": "输入模型或者规则文件",
"Input dist data": "输入模型或者规则文件",
"output_data": "输出数据",
"Output rules or models in DistData.meta": "在DistData.meta中输出规则或模型"
},
"io/write_data:0.0.1": {
"io": "io",
"write_data": "写入数据",
"write model or rules back to sf cluster": "将模型或规则写回sf集群",
"0.0.1": "0.0.1",
"rule or model protobuf by json formate": "json格式的规则或模型protobuf",
"write_data_type": "写入数据类型",
"which rule or model is writing": "正在编写哪个规则或模型",
"input_dd": "输入模型或者规则文件",
"Input dist data. Rule reconstructions may need hidden info in original rule for security considerations.": "输入模型或者规则文件。出于安全考虑,规则重建可能需要原始规则中的隐藏信息。",
"output_model": "输出模型",
"Output rules or models in sf cluster format": "以sf集群格式输出规则或模型"
},
"ml.eval/biclassification_eval:0.0.1": {
"ml.eval": "模型评估",
"biclassification_eval": "二分类评估",
"Statistics evaluation for a bi-classification model on a dataset.\n1. summary_report: SummaryReport\n2. group_reports: List[GroupReport]\n3. eq_frequent_bin_report: List[EqBinReport]\n4. eq_range_bin_report: List[EqBinReport]\n5. head_report: List[PrReport]\nreports for fpr = 0.001, 0.005, 0.01, 0.05, 0.1, 0.2": "数据集上二分类模型的统计评估\n1. summary_report: 总结报告\n2. group_reports: 分组报告\n3. eq_frequent_bin_report: 等频分箱报告\n4. eq_range_bin_report: 等距分箱报告\n5. head_report: \nFPR = 0.001, 0.005, 0.01, 0.05, 0.1, 0.2 的精度报告",
"0.0.1": "0.0.1",
"bucket_size": "分桶数",
"Number of buckets.": "分桶数",
"min_item_cnt_per_bucket": "每个桶的最小项目数",
"Min item cnt per bucket. If any bucket doesn't meet the requirement, error raises. For security reasons, we require this parameter to be at least 5.": "每个桶的最小项目数量;如果任何一个分桶不符合要求,则会引发错误出于安全原因,我们要求此参数至少为 5",
"in_ds": "输入数据表",
"Input table with prediction and label, usually is a result from a prediction component.": "包含预测和标签的输入数据表,通常是预测组件的结果",
"label": "标签",
"The label name to use in the dataset.": "数据集中要使用的标签名称",
"prediction": "预测",
"The prediction result column name to use in the dataset.": "要在数据集中使用的预测结果列名",
"reports": "报告",
"Output report.": "输出报告"
},
"ml.eval/prediction_bias_eval:0.0.1": {
"ml.eval": "模型评估",
"prediction_bias_eval": "预测偏差评估",
"Calculate prediction bias, ie. average of predictions - average of labels.": "计算预测偏差,即 预测平均值 - 标签平均值",
"0.0.1": "0.0.1",
"bucket_num": "分桶数",
"Num of bucket.": "分桶数",
"min_item_cnt_per_bucket": "每个桶的最小项目数",
"Min item cnt per bucket. If any bucket doesn't meet the requirement, error raises. For security reasons, we require this parameter to be at least 2.": "每个桶的最小项目数量;如果任何一个分桶不符合要求,则会引发错误出于安全原因,我们要求此参数至少为 2",
"bucket_method": "分桶方法",
"Bucket method.": "分桶方法",
"in_ds": "输入数据表",
"Input table with prediction and label, usually is a result from a prediction component.": "带有预测和标签的输入数据表,通常是预测组件的结果。",
"label": "标签",
"The label name to use in the dataset.": "要在数据集中使用的标签名称。",
"prediction": "预测",
"The prediction result column name to use in the dataset.": "要在数据集中使用的预测结果列名。",
"result": "结果",
"Output report.": "输出报告"
},
"ml.eval/regression_eval:0.0.1": {
"ml.eval": "评估模型",
"regression_eval": "回归模型评估",
"Statistics evaluation for a regression model on a dataset.\nContained Statistics:\nR2 Score (r2_score): It is a statistical measure that represents the proportion of the variance in the dependent variable that can be predicted from the independent variables. It ranges from 0 to 1, where a higher value indicates a better fit.\nMean Absolute Error (mean_abs_err): It calculates the average absolute difference between the predicted and actual values. It provides a measure of the average magnitude of the errors.\nMean Absolute Percentage Error (mean_abs_percent_err): It calculates the average absolute percentage difference between the predicted and actual values. It measures the average magnitude of the errors in terms of percentages.\nSum of Squared Errors (sum_squared_errors): It calculates the sum of the squared differences between the predicted and actual values. It provides an overall measure of the model's performance.\nMean Squared Error (mean_squared_errors): It calculates the average of the squared differences between the predicted and actual values. It is widely used as a loss function in regression problems.\nRoot Mean Squared Error (root_mean_squared_errors): It is the square root of the mean squared error. It provides a measure of the average magnitude of the errors in the original scale of the target variable.\nMean of True Values (y_true_mean): It calculates the average of the actual values in the target variable. It can be useful for establishing a baseline for the model's performance.\nMean of Predicted Values (y_pred_mean): It calculates the average of the predicted values. It can be compared with the y_true_mean to get an idea of the model's bias.\nResidual Histograms (residual_hists): It represents the distribution of the differences between the predicted and actual values. It helps to understand the spread and pattern of the errors.": "在数据集上对回归模型进行统计评估。\n包含的统计信息:\nR2得分(r2_score):它是一种统计度量,表示因变量中可以从自变量中预测的方差比例。它的取值范围从0到1,值越高表示拟合效果越好。\n平均绝对误差(mean_abs_err):它计算预测值和实际值之间的平均绝对差。它提供了误差的平均大小的度量。\n平均绝对百分比误差(mean_abs_percent_err):它计算预测值和实际值之间的平均绝对百分比差。它以百分比的形式衡量误差的平均大小。\n误差平方和(sum_squared_errors):它计算预测值和实际值之间的差值的平方和。它提供了对模型性能的整体衡量。\n均方误差(mean_squared_errors):它计算预测值和实际值之间的差值的平方的平均值。它广泛用作回归问题中的损失函数。\n均方根误差(root_mean_squared_errors):它是均方误差的平方根。它提供了目标变量原始标度中误差的平均幅度的测量。\n真值的平均值(y_true_mean):它计算目标变量中实际值的平均值。它可以用于建立模型性能的基准。\n预测值的平均值(y_pred_mean):它计算预测值的平均值。它可以与y_true_mean进行比较,以了解模型的偏差。\n残差直方图(residual_hists):它表示预测值和实际值之间的差异分布。它有助于理解误差的传播和模式。",
"0.0.1": "版本号:0.0.1",
"bucket_size": "桶大小",
"Number of buckets for residual histogram.": "残差直方图的桶数。",
"in_ds": "输入数据表",
"Input table with prediction and label, usually is a result from a prediction component.": "带有预测和标签的输入数据表,通常是预测组件的结果。",
"label": "标签",
"The label name to use in the dataset.": "要在数据集中使用的标签名称。",
"prediction": "预测",
"The prediction result column name to use in the dataset.": "要在数据集中使用的预测结果列名。",
"reports": "报告",
"Output report.": "输出报告。"
},
"ml.eval/ss_pvalue:0.0.1": {
"ml.eval": "模型评估",
"ss_pvalue": "P-VALUE评估",
"Calculate P-Value for LR model training on vertical partitioning dataset by using secret sharing.\nFor large dataset(large than 10w samples & 200 features),\nrecommend to use [Ring size: 128, Fxp: 40] options for SPU device.": "使用秘密共享计算垂直分区数据集上 LR 模型训练的 P 值\n对于大型数据集(大于10w样本和200个特征),\n建议对 SPU 设备使用 [Ring size: 128, Fxp: 40] 的选项",
"0.0.1": "0.0.1",
"model": "模型",
"Input model.": "输入模型",
"input_data": "输入数据集",
"Input vertical table.": "输入联合表",
"report": "报告",
"Output P-Value report.": "输出P-VALUE结果表"
},
"ml.predict/sgb_predict:0.0.2": {
"ml.predict": "模型预测",
"sgb_predict": "SecureBoost预测",
"Predict using SGB model.": "使用 SGB 模型进行预测",
"0.0.2": "0.0.2",
"receiver": "结果接收方",
"Party of receiver.": "结果接收方",
"pred_name": "预测结果列名",
"Name for prediction column": "预测结果列名",
"save_ids": "保存id列",
"Whether to save ids columns into output prediction table. If true, input feature_dataset must contain id columns, and receiver party must be id owner.": "是否将 id 列保存到输出预测表中;如果为 true,则输入feature_dataset必须包含 id 列,并且接收方必须是 id 所有者",
"save_label": "保存标签列",
"Whether or not to save real label columns into output pred file. If true, input feature_dataset must contain label columns and receiver party must be label owner.": "是否将真实的标签列保存到输出预测文件中;如果为 true,则输入feature_dataset必须包含标签列,并且接收方必须是标签所有者",
"model": "模型",
"feature_dataset": "特征数据集",
"Input vertical table.": "输入联合表",
"saved_features": "保存特征列",
"which features should be saved with prediction result": "哪些特征应该与预测结果一起保存",
"pred": "预测",
"Output prediction.": "输出预测结果表"
},
"ml.predict/slnn_predict:0.0.1": {
"ml.predict": "模型预测",
"slnn_predict": "拆分学习NN预测",
"Predict using the SLNN model.\nThis component is not enabled by default, it requires the use of the full version\nof secretflow image and setting the ENABLE_NN environment variable to true.": "使用拆分学习NN模型进行预测。\n此组件默认情况下未启用,它需要使用完整版本的secretflow镜像,并将ENABLE_NN环境变量设置为true。",
"0.0.1": "0.0.1",
"batch_size": "预测批数据量",
"The number of examples per batch.": "每个批次的数据量",
"receiver": "结果接收方",
"Party of receiver.": "结果接收方",
"pred_name": "预测结果列名",
"Column name for predictions.": "预测结果列名",
"save_ids": "保存id列",
"Whether to save ids columns into output prediction table. If true, input feature_dataset must contain id columns, and receiver party must be id owner.": "是否将 id 列保存到输出预测表中;如果为 true,则输入feature_dataset必须包含 id 列,并且接收方必须是 id 所有者",
"save_label": "保存标签列",
"Whether or not to save real label columns into output pred file. If true, input feature_dataset must contain label columns and receiver party must be label owner.": "是否将真实的标签列保存到输出 pred 文件中;如果为 true,则输入feature_dataset必须包含标签列,并且接收方必须是标签所有者",
"model": "模型",
"Input model.": "输入模型。",
"feature_dataset": "特征数据集",
"Input vertical table.": "输入联合表",
"saved_features": "保存特征列",
"which features should be saved with prediction result": "哪些特征应该与预测结果一起保存",
"pred": "预测",
"Output prediction.": "输出预测结果表"
},
"ml.predict/ss_glm_predict:0.0.1": {
"ml.predict": "模型预测",
"ss_glm_predict": "SSGLM预测",
"Predict using the SSGLM model.": "使用 SSGLM 模型进行预测",
"0.0.1": "0.0.1",
"receiver": "结果接收方",
"Party of receiver.": "结果接收方",
"pred_name": "预测结果列名",
"Column name for predictions.": "预测结果列名",
"save_ids": "保存id列",
"Whether to save ids columns into output prediction table. If true, input feature_dataset must contain id columns, and receiver party must be id owner.": "是否将 id 列保存到输出预测表中;如果为 true,则输入feature_dataset必须包含 id 列,并且接收方必须是 id 所有者",
"save_label": "保存标签列",
"Whether or not to save real label columns into output pred file. If true, input feature_dataset must contain label columns and receiver party must be label owner.": "是否将真实的标签列保存到输出 pred 文件中;如果为 true,则输入feature_dataset必须包含标签列,并且接收方必须是标签所有者",
"model": "模型",
"Input model.": "输入模型",
"feature_dataset": "特征数据集",
"Input vertical table.": "输入联合表",
"saved_features": "保存特征列",
"which features should be saved with prediction result": "哪些特征应该与预测结果一起保存",
"pred": "预测",
"Output prediction.": "输出预测结果表"
},
"ml.predict/ss_sgd_predict:0.0.1": {
"ml.predict": "模型预测",
"ss_sgd_predict": "逻辑回归预测",
"Predict using the SS-SGD model.": "使用 SS-SGD 模型进行预测",
"0.0.1": "0.0.1",
"batch_size": "训练批数据量",
"The number of training examples utilized in one iteration.": "一次迭代中使用的训练示例数",
"receiver": "结果接收方",
"Party of receiver.": "接收结果接收方",
"pred_name": "预测结果列名",
"Column name for predictions.": "预测结果列名",
"save_ids": "保存id列",
"Whether to save ids columns into output prediction table. If true, input feature_dataset must contain id columns, and receiver party must be id owner.": "是否将 id 列保存到输出预测表中;如果为 true,则输入feature_dataset必须包含 id 列,并且接收方必须是 id 所有者",
"save_label": "保存标签列",
"Whether or not to save real label columns into output pred file. If true, input feature_dataset must contain label columns and receiver party must be label owner.": "是否将真实的标签列保存到输出预测文件中;如果为 true,则输入feature_dataset必须包含标签列,并且接收方必须是标签所有者",
"model": "模型",
"Input model.": "输入模型",
"feature_dataset": "特征数据集",
"Input vertical table.": "输入联合表",
"saved_features": "保存特征列",
"which features should be saved with prediction result": "哪些特征应该与预测结果一起保存",
"pred": "预测",
"Output prediction.": "输出预测结果表"
},
"ml.predict/ss_xgb_predict:0.0.1": {
"ml.predict": "模型预测",
"ss_xgb_predict": "SS-XGB预测",
"Predict using the SS-XGB model.": "使用 SS-XGB 模型进行预测",
"0.0.1": "0.0.1",
"receiver": "结果接收方",
"Party of receiver.": "结果接收方",
"pred_name": "预测结果列名",
"Column name for predictions.": "预测结果列名",
"save_ids": "保存id列",
"Whether to save ids columns into output prediction table. If true, input feature_dataset must contain id columns, and receiver party must be id owner.": "是否将 id 列保存到输出预测表中;如果为 true,则输入feature_dataset必须包含 id 列,并且接收方必须是 id 所有者",
"save_label": "保存标签列",
"Whether or not to save real label columns into output pred file. If true, input feature_dataset must contain label columns and receiver party must be label owner.": "是否将真实的标签列保存到输出预测文件中;如果为 true,则输入feature_dataset必须包含标签列,并且接收方必须是标签所有者",
"model": "模型",
"Input model.": "输入模型",
"feature_dataset": "特征数据集",
"Input vertical table.": "输入联合表",
"saved_features": "保存特征列",
"which features should be saved with prediction result": "哪些特征应该与预测结果一起保存",
"pred": "预测",
"Output prediction.": "输出预测结果表"
},
"ml.train/sgb_train:0.0.2": {
"ml.train": "模型训练",
"sgb_train": "SecureBoost训练",
"Provides both classification and regression tree boosting (also known as GBDT, GBM)\nfor vertical split dataset setting by using secure boost.\n- SGB is short for SecureBoost. Compared to its safer counterpart SS-XGB, SecureBoost focused on protecting label holder.\n- Check https://arxiv.org/abs/1901.08755.": "使用Secure Boosting为垂直拆分数据集设置提供分类和回归树Boosting(也称为 GBDT、GBM)SGB是SecureBoost的缩写与更安全的SS-XGB相比, SecureBoost专注于保护标签持有人\n - 详细信息请参阅: https: //arxiv.org/abs/1901.08755",
"0.0.2": "0.0.2",
"num_boost_round": "训练轮数",
"Number of boosting iterations.": "Boosting迭代次数",
"max_depth": "最大深度",
"Maximum depth of a tree.": "树的最大深度",
"learning_rate": "学习率",
"Step size shrinkage used in update to prevent overfitting.": "更新中使用的步长收缩以防止过度拟合",
"objective": "目标",
"Specify the learning objective.": "指定学习目标",
"reg_lambda": "叶子节点权重L2正则项",
"L2 regularization term on weights.": "权重的 L2 正则化项",
"gamma": "最小分裂阈值",
"Greater than 0 means pre-pruning enabled. If gain of a node is less than this value, it would be pruned.": "大于 0 表示已启用预修剪;如果节点的增益小于此值,则将其修剪",
"colsample_by_tree": "每棵树子样本比率",
"Subsample ratio of columns when constructing each tree.": "构造每棵树时列的子样本比率",
"sketch_eps": "分裂系数",
"This roughly translates into O(1 / sketch_eps) number of bins.": "大致相当于 O(1 / 分裂系数) 个箱",
"base_score": "初始预测分数",
"The initial prediction score of all instances, global bias.": "所有实例的初始预测分数,全局偏差",
"seed": "种子",
"Pseudorandom number generator seed.": "伪随机数生成器种子",
"fixed_point_parameter": "HEU定点参数",
"Any floating point number encoded by heu, will multiply a scale and take the round, scale = 2 ** fixed_point_parameter. larger value may mean more numerical accuracy, but too large will lead to overflow problem.": "由heu编码的任何浮点数将乘以一个刻度并四舍五入, 刻度 = 2 ** HEU定点参数; 较大的值可能意味着更高的数值精度, 但太大会导致溢出问题",
"first_tree_with_label_holder_feature": "第一棵树是否使用标签持有人自己的特征",
"Whether to train the first tree with label holder's own features.": "是否使用标签持有人自己的特征训练第一棵树",
"batch_encoding_enabled": "批量编码是否启用",
"If use batch encoding optimization.": "是否使用批量编码优化",
"enable_quantization": "启用量化",
"Whether enable quantization of g and h.": "是否启用g和h的量化",
"quantization_scale": "量化刻度",
"Scale the sum of g to the specified value.": "将g的总和缩放到指定的值",
"max_leaf": "最大叶子",
"Maximum leaf of a tree. Only effective if train leaf wise.": "树的最大叶子;仅在叶子-wise训练时有效",
"rowsample_by_tree": "行采样比率",
"Row sub sample ratio of the training instances.": "训练实例的行子采样比率",
"enable_goss": "启用GOSS",
"Whether to enable GOSS.": "是否启用GOSS",
"top_rate": "顶部比例",
"GOSS-specific parameter. The fraction of large gradients to sample.": "GOSS特定参数;采样的大梯度的比例",
"bottom_rate": "底部比例",
"GOSS-specific parameter. The fraction of small gradients to sample.": "GOSS特定参数;采样的小梯度的比例",
"tree_growing_method": "树生长方法",
"How to grow tree?": "如何生长树",
"enable_monitor": "启用性能监控",
"Whether to enable monitoring performance during training.": "是否在训练期间启用性能监控。",
"enable_early_stop": "启用提前停止",
"Whether to enable early stop during training.": "是否在训练过程中启用提前停止。",
"eval_metric": "评估指标",
"Use what metric for monitoring and early stop? Currently support ['roc_auc', 'rmse', 'mse']": "使用哪种指标进行监控和提前停止?当前支持的指标有['roc_auc', 'rmse', 'mse']。",
"validation_fraction": "验证集比例",
"Early stop specific parameter. Only effective if early stop enabled. The fraction of samples to use as validation set.": "提前停止的特定参数。仅当启用提前停止时有效。用作验证集的样本比例。",
"stopping_rounds": "停止轮次",
"Early stop specific parameter. If more than 'stopping_rounds' consecutive rounds without improvement, training will stop. Only effective if early stop enabled": "提前停止的特定参数。如果连续超过'停止轮次'没有改善,训练将停止。仅当启用提前停止时有效。",
"stopping_tolerance": "停止容忍度",
"Early stop specific parameter. If the metric on the validation set is no longer improving by at least this amount, then consider not improving.": "提前停止的特定参数。如果验证集上的指标改善量不再至少达到该数值,则视为不再进步。",
"save_best_model": "保存最佳模型",
"Whether to save the best model on validation set during training.": "是否在训练期间保存在验证集上的最佳模型。",
"train_dataset": "训练数据集",
"Input vertical table.": "输入联合表",
"feature_selects": "特征列",
"which features should be used for training.": "哪些特征应该用于训练",
"label": "标签",
"Label of train dataset.": "训练数据集的标签",
"output_model": "输出模型",
"Output model.": "输出模型"
},
"ml.train/slnn_train:0.0.1": {
"ml.train": "模型训练",
"slnn_train": "拆分学习NN训练",
"Train nn models for vertical partitioning dataset by split learning.\nThis component is not enabled by default, it requires the use of the full version\nof secretflow image and setting the ENABLE_NN environment variable to true.\nSince it is necessary to define the model structure using python code,\nalthough the range of syntax and APIs that can be used has been restricted,\nthere are still potential security risks. It is recommended to use it in\nconjunction with process sandboxes such as nsjail.": "通过拆分学习训练垂直分割数据集的NN模型。\n此组件默认情况下未启用,它需要使用完整版本的secretflow镜像,并将ENABLE_NN环境变量设置为true。\n由于需要使用python代码来定义模型结构,尽管可以使用的语法和API的范围已经受到限制,但仍然存在潜在的安全风险。建议与进程沙盒(如nsjail)结合使用。",
"0.0.1": "0.0.1",
"models": "模型",
"Define the models for training.": "定义模型结构",
"epochs": "训练轮数",
"The number of complete pass through the training data.": "通过完整训练数据的次数",
"learning_rate": "学习率",
"The step size at each iteration in one iteration.": "一次迭代中每次迭代的步长",
"batch_size": "训练批数据量",
"The number of training examples utilized in one iteration.": "一次迭代中使用的训练样本数",
"validattion_prop": "验证集占比",
"The proportion of validation set to total data set.": "验证集占总数据集的比例",
"loss": "损失函数",
"Loss function.": "损失函数",
"builtin": "内置",
"Builtin loss function.": "内置损失函数",
"custom": "自定义",
"Custom loss function.": "自定义损失函数",
"optimizer": "优化器",
"Optimizer.": "优化器",
"name": "名称",
"Optimizer name.": "优化器名称",
"params": "额外参数",
"Additional optimizer parameters in JSON format.": "JSON格式的其他优化器参数",
"metrics": "指标",
"Metrics.": "指标",
"model_input_scheme": "输入格式",
"Input scheme of base model, tensor: merge all features into one tensor; tensor_dict: each feature as a tensor.": "模型的输入格式,tensor:将所有特征合并为一个 tensor;tensor_dict:每个特征都作为一个 tensor。",
"strategy": "拆分学习策略",
"Split learning strategy.": "拆分学习策略",
"Split learning strategy name.": "拆分学习策略名称",
"Additional strategy parameters in JSON format.": "JSON格式的其他策略参数",
"compressor": "压缩算法",
"Compressor for hiddens and gradients.": "用于压缩隐层和梯度",
"Compressor name.": "压缩算法名称名称。",
"Additional compressor parameters in JSON format.": "JSON格式的其他压缩算法参数",
"train_dataset": "训练数据集",
"Input vertical table.": "输入联合表",
"feature_selects": "特征列",
"which features should be used for training.": "哪些特征应该用于训练",
"label": "标签列",
"Label of train dataset.": "训练数据集的标签",
"output_model": "输出模型",
"Output model.": "输出模型",
"reports": "报告",
"Output report.": "输出报告"
},
"ml.train/ss_glm_train:0.0.2": {
"ml.train": "模型训练",
"ss_glm_train": "SSGLM训练",
"generalized linear model (GLM) is a flexible generalization of ordinary linear regression.\nThe GLM generalizes linear regression by allowing the linear model to be related to the response\nvariable via a link function and by allowing the magnitude of the variance of each measurement to\nbe a function of its predicted value.": "广义线性模型(GLM)是普通线性回归的一种灵活的推广;该模型允许因变量的偏差分布有除了正态分布之外的其它分布;此模型假设实验者所量测的随机变量的分布函数与实验中系统性效应(即非随机的效应)可经由一链接函数(link function)建立可解释其相关性的函数",
"0.0.2": "0.0.2",
"epochs": "训练轮数",
"The number of complete pass through the training data.": "通过完整训练数据的次数",
"learning_rate": "学习率",
"The step size at each iteration in one iteration.": "一次迭代中每次迭代的步长",
"batch_size": "训练批数据量",
"The number of training examples utilized in one iteration.": "一次迭代中使用的训练样本数",
"link_type": "链接函数类型",
"link function type": "链接函数类型",
"label_dist_type": "样本分布类型",
"label distribution type": "样本分布类型",
"tweedie_power": "tweedie_power",
"Tweedie distribution power parameter": "Tweedie分布的power参数",
"dist_scale": "样本分布尺度的猜测值",
"A guess value for distribution's scale": "样本分布尺度的猜测值",
"iter_start_irls": "IRLS初始化轮数",
"run a few rounds of IRLS training as the initialization of w, 0 disable": "运行几轮IRLS训练作为SGD训练的初始化w,0禁用",
"decay_epoch": "decay_epoch",
"decay learning interval": "衰减学习区间",
"decay_rate": "decay_rate",
"decay learning rate": "衰减学习率",
"optimizer": "优化器",
"which optimizer to use: IRLS(Iteratively Reweighted Least Squares) or SGD(Stochastic Gradient Descent)": "使用哪个优化器:IRLS(迭代加权最小二乘法)或SGD(随机梯度下降法)",
"l2_lambda": "l2_lambda",
"L2 regularization term": "L2正则系数",
"infeed_batch_size_limit": "输入批次大小限制",
"size of a single block, default to 10w * 100. increase the size will increase memory cost, but may decrease running time. Suggested to be as large as possible. (too large leads to OOM)": "单个块的大小,默认为10w * 100。增加大小将增加内存成本,但可能会减少运行时间。建议尽可能大。(过大导致内存溢出)",
"fraction_of_validation_set": "验证集占比",
"fraction of training set to be used as the validation set. ineffective for 'weight' stopping_metric": "用于验证集的训练集比例。对于'weight'停止指标无效",
"random_state": "随机状态",
"random state for validation split": "验证集划分的随机状态",
"stopping_metric": "停止指标",
"use what metric as the condition for early stop? Must be one of ['deviance', 'MSE', 'RMSE', 'AUC', 'weight']. only logit link supports AUC metric (note that AUC is very, very expansive in MPC)": "使用哪个指标作为早停的条件?必须是['偏差', '均方误差', '均方根误差', 'AUC', '权重']之一。只有逻辑回归链接支持AUC指标(注意,在多方计算中AUC非常占用资源)",
"stopping_rounds": "停止轮数",
"If the model is not improving for stopping_rounds, the training process will be stopped, for 'weight' stopping metric, stopping_rounds is fixed to be 1": "如果模型在停止轮数设定期间没有提升,则训练过程将被停止。对于'权重'停止指标,停止轮数固定为1",
"stopping_tolerance": "停止容差",
"the model is considered as not improving, if the metric is not improved by tolerance over best metric in history. If metric is 'weight' and tolerance == 0, then early stop is disabled.": "如果模型在度量指标上相对历史最佳指标的改善未达到设定的容忍度,则认为模型未有改进。若度量指标为'weight'并且容忍度为0,则不启用提前停止功能。",
"report_metric": "报告指标",
"Whether to report the value of stopping metric. Only effective if early stop is enabled. If this option is set to true, metric will be revealed and logged.": "是否报告终止指标的值。这仅在启用提前停止时有效。如果此选项被设置为真,则会公开并记录该指标。",
"report_weights": "模型报告",
"If this option is set to true, model will be revealed and model details are visible to all parties": "如果此选项设置为true,模型会被转换到明文,并且模型的详细信息对各方都可见",
"train_dataset": "训练数据集",
"Input vertical table.": "输入联合表",
"feature_selects": "特征列",
"which features should be used for training.": "哪些特征应该用于训练",
"offset": "偏移列",
"Specify a column to use as the offset": "指定要用作偏移量的列",
"weight": "权重列",
"Specify a column to use for the observation weights": "指定用于观测权重的列",
"label": "标签列",
"Label of train dataset.": "训练数据集的标签",
"output_model": "输出模型",
"Output model.": "输出模型",
"report": "报告",
"If report_weights is true, report model details": "如果report_weights为true,则报告模型详细信息"
},
"ml.train/ss_sgd_train:0.0.1": {
"ml.train": "模型训练",
"ss_sgd_train": "逻辑回归训练",
"Train both linear and logistic regression\nlinear models for vertical partitioning dataset with mini batch SGD training solver by using secret sharing.\n- SS-SGD is short for secret sharing SGD training.": "训练线性回归和逻辑回归\n使用秘密共享的具有 mini batch SGD 垂直分区数据集的线性模型\n- SS-SGD是秘密共享SGD训练的缩写",
"0.0.1": "0.0.1",
"epochs": "迭代次数",
"The number of complete pass through the training data.": "通过完整训练数据的次数",
"learning_rate": "学习率",
"The step size at each iteration in one iteration.": "一次迭代中每次迭代的步长",
"batch_size": "训练批数据量",
"The number of training examples utilized in one iteration.": "一次迭代中使用的训练示例数",
"sig_type": "sigmoid 函数拟合方法",
"Sigmoid approximation type.": "sigmoid 函数拟合方法",
"reg_type": "回归类型",
"Regression type": "回归类型",
"penalty": "正则化项类型",
"The penalty(aka regularization term) to be used.": "要使用的penalty(又名正则化项)",
"l2_norm": "L2正则系数",
"L2 regularization term.": "L2正则系数",
"eps": "eps",
"If the change rate of weights is less than this threshold, the model is considered to be converged, and the training stops early. 0 to disable.": "如果权重的变化率小于此阈值,则认为模型已收敛,训练提前停止;0 表示禁用",
"train_dataset": "训练数据集",
"Input vertical table.": "输入联合表",
"feature_selects": "特征列",
"which features should be used for training.": "哪些特征应该用于训练",
"label": "标签",
"Label of train dataset.": "训练数据集的标签",
"output_model": "输出模型",
"Output model.": "输出模型"
},
"ml.train/ss_xgb_train:0.0.1": {
"ml.train": "模型训练",
"ss_xgb_train": "SS-XGB训练",
"This method provides both classification and regression tree boosting (also known as GBDT, GBM)\nfor vertical partitioning dataset setting by using secret sharing.\n- SS-XGB is short for secret sharing XGB.\n- More details: https://arxiv.org/pdf/2005.08479.pdf": "该方法通过使用秘密共享为垂直分割数据集设置提供分类和回归树提升(也称为 GBDT、GBM)\n- SS-XGB是秘密共享XGB的缩写\n- 更多详情: https://arxiv.org/pdf/2005.08479.pdf",
"0.0.1": "0.0.1",
"num_boost_round": "训练轮数",
"Number of boosting iterations.": "Boosting迭代次数",
"max_depth": "最大深度",
"Maximum depth of a tree.": "树的最大深度",
"learning_rate": "学习率",
"Step size shrinkage used in updates to prevent overfitting.": "更新中使用的步长收缩以防止过度拟合",
"objective": "目的",
"Specify the learning objective.": "指定学习目标",
"reg_lambda": "叶子节点权重L2正则项",
"L2 regularization term on weights.": "权重的 L2 正则化项",
"subsample": "子样本比率",
"Subsample ratio of the training instances.": "训练实例的子样本比率",
"colsample_by_tree": "每棵树子样本比率",
"Subsample ratio of columns when constructing each tree.": "构造每棵树时列的子样本比率",
"sketch_eps": "分裂系数",
"This roughly translates into O(1 / sketch_eps) number of bins.": "箱数大致为 O(1 / sketch_eps) 个",
"base_score": "初始预测分数",
"The initial prediction score of all instances, global bias.": "所有实例的初始预测分数,全局偏差",
"seed": "种子",
"Pseudorandom number generator seed.": "伪随机数生成器种子",
"train_dataset": "训练数据集",
"Input vertical table.": "输入联合表",
"feature_selects": "特征列",
"which features should be used for training.": "哪些特征应该用于训练",
"label": "标签",
"Label of train dataset.": "训练数据集的标签",
"output_model": "输出模型",
"Output model.": "输出模型"
},
"model/model_export:0.0.1": {
"model": "模型",
"model_export": "模型导出",
"The model_export component supports converting and packaging the rule files generated by preprocessing and postprocessing components, as well as the model files generated by model operators, into a Secretflow-Serving model package. The list of components to be exported must contain exactly one model train or model predict component, and may include zero or multiple preprocessing and postprocessing components.": "model_export组件支持将预处理和后处理组件生成的规则文件以及模型操作符生成的模型文件转换打包为Secretflow Serving模型包。要导出的组件列表必须仅包含一个模型训练或模型预测组件,并且可能包括零个或多个预处理和后处理组件。",
"0.0.1": "0.0.1",
"model_name": "模型名称",
"model's name": "模型名称",
"model_desc": "模型描述",
"Describe what the model does": "描述模型的作用",
"input_datasets": "输入数据集",
"The input data IDs for all components to be exported. Their order must remain consistent with the sequence in which the components were executed.": "待导出的所有组件的输入数据ID。其顺序必须与执行组件的顺序保持一致。",
"output_datasets": "输出数据集",
"The output data IDs for all components to be exported. Their order must remain consistent with the sequence in which the components were executed.": "待导出的所有组件的输出数据ID。其顺序必须与执行组件的顺序保持一致。",
"component_eval_params": "组件执行参数",
"The eval parameters (in JSON format) for all components to be exported. Their order must remain consistent with the sequence in which the components were executed.": "待导出的所有组件的eval参数(JSON格式)。其顺序必须与执行组件的顺序保持一致。",
"package_output": "打包输出",
"output tar package uri": "输出tar包的uri",
"report": "报告",
"report dumped model's input schemas": "报表转储模型的输入架构"
},
"preprocessing/binary_op:0.0.2": {
"preprocessing": "预处理",
"binary_op": "二元操作",
"Perform binary operation binary_op(f1, f2) and assign the result to f3, f3 can be new or old. Currently f1, f2 and f3 all belong to a single party.": "执行二元操作 binary_op(f1, f2) 并将结果分配给 f3,f3 可以是新的或旧的。目前 f1、f2 和 f3 都属于同一个方。",
"0.0.2": "0.0.2",
"What kind of binary operation we want to do, currently only supports +, -, *, /": "我们想要进行何种二元操作,目前仅支持 +、-、*、/",
"new_feature_name": "新特征名称",
"Name of the newly generated feature.": "新生成特征的名称。",
"as_label": "作为标签",
"If True, the generated feature will be marked as label in schema.": "如果为 True,则生成的特征将在模式中标记为标签。",
"in_ds": "输入数据集",
"Input vertical table.": "输入垂直表格。",
"f1": "特征1",
"Feature 1 to operate on.": "要操作的特征1。",
"f2": "特征2",
"Feature 2 to operate on.": "要操作的特征2。",
"out_ds": "输出数据集",
"Output vertical table.": "输出垂直表格。",
"out_rules": "输出规则",
"feature gen rule": "特征生成规则"
},
"preprocessing/case_when:0.0.1": {
"preprocessing": "预处理",
"case_when": "交叉决策",
"0.0.1": "0.0.1",
"rules": "规则",
"input CaseWhen rules": "输入交叉决策规则",
"input_dataset": "输入数据集",
"Input vertical table.": "输入联合表",
"output_dataset": "输出数据集",
"out_rules": "输出规则",
"case when substitution rule": "预处理替换规则"
},
"preprocessing/feature_calculate:0.0.1": {
"preprocessing": "预处理",
"feature_calculate": "特征计算",
"Generate a new feature by performing calculations on an origin feature": "对原特征进行操作生成新特征",
"0.0.1": "0.0.1",
"rules": "规则",
"input CalculateOpRules rules": "输入特征计算规则",
"in_ds": "输入数据集",
"Input vertical table": "输入联合表",
"features": "特征列",
"Feature(s) to operate on": "要操作的特征列",
"out_ds": "输出数据集",
"output_dataset": "输出数据集",
"out_rules": "输出规则",
"feature calculate rule": "特征计算规则"
},
"preprocessing/fillna:0.0.1": {
"preprocessing": "预处理",
"fillna": "异常值填充",
"0.0.1": "0.0.1",
"strategy": "填充缺失值的方式",
"The imputation strategy. If \"mean\", then replace missing values using the mean along each column. Can only be used with numeric data. If \"median\", then replace missing values using the median along each column. Can only be used with numeric data. If \"most_frequent\", then replace missing using the most frequent value along each column. Can be used with strings or numeric data. If there is more than one such value, only the smallest is returned. If \"constant\", then replace missing values with fill_value. Can be used with strings or numeric data.": "插补策略。如果为“平均值”,则使用每列的平均值替换缺失值。只能与数字数据一起使用。如果为“中值”,则使用每列的中值替换缺失值。只能与数字数据一起使用。如果为“众数”,则使用每列中最频繁的值替换缺失的值。可以与字符串或数字数据一起使用。如果存在多个这样的值,则只返回最小的值。如果为“自定义值”,则用fill_value替换缺失的值。可以与字符串或数字数据一起使用。",
"missing_value_type": "缺失值类型",
"type of missing value. general_na type indicates that only np.nan, None or pandas.NA will be treated as missing values. When the type is not general_na, the type casted missing_value_type(missing_value) will also be treated as missing value as well, in addition to general_na values.": "缺失值的类型。general_na类型表示只有np.nan、None或pandas.NA将被视为缺失值。当类型不是general_na时,除了general_na值之外,类型转换后的missing_value_type(missing_value)也将被视为缺失值",
"missing_value": "缺失值",
"Which value should be treat as missing_value? If missing value type is 'general_na', this field will be ignored, and any np.nan, pd.NA, etc value will be treated as missing value. Otherwise, the type casted missing_value_type(missing_value) will also be treated as missing value as well, in addition to general_na values. In case the cast is not successful, general_na will be used instead. default value is 'custom_missing_value'.": "哪个值应该被视为缺失值?如果缺失值类型是'general_na',则此字段将被忽略,并且任何np.nan、pd.NA等值都将被视为缺失值。否则,除了general_na值之外,类型转换后的missing_value_type(missing_value)也将被视为缺失值。如果转换不成功,则将使用general_na作为缺失值。默认值为'custom_missing_value'。",
"fill_value_int": "整数类型填充值",
"For int type data. If method is 'constant' use this value for filling null.": "对于整数类型的数据,如果方法是“常量”,使用此值来填充null",
"fill_value_float": "浮点类型填充值",
"For float type data. If method is 'constant' use this value for filling null.": "对于浮点类型的数据,如果方法是“常量”,使用此值来填充null",
"fill_value_str": "字符串类型填充值",
"For str type data. If method is 'constant' use this value for filling null.": "对于字符串类型的数据,如果方法是“常量”,使用此值来填充null",
"input_dataset": "输入数据集",
"Input vertical table.": "输入纵向数据表",
"fill_na_features": "填充特征",
"Features to fill.": "要填充的特征。",
"out_ds": "输出数据集",
"Output vertical table.": "输出纵向数据表",
"out_rules": "输出规则",
"fill value rule": "填充值规则"
},
"preprocessing/onehot_encode:0.0.2": {
"preprocessing": "预处理",
"onehot_encode": "onehot_encode",
"0.0.2": "0.0.2",
"drop_first": "丢弃第一个",
"If true drop the first category in each feature. If only one category is present, the feature will be dropped entirely": "不删除占比小于某个频率的枚举值时,如丢弃一个枚举量的编码,可以保证编码后数据的线性无关性;但如只存在一个枚举值,则此列完全删除,下游计算报错",
"min_frequency": "最小频率",
"Specifies the minimum frequency below which a category will be considered infrequent, [0, 1), 0 disable": "指定类别将被视为不频繁的最低频率,[0,1),0禁用",
"report_rules": "报告规则",
"Whether to report rule details": "是否报告规则详细信息",
"input_dataset": "输入数据集",
"Input vertical table.": "输入竖排表格。",
"features": "特征列",
"Features to encode.": "要编码的功能。",
"output_dataset": "输出数据集",
"out_rules": "out_rules",
"onehot rule": "onehot规则",
"report": "报告",
"report rules details if report_rules is true": "如果report_rules为true,则报告规则详细信息"
},
"preprocessing/substitution:0.0.2": {
"preprocessing": "预处理",
"substitution": "特征工程应用",
"unified substitution component": "统一的特征工程规则应用组件(除分箱)",
"0.0.2": "0.0.2",
"input_dataset": "输入数据集",
"Input vertical table.": "输入竖排表格。",
"input_rules": "输入规则",
"Input preprocessing rules": "输入预处理规则",
"output_dataset": "输出数据集"
},
"preprocessing/vert_bin_substitution:0.0.1": {
"preprocessing": "预处理",
"vert_bin_substitution": "分箱转换",
"Substitute datasets' value by bin substitution rules.": "用分箱替换规则替换数据集的值",
"0.0.1": "0.0.1",
"input_data": "输入数据集",
"Vertical partitioning dataset to be substituted.": "要替换的垂直分区数据集",
"bin_rule": "分箱规则",
"Input bin substitution rule.": "输入分箱替换规则",
"output_data": "输出数据表",
"Output vertical table.": "输出垂直表"
},
"stats/groupby_statistics:0.0.3": {
"stats": "统计",
"groupby_statistics": "分组统计",
"Get a groupby of statistics, like pandas groupby statistics.\nCurrently only support VDataframe.": "获取分组统计信息,参考pandas的分组统计。\n目前仅支持 VDataframe。",
"0.0.3": "0.0.3",
"aggregation_config": "聚合配置",
"input groupby aggregation config": "输入聚合配置",
"max_group_size": "最大组数",
"The maximum number of groups allowed": "允许的最大组数",
"input_data": "输入数据",
"Input table.": "输入表",
"by": "特征列",
"by what columns should we group the values": "我们应该按哪些列进行分组",
"report": "报告",
"Output groupby statistics report.": "输出分组统计信息报告"
},
"stats/ss_pearsonr:0.0.1": {
"stats": "统计",
"ss_pearsonr": "相关系数矩阵",
"Calculate Pearson's product-moment correlation coefficient for vertical partitioning dataset\nby using secret sharing.\n- For large dataset(large than 10w samples & 200 features), recommend to use [Ring size: 128, Fxp: 40] options for SPU device.": "通过使用秘密共享计算垂直分区数据集的皮尔逊乘积矩相关系数\n- 对于大型数据集(大于10w样本和200个特征),建议使用SPU设备的[Ring size:128,Fxp:40]选项",
"0.0.1": "0.0.1",
"input_data": "输入数据集",
"Input vertical table.": "输入联合表",
"feature_selects": "特征列",
"Specify which features to calculate correlation coefficient with. If empty, all features will be used": "指定要计算相关系数的特征;如果为空,则将使用所有特征",
"report": "报告",
"Output Pearson's product-moment correlation coefficient report.": "输出相关系数矩阵表"
},
"stats/ss_vif:0.0.1": {
"stats": "统计",
"ss_vif": "VIF指标计算",
"Calculate Variance Inflation Factor(VIF) for vertical partitioning dataset\nby using secret sharing.\n- For large dataset(large than 10w samples & 200 features), recommend to use [Ring size: 128, Fxp: 40] options for SPU device.": "通过使用秘密共享计算垂直分区数据集的方差膨胀因子 (VIF)\n- 对于大型数据集(大于10w样本和200个特征),建议使用SPU设备的[Ring size:128,Fxp:40]选项",
"0.0.1": "0.0.1",
"input_data": "输入数据集",
"Input vertical table.": "输入联合表",
"feature_selects": "特征列",
"Specify which features to calculate VIF with. If empty, all features will be used.": "指定要用于计算 VIF 的特征;如果为空,则将使用所有特征",
"report": "报告",
"Output Variance Inflation Factor(VIF) report.": "输出VIF指标计算结果表"
},
"stats/table_statistics:0.0.2": {
"stats": "统计",
"table_statistics": "全表统计",
"Get a table of statistics,\nincluding each column's\n1. datatype\n2. total_count\n3. count\n4. count_na\n5. na_ratio\n6. min\n7. max\n8. mean\n9. var\n10. std\n11. sem\n12. skewness\n13. kurtosis\n14. q1\n15. q2\n16. q3\n17. moment_2\n18. moment_3\n19. moment_4\n20. central_moment_2\n21. central_moment_3\n22. central_moment_4\n23. sum\n24. sum_2\n25. sum_3\n26. sum_4\n- moment_2 means E[X^2].\n- central_moment_2 means E[(X - mean(X))^2].\n- sum_2 means sum(X^2).": "获取一个统计表,\n包括每列的\n1. datatype(数据类型)\n2. total_count(总数)\n3.count(非nan总数)\n4.count_na(nan总数)\n5.na_ratio(nan比例)\n6.min\n7.max\n8.mean\n9.var\n10.std\n11.sem(standard error of the mean)\n12.skewness(偏度)\n13.kurtosis(峰度)\n14.q1(分位数)\n15.q2\n16.q3\n17.moment_2\n18.moment_3\n19.moment_4\n20.central_ment_2\n21.central_ment_3\n22.central_ment_4\n23.sum\n24.sum_2\n25.sum_3\n26.sum_4\n-moment_2 表示 E[X^2]。\n-central_ment_2 表示 E[(X - mean(X))^2]。\n-sum_2表示 sum(X^2)。",
"0.0.2": "0.0.2",
"input_data": "输入数据",
"Input table.": "输入表",
"features": "特征",
"perform statistics on these columns": "对这些列执行统计",
"report": "报告",
"Output table statistics report.": "输出全表统计结果表"
}
}