-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathsearch.xml
1381 lines (658 loc) · 969 KB
/
search.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<?xml version="1.0" encoding="utf-8"?>
<search>
<entry>
<title>Transformer Family on Time series</title>
<link href="/uncategorized/surveys/transformer_in)time_series/"/>
<url>/uncategorized/surveys/transformer_in)time_series/</url>
<content type="html"><![CDATA[<p>当代的时间序列分析任务需要应对各种复杂的数据模式和动态性,传统的统计模型和机器学习方法在处理这些挑战时常常受限。然而,近年来,类Transformer网络在时间序列分析领域取得了显著的突破。Transformer模型的出现为时间序列分析任务提供了一种新的、强大的工具,它能够自适应地捕捉序列中的长期依赖关系,并能够有效地建模非线性和非平稳性特征。</p><p>TODO: 完成重点论文和异常检测论文的精读</p><span id="more"></span><p>在文章中,我们将关注一些近年来在时间序列分析任务上涌现的重要工作,这些工作以类Transformer网络为基础,以其卓越的性能和创新的方法引起了广泛的关注。我们将重点介绍以下几个工作:</p><ol><li>N-BEATS(ICLR 2022)</li><li>LogTrans(NeurIPS 2021)</li><li>Informer ( AAAI 2021 Best Paper)<a href="https://zhuanlan.zhihu.com/p/354916261">📒Informer:用Transformer架构解决LSTF问题 - 知乎</a></li><li>Autoformer(NeuraIPS 2021)<a href="https://zhuanlan.zhihu.com/p/386955393">📒细读好文 之 Autoformer - 知乎</a><ul><li>🌟其中基于Wiener-Khinchin定理的Auto-Correlation Mechanism有点意思,可以单独看看。</li></ul></li><li>FEDformer(ICML 2022)<a href="https://zhuanlan.zhihu.com/p/528131016">📒阿里达摩院最新FEDformer,长程时序预测全面超越SOTA - 知乎</a></li><li>Pyraformer(ICLR 2022)<a href="https://zhuanlan.zhihu.com/p/467765457">📒时间序列预测@Pyraformer - 知乎</a></li><li>Transformer embeddings of irregularly spaced events and their participants(ICLR 2022)</li><li>TranAD(VLDB 2022)</li><li>Probabilistic Transformer For Time Series Analysis(NeurIPS 2021)。</li></ol><p>🌟 此外,我们还将依托论文<a href="https://arxiv.org/abs/2205.13504">Are Transformers Effective for Time Series Forecasting?</a>探讨一个重要的问题,即Transformer网络在时间序列预测中的有效性。</p><p>通过对这些工作的综述和分析,我们将深入了解类Transformer网络在时间序列分析任务中的应用,以及它们的创新之处、优点和局限性。这将有助于我们对该领域的最新研究进展有一个全面的了解,并为未来的研究和应用提供指导和启示。</p><h2 id="Taxonomy-of-Transformers-in-Time-Series"><a href="#Taxonomy-of-Transformers-in-Time-Series" class="headerlink" title="Taxonomy of Transformers in Time Series"></a>Taxonomy of Transformers in Time Series</h2><blockquote><p>Yang, C., Mei, H., & Eisner, J. (2021). Transformer embeddings of irregularly spaced events and their participants. <em>arXiv preprint arXiv:2201.00044</em>.</p></blockquote><p>根据上面的文章,基于Transformers的时间序列分析工作的创新点主要分为两类,即更改模型架构的,以及为特殊的应用而适配的。后文中,我们也将从这个分类法出发,来整理每个工作是如何改进Transformer的。</p><img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/image-20230530200210844.png" alt="Taxonomy of Transformers for time series modeling from the perspectives of network modifications and application domains" style="zoom:50%;" /><h2 id="Network-Modifications"><a href="#Network-Modifications" class="headerlink" title="Network Modifications"></a>Network Modifications</h2><h3 id="Positional-Encoding"><a href="#Positional-Encoding" class="headerlink" title="Positional Encoding"></a>Positional Encoding</h3><ol><li>Learnable Positional Encoding<ol><li><a href="https://www.semanticscholar.org/paper/A-Transformer-based-Framework-for-Multivariate-Time-Zerveas-Jayaraman/2051548f7681c96d603de932ee23406c525276f9">THIS WORK</a> introduces an <strong>embedding layer</strong> in Transformer that <strong>learns embedding vectors for each position index</strong> jointly with other model parameter.</li><li><a href="https://www.semanticscholar.org/paper/Temporal-Fusion-Transformers-for-Interpretable-Time-Lim-Arik/6a9d69fb35414b8461573df333dba800f254519f">THIS WORK</a> uses an LSTM network to encode positional embeddings, which can better exploit sequential ordering information in time series.</li></ol></li><li>Timestamp Encoding: Encoding <strong>calendar timestamps</strong> (e.g., second, minute, hour, week, month, and year) and <strong>special timestamps</strong> (e.g., holidays and events).<ol><li> <strong>Informer / Autoformer / FED former</strong> proposed to encode <strong>timestamps as additional positional encoding</strong> by using learnable embedding layers.</li></ol></li></ol><h3 id="Attention-Module"><a href="#Attention-Module" class="headerlink" title="Attention Module"></a>Attention Module</h3><p>面向attention module的工作主要致力于减少self-attention module的时间、内存复杂度(原来为$\mathcal{O}(N^2)$)</p><ol><li>Introducing a <strong>sparsity bias</strong> into the attention mechanism: LogTrans, Pyraformer</li><li>Exploring the low-rank property of the self-attention matrix to speed up the computation: Informer, FEDformer</li></ol><h3 id="Architecture-based-Attention-Innovation"><a href="#Architecture-based-Attention-Innovation" class="headerlink" title="Architecture-based Attention Innovation"></a>Architecture-based Attention Innovation</h3><p>这类工作直接面向<strong>时间序列的特殊性质</strong>,对Transformer的整体架构进行了改进</p><ol><li>Introduce <strong>hierarchical architecture</strong> into Transformer to take into account the <strong>multi-resolution aspect</strong> of time series: Informer, Pyraformer<ol><li><strong>Informer</strong>: Inserts max-pooling layers with stride 2 between attention blocks, which down-sample series into its half slice (block-wise multi-resolution learning)</li><li><strong>Pyraformer</strong>: designs a <strong>C-ary tree-based</strong> attention mechanism, in which nodes at the finest scale correspond to the original time series, while nodes in the coarser scales represent series at lower resolutions. <ul><li>Pyraformer developed both <strong>intra-scale</strong> and <strong>inter-scale attentions</strong> in order to better capture temporal dependencies across different resolutions.</li><li>Hierarchical architecture also enjoys the benefits of efficient computation, particularly for long-time series.</li></ul></li></ol></li></ol><h2 id="Application-Domains"><a href="#Application-Domains" class="headerlink" title="Application Domains"></a>Application Domains</h2><p>上面的综述对Forecasting、anomaly detection、Classification的相关研究都给出了详尽的调研,这里只整理与anomaly detection相关的内容。Transformer架构为anomaly detection任务做出的主要贡献还是“<strong>improve the ablity of modeling temporal dependency</strong>”。除此之外针对异常检测任务,常见的模型融合方式有:</p><ol><li>Combine Transformer with neural generative models: VAE – <a href="https://www.semanticscholar.org/paper/Variational-Transformer-based-anomaly-detection-for-Wang-Pi/829ff2e0bad77467da0527515e3be91c376738ab">MT-RVAE</a>, <a href="https://www.semanticscholar.org/paper/Unsupervised-Anomaly-Detection-in-Multivariate-Time-Zhang-Xia/07ed7c5bee83b27c8035896e853488e6132cd86c">TransAnomaly</a>; GAN – <a href="https://www.semanticscholar.org/paper/TranAD%3A-Deep-Transformer-Networks-for-Anomaly-in-Tuli-Casale/2d57a3f90adf3fc28f0de61fb4b7b34bccb1b92d">TranAD</a>.</li><li>Combine Transformer with graph-based learning architecture for multivariate time series anomaly detection: <a href="https://www.semanticscholar.org/paper/Learning-Graph-Structures-With-Transformer-for-in-Chen-Chen/95f5870b18d5f894e4f6ec8490d1a39e0963e79e">GTA</a></li><li>Combine Transformer with Gaussian prior-Association: <a href="https://www.semanticscholar.org/paper/Anomaly-Transformer%3A-Time-Series-Anomaly-Detection-Xu-Wu/a46b06a4b8b4deecf96a4e42cd19b4696f999e66">AnomalyTrans</a></li></ol><!--我现在要写一个关于类Transformer网络在时间序列分析任务上的近年研究读书笔记,结合我调研的结果,准备在笔记中对下面几个工作做出总结:--><ol><li><!--N-BEATS: ICLR 2020--></li><li><!--LogTrans: NeurIPS 2021--></li><li><!--Informer: AAAI 2021 (Best Paper)--></li><li><!--Autoformer: NeuraIPS 2021--> </li><li><!--Lite Transformer: ICLR 2020--> </li><li><!--FEDformer: ICML 2022--></li><li><!--Pyraformer: ICLR 2022: https://ar5iv.org/abs/2202.07125--></li><li><!--Transformer embeddings of irregularly spaced events and their participants: ICLR 2022 (https://ar5iv.org/abs/2202.07125)--></li><li><!--TranAD: VLDB 2022 (https://ar5iv.org/abs/2202.07125)。--></li><li><!--Variational transformer-based anomaly detection approach for multivariate time series",Measurements 2022 (https://ar5iv.org/abs/2202.07125)。--></li><li><!--"Probabilistic Transformer For Time Series Analysis" NeurIPS 2021 (https://openreview.net/forum?id=HfpNVDg3ExA)。--></li><li><!--Are Transformers Effective for Time Series Forecasting? (https://arxiv.org/abs/2205.13504)--></li></ol>]]></content>
<tags>
<tag> transformers </tag>
<tag> survey </tag>
<tag> time series </tag>
</tags>
</entry>
<entry>
<title>因果推断遇见大模型</title>
<link href="/uncategorized/notes/causal_meets_LLM/"/>
<url>/uncategorized/notes/causal_meets_LLM/</url>
<content type="html"><![CDATA[<p>本论坛讨论的议题包括:</p><ol><li>大模型能够为因果学习带来什么?因果学习能为大模型带来什么</li><li>因果推断的在业界的应用现状,业界对因果推断的要求是什么,如何评估因果推断方法的效用</li></ol><span id="more"></span><h1 id="论坛:因果推断遇见大模型"><a href="#论坛:因果推断遇见大模型" class="headerlink" title="论坛:因果推断遇见大模型"></a>论坛:因果推断遇见大模型</h1><p>主持人:DataFun 王大川</p><p>主题:因果推断+业界讨论</p><p><strong>分享者</strong></p><ul><li><p>董振华 华为诺亚方舟 技术专家</p></li><li><p>况琨 浙大 计算机学院副教授</p></li><li><p>郑嘉 腾讯OVBU</p></li><li><p>万世想 度小满</p></li></ul><p>可以看看的工作(自己总结的)</p><blockquote><p>🌟大模型在因果推断任务上的<strong>能力边界</strong>:Frohberg, J., & Binder, F. (2021). Crass: A novel data set and benchmark to test counterfactual reasoning of large language models. <em>arXiv preprint arXiv:2112.11941</em>.</p><p>🌟 Causal transformer:Melnychuk, V., Frauen, D., & Feuerriegel, S. (2022, June). Causal transformer for estimating counterfactual outcomes. In <em>International Conference on Machine Learning</em> (pp. 15293-15329). PMLR.</p><p>🌟:IPS方法</p><p>IPS方法是指倾向得分加权法(Inverse Probability of Treatment Weighting, IPTW),是一种处理观测数据中的选择偏倚的方法,目的是通过加权的方式使得处理组和对照组之间的混淆变量平衡,从而估计因果效应[1]。<br>IPS方法的基本思想是利用倾向得分(propensity score)作为每个样本的权重,倾向得分是指样本接受处理的条件概率,可以用逻辑回归或其他机器学习方法来估计。通过倾向得分加权,可以使得处理组和对照组在倾向得分上的分布相同,从而消除混淆变量的影响[2]。<br>IPS方法的优点是简单易用,只需要对样本进行权重调整,不需要匹配或分层。缺点是需要假设所有的混淆变量都已经观测到,并且倾向得分的估计准确无偏。如果倾向得分过大或过小,可能导致权重不稳定或无限大,影响因果效应的估计[3]。</p><p>[1]: 因果推断综述及基础方法介绍(一) - 知乎 <a href="https://zhuanlan.zhihu.com/p/258562953">https://zhuanlan.zhihu.com/p/258562953</a><br>[2]: 因果推断推荐系统工具箱 - Influence Function for Unbiased Recommendation(一) - 简书 <a href="https://www.jianshu.com/p/0b1ef567d592">https://www.jianshu.com/p/0b1ef567d592</a><br>[3]: 【论文笔记】ICML2016 & Cornell | (IPS-MF) Unbiased Offline Evaluation of Contextual-bandit-based News Article Recommendation Algorithms - CSDN博客 <a href="https://blog.csdn.net/weixin_45884316/article/details/127267234">https://blog.csdn.net/weixin_45884316/article/details/127267234</a></p><p>🌟:因果和机器学习是一个逆向的手段<br>因果:从因果框架出发,演绎为数据(表现)<br>机器学习:从数据出发,归纳出知识</p></blockquote><h2 id="大模型为因果学习带来了什么"><a href="#大模型为因果学习带来了什么" class="headerlink" title="大模型为因果学习带来了什么"></a>大模型为因果学习带来了什么</h2><p><strong>况琨:</strong></p><ol><li><p>构建benchmark测试大模型在因果推断任务上的<strong>能力边界</strong>,构建评估数据集</p><blockquote><p>【已有工作做这方面的尝试】Frohberg, J., & Binder, F. (2021). Crass: A novel data set and benchmark to test counterfactual reasoning of large language models. <em>arXiv preprint arXiv:2112.11941</em>.</p></blockquote></li><li><p>如何确定大模型的因果推理能力是<strong>数据中得到的,还是涌现出的能力</strong></p></li><li><p>当前的大模型大多还是基于统计的模型,这些模型挖掘出的大多还是关联关系,即:关联涌现。大模型的下一步也许是<strong>跨越关联涌现 到 因果涌现的鸿沟</strong>。即挖掘真正的因果关系。)</p></li></ol><p><strong>郑嘉:</strong></p><ol><li><p>大模型作为因果知识提取器:LLM作为一个用大量语料训练的语言模型,其可能在语料中见过非结构化的因果知识,因此我们可以<strong>借助LLM来生成因果领域难以获得的反事实推理</strong>。</p></li><li><p>大模型作为作为知识存储:LLM作为一个语言模型,本身也是一个用户友好的终端,因此可以用于最为因果知识的存储介质</p></li><li><p>大模型作为作为知识终端:大模型的对话、生成形态,都可以用作用户友好的终端,让人类来访问因果知识</p></li></ol><p><strong>万世想:</strong></p><ol><li><strong>从金融的业务需求出发</strong>:在金融领域,现有的因果学习主要用于“构建决策框架”——即定额定价、权益发放等策略的制定</li><li>利用大模型,可以:<ol><li>丰富知识的来源:利用LLM中的非结构化数据、或解析以前难以解析的外部数据,来完善已有的决策框架</li><li>甚至,代替现有的因果决策框架</li></ol></li></ol><p><strong>董振华:</strong></p><ol><li>大模型学习到的可能是因果知识,可能不是因果发现能力<ol><li>因果知识:从语料中找到的“因为……所以……”具体知识,活着通过Chain-of-Thought总结出的因果链条</li><li>因果发现能力:不加引导、zero-shot的因果发现能力</li></ol></li></ol><h2 id="因果学习能为大模型带来什么"><a href="#因果学习能为大模型带来什么" class="headerlink" title="因果学习能为大模型带来什么"></a>因果学习能为大模型带来什么</h2><p><strong>董振华(承接上述的“因果发现能力缺失”论):</strong></p><ol><li><p>在LLM学习过程中,random mask strategy可以引导LLM学习因果知识,但如果借助<strong>知识图谱</strong>指导mask strategy,锻炼模型补全因果链条的能力,则可以<strong>引导大模型获得因果发现能力</strong>。</p></li><li><p>LLM终究存在知识盲区,知识图谱可以在知识不完备场景下发挥作用</p></li></ol><p><strong>况琨:</strong></p><ol><li><p>借助大模型,在因果学习的过程中,引进非结构的 unobserved counfounder</p><blockquote><ul><li>Unobserved confounder是指不能被观测到的变量,它可能会影响因果变量之间的关系,导致偏差或混淆。例如,在研究温度升高是否是电费增加的原因时,如果忽略了一个未观测到的confounder,比如空调使用率,那么就可能得到错误的因果推断[1]。</li><li>在因果学习中,处理unobserved confounder的一个常用方法是使用潜在变量模型,比如结构方程模型或贝叶斯网络,来建立因果变量和潜在变量之间的关系,并利用观测数据来估计潜在变量的分布和参数[2]。</li><li>另一个方法是使用反事实推理,即假设在不同的干预条件下,观测到的结果是如何变化的。这种方法可以利用潜在变量框架或潜在结果框架来定义反事实变量,并使用匹配、倾向得分、仪器变量等技术来消除unobserved confounder的影响[3]。</li></ul><p>[1] 大白话谈因果系列文章(一):因果推断简介及论文介绍 - 知乎 <a href="https://zhuanlan.zhihu.com/p/397796913">https://zhuanlan.zhihu.com/p/397796913</a><br>[2] 因果强化学习入门 - 知乎 - 知乎专栏 <a href="https://zhuanlan.zhihu.com/p/363339023">https://zhuanlan.zhihu.com/p/363339023</a><br>[3] 大白话谈因果系列文章(二)因果效应估计及论文介绍 - 知乎 <a href="https://zhuanlan.zhihu.com/p/397974913">https://zhuanlan.zhihu.com/p/397974913</a></p></blockquote></li><li><p>现有的大模型不可避免的存在歧视(公平性问题),这导出一个问题——LLM是否具备泛化性?</p><ol><li>而“因果”本身作为一种高度抽象的知识,其本身天然具备泛化性,是否可以帮助大模型提升可信度/泛化性。</li></ol></li><li><p>因果本身具备可解释性。</p></li></ol><p><strong>万世想:</strong></p><p>大模型对因果的提升,大多还是数据上的作用,引进非结构的数据</p><p><strong>郑嘉:</strong></p><ol><li><p>因果学习目前很大程度上还是一种graph-based的理论。而LLM目前对graph的处理能力还很弱。</p></li><li><p>将graph做符号化处理,进而使用LLM构建图大模型,这种图大模型势必后续对casual graph/因果推理的发展有极大的促进作用。</p></li></ol><h2 id="因果推断的应用现状"><a href="#因果推断的应用现状" class="headerlink" title="因果推断的应用现状"></a>因果推断的应用现状</h2><p><strong>郑嘉:因果在用户激励的应用</strong></p><ul><li>研究treatment组合对用户行为的影响</li><li>N-core:关注trearment的维度交叉,构造深度模型,探究交叉因果关联</li><li>业界目前存在的挑战:用户数据稀疏导致模型不稳定</li><li>业界如何运用因果学习:1. 业务问题抽象;2. 针对业务裁剪;3. 数据纠偏;</li></ul><p><strong>万世想(金融领域):</strong></p><ul><li>业界现状:应用难度与数据的获取难度正相关</li><li>业界关注:决策的可解释性</li><li>业界挑战:在因果领域,随机实验是获得因果知识的重要途径,但目前缺乏<strong>随机试验(随机treatment)生成的随机数据</strong></li><li>如何评估业务价值:离线评估(找到可以验证的高价值用户模型)、离线模拟(因果模型相对于<u>预测模型</u>是否有显著提升)</li></ul><p><strong>董振华(因果在业界的五个应用):</strong></p><ol><li><p>基于因果做预测:主要面临的挑战包括:<strong>debias</strong></p><blockquote><p>Debias在因果学习领域指的是一种去除或减少偏差的方法,偏差是指因果变量之间存在未观测到的混淆因素或选择偏倚,导致观测数据不能反映真实的因果效应[1]。</p><p>Debias的方法有很多种,比如使用双重机器学习或正交机器学习,利用交叉验证和正交化技术来消除估计误差和模型误差[2];使用反事实推理,利用潜在变量框架或潜在结果框架来定义和估计干预后的潜在结果[3];使用深度因果学习,利用深度神经网络来提取因果表示、发现因果结构和推断因果效应[4]</p><p>Debias的目的是提高因果学习的准确性和可靠性,使其能够从观测数据中推断出有效的因果策略和干预措施。</p><p>[1] <a href="https://matheusfacure.github.io/python-causality-handbook/22-Debiased-Orthogonal-Machine-Learning.html">https://matheusfacure.github.io/python-causality-handbook/22-Debiased-Orthogonal-Machine-Learning.html</a></p><p>[2] <a href="https://arxiv.org/abs/2211.03374">https://arxiv.org/abs/2211.03374</a></p><p>[3] <a href="https://arxiv.org/abs/1608.00060">https://arxiv.org/abs/1608.00060</a></p><p>[4] <a href="https://dl.acm.org/doi/abs/10.1145/3447548.3467067">https://dl.acm.org/doi/abs/10.1145/3447548.3467067</a></p></blockquote></li><li><p>业务归因</p><ol><li>董建华课题组在AAAI上提出了一个CDA,可以看看。</li><li>反事实样本生成,辅助预测。</li><li>KDD工作分析外源数据的影响。</li><li>cofounder bias去除</li></ol></li><li><p>可信:</p></li><li><p>分析:</p></li><li><p>有个因果数据集:一个推荐系统上的数据集</p></li></ol><p><strong>况琨:</strong></p><p>企业界的关联分三种(其中有一种虚假关联还少有讨论):</p><ul><li>因果关联</li><li>混淆偏差带来的虚假关联(confounding bias,21诺贝尔经济学奖“工具变量”也是在解决这个问题)<ul><li>挑战:1⃣️观测变量高维2⃣️confounder表征怎么做、工具变量怎么构建(工具变量实际无法证实,只能证伪)3⃣️treatment的组合复杂、高维,如何学习表征,交叉treatment的影响怎么解</li></ul></li><li>样本选择偏差带来的虚假关联(collection bias)<strong>【待解决的问题】</strong></li></ul><p>业界用因果的挑战:</p><ol><li>因果假设太多,假设还不能直接判断是否满足,只能通过很多方法来测试【有时候因果不work,往往是假设不满足】</li><li>怎么验证因果?已经发现的因果——其本身如果太过明显(如吸烟引发肺癌)则无需证明;但如果不是很明显,那我们如何证明因果是有效的?</li></ol>]]></content>
<tags>
<tag> causal discovery </tag>
<tag> foundation model </tag>
</tags>
</entry>
<entry>
<title>Hotpaper in 2021-2022 (Anomaly detection / Failure detection)</title>
<link href="/uncategorized/paperlistfile/Hotpaper%20in%202021-2022%20(Anomaly%20detection,%20Failure%20detection)/"/>
<url>/uncategorized/paperlistfile/Hotpaper%20in%202021-2022%20(Anomaly%20detection,%20Failure%20detection)/</url>
<content type="html"><![CDATA[<p>本文整理2021-2022的关注的各大主题的热门论文。记得读后及时做笔记!</p><span id="more"></span><p>❌——与本人研究不太相关</p><p>🌟——需要阅读</p><p>✅——已经阅读</p><table><thead><tr><th align="left">title</th><th>citation</th><th>conf</th><th>year</th><th>need read?</th><th>note</th><th>the point</th></tr></thead><tbody><tr><td align="left">Anomaly Detection on Attributed Networks via Contrastive Self-Supervised Learning</td><td>76</td><td>TNNLS</td><td>2022</td><td>❌</td><td><a href="https://www.notion.so/Anomaly-Detection-on-Attributed-Networks-via-Contrastive-Self-Supervised-Learning-4c48721dbc7c4783864685a79d5deb5e?pvs=4">link</a></td><td>图神经网络,有监督</td></tr><tr><td align="left">CFLOW-AD: Real-Time Unsupervised Anomaly Detection With Localization via Conditional Normalizing Flows</td><td>71</td><td>WACV</td><td>2022</td><td>🌟</td><td><a href="https://www.notion.so/CFLOW-AD-Real-Time-Unsupervised-Anomaly-Detection-with-Localization-via-Conditional-Normalizing-Flo-37d0f04a817b418eb85ca3ea4f087753?pvs=4">link</a></td><td>无监督、实时、图像、归一化流</td></tr><tr><td align="left">Prior-Based Tensor Approximation for Anomaly Detection in Hyperspectral Imagery</td><td>50</td><td>TNNLS</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">A Survey of Single-Scene Video Anomaly Detection</td><td>48</td><td>TPAMI</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Towards Total Recall in Industrial Anomaly Detection</td><td>32</td><td>CVPR</td><td>2022</td><td>🌟</td><td><a href="https://www.notion.so/Towards-Total-Recall-in-Industrial-Anomaly-Detection-acfb5bd05a5a4c919a0d2619ab979504?pvs=4">link</a></td><td>图像、p retrain model-based</td></tr><tr><td align="left">Multipixel Anomaly Detection With Unknown Patterns for Hyperspectral Imagery</td><td>30</td><td>TNNLS</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Anomaly Detection in Quasi-Periodic Time Series Based on Automatic Data Segmentation and Attentional LSTM-CNN</td><td>29</td><td>TKDE</td><td>2022</td><td>✅</td><td><a href="https://www.notion.so/Anomaly-Detection-in-Quasi-Periodic-Time-Series-based-on-Automatic-Data-Segmentation-and-Attentional-4b6a4e9116954a46b9b38e29abd70b88?pvs=4">link</a></td><td>准周期时间序列、分割、有监督</td></tr><tr><td align="left">Developing an Unsupervised Real-Time Anomaly Detection Scheme for Time Series With Multi-Seasonality</td><td>26</td><td>TKDE</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Anomaly Detection Based on Zero-Shot Outlier Synthesis and Hierarchical Feature Distillation</td><td>26</td><td>TNNLS</td><td>2022</td><td>✅</td><td><a href="https://www.notion.so/kmdsy/230514-TNNLS-Anomaly-Detection-based-on-Zero-Shot-Outlier-Synthesis-and-Hierarchical-Feature-Dis-da7b078f0d234f899ec19f5018035ff6?pvs=4">link</a></td><td>异常样本的zeroshot合成:在正常分布的边缘采样</td></tr><tr><td align="left">Anomaly Explanation</td><td>24</td><td>IJCAI</td><td>2022</td><td>✅</td><td></td><td>Anomaly Explanation的四种分类方法以及简单比较: explanation by feature importance, feature values, data points comparisons,structure analysis.</td></tr><tr><td align="left">A Deep Multi-View Framework for Anomaly Detection on Attributed Networks</td><td>24</td><td>TKDE</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Feature Encoding With Autoencoders for Weakly Supervised Anomaly Detection</td><td>22</td><td>TNNLS</td><td>2022</td><td>🤔️</td><td></td><td></td></tr><tr><td align="left">MOCCA: Multilayer One-Class Classification for Anomaly Detection</td><td>21</td><td>TNNLS</td><td>2022</td><td>🌟</td><td></td><td></td></tr><tr><td align="left">Towards a Rigorous Evaluation of Time-Series Anomaly Detection</td><td>20</td><td>AAAI</td><td>2022</td><td>✅</td><td><a href="https://www.notion.so/kmdsy/230517-AAAI-Towards-a-Rigorous-Evaluation-of-Time-Series-Anomaly-Detection-ebc26471ba7a4809969a071b91b0f6fa?pvs=4">link</a>,<a href="https://blog.csdn.net/weixin_45385429/article/details/129652748">CSDN</a></td><td>现有的time series anomaly detection的F1评估体系高估了很多方法的性能,导致了不公平比较。本文建议使用PA%K,即标,而是与它们一起使用。<strong>PA%K的思想是仅当某个异常段中正确检测到的异常数量与其长度的比值超过了阈值K时</strong>,声明一个true positive。此外本文还提出了一种<strong>新基线测试方法</strong>。</td></tr><tr><td align="left">Robust Unsupervised Video Anomaly Detection by Multipath Frame Prediction</td><td>19</td><td>TNNLS</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">A Synergistic Approach for Graph Anomaly Detection With Pattern Mining and Feature Learning</td><td>19</td><td>TNNLS</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Rethinking Video Anomaly Detection - A Continual Learning Approach</td><td>19</td><td>WACV</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Neural Contextual Anomaly Detection for Time Series</td><td>16</td><td>IJCAI</td><td>2022</td><td>✅</td><td></td><td>1. 基于窗口的<strong>对比学习</strong>:异常会在嵌入上产生显著的扰动,所以当我们比较两个重叠段的表示时,如果一个包含异常,一个没有,我们期望它们是不同的;2. 利用数据增强方法:outlier exposure、mixup(提高泛化性)</td></tr><tr><td align="left">Anomaly Detection With Bidirectional Consistency in Videos</td><td>16</td><td>TNNLS</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Multi-Scale Patch-Based Representation Learning for Image Anomaly Detection and Segmentation</td><td>14</td><td>WACV</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Deep Graph-level Anomaly Detection by Glocal Knowledge Distillation</td><td>13</td><td>WSDM</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Pixel-Wise Energy-Biased Abstention Learning for Anomaly Segmentation on Complex Urban Driving Scenes</td><td>12</td><td>ECCV</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Towards Continual Adaptation in Industrial Anomaly Detection</td><td>12</td><td>MM</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">FastAno: Fast Anomaly Detection via Spatio-Temporal Patch Transformation</td><td>12</td><td>WACV</td><td>2022</td><td>🌟</td><td></td><td></td></tr><tr><td align="left">Catching Both Gray and Black Swans: Open-Set Supervised Anomaly Detection</td><td>11</td><td>CVPR</td><td>2022</td><td>🌟</td><td></td><td></td></tr><tr><td align="left">Self-Supervised Predictive Convolutional Attentive Block for Anomaly Detection</td><td>10</td><td>CVPR</td><td>2022</td><td>🌟</td><td></td><td></td></tr><tr><td align="left">A Framework for Anomaly Detection in Time-Driven and Event-Driven Processes Using Kernel Traces</td><td>10</td><td>TKDE</td><td>2022</td><td>❌</td><td><a href="https://www.notion.so/kmdsy/230418-TKDE-A-Framework-for-Anomaly-Detection-in-Time-Driven-and-Event-Driven-Processes-Using-Ke-188bf6b6b42c4c4e89e5f4a9e71f77fd?pvs=4">link</a></td><td>进程视角下的异常检测</td></tr><tr><td align="left">Beyond Dents and Scratches: Logical Constraints in Unsupervised Anomaly Detection and Localization</td><td>9</td><td>IJCV</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Cross-Domain Graph Anomaly Detection</td><td>9</td><td>TNNLS</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Weakly Supervised Discriminative Learning With Spectral Constrained Generative Adversarial Network for Hyperspectral Anomaly Detection</td><td>9</td><td>TNNLS</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Focus Your Distribution: Coarse-to-Fine Non-Contrastive Learning for Anomaly Detection and Localization</td><td>8</td><td>ICME</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Latent Outlier Exposure for Anomaly Detection with Contaminated Data</td><td>8</td><td>ICML</td><td>2022</td><td>🌟</td><td></td><td></td></tr><tr><td align="left">Automated Anomaly Detection via Curiosity-Guided Search and Self-Imitation Learning</td><td>8</td><td>TNNLS</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Graph Regularized Autoencoder and its Application in Unsupervised Anomaly Detection</td><td>8</td><td>TPAMI</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Learning Task-Specific Representation for Video Anomaly Detection with Spatial-Temporal Attention</td><td>7</td><td>ICASSP</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Neighborhood Structure Assisted Non-negative Matrix Factorization and Its Application in Unsupervised Point-wise Anomaly Detection</td><td>7</td><td>JMLR</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">An Evaluation of Anomaly Detection and Diagnosis in Multivariate Time Series</td><td>7</td><td>TNNLS</td><td>2022</td><td>🌟</td><td></td><td></td></tr><tr><td align="left">Discrete Neural Representations for Explainable Anomaly Detection</td><td>7</td><td>WACV</td><td>2022</td><td>✅</td><td><a href="https://www.notion.so/kmdsy/230418-WACV22-Discrete-neural-representations-for-explainable-anomaly-detection-bab1a2e48b3249eba245391305cc8c61?pvs=4">link</a></td><td>两段式的异常解释方法:异常检测与定位、基于目标识别的异常解释</td></tr><tr><td align="left">Comprehensive Regularization in a Bi-directional Predictive Network for Video Anomaly Detection</td><td>6</td><td>AAAI</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Deep Generative model with Hierarchical Latent Factors for Time Series Anomaly Detection</td><td>6</td><td>AISTATS</td><td>2022</td><td>🌟</td><td></td><td></td></tr><tr><td align="left">Natural Synthetic Anomalies for Self-supervised Anomaly Detection and Localization</td><td>6</td><td>ECCV</td><td>2022</td><td>🌟</td><td></td><td></td></tr><tr><td align="left">Diffusion Models for Medical Anomaly Detection</td><td>6</td><td>MICCAI</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Local Anomaly Detection for Multivariate Time Series by Temporal Dependency Based on Poisson Model</td><td>6</td><td>TNNLS</td><td>2022</td><td>🌟</td><td></td><td></td></tr><tr><td align="left">Multi-Branch Neural Networks for Video Anomaly Detection in Adverse Lighting and Weather Conditions</td><td>6</td><td>WACV</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">A Causal Inference Look at Unsupervised Video Anomaly Detection</td><td>5</td><td>AAAI</td><td>2022</td><td>✅</td><td><a href="https://www.notion.so/kmdsy/230418-AAAI22-A-Causal-Inference-Look-at-Unsupervised-Video-Anomaly-Detection-30985995c4424858850f990059f61b46?pvs=4">link</a></td><td>从因果推理的角度分析injected label的生成过程,并去除上述伪标签的影响</td></tr><tr><td align="left">DenseHybrid: Hybrid Anomaly Detection for Dense Open-Set Recognition</td><td>5</td><td>ECCV</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Deep Variational Graph Convolutional Recurrent Network for Multivariate Time Series Anomaly Detection</td><td>5</td><td>ICML</td><td>2022</td><td>🌟</td><td></td><td>和后面的课题有关</td></tr><tr><td align="left"><strong>Matrix Profile XXIV</strong>: Scaling Time Series Anomaly Detection to Trillions of Datapoints and Ultra-fast Arriving Data Streams</td><td>5</td><td>KDD</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Memory-Augmented Generative Adversarial Networks for Anomaly Detection</td><td>5</td><td>TNNLS</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Self-Training Multi-Sequence Learning with Transformer for Weakly Supervised Video Anomaly Detection</td><td>4</td><td>AAAI</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">AnomalyKiTS: Anomaly Detection Toolkit for Time Series</td><td>4</td><td>AAAI</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">DSR - A Dual Subspace Re-Projection Network for Surface Anomaly Detection</td><td>4</td><td>ECCV</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Raising the Bar in Graph-level Anomaly Detection</td><td>4</td><td>IJCAI</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Experimental Comparison and Survey of Twelve Time Series Anomaly Detection Algorithms (Extended Abstract)</td><td>4</td><td>IJCAI</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Local Evaluation of Time Series Anomaly Detection Algorithms</td><td>4</td><td>KDD</td><td>2022</td><td>🌟</td><td></td><td></td></tr><tr><td align="left">AntiBenford Subgraphs: Unsupervised Anomaly Detection in Financial Networks</td><td>4</td><td>KDD</td><td>2022</td><td>🌟</td><td></td><td>金融领域的异常检测,看看</td></tr><tr><td align="left">Toolkit for Time Series Anomaly Detection</td><td>4</td><td>KDD</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Fast Unsupervised Brain Anomaly Detection and Segmentation with Diffusion Models</td><td>4</td><td>MICCAI</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Memorizing Structure-Texture Correspondence for Image Anomaly Detection</td><td>4</td><td>TNNLS</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Unsupervised Anomaly Detection by Robust Density Estimation</td><td>3</td><td>AAAI</td><td>2022</td><td>🌟</td><td></td><td></td></tr><tr><td align="left">Transferring the Contamination Factor between Anomaly Detection Domains by Shape Similarity</td><td>3</td><td>AAAI</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Generative Cooperative Learning for Unsupervised Video Anomaly Detection</td><td>3</td><td>CVPR</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Video Anomaly Detection via Prediction Network with Enhanced Spatio-Temporal Memory Exchange</td><td>3</td><td>ICASSP</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Learning Appearance-Motion Normality for Video Anomaly Detection</td><td>3</td><td>ICME</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Anomaly Detection by Leveraging Incomplete Anomalous Knowledge with Anomaly-Aware Bidirectional GANs</td><td>3</td><td>IJCAI</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Reconstruction Enhanced Multi-View Contrastive Learning for Anomaly Detection on Attributed Networks</td><td>3</td><td>IJCAI</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">GRELEN: Multivariate Time Series Anomaly Detection from the Perspective of Graph Relational Learning</td><td>3</td><td>IJCAI</td><td>2022</td><td>🌟</td><td></td><td>可能和后面的课题有关</td></tr><tr><td align="left">SmithNet: Strictness on Motion-Texture Coherence for Anomaly Detection</td><td>3</td><td>TNNLS</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Comparison of Anomaly Detectors: Context Matters</td><td>3</td><td>TNNLS</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Self-Supervised Acoustic Anomaly Detection Via Contrastive Learning</td><td>2</td><td>ICASSP</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">CADET: Calibrated Anomaly Detection for Mitigating Hardness Bias</td><td>2</td><td>IJCAI</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">HashNWalk: Hash and Random Walk Based Anomaly Detection in Hyperedge Streams</td><td>2</td><td>IJCAI</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">SoftPatch: Unsupervised Anomaly Detection with Noisy Data</td><td>2</td><td>NIPS</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">ComGA: Community-Aware Attributed Graph Anomaly Detection</td><td>2</td><td>WSDM</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">TFAD: A Decomposition Time Series Anomaly Detection Architecture with Time-Frequency Analysis</td><td>1</td><td>CIKM</td><td>2022</td><td>🌟</td><td></td><td>可解释性方面看一下</td></tr><tr><td align="left">Self-supervision Meets Adversarial Perturbation: A Novel Framework for Anomaly Detection</td><td>1</td><td>CIKM</td><td>2022</td><td>🌟</td><td></td><td>鲁棒异常检测?</td></tr><tr><td align="left">UBnormal: New Benchmark for Supervised Open-Set Video Anomaly Detection</td><td>1</td><td>CVPR</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Detecting Anomaly in Chemical Sensors via Regularized Contrastive Learning</td><td>1</td><td>ICASSP</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Unsupervised Anomaly Detection for Container Cloud Via BILSTM-Based Variational Auto-Encoder</td><td>1</td><td>ICASSP</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Multiple Temporal Context Embedding Networks for Unsupervised time Series Anomaly Detection</td><td>1</td><td>ICASSP</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Improving Anomaly Detection with a Self-Supervised Task Based on Generative Adversarial Network</td><td>1</td><td>ICASSP</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Controlled Sensing and Anomaly Detection Via Soft Actor-Critic Reinforcement Learning</td><td>1</td><td>ICASSP</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Object-Guided and Motion-Refined Attention Network for Video Anomaly Detection</td><td>1</td><td>ICME</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Multi-Scale Continuity-Aware Refinement Network for Weakly Supervised Video Anomaly Detection</td><td>1</td><td>ICME</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Rethinking Graph Neural Networks for Anomaly Detection</td><td>1</td><td>ICML</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Constrained Adaptive Projection with Pretrained Features for Anomaly Detection</td><td>1</td><td>IJCAI</td><td>2022</td><td>🌟</td><td></td><td></td></tr><tr><td align="left">PAC-Wrap: Semi-Supervised PAC Anomaly Detection</td><td>1</td><td>KDD</td><td>2022</td><td>🌟</td><td></td><td></td></tr><tr><td align="left">RCAD: Real-time Collaborative Anomaly Detection System for Mobile Broadband Networks</td><td>1</td><td>KDD</td><td>2022</td><td>🌟</td><td></td><td>移动宽带异常检测,可能和后面的课题有关</td></tr><tr><td align="left">Dynamic Network Anomaly Modeling of Cell-Phone Call Detail Records for Infectious Disease Surveillance</td><td>1</td><td>KDD</td><td>2022</td><td>🌟</td><td></td><td>后面的课题相关</td></tr><tr><td align="left">Dual-Distribution Discrepancy for Anomaly Detection in Chest X-Rays</td><td>1</td><td>MICCAI</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Pixel-Level Anomaly Detection via Uncertainty-aware Prototypical Transformer</td><td>1</td><td>MM</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Hierarchical Scene Normality-Binding Modeling for Anomaly Detection in Surveillance Videos</td><td>1</td><td>MM</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Graph Convolutional Adversarial Networks for Spatiotemporal Anomaly Detection</td><td>1</td><td>TNNLS</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Semisupervised Training of Deep Generative Models for High-Dimensional Anomaly Detection</td><td>1</td><td>TNNLS</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Attract-Repel Encoder: Learning Anomaly Representation Away From Landmarks</td><td>1</td><td>TNNLS</td><td>2022</td><td>🌟</td><td></td><td></td></tr><tr><td align="left">Center-Aware Adversarial Autoencoder for Anomaly Detection</td><td>1</td><td>TNNLS</td><td>2022</td><td>🌟</td><td></td><td></td></tr><tr><td align="left">A Semi-Supervised VAE Based Active Anomaly Detection Framework in Multivariate Time Series for Online Systems</td><td>1</td><td>WWW</td><td>2022</td><td>🌟</td><td></td><td></td></tr><tr><td align="left">CCA: An ML Pipeline for Cloud Anomaly Troubleshooting</td><td>0</td><td>AAAI</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Learning Hypersphere for Few-shot Anomaly Detection on Attributed Networks</td><td>0</td><td>CIKM</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Towards an Awareness of Time Series Anomaly Detection Models’ Adversarial Vulnerability</td><td>0</td><td>CIKM</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Bayesian Nonparametric Submodular Video Partition for Robust Anomaly Detection</td><td>0</td><td>CVPR</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Learning Second Order Local Anomaly for General Face Forgery Detection</td><td>0</td><td>CVPR</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Hierarchical Semi-supervised Contrastive Learning for Contamination-Resistant Anomaly Detection</td><td>0</td><td>ECCV</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Approaches Toward Physical and General Video Anomaly Detection</td><td>0</td><td>ICASSP</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Dictionary Learning with Uniform Sparse Representations for Anomaly Detection</td><td>0</td><td>ICASSP</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">An Anomaly Detection Method Based on Self-Supervised Learning with Soft Label Assignment for Defect Visual Inspection</td><td>0</td><td>ICASSP</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Contrastive Predictive Coding for Anomaly Detection of Fetal Health from the Cardiotocogram</td><td>0</td><td>ICASSP</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Stgat-Mad : Spatial-Temporal Graph Attention Network For Multivariate Time Series Anomaly Detection</td><td>0</td><td>ICASSP</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">A Closer Look at Autoencoders for Unsupervised Anomaly Detection</td><td>0</td><td>ICASSP</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Integration of Anomaly Machine Sound Detection into Active Noise Control to Shape the Residual Sound</td><td>0</td><td>ICASSP</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Just Noticeable Learning for Unsupervised Anomaly Localization and Detection</td><td>0</td><td>ICME</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Domain-Generalized Textured Surface Anomaly Detection</td><td>0</td><td>ICME</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Locality-Aware Attention Network with Discriminative Dynamics Learning for Weakly Supervised Anomaly Detection</td><td>0</td><td>ICME</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Understanding and Mitigating Data Contamination in Deep Anomaly Detection: A Kernel-based Approach</td><td>0</td><td>IJCAI</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Framing Algorithmic Recourse for Anomaly Detection</td><td>0</td><td>KDD</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Subset Node Anomaly Tracking over Large Dynamic Graphs</td><td>0</td><td>KDD</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Adaptive Model Pooling for Online Deep Anomaly Detection from a Complex Evolving Data Stream</td><td>0</td><td>KDD</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Augmenting Log-based Anomaly Detection Models to Reduce False Anomalies with Human Feedback</td><td>0</td><td>KDD</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Anomaly Detection for Spatiotemporal Data in Action</td><td>0</td><td>KDD</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">A Multi-task Network with Weight Decay Skip Connection Training for Anomaly Detection in Retinal Fundus Images</td><td>0</td><td>MICCAI</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Task-Oriented Self-supervised Learning for Anomaly Detection in Electroencephalography</td><td>0</td><td>MICCAI</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Anomaly-Aware Multiple Instance Learning for Rare Anemia Disorder Classification</td><td>0</td><td>MICCAI</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Evidential Reasoning for Video Anomaly Detection</td><td>0</td><td>MM</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Purifier: Plug-and-play Backdoor Mitigation for Pre-trained Models Via Anomaly Activation Suppression</td><td>0</td><td>MM</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Anomaly Warning: Learning and Memorizing Future Semantic Patterns for Unsupervised Ex-ante Potential Anomaly Prediction</td><td>0</td><td>MM</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Few-Shot Fast-Adaptive Anomaly Detection</td><td>0</td><td>NIPS</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Perturbation Learning Based Anomaly Detection</td><td>0</td><td>NIPS</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">GraphAD: A Graph Neural Network for Entity-Wise Multivariate Time-Series Anomaly Detection</td><td>0</td><td>SIGIR</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Leveraging World Events to Predict E-Commerce Consumer Demand under Anomaly</td><td>0</td><td>WSDM</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">MemStream: Memory-Based Streaming Anomaly Detection</td><td>0</td><td>WWW</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Towards Open Set Video Anomaly Detection</td><td>-1</td><td>ECCV</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Anomaly Detection for Tabular Data with Internal Contrastive Learning</td><td>-1</td><td>ICLR</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Graph-Augmented Normalizing Flows for Anomaly Detection of Multiple Time Series</td><td>-1</td><td>ICLR</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy</td><td>-1</td><td>ICLR</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Learning Sparse Latent Graph Representations for Anomaly Detection in Multivariate Time Series</td><td>-1</td><td>KDD</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">CAT: Beyond Efficient Transformer for Content-Aware Anomaly Detection in Event Sequences</td><td>-1</td><td>KDD</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">ANDEA: Anomaly and Novelty Detection, Explanation, and Accommodation</td><td>-1</td><td>KDD</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Dual-discriminative Graph Neural Network for Imbalanced Graph-level Anomaly Detection</td><td>-1</td><td>NIPS</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Unsupervised Contextual Anomaly Detection for Database Systems</td><td>-1</td><td>SIGMOD</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">Editorial Deep Learning for Anomaly Detection</td><td>-1</td><td>TNNLS</td><td>2022</td><td></td><td></td><td></td></tr><tr><td align="left">CutPaste: Self-Supervised Learning for Anomaly Detection and Localization</td><td>228</td><td>CVPR</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Anomaly Detection in Video via Self-Supervised and Multi-Task Learning</td><td>116</td><td>CVPR</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">The MVTec Anomaly Detection Dataset: A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection</td><td>112</td><td>IJCV</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Video Anomaly Detection with Sparse Coding Inspired Deep Neural Networks</td><td>107</td><td>TPAMI</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">PANDA: Adapting Pretrained Features for Anomaly Detection and Segmentation</td><td>98</td><td>CVPR</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">DRAEM - A Discriminatively Trained Reconstruction Embedding for Surface Anomaly Detection</td><td>84</td><td>ICCV</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Anomaly Detection of Time Series With Smoothness-Inducing Sequential Variational Auto-Encoder</td><td>69</td><td>TNNLS</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Graph Neural Network-Based Anomaly Detection in Multivariate Time Series</td><td>66</td><td>AAAI</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">MIST: Multiple Instance Self-Training Framework for Video Anomaly Detection</td><td>56</td><td>CVPR</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Multiresolution Knowledge Distillation for Anomaly Detection</td><td>49</td><td>CVPR</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Pixel-Wise Anomaly Detection in Complex Driving Scenes</td><td>47</td><td>CVPR</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Weakly-Supervised Video Anomaly Detection With Robust Temporal Feature Magnitude Learning</td><td>40</td><td>ICCV</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Driver Anomaly Detection: A Dataset and Contrastive Learning Approach</td><td>40</td><td>WACV</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Real-Time Nonparametric Anomaly Detection in High-Dimensional Settings</td><td>39</td><td>TPAMI</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">A Hybrid Video Anomaly Detection Framework via Memory-Augmented Flow Reconstruction and Flow-Guided Frame Prediction</td><td>38</td><td>ICCV</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Toward Deep Supervised Anomaly Detection: Reinforcement Learning from Partially Labeled Anomaly Data</td><td>35</td><td>KDD</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Structural Temporal Graph Neural Networks for Anomaly Detection in Dynamic Graphs</td><td>34</td><td>CIKM</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Neural Transformation Learning for Deep Anomaly Detection Beyond Images</td><td>31</td><td>ICML</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Constrained Contrastive Distribution Learning for Unsupervised Anomaly Detection and Localisation in Medical Images</td><td>30</td><td>MICCAI</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Appearance-Motion Memory Consistency Network for Video Anomaly Detection</td><td>28</td><td>AAAI</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Decoupling Representation Learning and Classification for GNN-based Anomaly Detection</td><td>28</td><td>SIGIR</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Glancing at the Patch: Anomaly Localization With Global and Local Feature Comparison</td><td>27</td><td>CVPR</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Few-shot Network Anomaly Detection via Cross-network Meta-learning</td><td>26</td><td>WWW</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Learning Causal Temporal Relation and Feature Discrimination for Anomaly Detection</td><td>25</td><td>TIP</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">MStream: Fast Anomaly Detection in Multi-Aspect Streams</td><td>25</td><td>WWW</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">G2D: Generate to Detect Anomaly</td><td>22</td><td>WACV</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">F-FADE: Frequency Factorization for Anomaly Detection in Edge Streams</td><td>22</td><td>WSDM</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Learning Semantic Context from Normal Samples for Unsupervised Anomaly Detection</td><td>21</td><td>AAAI</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Convolutional Transformer based Dual Discriminator Generative Adversarial Networks for Video Anomaly Detection</td><td>21</td><td>MM</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">A Hierarchical Transformation-Discriminating Generative Model for Few Shot Anomaly Detection</td><td>20</td><td>ICCV</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">FluxEV: A Fast and Effective Unsupervised Framework for Time-Series Anomaly Detection</td><td>20</td><td>WSDM</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Deep Learning for Anomaly Detection: Challenges, Methods, and Opportunities</td><td>20</td><td>WSDM</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Divide-and-Assemble: Learning Block-Wise Memory for Unsupervised Anomaly Detection</td><td>19</td><td>ICCV</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">ANEMONE: Graph Anomaly Detection with Multi-Scale Contrastive Learning</td><td>18</td><td>CIKM</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Multi-Scale One-Class Recurrent Neural Networks for Discrete Event Sequence Anomaly Detection</td><td>18</td><td>KDD</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Anomaly Detection in Time Series: A Comprehensive Evaluation</td><td>18</td><td>VLDB</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Multivariate Time Series Anomaly Detection and Interpretation using Hierarchical Inter-Metric and Temporal Embedding</td><td>17</td><td>KDD</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Deep Unsupervised Anomaly Detection</td><td>17</td><td>WACV</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">SDFVAE: Static and Dynamic Factorized VAE for Anomaly Detection of Multivariate CDN KPIs</td><td>17</td><td>WWW</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">GAN Ensemble for Anomaly Detection</td><td>16</td><td>AAAI</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Dance With Self-Attention: A New Look of Conditional Random Fields on Anomaly Detection in Videos</td><td>16</td><td>ICCV</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">TSB-UAD: An End-to-End Benchmark Suite for Univariate Time-Series Anomaly Detection</td><td>15</td><td>VLDB</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Time Series Anomaly Detection with Multiresolution Ensemble Decoding</td><td>14</td><td>AAAI</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Weakly-Supervised Spatio-Temporal Anomaly Detection in Surveillance Video</td><td>14</td><td>IJCAI</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Masked Contrastive Learning for Anomaly Detection</td><td>14</td><td>IJCAI</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Transfer-Based Semantic Anomaly Detection</td><td>13</td><td>ICML</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Understanding the Effect of Bias in Deep Anomaly Detection</td><td>13</td><td>IJCAI</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">LREN: Low-Rank Embedded Network for Sample-Free Hyperspectral Anomaly Detection</td><td>12</td><td>AAAI</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Road Anomaly Detection by Partial Image Reconstruction With Segmentation Coupling</td><td>11</td><td>ICCV</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Time Series Anomaly Detection for Cyber-physical Systems via Neural System Identification and Bayesian Filtering</td><td>11</td><td>KDD</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Toward Explainable Deep Anomaly Detection</td><td>11</td><td>KDD</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Elsa: Energy-based Learning for Semi-supervised Anomaly Detection</td><td>10</td><td>BMVC</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Practical Approach to Asynchronous Multivariate Time Series Anomaly Detection and Localization</td><td>10</td><td>KDD</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Online Anomaly Detection With Bandwidth Optimized Hierarchical Kernel Density Estimators</td><td>10</td><td>TNNLS</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">ASC-Net: Adversarial-Based Selective Network for Unsupervised Anomaly Segmentation</td><td>9</td><td>MICCAI</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Unsupervised Cross-system Log Anomaly Detection via Domain Adaptation</td><td>8</td><td>CIKM</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Online false discovery rate control for anomaly detection in time series</td><td>8</td><td>NIPS</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Action Sequence Augmentation for Early Graph-based Anomaly Detection</td><td>7</td><td>CIKM</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Implicit Field Learning for Unsupervised Anomaly Detection in Medical Images</td><td>7</td><td>MICCAI</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Regularizing Attention Networks for Anomaly Detection in Visual Question Answering</td><td>6</td><td>AAAI</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Anomaly Mining: Past, Present and Future</td><td>6</td><td>CIKM</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Subtractive Aggregation for Attributed Network Anomaly Detection</td><td>6</td><td>CIKM</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">An Accuracy Network Anomaly Detection Method Based on Ensemble Model</td><td>6</td><td>ICASSP</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Anomaly Mining - Past, Present and Future</td><td>6</td><td>IJCAI</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">(1 + epsilon)-class Classification: an Anomaly Detection Method for Highly Imbalanced or Incomplete Data Sets</td><td>6</td><td>JMLR</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">ELITE: Robust Deep Anomaly Detection with Meta Gradient</td><td>6</td><td>KDD</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Robust Graph Autoencoder for Hyperspectral Anomaly Detection</td><td>5</td><td>ICASSP</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Real-Time Synchronization in Neural Networks for Multivariate Time Series Anomaly Detection</td><td>5</td><td>ICASSP</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">RCA: A Deep Collaborative Autoencoder Approach for Anomaly Detection</td><td>5</td><td>IJCAI</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Anomaly Attribution with Likelihood Compensation</td><td>4</td><td>AAAI</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Student-Teacher Feature Pyramid Matching for Anomaly Detection</td><td>4</td><td>BMVC</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Towards Anomaly-resistant Graph Neural Networks via Reinforcement Learning</td><td>4</td><td>CIKM</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Voting-Based Ensemble Model for Network Anomaly Detection</td><td>4</td><td>ICASSP</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">A Semantic-Enhanced Method Based On Deep SVDD for Pixel-Wise Anomaly Detection</td><td>4</td><td>ICME</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">TranAD: Deep Transformer Networks for Anomaly Detection in Multivariate Time Series Data</td><td>4</td><td>VLDB</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Volume Under the Surface: A New Accuracy Evaluation Measure for Time-Series Anomaly Detection</td><td>4</td><td>VLDB</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Fourier Transformation Autoencoders for Anomaly Detection</td><td>3</td><td>ICASSP</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">STEP-GAN: A One-Class Anomaly Detection Model with Applications to Power System Security</td><td>3</td><td>ICASSP</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Learning Discriminative Features for Semi-Supervised Anomaly Detection</td><td>3</td><td>ICASSP</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Towards Parkinson’s Disease Prognosis Using Self-Supervised Learning and Anomaly Detection</td><td>3</td><td>ICASSP</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Weakly Supervised Temporal Anomaly Segmentation With Dynamic Time Warping</td><td>3</td><td>ICCV</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Learning Unsupervised Metaformer for Anomaly Detection</td><td>3</td><td>ICCV</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Anomaly Detection with Prototype-Guided Discriminative Latent Embeddings</td><td>3</td><td>ICDM</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Near-Optimal Entrywise Anomaly Detection for Low-Rank Matrices with Sub-Exponential Noise</td><td>3</td><td>ICML</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Multiclass Anomaly Detector: the CS++ Support Vector Machine</td><td>3</td><td>JMLR</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">SVD-GAN for Real-Time Unsupervised Video Anomaly Detection</td><td>2</td><td>BMVC</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Looking at the whole picture: constrained unsupervised anomaly segmentation</td><td>2</td><td>BMVC</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Low-Rank on Graphs Plus Temporally Smooth Sparse Decomposition for Anomaly Detection in Spatiotemporal Data</td><td>2</td><td>ICASSP</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Hybrid Model for Network Anomaly Detection with Gradient Boosting Decision Trees and Tabtransformer</td><td>2</td><td>ICASSP</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Situational Anomaly Detection in Multimedia Data under Concept Drift</td><td>2</td><td>MM</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">ESAD: End-to-end Semi-supervised Anomaly Detection</td><td>1</td><td>BMVC</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">TADPOLE: Task ADapted Pre-Training via AnOmaLy DEtection</td><td>1</td><td>EMNLP</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Cross-Scene Person Trajectory Anomaly Detection Based on Re-Identification</td><td>1</td><td>ICME</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Anomaly and Novelty Detection, Explanation, and Accommodation (ANDEA)</td><td>1</td><td>KDD</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Airway Anomaly Detection by Prototype-Based Graph Neural Network</td><td>1</td><td>MICCAI</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">A New Distributional Treatment for Time Series and An Anomaly Detection Investigation</td><td>1</td><td>VLDB</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">TimeEval: A Benchmarking Toolkit for Time Series Anomaly Detection Algorithms</td><td>1</td><td>VLDB</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Theseus: Navigating the Labyrinth of Time-Series Anomaly Detection</td><td>1</td><td>VLDB</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Sequential Adversarial Anomaly Detection with Deep Fourier Kernel</td><td>0</td><td>ICASSP</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">Adversarial Regularized Reconstruction for Anomaly Detection and Generation</td><td>0</td><td>ICDM</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">A Demonstration of AutoOD: A Self-tuning Anomaly Detection System</td><td>0</td><td>VLDB</td><td>2021</td><td></td><td></td><td></td></tr><tr><td align="left">USAD: UnSupervised Anomaly Detection on Multivariate Time Series</td><td>231</td><td>KDD</td><td>2020</td><td>✅</td><td><a href="https://www.notion.so/kmdsy/KDD20-UnSupervised-Anomaly-Detection-on-Multivariate-Time-Series-USAD-USAD-a2d01dd33bff4b46a782bdefb1faec59?pvs=4">link</a></td><td>auto-encoder+对抗损失</td></tr><tr><td align="left">Kitsune: An Ensemble of Autoencoders for Online Network Intrusion Detection</td><td>763</td><td>Machine Learning</td><td>2018</td><td>✅</td><td><a href="https://www.notion.so/Kitsune-An-Ensemble-of-Autoencoders-for-Online-Network-Intrusion-Detection-Kitsune-f63a08f6654b4d5c8d1a93f5291f6522?pvs=4">link</a></td><td>一种网络异常检测框架,在工程实现角度上的可借鉴之处很多</td></tr></tbody></table>]]></content>
<tags>
<tag> paper list </tag>
</tags>
</entry>
<entry>
<title>Optimization for Nondifferentiable Problem</title>
<link href="/uncategorized/surveys/nondiff_optimization/"/>
<url>/uncategorized/surveys/nondiff_optimization/</url>
<content type="html"><![CDATA[<p><a href="/uncategorized/surveys/bilevel_optimization">上篇</a>调研总结了双层优化问题的解决范式,并梳理了双层优化应用在Sim2real的domain randomization(DR)问题上时的难点在于</p><ul><li>如何解不可微的外层优化问题</li><li><strong>如何转换不可微的外层优化问题为smooth的</strong></li></ul><p>本文将对其中 “不可微问题的求解” 进行调研。</p><p>此外本文也将简要介绍一下关于超参优化(hyperparameter optimization,HO)的内容,调研HO问题的原因是:domain randomization与HO问题具有相似的formulation,但二者又存在区别(最重要的区别在于一般的HO问题是differetiable的,DR问题在没有转换为smooth问题之前是不可微的),因此阅读了一些文献并做了简要总结。</p><span id="more"></span><h2 id="1-Formulation-of-Domain-Randomization"><a href="#1-Formulation-of-Domain-Randomization" class="headerlink" title="1. Formulation of Domain Randomization"></a>1. Formulation of Domain Randomization</h2><p><a href="uncategorized/surveys/sim2real">回顾</a>DR的formulation如下</p><p>$$\begin{array}{rl}\min & F(\phi, \theta) = \mathcal{L}(\theta^{*};\mathcal{D}_{real}) \\s.t. & \theta^{*} = \arg\min_{\theta}f(\phi, \theta) = \mathbb{E}_{x \sim P_{\phi}(x)}[\mathcal{L}(\theta;\mathcal{D}_{x})]\\var. & \phi, \theta\end{array}$$</p>可以看到,外层优化问题可微但**内层优化函数不可微,** 原因在于:<ol><li>采样训练数据 $\mathcal{D}_{x}$ 的分布,其参数受控于优化变量$\phi$ 。</li><li>一般来讲,从仿真器中生成仿真数据的过程(如从图形渲染引擎中生成图像)不单纯是一个分布采样问题,为保证问题的一般性,这个数据生成过程要被视为不可微。</li></ol><blockquote><p>Ruiz, N., Schulter, S., & Chandraker, M. Learning To Simulate. In International Conference on Learning Representations, 2019.</p></blockquote><h2 id="2-Solving-a-Nondifferential-Problem"><a href="#2-Solving-a-Nondifferential-Problem" class="headerlink" title="2. Solving a Nondifferential Problem"></a>2. Solving a Nondifferential Problem</h2><p>一般来说,不可微问题的解决方法有次 <strong>梯度法(subgradient)</strong>、<strong>近端梯度法(Proximal Gradient)</strong>。</p><h3 id="2-1-Subgradient"><a href="#2-1-Subgradient" class="headerlink" title="2.1 Subgradient"></a>2.1 Subgradient</h3><p><strong>次梯度法一般用于求解convex non-smooth问题</strong>,本节的主要参考材料为[1],同时参考了博客[2]</p><blockquote><p>[1] Polyak, B. T. (1978). Subgradient methods: a survey of Soviet research. <em>Nonsmooth optimization</em>, <em>3</em>, 5-29.</p><p>[2] <a href="https://blog.csdn.net/zbwgycm/article/details/104507442">https://blog.csdn.net/zbwgycm/article/details/104507442</a></p></blockquote><p><strong>次梯度法适用于所有不可微的目标函数,但其收敛速度较慢</strong>。</p><h4 id="2-1-1-Definition"><a href="#2-1-1-Definition" class="headerlink" title="2.1.1 Definition"></a>2.1.1 Definition</h4><p>对于定义在 $\mathbb{R}^n$ 的连续凸函数 $f(x)$ ,当向量 $\partial f(x) \in \mathbb{R}^n $ 满足以下条件时,可被称为函数 $f(x)$ 在 $x$ 点的次梯度</p><p>$$f(x+y) \geq f(x)+(\partial f(x), y), \forall y \in R^{n}$$</p>注:这里的 $(\cdot,\cdot)$ 应该是向量积的意思,参考[2]中 $f(y) \geq f(x)+g^{T}(y-x)$。可以看出,相比与梯度的定义(等号),上述条件更为松弛。一些次梯度的性质如下[2]<ul><li>次梯度总是出现在定义域 $dom(f)$ 的内部。</li><li>对于所有 $x \in \mathbb{R}^n$ ,次梯度都存在(但可能不是唯一的)。如果 $f(x)$ 是可微的,则次梯度是唯一的,并且与梯度。各种类型函数的次梯度计算规则是众所周知的。</li><li>次梯度的定义可以推广到非凸函数中,但 <strong>非凸函数的次梯度可能不存在</strong> 。</li></ul><h4 id="2-1-2-Gradient-Decent"><a href="#2-1-2-Gradient-Decent" class="headerlink" title="2.1.2 Gradient Decent"></a>2.1.2 Gradient Decent</h4><p>类比于梯度下降法,次梯度法只是将其中的梯度替换为次梯度,其他步骤不变,其更新如下:</p><p>$$x_{k+1} = x_{k} - \gamma_{k} \frac{\partial f(x_k)}{\| \partial f(x_k) \|}$$</p>其中 $\gamma_k$ 为 step size,上式的简洁写法为 $x_{k+1} = x_k - \gamma_k \cdot g_{k-1}$。梯度法和次梯度法之间的主要区别在于,一般来说,方向 $ \partial f(x_k)$ 不是 $x_k$ 处的下降方向,即:**并非严格下降** ;不可微函数的 $f(x_k)$ 的值**不会单调减小**。 <h4 id="2-1-3-Error-and-Convergence"><a href="#2-1-3-Error-and-Convergence" class="headerlink" title="2.1.3 Error and Convergence"></a>2.1.3 Error and Convergence</h4><p>Convergence rate $O(1/\epsilon^2)$,慢于梯度下降的 $O(1/\epsilon)$ 。$k$ 次迭代后的error level为 $O(1/\sqrt{k})$。</p><h3 id="2-2-Proximal-Gradient"><a href="#2-2-Proximal-Gradient" class="headerlink" title="2.2 Proximal Gradient"></a>2.2 Proximal Gradient</h3><p>本节的主要参考材料如下</p><blockquote><p>[1] Schmidt, M., Roux, N., & Bach, F. (2011). Convergence rates of inexact proximal-gradient methods for convex optimization. <em>Advances in neural information processing systems</em>, <em>24</em>.</p><p>[2] <a href="https://zhuanlan.zhihu.com/p/82622940">https://zhuanlan.zhihu.com/p/82622940</a></p></blockquote><p>近端梯度法适用于“整体优化目标不可微分,但可以<strong>分解为部分可微、部分不可微</strong>”的目标函数。对比次梯度法,该方法具有更快的收敛速度和误差。</p><h4 id="2-2-1-Problem-Statement"><a href="#2-2-1-Problem-Statement" class="headerlink" title="2.2.1 Problem Statement"></a>2.2.1 Problem Statement</h4><p>近端梯度用于解决一类复合的不可微问题,可以表示如下</p><p>$$\underset{x \in \mathbb{R}^d}{\operatorname{minimize}} f(x):=g(x)+h(x)$$</p>其中 $g,h$ 均为凸函数(convex),但 $g$ 是 smooth 的,$h$ 为不可微的非smooth项。<h4 id="2-2-2-Definition"><a href="#2-2-2-Definition" class="headerlink" title="2.2.2 Definition"></a>2.2.2 Definition</h4><p>近端梯度算法的基础是以下的近端算子(proximity operator),其定义如下</p><p>$$\operatorname{prox}_L(y)=\underset{x \in \mathbb{R}^d}{\arg \min } \frac{L}{2}\|x-y\|^2+h(x)$$</p>其中 $L$ 是函数 $g$ 的 Lipschitz 常数。本质上,**近端算子用平滑项 $g$ 的二次近拟定义了变量向最小值的更新方向**。<p>上式解读:上式的自变量是 $y$,目标是给定一个 $y$,找到使得后面的式子最小化的 $x$。可以看出,由于后面的最小化问题其优化变量是 $x$,因此近端算子的形式与 $h$ 项密切相关。</p><p>🌟 对于几种特殊的 $h$ 形式,例如 $l_1$-norm,上面的近端算子是存在解析解的(详见[1]中的参考文献[5, 6]以及知乎文章)。<strong>然而,在许多情况下,邻近算子可能没有解析解,或者精确计算该解可能非常昂贵。</strong></p><h4 id="2-2-3-Proximal-Gradient-Descent"><a href="#2-2-3-Proximal-Gradient-Descent" class="headerlink" title="2.2.3 Proximal Gradient Descent"></a>2.2.3 Proximal Gradient Descent</h4><p>借助近端梯度做优化时的参数更新方法如下(generalized form)</p><p>$$x_k=\operatorname{prox}_L\left[y_{k-1}-(1 / L)\left(\nabla g\left(y_{k-1}\right)+e_k\right)\right]$$</p>其中 $e_k$ 是计算梯度时引入的误差,此外,如果近端算子被不精确求解,$x_k$ 将还存在一个由此导致的误差项 $\epsilon_k$。<p>上式的一种更容易理解的写法为(basic gradient descent),其中 $t$ 为迭代的步长≥</p><p>$$x_k=\operatorname{prox}_L\left[x_{k-1}-t\nabla g\left(x_{k-1}\right)\right]$$</p>相比 generalized form,basic gradient descent 取 $y_k = x_k$,在**accelerated proximal-gradient method**中,$y_k=x_k+\beta_k\left(x_k-x_{k-1}\right)$。<h4 id="2-2-4-Error"><a href="#2-2-4-Error" class="headerlink" title="2.2.4 Error"></a>2.2.4 Error</h4><p>$k$ 次迭代后的error level为 $O(1/k)$。accelerated proximal-gradient 的 error level 为 $O(k^2)$。</p><h2 id="3-Hyperparameter-Optimization"><a href="#3-Hyperparameter-Optimization" class="headerlink" title="3. Hyperparameter Optimization"></a>3. Hyperparameter Optimization</h2><p>本节内容参考</p><blockquote><p>Bao, F., Wu, G., Li, C., Zhu, J., & Zhang, B. (2021). Stability and generalization of bilevel programming in hyperparameter optimization. <em>Advances in Neural Information Processing Systems</em>, <em>34</em>, 4529-4541.</p></blockquote><p>HO 问题的formulation可以写为</p><p>$$\begin{array}{rl}\hat{\lambda}\left(S^{t r}, S^{v a l}\right) & \approx \underset{\lambda \in \Lambda}{\arg \min } \hat{R}^{v a l}\left(\lambda, \hat{\theta}\left(\lambda, S^{t r}\right), S^{v a l}\right) \\\text{where} \quad \hat{\theta}\left(\lambda, S^{t r}\right) & \approx \underset{\theta \in \Theta}{\arg \min } \hat{R}^{t r}\left(\lambda, \theta, S^{t r}\right)\end{array}$$</p>上式采用约等于符号,是因为:在大多数情况下(例如,神经网络内层优化变量),上式中的内部和外部问题的**全局最优值是难以实现**的。通常情况下,以某种方式(例如,使用(随机)梯度下降)对其进行近似。<p>🌟注意:HO问题不存在DR问题中所提到的两个阶段,因此<strong>大多数可以认为是可微</strong>的。且大多数是带约束的优化(如整数约束等)。</p><p><strong>HO的解决方法主要分为以下几种:</strong></p><ul><li>Unrolled differentiation:在内外层上执行有限步数的梯度下降。注意:<strong>这里的 $\theta$ 相对于 $\lambda$ 是可微的</strong>,因此可以使用梯度下降。具体的分析:将内层变量(神经网络参数)视为一个足够大的矩阵,则外层变量(超参)的变化导致的内层变量变化(如神经网络层数变化)可以被视为矩阵与mask的乘积。因此二者存在函数关系 $\theta = g(\lambda)$,这种函数关系是可微的。</li><li>Cross-validation:CV是HO的经典方法。它首先通过 网格搜索 或 随机搜索 获得一组有限的超参数,通常是 $\Lambda$ 的子集。然后,它训练内层问题,以获得给定超参数的相应参数 $\theta$ 。最后,根据<strong>验证误差</strong>选择最佳 $(\theta^{\star}, \lambda^{\star})$ 对。</li><li>Implicit gradient:隐式梯度方法直接估计外层问题的梯度,这通常涉及一个迭代过程,如共轭梯度、Neumann近似,以估计Hessian矩阵的逆。</li><li>Bayesian optimization:贝叶斯优化将外层问题视为从高斯过程(GP)采样的<strong>黑箱函数</strong>,并在评估新的超参数时更新GP的后验。</li><li>Hypernetworks:学习在给定超参数的情况下输出近似最优假设的代理函数。</li></ul>]]></content>
<tags>
<tag> survey </tag>
<tag> optimization </tag>
</tags>
</entry>
<entry>
<title>Paradigm of Bi-level Optimization</title>
<link href="/uncategorized/surveys/bilevel_optimization/"/>
<url>/uncategorized/surveys/bilevel_optimization/</url>
<content type="html"><![CDATA[<p>本文调研了双层优化问题的解决范式,对两种popular的<strong>基于梯度</strong>的方法——AID、ITD作出介绍与总结。最后,本文对如何利用bilevel optimization解决sim2real问题难点作出梳理。</p><p>本调研的motivation来自于<a href="/uncategorized/surveys/sim2real/">sim2real</a>问题——借助少量的仿真数据,如何找到一组仿真器参数$\phi$,在该组参数下的仿真数据上训练一个异常检测模型,将会在现实数据中有较好的效果——这可以类比于一个超参数调优问题。</p><span id="more"></span><h2 id="1-Formulation-of-Bilevel-Optimization"><a href="#1-Formulation-of-Bilevel-Optimization" class="headerlink" title="1. Formulation of Bilevel Optimization"></a>1. Formulation of Bilevel Optimization</h2><p>双层优化问题的形式一般定义为</p><p>$$\begin{array}{rl}\min & \Phi(\boldsymbol{x}):=f(\boldsymbol{x}, \boldsymbol{y}^{\star}) \\\text { s.t. } & \boldsymbol{y}^{\star}=\underset{\boldsymbol{y}}{\arg \min } g\left(\boldsymbol{x}, \boldsymbol{y}\right) \\\text { var. } & \boldsymbol{x}, \boldsymbol{y},\end{array}$$</p>在双层优化中,一个优化问题嵌入或嵌套在另一个问题中。外部优化问题被称为上层优化问题,而内部优化问题被称作下层优化问题,$\boldsymbol{x}, \boldsymbol{y}$ 分别为上下层的优化变量。<p>双层优化问题的解决范式包括以下两类:</p><ol><li><p>Double Loop / gradient-based:内外层问题分别根据其问题特性(可微、平滑、凸等),采用不同的优化方法,两个问题的解通过<strong>超梯度</strong>串联起来,以寻找一个全局的(渐进)最优解。这类方法可以概括为“启发式”——有方向的迭代,一般来说给一个初始可行解然后按照实际问题确定一个下降方向,不断搜索直到gap满足精度要求,但这里必须注意,启发式算法很有可能得到的不是最优解,一定要对结果进行论证。</p><ul><li>此类范式的方法包括:approximate implicit differentiation (AID)、iterative differentiation (ITD),这两类方法均为基于梯度的方法。</li></ul></li><li><p>Single Loop / constrain-based:在这种方法中,下层优化问题可以视为是上层优化问题的约束,因此双层优化也可以视为<strong>约束优化</strong>的特殊情况。这类方法可以概括为“解析式”——接算出解析节,这种方法的逻辑大都使用KKT,对偶,罚函数等将双层规划转化成单层,然后利用单层的方法求解。</p><ul><li>解决带约束优化的方法包括但不限于:Karush–Kuhn–Tucker(KKT,KKT条件表现为拉格朗日和互补约束,并将整体双层优化问题简化为单级约束优化问题),Penalty Function Methods等[1]。</li></ul></li></ol><blockquote><p>[1] Sinha, A., Malo, P., & Deb, K. (2017). A review on bilevel optimization: From classical to evolutionary approaches and applications. <em>IEEE Transactions on Evolutionary Computation</em>, <em>22</em>(2), 276-295.</p><p>[2] <a href="https://www.zhihu.com/question/25059001?sort=created">https://www.zhihu.com/question/25059001?sort=created</a></p></blockquote><h2 id="2-Gradient-based-iterative-methods"><a href="#2-Gradient-based-iterative-methods" class="headerlink" title="2. Gradient-based iterative methods"></a>2. Gradient-based iterative methods</h2><p>Single Loop / constrain-based 方法往往涉及大量约束因此难以在机器学习任务中应用。在下文中,将对两种 Double Loop / gradient-based 的方法——AID / ITD作出总结。主要的formulation参考下文:</p><blockquote><p>Ji, K., Yang, J., & Liang, Y. (2021, July). Bilevel optimization: Convergence analysis and enhanced design. In <em>International conference on machine learning</em> (pp. 4882-4892). PMLR.</p></blockquote><p>这两种算法均涉及到一个概念:<strong>hyper-gradient(超梯度)</strong> <a href="https://github.com/gbaydin/hypergradient-descent">link</a></p><ul><li>梯度(gradient):在一个基于梯度的优化问题中,优化目标对<strong>模型参数</strong>的梯度,被称为 basic gridient,简称为梯度</li><li>超梯度(hyper-gridient):在上述的优化问题中,将优化目标对<strong>优化过程的超参数</strong>(如 learning rate, momentum, regularization parameters, etc.)求梯度,则称为超梯度。</li></ul><p>在双层优化中,待优化的参数往往是内层优化问题的变量(如模型参数),而外层优化问题的变量一般为算法超参数(如模型超参),因此这里<strong>称:外层优化目标对外层变量的梯度——超梯度</strong>。超梯度的复杂度分析可以参考下文</p><blockquote><p>Grazzi, R., Franceschi, L., Pontil, M., & Salzo, S. (2020, November). On the iteration complexity of hypergradient computation. In <em>International Conference on Machine Learning</em> (pp. 3748-3758). PMLR.</p></blockquote><p>🌟 AID 和 ITD 的不同之处在于,他们计算超梯度的方法不同,因此使用了不同的近似方法</p><p>⬇️ 一些有用的链接</p><blockquote><p><a href="https://github.com/JunjieYang97/stocBiO">https://github.com/JunjieYang97/stocBiO</a></p><p><a href="https://github.com/prolearner/hypertorch">https://github.com/prolearner/hypertorch</a></p><p><a href="https://github.com/gbaydin/hypergradient-descent">https://github.com/gbaydin/hypergradient-descent</a></p></blockquote><h3 id="2-1-Approximate-Implicit-Differentiation-AID"><a href="#2-1-Approximate-Implicit-Differentiation-AID" class="headerlink" title="2.1 Approximate Implicit Differentiation (AID)"></a>2.1 Approximate Implicit Differentiation (AID)</h3><p>AID方法的核心是:用一个近似值来代替隐微分(implicit differentitation),以构造超梯度。AID的超梯度计算公式为:</p><p>$$\nabla \Phi\left(x_k\right)=\nabla_x f\left(x_k, y^{*}\left(x_k\right)\right)-\nabla_x \nabla_y g\left(x_k, y^{*}\left(x_k\right)\right) v_k^{*}$$</p>其中 $v_k^{*}$ 是下面的线性系统 $\nabla_y^2 g\left(x_k, y^{*}\left(x_k\right)\right) v=\nabla_y f\left(x_k, y^{*}\left(x_k\right)\right)$ 的解。以上是超梯度的原始计算公式。<p><strong>AID的求解思路如下:</strong></p><ol><li>利用 $N-step$ 的 conjugate-gradient (CG) 方法来求解线性系统的解 $v_k^N$。</li><li>将上述解 $v_k^N$ 和内层的最近一次的迭代解 $y_k^T$ 带入上述的超梯度公式,得到超梯度的近似值,即:</li></ol><p>$$\nabla \Phi\left(x_k\right)=\nabla_x f\left(x_k, y_k^T\right)-\nabla_x \nabla_y g\left(x_k, y_k^T\right) v_k^N$$</p><p>⚠️ Note:注意——上式中的最后一项 Jacobian-vector product 可以通过自动微分求出,因此上式可以被计算。</p><h3 id="2-2-Iterative-Differentiation-ITD"><a href="#2-2-Iterative-Differentiation-ITD" class="headerlink" title="2.2 Iterative Differentiation (ITD)"></a>2.2 Iterative Differentiation (ITD)</h3><p>ITD方法的核心是:超梯度的近似值具有一个解析形式,该解析形式可以通过迭代的方式求出。ITD的超梯度计算公式为:</p><p>$$\nabla \Phi\left(x_k\right)=\frac{\partial f\left(x_k, y^{*}\left(x_k\right)\right)}{\partial x_k}$$</p>其近似估计为$\frac{\partial f\left(x_k, y_k^D\left(x_k\right)\right)}{\partial x_k}$,而该梯度存在下述的解析形式,因此可以被计算:<p>$$\begin{aligned}\frac{\partial f\left(x_{k}, y_{k}^{D}\right)}{\partial x_{k}}= & \nabla_{x} f\left(x_{k}, y_{k}^{D}\right)-\alpha \sum_{t=0}^{D-1} \nabla_{x} \nabla_{y} g\left(x_{k}, y_{k}^{t}\right) \\& \times \prod_{j=t+1}^{D-1}\left(I-\alpha \nabla_{y}^{2} g\left(x_{k}, y_{k}^{j}\right)\right) \nabla_{y} f\left(x_{k}, y_{k}^{D}\right)\end{aligned}$$</p><img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/image-20230216213116561.png" alt="Bilevel algorithms via AID or ITD" style="zoom:50%;" /><h3 id="2-3-Other-Methods"><a href="#2-3-Other-Methods" class="headerlink" title="2.3 Other Methods"></a>2.3 Other Methods</h3><p>基于梯度的方法,其精髓就在于计算超梯度的方式,这里还列出一些不属于上述范式的方法,有空可以看看</p><blockquote><p>Franceschi, L., Donini, M., Frasconi, P., & Pontil, M. (2017, July). Forward and reverse gradient-based hyperparameter optimization. In <em>International Conference on Machine Learning</em> (pp. 1165-1173). PMLR.</p></blockquote><h2 id="3-利用双层优化解sim2real问题的难点"><a href="#3-利用双层优化解sim2real问题的难点" class="headerlink" title="3. 利用双层优化解sim2real问题的难点"></a>3. 利用双层优化解sim2real问题的难点</h2><p>简要梳理:AID ITD分别是两种借助超梯度进行内外层优化串联的方法,但在实际计算的过程中对<strong>外层优化问题</strong>仍有诸多限制,如:<strong>smoothness</strong>, twice differentiability and an invertible Hessian。</p><p>对比domain randomization可以发现,DR问题的外层优化问题依然是不可微的[1]。解决这类不可微问题,常用的转换思路有</p><ul><li>用非梯度算法:贝叶斯优化、RL</li><li>将 non-smooth 问题近似成 smooth 的问题来求解,这样就可以用大部分的 bilevel 的工作</li></ul><p>因此,利用双层优化问题解DR问题的难点在于——<strong>如何解不可微优化问题 或 如何转换不可微问题为可微问题</strong>。这也是<a href="/uncategorized/surveys/nondiff_optimization">下一篇</a>survey的调研重点。</p><blockquote><p>[1] Ruiz, N., Schulter, S., & Chandraker, M. Learning To Simulate. In International Conference on Learning Representations, 2019.</p></blockquote>]]></content>
<tags>
<tag> survey </tag>
<tag> optimization </tag>
</tags>
</entry>
<entry>
<title>NS3无线网络仿真器</title>
<link href="/uncategorized/notes/NS3/"/>
<url>/uncategorized/notes/NS3/</url>
<content type="html"><![CDATA[<p>本文整理了在用NS3做无线网络仿真的时候,需要掌握的一些基础知识</p><span id="more"></span><h2 id="1-关键名词"><a href="#1-关键名词" class="headerlink" title="1. 关键名词"></a>1. 关键名词</h2><ul><li>Node:应该将 <code>Node</code>视为您将向其添加application的计算机。一个是添加诸如应用程序、协议栈和外围卡及其相关驱动程序之类的东西,以使计算机能够执行有用的工作。<em>我们在ns-3</em>中使用相同的基本模型。</li><li>Application:<em>ns-3</em> <code>application</code>在 <em>ns-3</em> <code>Nodes</code>上运行,以驱动模拟世界中的模拟。</li><li>Channel:在<em>ns-3</em>的模拟世界中,将<code>Node</code>连接到表示通信信道的对象。在这里,基本的通信子网络抽象被称为<code>Channel</code>,并在C++中由Channel类表示。Channel类提供了管理通信子网络对象和将节点连接到这些对象的方法。channel的实体可以建模像导线这样简单的东西,还可以模拟像大型以太网交换机或无线网络中充满障碍物的三维空间这样复杂的事物。在本教程中,我们将使用名为<code>CsmaChannel</code>、<code>PointToPointChannel</code>和<code>WifiChannel</code>的Channel的实体。<ul><li><code>CsmaChannel</code>为实现载波侦听多址通信介质的通信子网络的一个版本建模。这为我们提供了类似以太网的功能。</li></ul></li><li>Net Device:<em>ns-3</em>中的<code>net device</code>抽象类包含了软件驱动与模拟的硬件设备,一个<code>net device</code>被“安装”在<code>Node</code>中,用于使能<code>Node</code>与其他<code>Node</code>之间经由<code>Channel</code>的通信。正如一个真实的计算机,一个<code>Node</code>可以通过多个<code>net device</code>链接多个<code>channel</code>。</li><li>Topology Helpers:在<em>ns-3</em>中,连接<code>Net Devices</code>到<code>Node</code>,连接<code>NetDevice</code>到<code>channel</code>,分配IP地址等都是常见的任务,因此我们提供了我们称之为Topology Helpers的功能,以尽可能简化这一过程。<ul><li>创建网络设备、添加MAC地址、在节点上安装该网络设备、配置节点的协议堆栈,然后将网络设备连接到信道,可能需要许多不同的ns-3核心操作。甚至需要更多的操作来将多个设备连接到多点信道上,然后将单个网络连接到互联网上。我们提供了拓扑帮助器对象,这些对象将这些不同的操作组合成一个易于使用的模型。</li></ul></li></ul>]]></content>
<tags>
<tag> note </tag>
<tag> 5G </tag>
<tag> 4G </tag>
<tag> simulator </tag>
<tag> ns3 </tag>
</tags>
</entry>
<entry>
<title>Sim2Real and Domain Randomization</title>
<link href="/uncategorized/surveys/sim2real/"/>
<url>/uncategorized/surveys/sim2real/</url>
<content type="html"><![CDATA[<p>本文首先对Sim2real问题作出了简要介绍,并简单对其方法作出分类。接着,对于其中的domain randomization方法给出bilevel optimization的形式,最后调研了基于上述优化形式(或类似形式)下的优化问题求解方案。</p><p>此外 <a href="https://lilianweng.github.io/posts/2019-05-05-domain-randomization/">Lilian Weng (Open AI) 的博客</a> 对 DR 问题做了深入探讨,极具价值。</p><span id="more"></span><h2 id="概述"><a href="#概述" class="headerlink" title="概述"></a>概述</h2><p>sim2real的全称是simulation to reality,是强化学习的一个分支,同时也属于transfer learning的一种。主要解决的问题是机器人领域中,直接让机器人或者机械臂在现实环境中与环境进行交互、采样时,会出现以下两个比较严重的问题:</p><ul><li>采样效率太低(在用强化学习算法解决机器人相关问题时,所需要的样本量一般会达到上千万,在现实环境中采集如此数量级的样本要耗费几个月的时间)</li><li>安全问题 (由于强化学习需要通过智能体在环境中进行大范围的随机采样来进行试错,因而在某些时刻其做出的行为可能会损伤机器人自身,例如手臂转动角度过大或者避障任务中由于碰撞造成的不可逆损伤等等;也可能会损害周围的环境甚至生物)</li></ul><p>但是如果我们在模拟器中进行强化学习算法的训练,以上两个问题均可迎刃而解。但是,这里同样会存在一个问题,由于模拟器对于物理环境的建模都是存在误差的,因而在模拟环境中学习到的最优策略是否可以直接在现实环境中应用呢?答案往往是否定的,我们把这个问题称为 “reality gap”。而sim2real的工作就是去尝试解决这个问题。</p><p>这里值得注意的一点是,虽然这个方向叫做sim2real,其实其中的所有的算法都可以直接应用在sim2sim,real2real等的任务中。</p><p>[引自:<a href="https://zhuanlan.zhihu.com/p/106216904]">https://zhuanlan.zhihu.com/p/106216904]</a></p><p>本文找了一篇survey,对其中的内容作出整理,意图对整个sim2real领域有一个大致的了解。</p><h2 id="1-Sim2real-分类"><a href="#1-Sim2real-分类" class="headerlink" title="1. Sim2real 分类"></a>1. Sim2real 分类</h2><p>在下述工作中,Sim2real被分为了以下4个类别</p><blockquote><p>W. Zhao, J. P. Queralta and T. Westerlund, “Sim-to-Real Transfer in Deep Reinforcement Learning for Robotics: a Survey,” 2020 IEEE Symposium Series on Computational Intelligence (SSCI), Canberra, ACT, Australia, 2020, pp. 737-744, doi: 10.1109/SSCI47803.2020.9308468.</p></blockquote><p><strong>tax1.1 Zero-shot Transfer</strong></p><p>建立一个逼真的模拟器,或者有足够的模拟经验,这样模型就可以直接应用到现实环境中。这种策略通常称为zero-shot transfer或direct transfer。建立真实世界精确模型的系统识别(System Identification)和域随机化(Domain Randomization Methods)是可以被视为一次性迁移的技术。我们在第 III-B 节和第III-C节中分别讨论了这两个问题。</p><p><strong>tax1.2 System Identification</strong></p><p>值得注意的是,模拟器不是真实世界的忠实代表。<strong>系统识别[51]正是为了建立物理系统的精确数学模型,并使模拟器更真实,需要仔细校准</strong>。尽管如此,获得足够逼真的模拟器的挑战仍然存在。例如,很难构建高质量的渲染图像来模拟真实的视觉。此外,同一机器人的许多物理参数可能会因温度、湿度、定位或其磨损而发生显著变化,这给系统识别带来了更大的困难。</p><p><strong>tax1.3 Domain Randomization Methods</strong></p><p>领域随机化是这样一种想法[52],我们可以高度随机化模拟,而不是仔细建模真实世界的所有参数,以覆盖真实世界数据的真实分布,尽管模型和真实世界之间存在偏差。图3a显示了域随机化的范例。</p><p><strong>tax1.4 Domain Adaptation Methods</strong></p><p>域自适应方法使用来自源域的数据来提高学习模型在数据总是较少可用的不同目标域上的性能。由于源域和目标域之间通常存在不同的特征空间,为了更好地迁移源数据中的知识,我们应该尝试使这两个特征空间统一。这是域自适应的主要精神,可以用图3b中的图来描述。</p><p><img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/image-20230206202202709.png" alt="image-20230206202202709"></p><p>在此分类下,根据任务目的,还有下述两种分类,参考下文(但这篇文章不怎么样),这里参考了经典Robot policy learning的场景。</p><blockquote><p>Dimitropoulos, K., Hatzilygeroudis, I., & Chatzilygeroudis, K. (2022). A brief survey of Sim2Real methods for robot learning. <em>Advances in Service and Industrial Robotics: RAAD 2022</em>, 133-140.</p></blockquote><ol><li><p><strong>Randomizing for Perception</strong>:这类方法主要描述了“给定一个简单编程的控制器,创建可以推广到物理世界的感知模型”。这类方法的随机化对象主要是环境因素。</p></li><li><p><strong>Randomizing for Control</strong>:这类方法主要目标是训练一个灵巧的控制主体,其随机化包括但不限于:关节位置、相机位置、随机干扰物、质量、滚动和旋转摩擦系数、光线、物体摆放位置等。</p></li></ol><p><strong>tax1.5 Learning with Disturbances (干扰)</strong></p><p>Domain randomization 和 dynamics randomization侧重于在模拟环境中引入扰动,目的是使代理不易受模拟与现实之间不匹配的影响[30]、[38]、[40]。在其他作品中也扩展了相同的概念,其中引入了扰动以获得更鲁棒的代理。例如,在[72]中,作者考虑了嘈杂的奖励。虽然与模拟到真实的迁移没有直接关系,但嘈杂的奖励可以更好地模拟真实世界的代理训练。此外,在我们最近的一些作品中[8],[73],我们考虑了影响并行学习的不同代理的环境扰动。当要使用通用策略部署或训练多个真实代理时,这是需要考虑的一个方面。</p><p><strong>tax1.6 Simulation Environments</strong></p><p>sim2real的一个关键方面是<strong>模拟器</strong>的选择。独立于用于有效地将知识转移到真实机器人的技术,模拟器越逼真,预期的结果就越好。文献中使用最广泛的模拟器是 Gazebo [74]、Unity3D 和 PyBullet [75]或 MuJoCo [17]。Gazebo 具有与机器人操作系统 (ROS) 中间件广泛集成的优势,因此可以与真实机器人中存在的部分机器人堆栈一起使用。另一方面,PyBullet 和 MuJoCo 与 DL 和 RL 库以及体育馆环境进行了更广泛的集成。一般来说,Gazebo 适合更复杂的场景,而 PyBullet 和 MuJoCo 提供更快的训练。在以一次性传输的系统识别为目标的情况下,研究人员通常会构建或定制满足特定问题要求和约束的特定模拟[32]、[36]、[41]。</p><h2 id="2-Domain-randomization优化问题表述"><a href="#2-Domain-randomization优化问题表述" class="headerlink" title="2. Domain randomization优化问题表述"></a>2. Domain randomization优化问题表述</h2><p>Domain randomization问题可以表述为以下的双层优化问题</p><p>$$\begin{array}{rl}\min & F(\phi, \theta) = \mathcal{L}(\theta^{*};\mathcal{D}_{real}) \\\text{s.t.} & \theta^{*} = \arg\min_{\theta}f(\phi, \theta) = \mathbb{E}_{x \sim P_{\phi}(x)}[\mathcal{L}(\theta;\mathcal{D}_{x})]\\\text{var.} & \phi, \theta\end{array}$$</p><ul><li><p>$\phi$:上层优化问题的变量,是一个<strong>控制生成随机化样本的分布</strong>的参数。</p></li><li><p>$\theta$ :下层优化问题的变量,是要学习的控制器、神经网络等具体模型的参数。</p></li><li><p>$\mathcal{D}_{real}$ :真实世界的数据集。</p></li><li><p>$\mathcal{D}<em>{x}$ :合成数据集,生成该数据集的分布 $P</em>{\phi}$ 受控于 $\phi$ 。 </p></li></ul><blockquote><p>🌟 Franceschi, L., Frasconi, P., Salzo, S., Grazzi, R., & Pontil, M. (2018, July). Bilevel programming for hyperparameter optimization and meta-learning. In <em>International Conference on Machine Learning</em> (pp. 1568-1577). PMLR.</p><p>Marez, D., Borden, S., & Nans, L. (2020, May). UAV detection with a dataset augmented by domain randomization. In <em>Geospatial Informatics X</em> (Vol. 11398, pp. 39-50). SPIE.</p></blockquote><p>以上表达是对DR问题的直接表达,即寻找一组恰当的仿真器参数,该参数与现实分布相似。但上述方法是一个<strong>non-convex non-smooth</strong>问题,其形式并不利于优化,目前解决这种优化问题的思路包括:</p><ul><li>无需梯度的:贝叶斯优化、RL等无需求导运算的方法</li><li>基于梯度的:借助对抗思想,将原问题转化为单层的min-max问题</li></ul><p>下图中是文献 “Goldilocks-curriculum Domain Randomization and Fractal Perlin Noise with Application to Sim2Real Pneumonia Lesion Detection” 中的方法总结。</p><img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/image-20230221203527418.png" style="zoom:50%;" /><h2 id="3-调研:基于Bi-level-Optimization的DR问题求解"><a href="#3-调研:基于Bi-level-Optimization的DR问题求解" class="headerlink" title="3. 调研:基于Bi-level Optimization的DR问题求解"></a>3. 调研:基于Bi-level Optimization的DR问题求解</h2><h3 id="3-1-Gradient-free"><a href="#3-1-Gradient-free" class="headerlink" title="3.1 Gradient-free"></a>3.1 Gradient-free</h3><ol><li>基于<strong>policy gradients</strong> (RL) 解双层优化问题(无代码)</li></ol><blockquote><p>Ruiz, N., Schulter, S., & Chandraker, M. Learning To Simulate. In <em>2019 International Conference on Learning Representations</em>.</p></blockquote><p>本文如第二节,将domain randomization问题表达为双层优化问题后,指出“基于梯度的方法是无法解该问题的(由于对内层问题的苛刻性质限制、数据生成函数受优化目标的影响、数据生成过程本身是不可微的)”,提出使用<strong>policy gradients</strong>解上述双层优化问题(见Sec. 2.2)。此外文章还讨论了simulator在实际场景下应该建模为一个Bayesian network或更复杂的网络,因为“actual data description (e.g., what objects are rendered in an image) is sampled from a distribution $S$ parametrized by the provided simulation parameters $\rho$ and specific rendering settings $\phi$ (e.g., lighting conditions) are sampled from a distribution $P$ also parameterized by $\psi$ , i.e. $G(x,y|\psi)=\mathcal{R}(S(\rho|\psi),P(\phi|\psi))$”</p><ol start="2"><li>用<strong>Bayes Optimization</strong>来解决外层优化问题,观察上述的formulation可以看出,外层优化问题是一个分布的优化问题,本工作用Gaussian Process来建模real world中<strong>代价函数</strong>,用贝叶斯法调整GP中的参数,以逼近真实世界的代价函数。**(Third party <a href="https://github.com/wibox/MLDLRL/tree/dee539e60e6c7e5e930776e12083875f107cf45b">code</a> available)**</li></ol><blockquote><p>Muratore, F., Eilers, C., Gienger, M., & Peters, J. (2021). Data-efficient domain randomization with bayesian optimization. <em>IEEE Robotics and Automation Letters</em>, <em>6</em>(2), 911-918.</p></blockquote><p>$$\begin{aligned}\phi^{\star} & =\arg \max _{\phi \in \Phi} J^{\text {real }}\left(\theta^{\star}(\phi)\right) \quad \text { with } \\\theta^{\star}(\phi) & =\arg \max _{\theta \in \Theta} \mathbb{E}_{\xi \sim \nu(\xi ; \phi)}[J(\theta, \xi)]\end{aligned}$$</p><p>其中外层优化问题用BO来解决,$\hat{J}^{real}(\theta^{\star}(\phi))$ 被建模为GP,The GPs’s mean and covariance is updated using all recorded inputs $\phi$ and the corresponding observations $\hat{J}^{real}(\theta^{\star}(\phi))$. </p><img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/image-20230213152156628.png" alt="image-20230213152156628" style="zoom: 33%;" /><ol start="3"><li>本文提出,借助少量的真实世界数据,让<strong>RL</strong>算法从真实有效的分布开始,<strong>渐进、迭代式的</strong>学习更广、更宽的分布上的策略。</li></ol><blockquote><p>Y. Chebotar et al., “Closing the Sim-to-Real Loop: Adapting Simulation Randomization with Real World Experience,” 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 2019, pp. 8973-8979, doi: 10.1109/ICRA.2019.8793789.</p></blockquote><p>本文的外层优化目标在上节的基础上细化为下式:</p><p>$$\begin{array}{rl}\min _{\phi_{i+1}} & \mathbb{E}_{P_{\xi_{i+1} \sim p_{\phi_{i+1}}}}\left[\mathbb{E}_{\pi_{\theta, p_{\phi_i}}}\left[D\left(\tau_{\xi_{i+1}}^{o b}, \tau_{\text {real }}^{o b}\right)\right]\right] \\\text { s.t. } & D_{K L}\left(p_{\phi_{i+1}} \| p_{\phi_i}\right) \leq \epsilon\end{array}$$</p>这里限制了simulator每次的变化程度。本文指出,$p_{\phi_0}$ 应当从一个真实有效的分布开始学起。对于上述外层优化目标,本文采用**relative entropy policy search**来解决,这是一种sampling-based、gradient-free的方法。<ol start="4"><li>基于<strong>Cross entropy method (CEM)</strong> 和proximal policy optimization (PPO)分别解决上述的外层和内层优化问题<strong>(dockerized code aviliable)</strong></li></ol><blockquote><p>Vuong, Q., Vikram, S., Su, H., Gao, S., & Christensen, H. I. (2019). How to pick the domain randomization parameters for sim-to-real transfer of reinforcement learning policies?. <em>arXiv preprint arXiv:1903.11774</em>.</p></blockquote><p>本文的内层优化问题是一个policy learning问题,本文的中心思想是借助少量优先的真实世界样本来矫正simulator。本文提出基于Cross entropy method (CEM) 和proximal policy optimization (PPO)分别解决上述的外层和内层优化问题,其中CEM作为解domain randomization问题的关键,是一种iterative gradient-free stochastic optimization method。</p><ol start="5"><li><strong>Active domain randomization (RL-based)<strong>:传统DR需要显式的给出一组随机参数、以及对应的取值空间,常用的经典做法是——在取值空间中做均匀采样,以生成不同类型的环境。但有工作证明,这样的均匀采样导致模型学习了过多不常出现的环境策略,从而导致低效学习。本文提出,</strong>将simulator变量取值空间中的变量生成问题也看作称一个RL问题,并使用学习算法来学习如何在空间中采样</strong>。该方法的外层优化目标也是一个RL问题。</li></ol><blockquote><p>Raparthy, S. C., Mehta, B., Golemo, F., & Paull, L. (2020). Generating automatic curricula via self-supervised active domain randomization. <em>arXiv preprint arXiv:2002.07911</em>.</p></blockquote><h3 id="3-2-Gradient-based"><a href="#3-2-Gradient-based" class="headerlink" title="3.2 Gradient-based"></a>3.2 Gradient-based</h3><ol><li>引入对抗思想,将bilevel问题转化为单层的<strong>minmax问题</strong></li></ol><blockquote><p>Zakharov, S., Kehl, W., Ilic, S.: Deceptionnet: Network-driven domain randomization. In: ICCV (2019)</p></blockquote><ol start="2"><li>这篇没有找到原文,应该有更新的版本</li></ol><blockquote><p>Khirodkar, R., Yoo, D., Kitani, K.M.: VADRA: visual adversarial domain randomization and augmentation. CoRR abs/1812.00491 (2018)</p></blockquote><h3 id="3-3-others"><a href="#3-3-others" class="headerlink" title="3.3 others"></a>3.3 others</h3><ol><li>引入<strong>Goldilocks Principle</strong>:在curriculum learning的领域,以一种meaningful的order训练模型对模型学习策略是十分有效的,换言之,在学习过程中有一个学习任务的sweet point,在该点上持续学习,可以最大程度的提升模型性能(大概是这个意思,可能不准)。改工作的formulation不同于上一节。</li></ol><blockquote><p>Suzuki, T., Hanaoka, S., & Sato, I. (2022). Goldilocks-curriculum Domain Randomization and Fractal Perlin Noise with Application to Sim2Real Pneumonia Lesion Detection. <em>arXiv preprint arXiv:2204.13849</em>.</p></blockquote><p>该工作的内外层优化目标分别为函数为 </p><p>$$\begin{array}{c}\phi^{t+1}=\arg \max_{\phi \in \Phi}-\left|V(\theta^t, S_\phi)-k\right| \\\theta^t=\arg \min_{\theta} \sum_{i=1}^t L(\theta, S_{\phi^i})\end{array}$$</p>内层:求训练一个分类器,该分类器在所有模拟器参数上的平均性能达到最优。外层:区别于其他工作,是一种curriculum-based method,它通过调整 $k$ 来获得对对给定的模型来说具有不同难度的任务,该方法试图找到一个sweet point,在该点上,模型可以最有效的进行学习(原文:Goldilocks principle suggests that there might be a **sweet spot** of task difficulty that **is effective to successfully progress the training** of the current model)。<h2 id="4-提及bilevel-formulation但未给出解决方法的工作"><a href="#4-提及bilevel-formulation但未给出解决方法的工作" class="headerlink" title="4. 提及bilevel formulation但未给出解决方法的工作"></a>4. 提及bilevel formulation但未给出解决方法的工作</h2><p>还有一些文献没有提出具体的bilevel优化解决方法,但套用了该概念做了论文陈述,如:</p><p>[1] 仅说明了 $S \subseteq R $ 即:the feature space of synthetic dataset must encompass features from the real world data。</p><blockquote><p>[1] Shamsuddin, A. F., Ragunathan, K., Abhijith, P., PM, D. R. S., & Sankaran, P. (2022). From synthetic to natural—single natural image dehazing deep networks using synthetic dataset domain randomization. <em>Journal of Visual Communication and Image Representation</em>, <em>89</em>, 103636.</p></blockquote>]]></content>
<tags>
<tag> survey </tag>
<tag> optimization </tag>
<tag> sim2real </tag>
</tags>
</entry>
<entry>
<title>3GPP体系</title>
<link href="/uncategorized/notes/3GPP%E4%BD%93%E7%B3%BB/"/>
<url>/uncategorized/notes/3GPP%E4%BD%93%E7%B3%BB/</url>
<content type="html"><![CDATA[<p>本文主要梳理3GPP在无线通信领域的协议体系,以专家知识赋能「基于3GPP协议语料的foundation model」。初步的,本文将梳理以下几点(持续更新)</p><ul><li>从RAN出发,对梳理相关专业名词</li><li>3GPP相关的协议架构 <ul><li>参考:<a href="https://zhuanlan.zhihu.com/p/102176081">link</a>,对3GPP协议的架构、命名、下载方式等做出了总结 ‼️</li></ul></li></ul><span id="more"></span><h2 id="1-不同时代下的基站-及关键组件名称"><a href="#1-不同时代下的基站-及关键组件名称" class="headerlink" title="1. 不同时代下的基站 及关键组件名称"></a>1. 不同时代下的基站 及关键组件名称</h2><p>构建foundation model时,区分实体在RAN中的位置与身份是至关重要的。以下内容转自 <a href="https://commsbrief.com/radio-access-network-ran-geran-utran-e-utran-and-ng-ran/">link</a>,并对其中的缩写做出注释。</p><h3 id="1-1-RAN-Radio-Access-Network"><a href="#1-1-RAN-Radio-Access-Network" class="headerlink" title="1.1 RAN (Radio Access Network)"></a>1.1 RAN (Radio Access Network)</h3><p><img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/IMG_0054.jpg" alt="IMG_0054"></p><p><img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/IMG_0055.jpg" alt="IMG_0055"></p><p>「1G、2G、3G、4G 和 5G 这些术语的真正含义是什么」: <a href="https://commsbrief.com/what-do-the-terms-1g-2g-3g-4g-and-5g-really-mean/">link</a></p><p>「GSM、UMTS 和 LTE 之间有什么区别」: <a href="https://commsbrief.com/difference-between-gsm-umts-lte/">link</a></p><p>「长期演进:什么是 4G LTE 及其工作原理?」: <a href="https://commsbrief.com/long-term-evolution-what-is-4g-lte-and-how-does-it-work/">link</a></p><p>「什么是移动核心网」: <a href="https://commsbrief.com/what-is-a-mobile-core-network/">link</a></p><p>「GSM 中的基站子系统与网络交换子系统」: <a href="https://commsbrief.com/base-station-subsystem-vs-network-switching-subsystem-in-gsm/">link</a></p><p>「GGSN 和 SGSN 有什么区别」: <a href="https://commsbrief.com/what-is-the-difference-between-ggsn-and-sgsn/">link</a></p><p>「无线接入网 (RAN):GERAN、UTRAN、E-UTRAN 和 NG-RAN」: <a href="https://commsbrief.com/radio-access-network-ran-geran-utran-e-utran-and-ng-ran/">link</a></p><p>「Node B、ENodeB 和 GNB 有什么区别,2G/3G/4G使用的基站有什么区别」: <a href="https://commsbrief.com/what-is-the-difference-between-node-b-enodeb-ng-enb-and-gnb/">link</a></p>]]></content>
<tags>
<tag> note </tag>
<tag> 5G </tag>
<tag> 4G </tag>
<tag> 3GPP </tag>
</tags>
</entry>
<entry>
<title>Transformers in Hugging Face</title>
<link href="/uncategorized/notes/hugging_face/"/>
<url>/uncategorized/notes/hugging_face/</url>
<content type="html"><![CDATA[<p>Hugging Face 的入门教程,目标是从0开始训练自己的大模型。</p><span id="more"></span><h2 id="重点教程"><a href="#重点教程" class="headerlink" title="重点教程"></a>重点教程</h2><ul><li>组装所有的组件:<a href="https://huggingface.co/course/chapter2/6?fw=pt">https://huggingface.co/course/chapter2/6?fw=pt</a></li><li>processing data: <a href="https://huggingface.co/course/chapter3/2?fw=pt">https://huggingface.co/course/chapter3/2?fw=pt</a></li><li>Fine-tune: <a href="https://huggingface.co/course/chapter3/3?fw=pt">https://huggingface.co/course/chapter3/3?fw=pt</a></li><li><strong>Full training</strong>: <a href="https://huggingface.co/course/chapter3/4?fw=pt">https://huggingface.co/course/chapter3/4?fw=pt</a></li><li><strong>Train a new tokenizer from a old one</strong>: <a href="https://huggingface.co/course/chapter6/2?fw=pt">https://huggingface.co/course/chapter6/2?fw=pt</a></li><li><strong>Use open source dataset</strong>: <a href="https://huggingface.co/course/chapter5/1?fw=pt">https://huggingface.co/course/chapter5/1?fw=pt</a></li><li>Check tokenizers is fast or not: <a href="https://huggingface.co/course/chapter6/3?fw=pt">https://huggingface.co/course/chapter6/3?fw=pt</a></li><li>Normalization and pre-tokenization (maybe we won’t use): <a href="https://huggingface.co/course/chapter6/4?fw=pt">https://huggingface.co/course/chapter6/4?fw=pt</a></li></ul><h2 id="1-Hugging-Face"><a href="#1-Hugging-Face" class="headerlink" title="1. Hugging Face"></a>1. Hugging Face</h2><p><code>pipeline</code>: 一个端到端的transformer实现,可以直接用于接收文本信息,得到模型在下游任务上的向量表示,并最终处理为人类可理解的形式。</p><p><code>pipeline</code> = <code>tokenizer</code> + <code>model</code> + <code>post processing</code></p><img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/image-20221123193602726.png" alt="image-20221123193602726" style="zoom: 50%;" /><h3 id="1-1-Tokenizer"><a href="#1-1-Tokenizer" class="headerlink" title="1.1 Tokenizer"></a>1.1 Tokenizer</h3><p>tokenizer: </p><ol><li>[分词] Splitting the input into words, subwords, or symbols (like punctuation) that are called tokens<ul><li>split on spaces</li><li>Character-based</li><li>sub-word tokenization</li></ul></li><li>[查表] Mapping each token to an integer</li><li>[add attention mask, etc] Adding additional inputs that may be useful to the model</li></ol><pre class="line-numbers language-python" data-language="python"><code class="language-python"><span class="token comment"># load a pretrained tokenizer</span><span class="token keyword">from</span> transformers <span class="token keyword">import</span> AutoTokenizercheckpoint <span class="token operator">=</span> <span class="token string">"distilbert-base-uncased-finetuned-sst-2-english"</span>tokenizer <span class="token operator">=</span> AutoTokenizer<span class="token punctuation">.</span>from_pretrained<span class="token punctuation">(</span>checkpoint<span class="token punctuation">)</span><span class="token comment"># get result </span>raw_inputs <span class="token operator">=</span> <span class="token punctuation">[</span> <span class="token string">"I've been waiting for a HuggingFace course my whole life."</span><span class="token punctuation">,</span> <span class="token string">"I hate this so much!"</span><span class="token punctuation">,</span><span class="token punctuation">]</span>inputs <span class="token operator">=</span> tokenizer<span class="token punctuation">(</span>raw_inputs<span class="token punctuation">,</span> padding<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span> truncation<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span> return_tensors<span class="token operator">=</span><span class="token string">"pt"</span><span class="token punctuation">)</span><span class="token keyword">print</span><span class="token punctuation">(</span>inputs<span class="token punctuation">)</span><span class="token triple-quoted-string string">'''{ 'input_ids': tensor([ [ 101, 1045, 1005, 2310, 2042, 3403, 2005, 1037, 17662, 12172, 2607, 2026, 2878, 2166, 1012, 102], [ 101, 1045, 5223, 2023, 2061, 2172, 999, 102, 0, 0, 0, 0, 0, 0, 0, 0] ]), 分词之后,每个词在词表中的id,注意这里用了word and subword分词方法,即分割词语到不可分割的常见词语为止,其中包含了用于将序列填充为等长序列的占位符 'attention_mask': tensor([ [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0] ]),have the same shape as input ids, }'''</span><span aria-hidden="true" class="line-numbers-rows"><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span></span></code></pre><p>API:</p><pre class="line-numbers language-python" data-language="python"><code class="language-python"><span class="token comment"># load</span><span class="token keyword">from</span> transformers <span class="token keyword">import</span> AutoTokenizertokenizer <span class="token operator">=</span> AutoTokenizer<span class="token punctuation">.</span>from_pretrained<span class="token punctuation">(</span><span class="token string">"bert-base-cased"</span><span class="token punctuation">)</span><span class="token comment"># use</span>tokenizer<span class="token punctuation">(</span><span class="token string">"Using a Transformer network is simple"</span><span class="token punctuation">)</span><span class="token triple-quoted-string string">'''{'input_ids': [101, 7993, 170, 11303, 1200, 2443, 1110, 3014, 102], 'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1]}'''</span><span class="token comment"># split (tokenize)</span>sequence <span class="token operator">=</span> <span class="token string">"Using a Transformer network is simple"</span>tokens <span class="token operator">=</span> tokenizer<span class="token punctuation">.</span>tokenize<span class="token punctuation">(</span>sequence<span class="token punctuation">)</span><span class="token keyword">print</span><span class="token punctuation">(</span>tokens<span class="token punctuation">)</span><span class="token triple-quoted-string string">'''output: ['Using', 'a', 'transform', '##er', 'network', 'is', 'simple']'''</span><span class="token comment"># From tokens to input IDs</span>ids <span class="token operator">=</span> tokenizer<span class="token punctuation">.</span>convert_tokens_to_ids<span class="token punctuation">(</span>tokens<span class="token punctuation">)</span><span class="token keyword">print</span><span class="token punctuation">(</span>ids<span class="token punctuation">)</span><span class="token triple-quoted-string string">'''output: [7993, 170, 11303, 1200, 2443, 1110, 3014]'''</span><span class="token comment"># decoding</span>decoded_string <span class="token operator">=</span> tokenizer<span class="token punctuation">.</span>decode<span class="token punctuation">(</span><span class="token punctuation">[</span><span class="token number">7993</span><span class="token punctuation">,</span> <span class="token number">170</span><span class="token punctuation">,</span> <span class="token number">11303</span><span class="token punctuation">,</span> <span class="token number">1200</span><span class="token punctuation">,</span> <span class="token number">2443</span><span class="token punctuation">,</span> <span class="token number">1110</span><span class="token punctuation">,</span> <span class="token number">3014</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token keyword">print</span><span class="token punctuation">(</span>decoded_string<span class="token punctuation">)</span><span class="token triple-quoted-string string">'''output: 'Using a Transformer network is simple'''</span>'<span class="token comment"># save</span>tokenizer<span class="token punctuation">.</span>save_pretrained<span class="token punctuation">(</span><span class="token string">"directory_on_my_computer"</span><span class="token punctuation">)</span><span aria-hidden="true" class="line-numbers-rows"><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span></span></code></pre><h3 id="1-2-Model"><a href="#1-2-Model" class="headerlink" title="1.2 Model"></a>1.2 Model</h3><p><code>Model</code> = <code>transformer</code> + <code>model heads</code></p><p><code>transformer</code>: input: tokenized raw data; output: high-dimensional output shape like <code>[b, t, d]</code></p><pre class="line-numbers language-python" data-language="python"><code class="language-python"><span class="token comment"># load pretrained transformer</span><span class="token keyword">from</span> transformers <span class="token keyword">import</span> AutoModelcheckpoint <span class="token operator">=</span> <span class="token string">"distilbert-base-uncased-finetuned-sst-2-english"</span>model <span class="token operator">=</span> AutoModel<span class="token punctuation">.</span>from_pretrained<span class="token punctuation">(</span>checkpoint<span class="token punctuation">)</span>outputs <span class="token operator">=</span> model<span class="token punctuation">(</span><span class="token operator">**</span>inputs<span class="token punctuation">)</span> <span class="token comment"># tokenized input</span><span class="token keyword">print</span><span class="token punctuation">(</span>outputs<span class="token punctuation">.</span>last_hidden_state<span class="token punctuation">.</span>shape<span class="token punctuation">)</span><span class="token comment"># output: torch.Size([2, 16, 768]), [b, t, d]</span><span aria-hidden="true" class="line-numbers-rows"><span></span><span></span><span></span><span></span><span></span><span></span><span></span></span></code></pre><p><code>model heads</code>: input: output of transformer; output: the result of downstream task, maybe the output of a sigmoid network.</p><img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/image-20221123194149899.png" alt="image-20221123194149899" style="zoom: 33%;" /><pre class="line-numbers language-python" data-language="python"><code class="language-python"><span class="token comment"># transformer with subsequent network</span><span class="token keyword">from</span> transformers <span class="token keyword">import</span> AutoModelForSequenceClassificationcheckpoint <span class="token operator">=</span> <span class="token string">"distilbert-base-uncased-finetuned-sst-2-english"</span>model <span class="token operator">=</span> AutoModelForSequenceClassification<span class="token punctuation">.</span>from_pretrained<span class="token punctuation">(</span>checkpoint<span class="token punctuation">)</span>outputs <span class="token operator">=</span> model<span class="token punctuation">(</span><span class="token operator">**</span>inputs<span class="token punctuation">)</span><span class="token keyword">print</span><span class="token punctuation">(</span>outputs<span class="token punctuation">.</span>logits<span class="token punctuation">.</span>shape<span class="token punctuation">)</span><span class="token comment"># output: torch.Size([2, 2]), we have just two sentences and two labels, the result we get from our model is of shape 2 x 2.</span><span aria-hidden="true" class="line-numbers-rows"><span></span><span></span><span></span><span></span><span></span><span></span><span></span></span></code></pre><p><strong>The Choice of <code>Model</code></strong></p><ul><li><code>*Model</code> <strong>(retrieve the hidden states)</strong></li><li><code>*ForCausalLM</code></li><li><code>*ForMaskedLM</code></li><li><code>*ForMultipleChoice</code></li><li><code>*ForQuestionAnswering</code></li><li><code>*ForSequenceClassification</code></li><li><code>*ForTokenClassification</code></li><li>and others (non-exhaustive list)</li></ul><h3 id="1-3-Post-processing"><a href="#1-3-Post-processing" class="headerlink" title="1.3 Post-processing"></a>1.3 Post-processing</h3><p>Map tensor value output by model head (mentioned above) to text (according to id2text, etc.).</p>]]></content>
<tags>
<tag> note </tag>
<tag> hugging face </tag>
<tag> transformers </tag>
</tags>
</entry>
<entry>
<title>Related Papers in ACL 2022</title>
<link href="/uncategorized/paperlistfile/ACL2022/"/>
<url>/uncategorized/paperlistfile/ACL2022/</url>
<content type="html"><![CDATA[<p><a href="https://www.2022.aclweb.org/papers">Link</a></p><span id="more"></span><h2 id="Anomaly-detection"><a href="#Anomaly-detection" class="headerlink" title="Anomaly detection"></a>Anomaly detection</h2><ul><li><p>Self-Attentive, Multi-Context One-Class Classification for Unsupervised Anomaly Detection on Text</p><p>Lukas RuffYury ZemlyanskiyRobert VandermeulenThomas SchnakeMarius Kloft</p></li></ul><h2 id="Big-model-foundation-model"><a href="#Big-model-foundation-model" class="headerlink" title="Big model / foundation model"></a>Big model / foundation model</h2><ul><li><p>BMInf: An Efficient Toolkit for Big Model Inference and Tuning</p><p>Xu HanGuoyang ZengWeilin ZhaoZhiyuan LiuZhengyan ZhangJie ZhouJun ZhangJia ChaoMaosong Sun</p></li></ul>]]></content>
<tags>
<tag> paper list </tag>
</tags>
</entry>
<entry>
<title>Foundation models</title>
<link href="/uncategorized/surveys/foundation_models/"/>
<url>/uncategorized/surveys/foundation_models/</url>
<content type="html"><![CDATA[<p>这里列出一些近年来关于大模型的总结、调研,还有相关顶会论文。总结顶会论文的原因在于,在我看来大模型(或基础模型)大多都是在工程领域的创新,如何利用工程创新,助力是科学创新。中间的桥梁应该被找到。</p><span id="more"></span><p>注:有些调研直接截图了平日的工作汇报,注意与最新的工作进展及时同步。</p><h2 id="1-Large-Language-Model-after-GPT-3"><a href="#1-Large-Language-Model-after-GPT-3" class="headerlink" title="1. Large Language Model, after GPT-3"></a>1. Large Language Model, after GPT-3</h2><p>概括:<strong>超大参数规模</strong>的模型,并利用<strong>超大规模数据</strong>,大多以self-supervised方式进行训练,来学习数据的通用表征。后续通过prompt、fine-tune等迁移学习方法适应不同下游任务的通用模型范式。</p><p><strong>目前foundation model的应用领域以及下游任务包括</strong></p><ul><li><p>NLP(成熟)</p><ul><li><p>下游任务:翻译、问答、语义总结,等</p></li><li><p>代表模型:GPT-3,LaMDA、PaLM、BLOOM,等</p></li></ul></li><li><p>CV</p><ul><li><p>下游任务:文生图、文生视频、图片描述、风格迁移,等</p></li><li><p>代表模型:DALL-E 2、Imagen、Parti,等</p></li></ul></li></ul><p><img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/image-20221121142447307.png" alt="Explode of LLM"></p><p><strong>Foundation model的特点:emergence, homogenization</strong></p><ul><li><p>Emergence:除“隐生性”,即模型学习到的表征是隐性的,而非人类指定的。还有一种解释为“<strong>涌现性</strong>”,即:模型参数规模上,由量变引起质变的过程,一些模型的特性在小模型上不具备,而当参数规模扩大后才会显露的特性。</p></li><li><p>Homogenization:foundation model的基础模型呈现同质化趋势,目前NLP大模型几乎都由transformer结构中改变而来。</p></li></ul><p><strong>Foundation model对下游任务的适配</strong></p><ul><li><p>Fine-tune:针对特定的任务,利用特定的标签数据对模型参数进行fine-tune,得到的模型将只在<strong>特定任务</strong>上有较好性能,无法用于其他任务</p></li><li><p>Prompt:对输入的文本按照特定模板进行处理,通过恰当的方式<strong>重新定义下游任务</strong>,使之更适配预训练语言模型的形式,使之回忆起预训练时的知识</p><ul><li><p>Few-shot learning setting</p></li><li><p>Zero-shot learning setting</p></li></ul></li></ul><h3 id="1-1-大模型调研"><a href="#1-1-大模型调研" class="headerlink" title="1.1 大模型调研"></a>1.1 大模型调研</h3><p><img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/image-20221121143241729.png" alt="LLM-Timeline"></p><p>上述模型的体量总结如下表</p><table><thead><tr><th align="center">模型</th><th align="center">训练时间</th><th align="center">训练空间</th><th align="center">模型大小</th><th align="center">优化器+模型大小</th><th align="center">参数量</th><th align="center">数据量</th><th align="center">模型结构</th></tr></thead><tbody><tr><td align="center">GPT-3 (OpenAI)</td><td align="center">3.14e11 TFLOPS</td><td align="center"></td><td align="center"></td><td align="center"></td><td align="center">175B</td><td align="center">45TB (raw data) 570GB</td><td align="center">Sparse Transformer</td></tr><tr><td align="center">PanGu (Huawei, CN)</td><td align="center"></td><td align="center">2048 Ascend 910 AI processors</td><td align="center"></td><td align="center">750GB</td><td align="center">200B</td><td align="center">1.1T</td><td align="center">Transformer</td></tr><tr><td align="center">GPT-J (EleutherAI)</td><td align="center">1.5e10 TFLOPs</td><td align="center"></td><td align="center">9GB</td><td align="center">61GB</td><td align="center">6B</td><td align="center">825G (raw data)</td><td align="center">Sparse Transformer (like GPT-3)</td></tr><tr><td align="center">Ernie 3.0 Titan (Baidu)</td><td align="center">3.14e11 TFLOPS</td><td align="center">Nvidia V100 GPU and Ascend 910 NPU clusters (分布式)</td><td align="center"></td><td align="center">2.1TB</td><td align="center">260B</td><td align="center"></td><td align="center">Transformer-XL</td></tr><tr><td align="center">GPT-NeoX (EleutherAI)</td><td align="center"></td><td align="center"></td><td align="center">39GB</td><td align="center">268GB</td><td align="center">20B</td><td align="center">825G (raw data)</td><td align="center">Sparse Transformer (like GPT-3)</td></tr><tr><td align="center">OPT (Meta)</td><td align="center">4.48e10 TFLOPs*</td><td align="center">992 80GB A100 GPUs</td><td align="center"></td><td align="center"></td><td align="center">175B</td><td align="center">800GB (raw data)</td><td align="center">Transformer</td></tr><tr><td align="center">BLOOM (BigScience)</td><td align="center">3.5 month</td><td align="center">384 A100 80GB GPUs (48 nodes)</td><td align="center">0.33TB</td><td align="center">2.3TB</td><td align="center">176B</td><td align="center"></td><td align="center">Transformer (like GPT-2)</td></tr><tr><td align="center">GLM-130B (Tsinghua)</td><td align="center">2 month</td><td align="center">96 NVIDIA DGX-A100 (8*40G) GPU nodes</td><td align="center"></td><td align="center"></td><td align="center">130B</td><td align="center">2.3T (raw data)</td><td align="center">Transformer (like GLM)</td></tr></tbody></table><h4 id="1-1-1-大模型研究进展更新"><a href="#1-1-1-大模型研究进展更新" class="headerlink" title="1.1.1 大模型研究进展更新"></a>1.1.1 大模型研究进展更新</h4><h5 id="LLaMA-Meta-AI-2023-02-23"><a href="#LLaMA-Meta-AI-2023-02-23" class="headerlink" title="LLaMA [Meta AI, 2023.02.23]"></a>LLaMA [Meta AI, 2023.02.23]</h5><ol><li><p><strong>模型亮点</strong></p><ul><li><p>着眼于<strong>高效推理</strong>的大模型,以<strong>小规模参数</strong>、<strong>大规模</strong> <strong>token</strong>,借助<strong>公开</strong>数据集,使用<strong>常规优化器</strong>训练的<strong>开源</strong>大模型</p></li><li><p>在不同规模的参数量上训练不同规模的模型</p><ul><li><p>LLaMA-13B 的性能优于 GPT3-130B,但仅有约0.1倍参数量</p></li><li><p>LLaMA-65B 的性能比肩 Chinchilla-70B、PaLM-540B</p></li></ul></li><li><p>重点评估了模型的 biases and toxicity</p></li></ul></li><li><p><strong>模型架构</strong></p><ul><li><p>骨架模型:Transformer</p></li><li><p>结构修改:</p><ul><li><p>[GPT3] 在每个 Transformer 子层,normalize (RMSNorm) 输入,而非输出</p></li><li><p>[PaLM] 用 SwiGLU 取代 ReLU;记 attention 层维度为 $d_{model}=d_{head}\times n_{head}$,feed-forward 层维度为 $d_{ff}=\frac{2}{3} 4d_{model}$ ,而非 $4d_{model}$</p></li><li><p>[GPTNeo] Embedding 方法采用 rotary positional embeddings (RoPE),而非absolute positional embeddings</p></li></ul></li></ul></li><li><p><strong>训练、推理空间</strong></p><ul><li><p>训练:2048 A100 GPU with 80GB of RAM</p></li><li><p>推理:LLaMA-13B可推理于<strong>一张</strong> V100 GPU</p></li></ul></li></ol><img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/image-20230301154352127.png" alt="Model size" style="zoom: 33%;" /><h4 id="1-1-2-大模型基础架构"><a href="#1-1-2-大模型基础架构" class="headerlink" title="1.1.2 大模型基础架构"></a>1.1.2 大模型基础架构</h4><p>目前在NLP领域被成功训练并大规模应用的模型,都是基于Transformer的self-attention架构的:</p><ol><li><p>Autoregressive(仅包含decoder):自回归模型的代表是GPT。本质上是一个从左到右的语言模型,训练目标是从左到右的文本生成。</p><ul><li>常用于<strong>无条件长文本生成</strong>(对话生成、故事生成等),但缺点是单向注意力机制,不利于NLU(自然语言理解)任务。</li></ul></li><li><p>Autoencoding(仅包含encoder):代表模型是BERT、ALBERT、DeBERTa 。自编码模型是通过去噪任务(如利用掩码语言模型)学习双向的上下文编码器,训练目标是对文本进行随机掩码,然后预测被掩码的词。</p><ul><li>常用于<strong>自然语言理解</strong>(事实推断、语法分析、分类等),缺点是不能直接用于文本生成。</li></ul></li><li><p>Encoder-decoder(完整的Transformer结构):代表模型是T5、BART。包含一个编码器和一个解码器,接受一段文本,从左到右的生成另一段文本。</p><ul><li>常用于<strong>有条件的生成任务</strong>(摘要生成、对话等)。缺点是比BERT-based模型在同性能下需要更多参数。</li></ul></li><li><p>Hybird-model:GLM</p></li></ol><p>还有一些模型结合了transformer-based模型,以及其他模型,用于改善transformer缺乏长期记忆的缺点。</p><ul><li><p>与GNN结合: CogQA [1]</p></li><li><p>与knowledge graph结合: OAG-BERT [2] </p></li></ul><blockquote><p>[1] Ding, M., Zhou, C., Chen, Q., Yang, H., & Tang, J. (2019, July). Cognitive Graph for Multi-Hop Reading Comprehension at Scale. In <em>Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics</em> (pp. 2694-2703).</p><p>[2] Liu, X., Yin, D., Zhang, X., Su, K., Wu, K., Yang, H., & Tang, J. (2021). Oag-bert: Pre-train heterogeneous entity-augmented academic language models. <em>arXiv</em> <em>preprint arXiv:2103.02410</em>.</p></blockquote><h3 id="1-2-大模型复杂度分析"><a href="#1-2-大模型复杂度分析" class="headerlink" title="1.2 大模型复杂度分析"></a>1.2 大模型复杂度分析</h3><p>深度学习的保存模型里包含所有trainable variables的精确值。下文以sparse transformer为例,分析该模型的空间复杂度、计算复杂度。</p><ul><li><p>Self-attention的隐空间维度为$d_{model}$,head数目为$n_{head}$,则每个head的维度为$d_{head}=d_{model}/n_{head} $。Feed-forward的隐空间维度为$d_{ff}$</p></li><li><p>记输入到下述的一层sparse transformer的数据为$\mathbf{X} \in \mathbb{R}^{N × d_{model}}$,$N$为输入的句子长度。</p></li><li><p>记self-attention的层数为$n_{layer}$</p></li></ul><p><img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/image-20221121144017190.png" alt="Complexity Analysis"></p><h2 id="2-How-to-train-a-big-model-from-zero-to-one"><a href="#2-How-to-train-a-big-model-from-zero-to-one" class="headerlink" title="2. How to train a big model, from zero to one"></a>2. How to train a big model, from zero to one</h2><p>在specific-domain构建一个大模型,首先需要</p><ol><li><p>确定下游任务</p><ul><li><p>下游任务决定了模型的预训练任务:预训练任务应当challenge,且贴近下游任务。分析下游任务主要分析token-wise还是sentence-wise relationship,可以按需选择预训练任务。</p></li><li><p>下游任务决定了模型骨架:模型注重NLU还是NLG?</p></li></ul></li><li><p>确定模型骨架:从Autoencoding / Autoregressive / Encoder-decoder中选择合适的框架</p></li><li><p>确定模型预训练任务</p></li><li><p>从下游任务和预训练任务出发,处理并准备语料</p></li></ol><p>在确定了以上要素后,在specific-domain foundation model中存在“词表与通用领域不同”的问题,即可能某些词语在通用语料库中不存在,或具有歧义,因此模型的word embedding层需要替换为specific domain的词表。如下图所示采用skip-gram学习embedding。</p><img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/image-20221123161546624.png" alt="" style="zoom: 33%;" /><p><strong>主要参考文献</strong><br>Kalyan, K. S., Rajasekharan, A., & Sangeetha, S. (2021). Ammus: A survey of transformer-based pretrained models in natural language processing. <em>arXiv preprint arXiv:2108.05542</em>.</p><h2 id="3-Related-work-after-2020"><a href="#3-Related-work-after-2020" class="headerlink" title="3. Related work, after 2020"></a>3. Related work, after 2020</h2><p>这里列出了大模型的训练算法、应对大规模模型参数的解法,应对分布式数据、环境的训练方法。有标注会议名称的论文均为顶会/领域顶会论文。</p><ul><li><p>[ACL2022] BMInf: An Efficient Toolkit for Big Model Inference and Tuning</p><p>Xu Han; Guoyang Zeng; Weilin Zhao; Zhiyuan Liu; Zhengyan Zhang; Jie Zhou; Jun Zhang; Jia Chao; Maosong Sun</p></li><li><p>[KDD2022] Beyond Traditional Characterizations in the Age of Data: Big Models, Scalable Algorithms, and Meaningful Solutions</p><p>Shang-Hua Teng</p></li><li><p>[NIPS2022] Contrastive Adapters for Foundation Model Group Robustness</p><p>Michael Zhang; Christopher Re</p></li><li><p>[NIPS2022] Decentralized Training of Foundation Models in Heterogeneous Environments</p><p>Binhang Yuan; Yongjun He; Jared Quincy Davis; Tianyi Zhang; Tri Dao; Beidi Chen; Percy Liang; Christopher Re; Ce Zhang</p></li><li><p>[IMCL2021] PipeTransformer: Automated Elastic Pipelining for Distributed Training of Large-scale Models</p><p> Chaoyang He; Shen Li; Mahdi Soltanolkotabi; Salman Avestimehr</p></li><li><p>[IJCAI2022] Heterogeneous Ensemble Knowledge Transfer for Training Large Models in Federated Learning</p></li></ul><p> Yae Jee Cho; Andre Manoel; Gauri Joshi; Robert Sim; Dimitrios Dimitriadis</p><ul><li>[MLSYS2021] Pipelined Backpropagation at Scale: Training Large Models without Batches</li></ul><p> Atli Kosson; Vitaliy Chiley; Abhinav Venigalla; Joel Hestness; Urs Koster</p><h3 id="3-1-大模型在垂直领域(定义在文本领域)的构建"><a href="#3-1-大模型在垂直领域(定义在文本领域)的构建" class="headerlink" title="3.1 大模型在垂直领域(定义在文本领域)的构建"></a>3.1 大模型在垂直领域(定义在文本领域)的构建</h3><ul><li><p>[RECSYS2021] Large-Scale Modeling of Mobile User Click Behaviors Using Deep Learning</p><p>Xin ZhouYang Li</p></li><li><p>Lewis, P., Ott, M., Du, J., & Stoyanov, V. (2020, November). Pretrained language models for biomedical and clinical tasks: Understanding and extending the state-of-the-art. In <em>Proceedings of the 3rd Clinical Natural Language Processing Workshop</em> (pp. 146-157).</p></li><li><p>Xiao, C., Hu, X., Liu, Z., Tu, C., & Sun, M. (2021). Lawformer: A pre-trained language model for chinese legal long documents. <em>AI Open</em>, <em>2</em>, 79-84.【中国法律长文档,做法律判决预测、相似案例检索、法律阅读理解和法律问答】</p></li><li><p>Beltagy, I., Lo, K., & Cohan, A. (2019). SciBERT: A pretrained language model for scientific text. <em>arXiv preprint arXiv:1903.10676</em>.【科学文本的语言模型】</p></li><li><p>Kierszbaum, S., Klein, T., & Lapasset, L. (2022). ASRS-CMFS vs. RoBERTa: Comparing Two Pre-Trained Language Models to Predict Anomalies in Aviation Occurrence Reports with a Low Volume of In-Domain Data Available. <em>Aerospace</em>, <em>9</em>(10), 591.【航天事故文档,下游任务是关于故障种类的多分类问题】</p></li><li><p>Shen, J. T., Yamashita, M., Prihar, E., Heffernan, N., Wu, X., Graff, B., & Lee, D. (2021). Mathbert: A pre-trained language model for general nlp tasks in mathematics education. <em>arXiv preprint arXiv:2106.07340</em>.【数学文本中的语言模型】</p></li></ul><h3 id="3-2-大模型在垂直领域(定义在物理世界)的构建"><a href="#3-2-大模型在垂直领域(定义在物理世界)的构建" class="headerlink" title="3.2 大模型在垂直领域(定义在物理世界)的构建"></a>3.2 大模型在垂直领域(定义在物理世界)的构建</h3><ul><li>Zheng, Z., Lu, X. Z., Chen, K. Y., Zhou, Y. C., & Lin, J. R. (2022). Pretrained domain-specific language model for natural language processing tasks in the AEC domain. <em>Computers in Industry</em>, <em>142</em>, 103733. 【建筑施工标准领域的语言模型】</li><li>Zhou, Y. C., Zheng, Z., Lin, J. R., & Lu, X. Z. (2022). Integrating NLP and context-free grammar for complex rule interpretation towards automated compliance checking. <em>Computers in Industry</em>, <em>142</em>, 103746.【上一篇的延续,从复杂合规标准中提取规则以做合规检验】</li><li>Webersinke, N., Kraus, M., Bingler, J. A., & Leippold, M. (2021). Climatebert: A pretrained language model for climate-related text. <em>arXiv preprint arXiv:2110.12010</em>.【气候数据上的语言模型,其中有一个很有趣的例子是:<em>Fact-Checking</em>,即针对某个证据,由模型给出“该证据支持什么声明”的判断。】</li><li>Berquand, A., Darm, P., & Riccardi, A. (2021). SpaceTransformers: language modeling for space systems. <em>IEEE Access</em>, <em>9</em>, 133111-133122.【空间系统中的语言模型,根据空间标准制定,以concept recognization为最后的评估任务,这个任务应当被视为规范/标准类的基础任务】</li></ul><p>此外还有一些大模型在多模态数据、针对大模型的security issue(like backdoor attack, etc.)等议题;在此不列出。</p>]]></content>
<tags>
<tag> survey </tag>
<tag> foundation models </tag>
<tag> big models </tag>
<tag> NLP </tag>
</tags>
</entry>
<entry>
<title>Markov switching Model & Markov Chain Monte Carlo</title>
<link href="/uncategorized/notes/mcmc/"/>
<url>/uncategorized/notes/mcmc/</url>
<content type="html"><![CDATA[<p>撰写本篇笔记的动机来源于一个真实场景中的项目需求——<strong>用户行为模式研究</strong>。我们总希望能够使用数学的语言来描述用户行为,以便能够推断用户的未来动态,用于优化系统。因此本文首先将给出上述的motivation scenario,然后介绍用于建模场景中用户行为的数学模型——Markov switching Model,以及原因。最后介绍该建模方法背后的数学工具——Markov Chain Monte Carlo。</p><span id="more"></span><h2 id="目录"><a href="#目录" class="headerlink" title="目录"></a>目录</h2><ul><li>Motivation</li><li>Regime Switching Model & Markov Switching Model</li><li>Markov Chain Monte Carlo (MCMC)</li><li>案例分析与实现<ul><li>开源代码调研</li><li>案例分析</li></ul></li></ul><h2 id="1-Motivation"><a href="#1-Motivation" class="headerlink" title="1. Motivation"></a>1. Motivation</h2><p>本文的示例场景来源于“复杂移动接入网中,面向用户的体验优化与网络问题发现”。</p><h3 id="1-1-示例场景"><a href="#1-1-示例场景" class="headerlink" title="1.1 示例场景"></a>1.1 示例场景</h3><p>设有用户A,该用户在移动接入网中的行为模式可以概括为以下几点:</p><ol><li>用户A可能会随节假日/工作日的规划,接入到不同区域的网络中,而这些区域的基站由于所处位置不同,所能提供的网络性能也有差异。例如:大型居民区/城市热点区域的基站一般具有较大的容量(重服务),而郊区/或非城市主干区域则相反(重覆盖)。因此用户在上述区域内体验到的网络服务质量也有所差异。</li><li>用户A在节假日或工作日内,将随机接入用户所在地附近的小区,这些小区将综合自身的服务能力、用户的业务质量要求、用户数据量,提供差异化的网络服务。用户在此过程中可感知到的网络服务水平指标通常由“速率(DLUserThrpAvgwithoutLastTTI(Mbps))”刻画。</li></ol><h3 id="1-2-目标"><a href="#1-2-目标" class="headerlink" title="1.2 目标"></a>1.2 目标</h3><p>分析用户的行为特征,目标在于借助单个用户的体验反馈,找到(频繁)影响用户网络体验的<strong>网络问题</strong>。这里,网络问题主要指“接入网”中可能存在的故障、告警、不恰当的组网策略等。</p><p>网络问题常常表征在<strong>用户级KPI的变化</strong>上,因此,分析用户级KPI上的<strong>异常链路</strong>,是帮助我们发现网络问题的必经途径。这里,异常链路指一系列时间、空间、因果上具有相关关系的指标异常。例如:用户TA异常——用户RSRP异常——MAC层指标异常——用户体验异常。在分析过程中,我们不仅<strong>关注用户的体验异常</strong>,也关注<strong>数传指标的异常</strong>,因为这两者都代表了潜在的网络问题。</p><p>下图展示了在同一小区内,用户级指标之间的机理,其中左框中为资源型指标,右框中为数传指标。实线表示指标之间的相互影响关系,虚线则表示“数传指标差时,小区将调用更多的资源来完成用户的请求”。待发现的异常链路,即为下图中的KPI节点所连接起来的一条路径。</p><div align="center"> <img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/image-20220527195755138.png" alt="image-20220527195628244" width="70%" /></div><h3 id="1-3-建模思路"><a href="#1-3-建模思路" class="headerlink" title="1.3 建模思路"></a>1.3 建模思路</h3><p>如<a href="#%E7%A4%BA%E4%BE%8B%E5%9C%BA%E6%99%AF">2.1节</a>所述,用户的行为模式依照时间的粒度可以分为两种模式:</p><ul><li>粗粒度模式:用户的行为模式可能随时在”节假日模式“与”工作模式“中切换。在每种模式在,用户的业务需求、业务类型、数据类型将有较大的差异。</li><li>细粒度模式:用户在每种粗粒度模式下享受到的网络服务,也将受小区状态、小区能力、昼夜的影响,有较大的差异。</li></ul><p>因此,本研究使用马尔可夫机制转换模型(Markov regime switching model)与马尔可夫链模型(Markov Chain)来分别建模用户的粗粒度模式与细粒度模式。其中,用户的模式切换,以及所享受到的网络服务,都被视为随机变量,粗粒度模式被视为一种regime,其转换受Markov Chain所驱动。用户体验到的网络服务则被视为由参数不同的另一个Markov Chain驱动的随机变量。</p><h2 id="2-Regime-Switching-Model-amp-Markov-Switching-Model"><a href="#2-Regime-Switching-Model-amp-Markov-Switching-Model" class="headerlink" title="2. Regime Switching Model & Markov Switching Model"></a>2. Regime Switching Model & Markov Switching Model</h2><h3 id="2-1-Why-regime-switching-model"><a href="#2-1-Why-regime-switching-model" class="headerlink" title="2.1 Why regime switching model"></a>2.1 Why regime switching model</h3><p>机制转换模型(Regime switching model)是对非平稳、非线性时间序列建模的一类常见模型。机制难以观测、由随机过程所驱动,而每一种机制下的时间序列可以视为独立的随机过程,具有不同的参数<strong>。</strong>机制转换模型常被用于建模经济中的周期性变化影响。</p><p>使用单一的随机过程(以马尔可夫过程为例)对数据在一段时间内的变化进行描述时,其概率转移矩阵通常是固定不变的。但是这个假设对于我们在现实世界数据中遇到的许多问题并不总是有效的,原因如下:</p><ul><li>实际时间序列数据在不同时间段,可能具有不同特征,例如均值和方差;</li><li>对于具有不同特征的时间序列数据,其在不同特征时期的模型参数(状态空间、概率转移矩阵等)可能不一样,如股市在平稳期、震荡期(宽幅震荡、窄幅震荡);</li></ul><p>因此,<strong>机制转换模型</strong>可以认为是<strong>最接近实际问题的理论模型</strong>,是对真实系统的一种近似,并且该理论模型的可以指导我们设计解决实际问题的方案。</p><h3 id="2-2-Application-of-regime-switching-model-in-wireless-access-network"><a href="#2-2-Application-of-regime-switching-model-in-wireless-access-network" class="headerlink" title="2.2 Application of regime switching model in wireless access network"></a>2.2 Application of regime switching model in wireless access network</h3><p>在无线网络中,由于各小区环境及基站配置不同,系统性能指标的变化模式也不同,具有<strong>非平稳性</strong>,且用户的<strong>移动</strong>会导致小区切换、同时小区也存在<strong>状态间切换</strong>。相比传统模型,<strong>机制转换模型</strong>通过以下设定,可以更好地刻画非平稳的时间序列,并捕捉现实世界数据的真实行为。</p><ul><li>将数据描述为属于不同的、重复出现的状态(regime)</li><li>允许时间序列数据的特征(例如均值、方差和模型参数等)在不同状态下发生改变</li><li>假设在任何给定的时间段内,序列数据都可能处于任何一种状态并可能过渡到不同的状态</li></ul><h3 id="2-3-Markov-Switching-Model"><a href="#2-3-Markov-Switching-Model" class="headerlink" title="2.3 Markov Switching Model"></a>2.3 Markov Switching Model</h3><p>由上节可以知道,regime switching model也是受随机过程驱动的,如果这个过程是一个马尔可夫链,那么我们称之为 Markov switching model。</p><p>综上,在本案例(即用户行为建模)中,我们利用两个层面的马尔可夫链来从用户级CHR数据建模用户行为:</p><ul><li>Markov Chain 1:驱动regime的随机变换。</li><li>Markov Chain 2:驱动某个regime内的用户体验变化与数传指标变化。</li></ul><h2 id="3-Markov-Chain-Monte-Carlo-MCMC"><a href="#3-Markov-Chain-Monte-Carlo-MCMC" class="headerlink" title="3. Markov Chain Monte Carlo (MCMC)"></a>3. Markov Chain Monte Carlo (MCMC)</h2><p>MCMC由两个MC组成,即蒙特卡罗方法(Monte Carlo Simulation,简称MC)和马尔科夫链(Markov Chain ,也简称MC)。MCMC算法的目的在于“<strong>在概率空间,用随机采样的思路,估计概率后验分布</strong>”。</p><p>这里需要阐明实际问题中的几个概念:“先验分布(prior distribution)”、“可能性分布(likelihood distribution)”、“后验分布(Posterior distribution)”</p><ul><li>Prior:先验分布,代表了在未知真实数据的情况下,人们对于该数据分布做出的先行假设。先验分布也被称为信念(brief)分布,因为它指明了人们在未知真实数据的情况下,对数据分布的信念。</li><li>Likelihood:似然分布,表示在已知真实数据的情况下,每个观测数据出现的可能性,它总结了已观测数据的统计意义。与之相关的maximum likelihood estimation(最大似然估计)回答了一个问题:怎样的样本值范围才最有可能让我们采样到已经观察过的数据?该问题在没有先验分布的情况下时没有意义的。</li><li>Posterior:后验分布,这是贝叶斯分析的最终目标。旨在综合prior distribution(先验的信念)与likelihood distribution(实际的观察),来推断数据的真实分布。</li></ul><p>在<a href="#ref1">[1]</a>中给出了一个简明的案例,用于说明三者之间的关系。</p><blockquote><div align="center"> <img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/image-20220527204614539.png" alt="image-20220527204614539" width="60%" /></div><p>上图中,红色的曲线代表后验分布。你可以把它看作先验分布和可能性分布的平均值。由于先验分布较短且分散,所以它反映了人们并不太确定人类的平均身高是多少。同时,可能性在相对较窄的范围内汇总了数据,因此它对真是参数值更加确定。</p><p>当先验分布和可能性分布被合并时,数据(由可能性表示)支配了之前那个在巨人堆里长大的人的弱先验信念。尽管那一个体仍然认为人类平均身高比数据告诉他的稍高些,但他最相信的是数据。</p></blockquote><h3 id="3-1-目标"><a href="#3-1-目标" class="headerlink" title="3.1 目标"></a>3.1 目标</h3><p>在回顾了上述概念后,我们可以轻易的说明,MCMC算法如何服务于建模用户行为模式。</p><p>MCMC算法解决后验概率分布估计问题的思路为:</p><ol><li>对于复杂的问题,我们都采用<strong>蒙特卡洛法</strong>,即采用随机采样的策略,估计样本的后验分布$p(x)$ <a href="#ref_liu_montecarlo">[2]</a>。</li><li>对于复杂的概率分布$p(x)$,我们往往无法直接对其进行采样,此时我们将引入“<strong>接受-拒绝采样</strong>”,即设定一个容易采样的分布$q(x)$(通常为高斯分布),然后按照一定的方法拒绝某些样本,以达到接近$p(x)$分布的目的 <a href="#ref_liu_montecarlo">[2]</a>。</li><li>对于高维分布,我们一般只能得到条件概率分布,而非联合概率分布,这时我们无法使用“接受-拒绝采样”,另外,对于高维分布,寻找合适的$q(x)$也很困难 <a href="#ref_liu_montecarlo">[2]</a>。</li><li>但如果我们要建模的后验分布是一个随机过程,且该随机过程是<strong>马尔可夫链</strong>,马尔科夫链模型中状态转移矩阵$\mathbf{P}$的性质可以让我们<strong>由任意一个初始分布开始,推断出$n$次状态转移后收敛了的平稳后验分布</strong> <a href="#ref_liu_markov">[3]</a>。</li><li>然而,给定一个平稳的分布,马尔可夫链中的转移矩阵$\mathbf{P}$还是无法得到,因此可以引入<strong>MCMC采样</strong>,该采样方法使得目标矩阵$\mathbf{P}(i,j)$可以由通过任意一个马尔科夫链状态转移矩阵$\mathbf{Q}$乘以$\alpha(i,j)$得到,即$\mathbf{P}(i,j)=\mathbf{Q}(i,j)\alpha(i,j)$。$\alpha(i,j)$一般称之为接受率。取值在$[0,1]$之间,可以理解为一个概率值 <a href="#ref_liu_mcmc">[4]</a>。</li><li>后续的,<strong>Metropolis-Hastings采样</strong>与<strong>Gibbs采样</strong>都是在MCMC采样的基础上,针对运算复杂度、高维分布拓展性所做出的算法改进。需要指出的是:Gibbs采样在高维特征时有明显优势,因此通常意义上的MCMC采样都是用的Gibbs采样。当然Gibbs采样是从M-H采样的基础上的进化而来的,同时Gibbs采样要求数据至少有两个维度,一维概率分布的采样是没法用Gibbs采样的,这时M-H采样仍然成立 <a href="#ref_liu_mcmc">[4]</a><a href="#ref_liu_gibbis">[5]</a>。</li></ol><p>建模用户行为模式的目标在于预测用户在任意时刻处于某种状态的可能性,若用户的模式可以用马尔可夫链来表示,则用户在时刻$t$的状态$s(t)$将仅与状态转移矩阵$\mathbf{P}$和前一个时刻的状态$s(t-1)$有关。因此,推断状态转移矩阵$\mathbf{P}$,就是建模用户行为的重要基础。回顾上述的MCMC算法脉络,可以看出MCMC采样解决了估计状态转移矩阵$\mathbf{P}$的问题。因此,MCMC算法可以被拆解,用于我们实际的用户行为建模问题上。</p><p>【临时笔记】</p><p>我们可以从那些方面来构造regime</p><ol><li>从用户体验层面:通过识别代表用户体验的KPI所属的regime,我们可以分析用户在哪些时间有较差的网络体验,在分析提供了差体验的小区后,我们也许可以在这些差小区内构造多小区之间的异常链路。</li><li>从小区交付能力层面:通过小区的历史性能记录,我们将性能相似的小区归为一个regime<ol><li>首先,我们可以在相同regime的小区中做异常检测,避免一些小容量小区上的正常KPI被检测为异常,也避免大容量小区上的异常被漏检。</li><li>这种方法将会过滤掉因为小区设计容量导致的体验差,而更专注于网络问题发现上,即:在同一个regime检测出的异常将更代表网络问题。</li></ol></li></ol><h3 id="3-2-Monte-Carlo法"><a href="#3-2-Monte-Carlo法" class="headerlink" title="3.2 Monte Carlo法"></a>3.2 Monte Carlo法</h3><p>[To be continued]</p><h3 id="3-3-Markov-Chain-amp-状态转移矩阵的性质"><a href="#3-3-Markov-Chain-amp-状态转移矩阵的性质" class="headerlink" title="3.3 Markov Chain & 状态转移矩阵的性质"></a>3.3 Markov Chain & 状态转移矩阵的性质</h3><p>[To be continued]</p><h3 id="3-4-MCMC采样"><a href="#3-4-MCMC采样" class="headerlink" title="3.4 MCMC采样"></a>3.4 MCMC采样</h3><p>[To be continued]</p><h2 id="4-案例分析与实现"><a href="#4-案例分析与实现" class="headerlink" title="4. 案例分析与实现"></a>4. 案例分析与实现</h2><h3 id="4-1-开源代码调研"><a href="#4-1-开源代码调研" class="headerlink" title="4.1 开源代码调研"></a>4.1 开源代码调研</h3><ol><li><strong>statsmodels.tsa</strong>(时间序列分析包,github 7k stars)<a href="#ref_tsa">[6]</a></li></ol><div align="left"> <img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/image-20220530095613227.png" alt="image-20220530095613227" width="80%" /></div><p>关键参数说明:</p><ul><li><code>k_regimes</code>: 设定的regime个数,作为超参数;</li></ul><ol start="2"><li><strong>Python版 regime-switching model</strong>(github 9 stars)<a href="#ref_py_regime">[7]</a></li></ol><div align="left"> <img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/image-20220530095819553.png" alt="image-20220530095819553" width="40%" /></div><p>关键参数说明:</p><ul><li><code>n_components</code>: 设定的regime个数,作为超参数;</li></ul><ol start="3"><li><strong>Matlab版 regime-switching model(</strong>github 36 stars)</li></ol><div align="left"> <img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/image-20220530100045914.png" alt="image-20220530100045914" width="80%" /></div><p>特性说明:</p><ul><li>支持单维和高维时间序列的建模;</li><li>可选择模型中的哪些参数随时间切换状态;</li><li>支持任意数量的regime设置与可解释变量;</li></ul><p>本节鸣谢Tongji DNA Lab的Chengbo Qiu同学。</p><h3 id="4-2-案例分析"><a href="#4-2-案例分析" class="headerlink" title="4.2 案例分析"></a>4.2 案例分析</h3><p>[To be continued]</p><h2 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h2><p><a id="ref1">[1]</a> <a href="https://zhuanlan.zhihu.com/p/32982140">https://zhuanlan.zhihu.com/p/32982140</a></p><p><a id="ref_liu_montecarlo">[2]</a> <a href="https://www.cnblogs.com/pinard/p/6625739.html">https://www.cnblogs.com/pinard/p/6625739.html</a></p><p><a id="ref_liu_markov">[3]</a> <a href="https://www.cnblogs.com/pinard/p/6632399.html">https://www.cnblogs.com/pinard/p/6632399.html</a></p><p><a id="ref_liu_mcmc">[4]</a> <a href="https://www.cnblogs.com/pinard/p/6638955.html">https://www.cnblogs.com/pinard/p/6638955.html</a></p><p><a id="ref_liu_gibbis">[5]</a> <a href="https://www.cnblogs.com/pinard/p/6645766.html">https://www.cnblogs.com/pinard/p/6645766.html</a></p><p><a id="ref_tsa">[6]</a> Kim, Chang-Jin, and Charles R. Nelson. 1999. “State-Space Models with Regime Switching: Classical and Gibbs-Sampling Approaches with Applications”. MIT Press Books. The MIT Press.</p><p><a id="ref_py_regime">[7]</a> Ma, Ying, Leonard MacLean, Kuan Xu, and Yonggan Zhao. “A portfolio optimization model with regime-switching risk factors for sector exchange traded funds.” Pac J Optim 7, no. 2 (2011): 281-296.</p>]]></content>
<tags>
<tag> note </tag>
<tag> Bayesian </tag>
<tag> Monte Carlo </tag>
<tag> Markov </tag>
</tags>
</entry>
<entry>
<title>Probabilistic Graphical Model</title>
<link href="/uncategorized/notes/graph_model_probability/"/>
<url>/uncategorized/notes/graph_model_probability/</url>
<content type="html"><![CDATA[<p>本文主要梳理建立在概率图模型上的多种概念,及其相关的算法。</p><span id="more"></span><h2 id="目录"><a href="#目录" class="headerlink" title="目录"></a>目录</h2><p>目前包含的概念/算法如下:</p><p><strong>概念</strong>:</p><ul><li>概率图模型<ul><li>贝叶斯网络 (Bayes Model) </li><li>马尔可夫随机场 (Markov Random Field) </li><li>因子图 (Factor Graph) </li></ul></li></ul><p><strong>算法</strong>:</p><ul><li>消息传递算法 (message passing algorithm) <ul><li>置信传播 (Belief propagation) </li></ul></li></ul><h2 id="1-概率图模型"><a href="#1-概率图模型" class="headerlink" title="1. 概率图模型"></a>1. 概率图模型</h2><p>概率图模型 (Probabilistic Graphical Model) 是一种表示随机变量之间条件依赖的结构。使用概率图模型的动机在于——利用基于图的表示对多维空间上的概率分布进行编码。对于特定分布中的一组独立性 (independency) ,图结构是这种独立性的<strong>紧凑</strong>表示、或<strong>分解</strong>表示<a href="#graphical_model">[1]</a>。</p><p>即:</p><ul><li>对于一个多元的分布,可以通过不同的诱导方式将其表征为图模型。</li><li>一个多元分布可以根据已有的概率图模型上的连接,做关系分解。</li></ul><p>概率图模型的<strong>分类</strong>如下 (尚不完整) :</p><ul><li>有向图模型<ul><li>有向无环图 (贝叶斯网络,Bayesian Network) </li><li>循环有向图模型</li></ul></li><li>无向图模型 (随机场模型) <ul><li>马尔可夫随机场 (Markov Random Field) </li><li>条件随机场 (Conditional Random fields) </li></ul></li><li>因子图 (Factor Graph) </li></ul><h3 id="1-1-贝叶斯网络-Bayesian-Model"><a href="#1-1-贝叶斯网络-Bayesian-Model" class="headerlink" title="1.1 贝叶斯网络 (Bayesian Model)"></a>1.1 贝叶斯网络 (Bayesian Model)</h3><p><strong>又名</strong>:Bayes network, Bayes net, belief network, or decision network</p><p><strong>贝叶斯网络</strong>是一种概率图形模型,它通过<strong>有向无环图</strong> (Directed Acyclic Graph) 表示一组变量及其条件依赖关系。</p><p>贝叶斯网络非常适合用于获取已发生的事件并预测几种可能的已知原因中的任何一种是促成因素的可能性。例如,贝叶斯网络可以表示疾病和症状之间的概率关系。给定症状,该网络可用于计算各种疾病存在的概率<a href="#bayesian_model">[2]</a>。</p><p>在形式上,贝叶斯网络是有向无环图,其节点与边的定义如下:</p><p><strong>节点</strong>:代表概率意义上的变量,每个节点都与一个概率函数相关联 (以父节点变量为条件/输入,该节点为输出) 。变量类型包括</p><ul><li>可观察的变量</li><li>潜在变量</li><li>未知的参数或者假设</li></ul><p><strong>边</strong>:代表条件依赖,未被连接的节点表示<strong>条件独立</strong>的变量 (两节点之间不存在任意路径) </p><h3 id="1-2-马尔可夫随机场-Markov-Random-Field"><a href="#1-2-马尔可夫随机场-Markov-Random-Field" class="headerlink" title="1.2 马尔可夫随机场 (Markov Random Field)"></a>1.2 马尔可夫随机场 (Markov Random Field)</h3><p><strong>又名</strong>:Markov Network</p><p><strong>马尔可夫随机场</strong>是一组具有由无向图描述的,具有<strong>马尔可夫性质</strong>的随机变量。换句话说,如果随机场满足马尔可夫性质,则称其为马尔可夫随机场。</p><p>马尔可夫网络或 MRF 在其依赖关系的表示方面类似于贝叶斯网络;区别在于贝叶斯网络是有向无环的,而马尔可夫网络是<strong>无向</strong>的并且<strong>可能是循环的</strong>。因此,马尔可夫网络可以表示贝叶斯网络不能表示的某些依赖关系 (例如循环依赖关系,cyclic dependencies) ;另一方面,它不能表示贝叶斯网络可以表示的某些依赖关系 (例如诱导依赖关系,induced dependencies) 。马尔可夫随机场的基础图可能是<strong>有限的</strong>或<strong>无限的</strong>。</p><h4 id="1-2-1-定义"><a href="#1-2-1-定义" class="headerlink" title="1.2.1 定义"></a>1.2.1 定义</h4><ol><li><p>马尔可夫性质</p><p> 给定一个无向图$G=(V,E)$,一系列随机变量$X = (X_v)_{v \in V}$,当随机变量满足以下性质时,我们称他们相对于$G$形成了一个马尔可夫随机场。</p><ol><li><strong>成对马尔可夫性质</strong> (Pairwise Markov property) :给定所有的其他变量,任何两个在图上不相邻的变量都是条件独立的。即:$X_{u} \perp X_{v} \mid X_{V \backslash{u, v}}$</li><li><strong>局部马尔可夫性质</strong> (Local Markov property) :一个变量在给定其邻居的情况下,条件独立于所有其他变量。即:$X_{v} \perp X_{V \backslash \mathrm{N}[v]} \mid X_{\mathrm{N}(v)}$,$\mathrm{N}[v]$是$v$的邻居集合,$\mathrm{N}[v]=v \cup \mathrm{N}(v)$是节点$v$的closed neighborhood。</li><li><strong>全局马尔可夫性质</strong> (Global Markov property) :给定一个分离子集,任何两个变量子集都是条件独立的。即:$X_{A} \perp X_{B} \mid X_{S}$,其任意从$A$中的节点到$B$中的节点的路径,都经过子集$S$。</li></ol><p> <strong>属性强弱</strong>:Global Markov property > Local Markov property > Pairwise Markov property。上述三个马尔可夫性质对于正分布 (那些只为相关变量分配非零概率的分布) 是等价的。</p></li><li><p>随机场:当给图中的每一个变量,都按照某种分布随机赋予相空间 (phase space) 的一个值之后,其构成的全体就叫做随机场。</p></li></ol><h3 id="1-3-因子图-Factor-Graph"><a href="#1-3-因子图-Factor-Graph" class="headerlink" title="1.3 因子图 (Factor Graph)"></a>1.3 因子图 (Factor Graph)</h3><p>因子图 (Factor Graph) 是表示函数分解的二分图。在概率论及其应用中,因子图用于表示<strong>概率分布函数的因式分解</strong>,从而实现高效计算,例如通过sum-product 算法计算边缘分布。</p><h4 id="1-3-1-定义"><a href="#1-3-1-定义" class="headerlink" title="1.3.1 定义"></a>1.3.1 定义</h4><p>因子图是一种二部图,用于表示联合概率分布的函数分解,以如下的联合概率为例:</p><p>$$g\left(X_{1}, X_{2}, \ldots, X_{n}\right)=\prod_{j=1}^{m} f_{j}\left(S_{j}\right)$$</p>其中$g\left(X_{1}, X_{2}, \ldots, X_{n}\right)$是联合概率分布,默认为实值函数。$S_j\subseteq\left\{X_{1}, X_{2}, \ldots, X_{n}\right\}$,即$S_j$是变量节点的一个子集。<p>上式对应的因子图表示为$G = (X, F, E)$,其中$X={X_1, X_2, \cdots, X_n}$,表示变量节点构成的集合;$F={f_1, f_2, \cdots f_m}$表示因子节点构成的集合。因子图上的边$E$定义在因子节点与变量节点之间,即因子节点$f_j$与变量节点$X_k, X_k \in S_j$之间存在一条无向的边。</p><p><strong>举例</strong>:</p><div width="100%" align="center"> <svg width="35%" viewBox="0 0 396 388" version="1.1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"> <g id="页面-1" stroke="none" stroke-width="1" fill="none" fill-rule="evenodd"> <g id="A4" transform="translate(-104.000000, -240.000000)"> <line x1="166.5" y1="596.5" x2="265.5" y2="596.5" id="直线" stroke="#979797" stroke-width="2" stroke-linecap="square"></line> <line x1="333.5" y1="595.5" x2="435.5" y2="595.5" id="直线" stroke="#979797" stroke-width="2" stroke-linecap="square"></line> <line x1="466.5" y1="473.5" x2="466.5" y2="565.5" id="直线" stroke="#979797" stroke-width="2" stroke-linecap="square"></line> <line x1="466.5" y1="303.5" x2="466.5" y2="407.5" id="直线" stroke="#979797" stroke-width="2" stroke-linecap="square"></line> <line x1="333.5" y1="272.5" x2="435.5" y2="272.5" id="直线" stroke="#979797" stroke-width="2" stroke-linecap="square"></line> <line x1="320.5" y1="416.5" x2="435.5" y2="272.5" id="直线" stroke="#979797" stroke-width="2" stroke-linecap="square"></line> <line x1="320.5" y1="465.5" x2="435.5" y2="595.5" id="直线" stroke="#979797" stroke-width="2" stroke-linecap="square"></line> <g id="f1" opacity="0.597795759" transform="translate(265.000000, 240.000000)" stroke="#000000"> <ellipse id="椭圆形" stroke-width="2" cx="34" cy="32.5" rx="33" ry="31.5"></ellipse> <g id="编组" transform="translate(34.000000, 33.000000) scale(-1, 1) rotate(-180.000000) translate(-34.000000, -33.000000) translate(21.000000, 17.000000)" fill="#000000" fill-rule="nonzero"> <path d="M2.22285853,1.51208791 C2.26990316,1.51208791 2.34047009,1.48864469 2.43455934,1.44175824 C2.52864859,1.39487179 2.65802131,1.35970696 2.8226775,1.33626374 C2.98733369,1.31282051 3.12846756,1.3010989 3.24607912,1.3010989 C3.55186919,1.3010989 3.83413694,1.45347985 4.09288237,1.75824176 C4.35162781,2.06300366 4.53980631,2.4029304 4.65741787,2.77802198 C4.892641,3.41098901 5.29252031,5.2043956 5.85705581,8.15824176 C6.42159131,11.1120879 6.9626045,13.9252747 7.48009537,16.5978022 C7.99758625,19.2703297 8.25633169,20.618315 8.25633169,20.6417582 L8.25633169,20.7472527 L6.59800866,20.7472527 C5.49245997,20.7472527 4.90440216,20.770696 4.83383522,20.8175824 C4.73974597,20.8879121 4.69270134,21.0051282 4.69270134,21.1692308 L4.93968562,22.1538462 C4.98673025,22.2710623 5.0808195,22.3296703 5.22195337,22.3296703 C5.36308725,22.3296703 5.91586159,22.3413919 6.88027641,22.3648352 C8.00934741,22.3648352 8.57388291,22.3765568 8.57388291,22.4 C8.57388291,22.4468864 8.67973331,23.032967 8.89143412,24.1582418 C9.10313494,25.2835165 9.24426881,25.96337 9.31483575,26.1978022 C10.1851613,30.0659341 11.9022901,32 14.4662222,32 C15.3365477,31.9531136 16.0539783,31.6952381 16.6185138,31.2263736 C17.1830493,30.7575092 17.465317,30.1362637 17.465317,29.3626374 C17.465317,28.5186813 17.2183327,27.9091575 16.7243642,27.5340659 C16.2303956,27.1589744 15.7364271,26.959707 15.2424585,26.9362637 C14.2074767,26.9362637 13.6899859,27.4285714 13.6899859,28.4131868 C13.6899859,28.8586081 13.8193586,29.2454212 14.078104,29.5736264 C14.3368495,29.9018315 14.6191172,30.1479853 14.9249073,30.3120879 L15.2424585,30.4879121 C14.8661015,30.6285714 14.5132668,30.6989011 14.1839544,30.6989011 C13.9016867,30.6989011 13.6311801,30.581685 13.3724347,30.3472527 C13.1136892,30.1128205 12.9372719,29.7846154 12.8431826,29.3626374 C12.6785264,28.6827839 12.4903479,27.7684982 12.2786471,26.6197802 C12.0669463,25.4710623 11.8787678,24.4981685 11.7141116,23.7010989 C11.5494554,22.9040293 11.4671273,22.4820513 11.4671273,22.4351648 C11.4671273,22.3882784 12.1139909,22.3648352 13.4077181,22.3648352 C14.4662222,22.3648352 15.0778023,22.3531136 15.2424585,22.3296703 C15.4071147,22.3062271 15.5364874,22.2358974 15.6305767,22.1186813 C15.654099,22.0249084 15.6305767,21.825641 15.5600097,21.5208791 C15.4894428,21.2161172 15.430637,21.0285714 15.3835924,20.9582418 C15.3365477,20.8410256 15.2306973,20.7824176 15.0660412,20.7824176 C14.901385,20.7824176 14.2662825,20.770696 13.1607338,20.7472527 L11.1848596,20.7472527 L10.3733398,16.4571429 C9.24426881,10.6432234 8.45627134,7.00952381 8.00934741,5.55604396 C7.37424497,3.56336996 6.55096403,2.08644689 5.53950459,1.12527473 C4.66917903,0.375091575 3.81061462,0 2.96381137,0 C2.21109737,0 1.52895031,0.222710623 0.917370187,0.668131868 C0.305790062,1.11355311 -4.54133108e-15,1.74652015 -4.54133108e-15,2.56703297 C-4.54133108e-15,3.43443223 0.246984281,4.06739927 0.740952844,4.46593407 C1.23492141,4.86446886 1.72888997,5.06373626 2.22285853,5.06373626 C3.25784028,5.06373626 3.77533116,4.57142857 3.77533116,3.58681319 C3.77533116,3.14139194 3.64595844,2.75457875 3.387213,2.42637363 C3.12846756,2.0981685 2.84619981,1.85201465 2.54040975,1.68791209 L2.22285853,1.51208791 Z" id="路径"></path> <g transform="translate(17.418778, 1.934066)" id="路径" stroke-width="0.707"> <path d="M3.24290361,14.3699692 L2.91861325,14.2456615 C2.6857894,14.1627897 2.3531839,14.0799179 1.92079676,13.9970462 C1.48840961,13.9141744 1.00613163,13.8561641 0.473962836,13.8230154 L1.59951051e-14,13.8230154 L1.59951051e-14,14.9666462 L0.473962836,14.9666462 C1.25558576,14.9997949 1.97900272,15.1241026 2.64421371,15.3395692 C3.30942471,15.5550359 3.77507241,15.7539282 4.04115681,15.9362462 C4.30724121,16.1185641 4.54006506,16.3008821 4.73962836,16.4832 C4.77288891,16.5329231 4.87267056,16.5577846 5.03897331,16.5577846 C5.18864578,16.5577846 5.33000312,16.5080615 5.46304532,16.4086154 L5.46304532,8.97501538 L5.48799073,1.51655385 C5.60440265,1.40053333 5.7041843,1.32594872 5.78733568,1.2928 C5.87048705,1.25965128 6.07005035,1.22650256 6.38602558,1.19335385 C6.7020008,1.16020513 7.21753932,1.14363077 7.93264114,1.14363077 L8.58122187,1.14363077 L8.58122187,0 L8.30682233,0 C7.95758656,0.0497230769 6.6354797,0.0745846154 4.34050176,0.0745846154 C2.07878437,0.0745846154 0.773307784,0.0497230769 0.424072011,0 L0.124727062,0 L0.124727062,1.14363077 L0.773307784,1.14363077 C1.13917383,1.14363077 1.45514906,1.14363077 1.72123346,1.14363077 C1.98731785,1.14363077 2.19519629,1.15191795 2.34486877,1.16849231 C2.49454124,1.18506667 2.62758344,1.20992821 2.74399536,1.24307692 C2.86040729,1.27622564 2.92692839,1.2928 2.94355866,1.2928 C2.96018894,1.2928 3.01007976,1.3342359 3.09323114,1.41710769 C3.17638251,1.49997949 3.22627334,1.53312821 3.24290361,1.51655385 L3.24290361,14.3699692 Z"></path> </g> </g> </g> <g id="f2" opacity="0.597795759" transform="translate(265.000000, 408.000000)" stroke="#000000"> <ellipse id="椭圆形" stroke-width="2" cx="34" cy="32.5" rx="33" ry="31.5"></ellipse> <g id="编组" transform="translate(34.500000, 32.000000) scale(-1, 1) rotate(-180.000000) translate(-34.500000, -32.000000) translate(21.000000, 16.000000)" fill="#000000" fill-rule="nonzero"> <path d="M2.26063635,1.51208791 C2.30848051,1.51208791 2.38024674,1.48864469 2.47593505,1.44175824 C2.57162337,1.39487179 2.70319479,1.35970696 2.87064934,1.33626374 C3.03810388,1.31282051 3.18163635,1.3010989 3.30124674,1.3010989 C3.61223375,1.3010989 3.89929868,1.45347985 4.16244154,1.75824176 C4.4255844,2.06300366 4.61696102,2.4029304 4.73657141,2.77802198 C4.97579219,3.41098901 5.38246751,5.2043956 5.95659738,8.15824176 C6.53072724,11.1120879 7.08093503,13.9252747 7.60722075,16.5978022 C8.13350646,19.2703297 8.39664931,20.618315 8.39664931,20.6417582 L8.39664931,20.7472527 L6.71014283,20.7472527 C5.58580517,20.7472527 4.98775323,20.770696 4.91598699,20.8175824 C4.82029868,20.8879121 4.77245452,21.0051282 4.77245452,21.1692308 L5.02363634,22.1538462 C5.0714805,22.2710623 5.16716881,22.3296703 5.31070128,22.3296703 C5.45423374,22.3296703 6.01640257,22.3413919 6.99720776,22.3648352 C8.1454675,22.3648352 8.71959736,22.3765568 8.71959736,22.4 C8.71959736,22.4468864 8.82724672,23.032967 9.04254542,24.1582418 C9.25784412,25.2835165 9.40137658,25.96337 9.47314282,26.1978022 C10.3582597,30.0659341 12.1045714,32 14.7120779,32 C15.5971947,31.9531136 16.3268181,31.6952381 16.900948,31.2263736 C17.4750778,30.7575092 17.7621428,30.1362637 17.7621428,29.3626374 C17.7621428,28.5186813 17.510961,27.9091575 17.0085973,27.5340659 C16.5062337,27.1589744 16.0038701,26.959707 15.5015064,26.9362637 C14.448935,26.9362637 13.9226493,27.4285714 13.9226493,28.4131868 C13.9226493,28.8586081 14.0542207,29.2454212 14.3173636,29.5736264 C14.5805064,29.9018315 14.8675714,30.1479853 15.1785584,30.3120879 L15.5015064,30.4879121 C15.1187532,30.6285714 14.759922,30.6989011 14.4250129,30.6989011 C14.137948,30.6989011 13.8628441,30.581685 13.5997012,30.3472527 C13.3365584,30.1128205 13.1571428,29.7846154 13.0614545,29.3626374 C12.8939999,28.6827839 12.7026233,27.7684982 12.4873246,26.6197802 C12.2720259,25.4710623 12.0806493,24.4981685 11.9131948,23.7010989 C11.7457402,22.9040293 11.6620129,22.4820513 11.6620129,22.4351648 C11.6620129,22.3882784 12.3198701,22.3648352 13.6355844,22.3648352 C14.7120779,22.3648352 15.3340519,22.3531136 15.5015064,22.3296703 C15.668961,22.3062271 15.8005324,22.2358974 15.8962207,22.1186813 C15.9201428,22.0249084 15.8962207,21.825641 15.8244545,21.5208791 C15.7526882,21.2161172 15.692883,21.0285714 15.6450389,20.9582418 C15.5971947,20.8410256 15.4895454,20.7824176 15.3220908,20.7824176 C15.1546363,20.7824176 14.5087402,20.770696 13.3844025,20.7472527 L11.374948,20.7472527 L10.5496363,16.4571429 C9.40137658,10.6432234 8.59998698,7.00952381 8.1454675,5.55604396 C7.4995714,3.56336996 6.66229867,2.08644689 5.63364933,1.12527473 C4.74853245,0.375091575 3.87537661,0 3.01418181,0 C2.24867531,0 1.55493506,0.222710623 0.932961035,0.668131868 C0.310987012,1.11355311 -4.37171275e-15,1.74652015 -4.37171275e-15,2.56703297 C-4.37171275e-15,3.43443223 0.251181817,4.06739927 0.753545451,4.46593407 C1.25590909,4.86446886 1.75827272,5.06373626 2.26063635,5.06373626 C3.31320778,5.06373626 3.83949349,4.57142857 3.83949349,3.58681319 C3.83949349,3.14139194 3.70792206,2.75457875 3.44477921,2.42637363 C3.18163635,2.0981685 2.89457142,1.85201465 2.5835844,1.68791209 L2.26063635,1.51208791 Z" id="路径"></path> <g transform="translate(16.877624, 1.934066)" id="路径" stroke-width="0.707"> <path d="M1.49679245,10.6656 C1.0401439,10.6656 0.676516361,10.8147692 0.405909816,11.1131077 C0.135303272,11.4114462 2.54964879e-16,11.7760821 2.54964879e-16,12.2070154 C2.54964879e-16,13.3837949 0.448192089,14.4031179 1.34457627,15.2649846 C2.24096044,16.1268513 3.35721244,16.5577846 4.69333225,16.5577846 C6.23240697,16.5577846 7.51778806,16.0937026 8.54947551,15.1655385 C9.58116296,14.2373744 10.1054631,13.0357333 10.122376,11.5606154 C10.122376,10.8479179 9.95324696,10.1683692 9.61498878,9.52196923 C9.2767306,8.87556923 8.87082078,8.31204103 8.39725933,7.83138462 C7.92369787,7.35072821 7.24718151,6.73747692 6.36771025,5.99163077 C5.75884552,5.47782564 4.91320007,4.71540513 3.83077389,3.70436923 L2.33398144,2.31212308 L4.26205307,2.28726154 C6.91737979,2.28726154 8.32960769,2.32869744 8.49873678,2.41156923 C8.61712714,2.44471795 8.82008205,3.18227692 9.10760151,4.62424615 L9.10760151,4.69883077 L10.122376,4.69883077 L10.122376,4.62424615 C10.1054631,4.57452308 9.99552923,3.82038974 9.79257432,2.36184615 C9.58961941,0.903302564 9.4627726,0.140882051 9.41203387,0.0745846154 L9.41203387,0 L2.54964879e-16,0 L2.54964879e-16,0.472369231 L2.54964879e-16,0.770707692 C2.54964879e-16,0.886728205 0.0507387271,1.0110359 0.152216181,1.14363077 C0.253693635,1.27622564 0.507387271,1.56627692 0.913297087,2.01378462 C1.40377145,2.5441641 1.82659417,3.00824615 2.18176526,3.40603077 C2.33398144,3.57177436 2.6215009,3.8784 3.04432362,4.32590769 C3.46714635,4.77341538 3.7546658,5.08004103 3.90688198,5.24578462 C4.05909816,5.41152821 4.30433535,5.68500513 4.64259353,6.06621538 C4.98085171,6.44742564 5.21763243,6.72918974 5.3529357,6.91150769 C5.48823898,7.09382564 5.68273743,7.34244103 5.93643107,7.65735385 C6.1901247,7.97226667 6.36771025,8.23745641 6.4691877,8.45292308 C6.57066515,8.66838974 6.69751197,8.90871795 6.84972815,9.17390769 C7.00194433,9.43909744 7.11187824,9.70428718 7.17952988,9.96947692 C7.24718151,10.2346667 7.3063767,10.4832821 7.35711542,10.7153231 C7.40785415,10.9473641 7.43322351,11.220841 7.43322351,11.5357538 C7.43322351,12.5799385 7.14570406,13.483241 6.57066515,14.2456615 C5.99562625,15.0080821 5.17535016,15.3892923 4.10983689,15.3892923 C3.55171089,15.3892923 3.06123653,15.2484103 2.63841381,14.9666462 C2.21559108,14.6848821 1.91961517,14.4114051 1.75048608,14.1462154 C1.58135699,13.8810256 1.49679245,13.7235692 1.49679245,13.6738462 C1.49679245,13.6572718 1.53907472,13.6489846 1.62363927,13.6489846 C1.92807163,13.6489846 2.24096044,13.5329641 2.56230572,13.3009231 C2.88365099,13.0688821 3.04432362,12.6876718 3.04432362,12.1572923 C3.04432362,11.7429333 2.90902035,11.3948718 2.63841381,11.1131077 C2.36780726,10.8313436 1.98726681,10.6821744 1.49679245,10.6656 Z"></path> </g> </g> </g> <g id="f3" opacity="0.597795759" transform="translate(432.000000, 408.000000)" stroke="#000000"> <ellipse id="椭圆形" stroke-width="2" cx="34" cy="32.5" rx="33" ry="31.5"></ellipse> <g id="编组" transform="translate(34.500000, 32.000000) scale(-1, 1) rotate(-180.000000) translate(-34.500000, -32.000000) translate(21.000000, 16.000000)" fill="#000000" fill-rule="nonzero"> <path d="M2.24377027,1.51208791 C2.29125747,1.51208791 2.36248828,1.48864469 2.45746268,1.44175824 C2.55243708,1.39487179 2.68302689,1.35970696 2.84923209,1.33626374 C3.0154373,1.31282051 3.1578989,1.3010989 3.27661691,1.3010989 C3.58528372,1.3010989 3.87020693,1.45347985 4.13138653,1.75824176 C4.39256614,2.06300366 4.58251495,2.4029304 4.70123295,2.77802198 C4.93866896,3.41098901 5.34231017,5.2043956 5.91215659,8.15824176 C6.48200301,11.1120879 7.02810583,13.9252747 7.55046504,16.5978022 C8.07282426,19.2703297 8.33400387,20.618315 8.33400387,20.6417582 L8.33400387,20.7472527 L6.66008002,20.7472527 C5.54413078,20.7472527 4.95054076,20.770696 4.87930996,20.8175824 C4.78433556,20.8879121 4.73684835,21.0051282 4.73684835,21.1692308 L4.98615616,22.1538462 C5.03364336,22.2710623 5.12861777,22.3296703 5.27107937,22.3296703 C5.41354098,22.3296703 5.97151559,22.3413919 6.94500323,22.3648352 C8.08469606,22.3648352 8.65454248,22.3765568 8.65454248,22.4 C8.65454248,22.4468864 8.76138868,23.032967 8.97508109,24.1582418 C9.1887735,25.2835165 9.3312351,25.96337 9.4024659,26.1978022 C10.2809791,30.0659341 12.014262,32 14.6023145,32 C15.4808277,31.9531136 16.2050075,31.6952381 16.7748539,31.2263736 C17.3447004,30.7575092 17.6296236,30.1362637 17.6296236,29.3626374 C17.6296236,28.5186813 17.3803158,27.9091575 16.8817001,27.5340659 C16.3830845,27.1589744 15.8844689,26.959707 15.3858533,26.9362637 C14.3411349,26.9362637 13.8187756,27.4285714 13.8187756,28.4131868 C13.8187756,28.8586081 13.9493655,29.2454212 14.2105451,29.5736264 C14.4717247,29.9018315 14.7566479,30.1479853 15.0653147,30.3120879 L15.3858533,30.4879121 C15.0059557,30.6285714 14.6498017,30.6989011 14.3173913,30.6989011 C14.0324681,30.6989011 13.7594166,30.581685 13.498237,30.3472527 C13.2370574,30.1128205 13.0589804,29.7846154 12.964006,29.3626374 C12.7978008,28.6827839 12.607852,27.7684982 12.3941596,26.6197802 C12.1804672,25.4710623 11.9905184,24.4981685 11.8243132,23.7010989 C11.658108,22.9040293 11.5750054,22.4820513 11.5750054,22.4351648 C11.5750054,22.3882784 12.2279544,22.3648352 13.5338524,22.3648352 C14.6023145,22.3648352 15.2196481,22.3531136 15.3858533,22.3296703 C15.5520585,22.3062271 15.6826483,22.2358974 15.7776227,22.1186813 C15.8013663,22.0249084 15.7776227,21.825641 15.7063919,21.5208791 C15.6351611,21.2161172 15.5758021,21.0285714 15.5283149,20.9582418 C15.4808277,20.8410256 15.3739815,20.7824176 15.2077763,20.7824176 C15.0415711,20.7824176 14.4004939,20.770696 13.2845446,20.7472527 L11.2900822,20.7472527 L10.4709279,16.4571429 C9.3312351,10.6432234 8.53582448,7.00952381 8.08469606,5.55604396 C7.44361884,3.56336996 6.61259281,2.08644689 5.59161798,1.12527473 C4.71310475,0.375091575 3.84646332,0 2.9916937,0 C2.23189847,0 1.54333405,0.222710623 0.92600043,0.668131868 C0.30866681,1.11355311 -1.45051597e-17,1.74652015 -1.45051597e-17,2.56703297 C-1.45051597e-17,3.43443223 0.249307808,4.06739927 0.747923424,4.46593407 C1.24653904,4.86446886 1.74515466,5.06373626 2.24377027,5.06373626 C3.28848871,5.06373626 3.81084792,4.57142857 3.81084792,3.58681319 C3.81084792,3.14139194 3.68025812,2.75457875 3.41907851,2.42637363 C3.1578989,2.0981685 2.87297569,1.85201465 2.56430888,1.68791209 L2.24377027,1.51208791 Z" id="路径"></path> <g transform="translate(16.550263, 1.387112)" id="路径" stroke-width="0.707"> <path d="M2.14030753,12.0578462 C1.68706594,12.0578462 1.3345447,12.1987282 1.08274381,12.4804923 C0.830942924,12.7622564 0.696649118,13.1268923 0.679862393,13.5744 C0.679862393,14.4859897 1.08274381,15.2981333 1.88850665,16.0108308 C2.69426948,16.7235282 3.66789958,17.0798769 4.80939693,17.0798769 C5.39693233,17.0798769 5.76624029,17.0715897 5.91732082,17.0550154 C7.15953853,16.8561231 8.10798853,16.4169026 8.76267084,15.7373538 C9.41735314,15.0578051 9.75308766,14.3202462 9.76987438,13.5246769 C9.76987438,12.6628103 9.48450005,11.8258051 8.91375137,11.0136615 C8.34300269,10.2015179 7.55402658,9.61312821 6.54682304,9.24849231 L6.47128277,9.19876923 C6.47128277,9.18219487 6.54682304,9.15733333 6.69790357,9.12418462 C6.8489841,9.0910359 7.09239163,9.01645128 7.42812614,8.90043077 C7.76386066,8.78441026 8.08280845,8.61037949 8.38496951,8.37833846 C9.76148102,7.49989744 10.4497368,6.33969231 10.4497368,4.89772308 C10.4497368,3.58834872 9.92934828,2.44471795 8.88857128,1.46683077 C7.84779429,0.48894359 6.53003631,1.24930591e-16 4.93529737,1.24930591e-16 C3.59235931,1.24930591e-16 2.43407523,0.356348718 1.46044514,1.06904615 C0.486815047,1.78174359 3.79593969e-16,2.68504615 3.79593969e-16,3.77895385 C3.79593969e-16,4.2430359 0.151080532,4.61595897 0.453241595,4.89772308 C0.755402658,5.17948718 1.13310399,5.32865641 1.58634558,5.34523077 C2.0563739,5.34523077 2.4424686,5.19606154 2.74462966,4.89772308 C3.04679072,4.59938462 3.19787125,4.22646154 3.19787125,3.77895385 C3.19787125,3.5966359 3.17269117,3.43089231 3.12233099,3.28172308 C3.07197081,3.13255385 3.01321727,2.99995897 2.94607037,2.88393846 C2.87892347,2.76791795 2.78659647,2.66847179 2.66908939,2.5856 C2.55158231,2.50272821 2.45086196,2.43643077 2.36692833,2.38670769 C2.2829947,2.33698462 2.19906107,2.3038359 2.11512744,2.28726154 C2.03119382,2.27068718 1.96404691,2.24582564 1.91368673,2.21267692 L1.81296638,2.18781538 C2.66908939,1.44196923 3.70986639,1.06904615 4.93529737,1.06904615 C5.85856728,1.06904615 6.5552164,1.50826667 7.02524472,2.38670769 C7.31061906,2.93366154 7.45330623,3.77066667 7.45330623,4.89772308 L7.45330623,5.39495385 C7.45330623,6.96951795 6.91613101,8.02198974 5.84178056,8.55236923 C5.58997967,8.65181538 5.07798454,8.70982564 4.30579515,8.7264 L3.24823143,8.75126154 L3.17269117,8.80098462 C3.13911771,8.85070769 3.12233099,8.98330256 3.12233099,9.19876923 C3.12233099,9.49710769 3.18947789,9.64627692 3.3237717,9.64627692 C3.79380002,9.64627692 4.28061506,9.68771282 4.78421684,9.77058462 C5.35496551,9.85345641 5.87535401,10.2015179 6.34538233,10.8147692 C6.81541065,11.4280205 7.05042481,12.3561846 7.05042481,13.5992615 L7.05042481,13.7981538 C7.05042481,14.7428923 6.75665711,15.3975795 6.16912171,15.7622154 C5.79981374,15.9942564 5.40532569,16.1102769 4.98565755,16.1102769 C4.44848232,16.1102769 3.95327391,16.0191179 3.50003232,15.8368 C3.04679072,15.6544821 2.72784293,15.4638769 2.54318895,15.2649846 C2.35853497,15.0660923 2.26620798,14.9666462 2.26620798,14.9666462 L2.34174824,14.9666462 C2.39210842,14.9500718 2.45925532,14.9334974 2.54318895,14.9169231 C2.62712258,14.9003487 2.71105621,14.8589128 2.79498984,14.7926154 C2.87892347,14.7263179 2.97964382,14.6683077 3.0971509,14.6185846 C3.21465798,14.5688615 3.29859161,14.4777026 3.34895179,14.3451077 C3.39931196,14.2125128 3.46645887,14.0882051 3.55039249,13.9721846 C3.63432612,13.8561641 3.65950621,13.6904205 3.62593276,13.4749538 C3.62593276,13.1103179 3.50842568,12.7871179 3.27341152,12.5053538 C3.03839736,12.2235897 2.66069603,12.0744205 2.14030753,12.0578462 Z"></path> </g> </g> </g> <g id="f4" opacity="0.597795759" transform="translate(265.000000, 563.000000)" stroke="#000000"> <ellipse id="椭圆形" stroke-width="2" cx="34" cy="32.5" rx="33" ry="31.5"></ellipse> <g id="编组" transform="translate(33.500000, 33.000000) scale(-1, 1) rotate(-180.000000) translate(-33.500000, -33.000000) translate(20.000000, 17.000000)" fill="#000000" fill-rule="nonzero"> <path d="M2.2148524,1.51208791 C2.26172758,1.51208791 2.33204036,1.48864469 2.42579073,1.44175824 C2.51954109,1.39487179 2.64844785,1.35970696 2.81251099,1.33626374 C2.97657413,1.31282051 3.11719968,1.3010989 3.23438763,1.3010989 C3.53907632,1.3010989 3.82032742,1.45347985 4.07814093,1.75824176 C4.33595444,2.06300366 4.52345517,2.4029304 4.64064313,2.77802198 C4.87501904,3.41098901 5.2734581,5.2043956 5.8359603,8.15824176 C6.39846249,11.1120879 6.9375271,13.9252747 7.45315411,16.5978022 C7.96878113,19.2703297 8.22659464,20.618315 8.22659464,20.6417582 L8.22659464,20.7472527 L6.57424443,20.7472527 C5.47267763,20.7472527 4.88673784,20.770696 4.81642506,20.8175824 C4.7226747,20.8879121 4.67579951,21.0051282 4.67579951,21.1692308 L4.92189423,22.1538462 C4.96876941,22.2710623 5.06251978,22.3296703 5.20314532,22.3296703 C5.34377087,22.3296703 5.89455428,22.3413919 6.85549553,22.3648352 C7.98049992,22.3648352 8.54300212,22.3765568 8.54300212,22.4 C8.54300212,22.4468864 8.64847128,23.032967 8.85940961,24.1582418 C9.07034793,25.2835165 9.21097348,25.96337 9.28128626,26.1978022 C10.1484771,30.0659341 11.8594213,32 14.4141188,32 C15.2813097,31.9531136 15.9961562,31.6952381 16.5586584,31.2263736 C17.1211606,30.7575092 17.4024117,30.1362637 17.4024117,29.3626374 C17.4024117,28.5186813 17.156317,27.9091575 16.6641276,27.5340659 C16.1719382,27.1589744 15.6797487,26.959707 15.1875593,26.9362637 C14.1563053,26.9362637 13.6406783,27.4285714 13.6406783,28.4131868 C13.6406783,28.8586081 13.769585,29.2454212 14.0273985,29.5736264 C14.2852121,29.9018315 14.5664632,30.1479853 14.8711518,30.3120879 L15.1875593,30.4879121 C14.8125579,30.6285714 14.460994,30.6989011 14.1328677,30.6989011 C13.8516166,30.6989011 13.5820843,30.581685 13.3242708,30.3472527 C13.0664573,30.1128205 12.8906754,29.7846154 12.796925,29.3626374 C12.6328618,28.6827839 12.4453611,27.7684982 12.2344228,26.6197802 C12.0234845,25.4710623 11.8359837,24.4981685 11.6719206,23.7010989 C11.5078575,22.9040293 11.4258259,22.4820513 11.4258259,22.4351648 C11.4258259,22.3882784 12.0703596,22.3648352 13.3594272,22.3648352 C14.4141188,22.3648352 15.0234962,22.3531136 15.1875593,22.3296703 C15.3516225,22.3062271 15.4805292,22.2358974 15.5742796,22.1186813 C15.5977172,22.0249084 15.5742796,21.825641 15.5039668,21.5208791 C15.433654,21.2161172 15.3750601,21.0285714 15.3281849,20.9582418 C15.2813097,20.8410256 15.1758405,20.7824176 15.0117774,20.7824176 C14.8477142,20.7824176 14.2148993,20.770696 13.1133325,20.7472527 L11.1445748,20.7472527 L10.3359779,16.4571429 C9.21097348,10.6432234 8.42581416,7.00952381 7.98049992,5.55604396 C7.34768495,3.56336996 6.52736925,2.08644689 5.51955281,1.12527473 C4.65236192,0.375091575 3.79688983,0 2.95313654,0 C2.20313361,0 1.52344345,0.222710623 0.914066071,0.668131868 C0.30468869,1.11355311 -7.41304549e-15,1.74652015 -7.41304549e-15,2.56703297 C-7.41304549e-15,3.43443223 0.246094711,4.06739927 0.738284134,4.46593407 C1.23047356,4.86446886 1.72266298,5.06373626 2.2148524,5.06373626 C3.24610643,5.06373626 3.76173344,4.57142857 3.76173344,3.58681319 C3.76173344,3.14139194 3.63282669,2.75457875 3.37501318,2.42637363 C3.11719968,2.0981685 2.83594858,1.85201465 2.53125989,1.68791209 L2.2148524,1.51208791 Z" id="路径"></path> <g transform="translate(15.988984, 1.934066)" id="形状" stroke-width="0.707"> <path d="M10.7873156,0 C10.4890488,0.0497230769 9.42025945,0.0745846154 7.58094758,0.0745846154 C5.65878382,0.0745846154 4.54856855,0.0497230769 4.25030176,0 L4.02660167,0 L4.02660167,1.14363077 L4.79712421,1.14363077 C5.12853175,1.14363077 5.35223184,1.14363077 5.46822449,1.14363077 C5.58421713,1.14363077 5.72506533,1.16020513 5.8907691,1.19335385 C6.05647288,1.22650256 6.17246552,1.26793846 6.23874703,1.31766154 C6.30502854,1.36738462 6.36302486,1.43368205 6.41273599,1.51655385 C6.42930636,1.54970256 6.43759155,1.99721026 6.43759155,2.85907692 L6.43759155,4.10215385 L-7.17664205e-15,4.10215385 L-7.17664205e-15,5.24578462 L3.75319044,10.9888 C6.28845816,14.8340513 7.57266239,16.7649641 7.60580315,16.7815385 C7.6389439,16.8146872 7.81293286,16.8312615 8.12777003,16.8312615 L8.57517022,16.8312615 L8.72430361,16.6820923 L8.72430361,5.24578462 L11.0110157,5.24578462 L11.0110157,4.10215385 L8.72430361,4.10215385 L8.72430361,2.83421538 C8.72430361,2.15466667 8.72430361,1.75688205 8.72430361,1.64086154 C8.72430361,1.52484103 8.77401474,1.42539487 8.87343701,1.34252308 C9.00600002,1.22650256 9.47825577,1.16020513 10.2902043,1.14363077 L11.0110157,1.14363077 L11.0110157,0 L10.7873156,0 Z M6.58672495,5.24578462 L6.58672495,13.5495385 L1.14335603,5.27064615 L3.85261271,5.24578462 L6.58672495,5.24578462 Z"></path> </g> </g> </g> <g id="x1" opacity="0.597795759" transform="translate(435.000000, 241.000000)" stroke="#000000"> <rect id="矩形" stroke-width="2" x="1" y="1" width="60" height="60"></rect> <g id="编组" transform="translate(30.500000, 30.000000) scale(-1, 1) rotate(-180.000000) translate(-30.500000, -30.000000) translate(9.000000, 14.000000)" fill="#000000" fill-rule="nonzero"> <path d="M0.62325107,5.76230492 L0.545344686,5.76230492 C0.181781562,5.76230492 0,5.90316126 0,6.18487395 C0,6.28731493 0.0389531919,6.49219688 0.116859576,6.79951981 C0.220734754,7.15806323 0.311625535,7.36294518 0.389531919,7.41416567 C0.467438302,7.46538615 0.714141851,7.50380152 1.12964256,7.52941176 C3.3629589,7.6062425 5.11585253,8.27210884 6.38832346,9.5270108 C6.64801141,9.78311325 8.15420149,11.3965586 10.9068937,14.3673469 C13.6595859,17.3381353 15.0229477,18.8491397 14.9969789,18.9003601 C12.1663803,26.020008 10.686159,29.6438575 10.556315,29.7719088 C10.3225958,30.0536214 9.50457881,30.2072829 8.10226391,30.2328932 L7.01157453,30.2328932 C6.85576177,30.3865546 6.77785538,30.4889956 6.77785538,30.5402161 C6.77785538,30.5914366 6.80382418,30.8347339 6.85576177,31.270108 C6.95963694,31.6030412 7.08948092,31.8463385 7.24529368,32 L7.79063837,32 C8.67357739,31.9487795 10.2446895,31.9231693 12.5039746,31.9231693 C13.3609448,31.9231693 14.152993,31.9231693 14.8801193,31.9231693 C15.6072455,31.9231693 16.1915434,31.9359744 16.6330129,31.9615846 C17.0744824,31.9871949 17.3341704,31.9871949 17.4120768,31.9615846 C17.8535463,31.9615846 18.074281,31.8207283 18.074281,31.5390156 C18.074281,31.5134054 18.0483122,31.3469388 17.9963746,31.0396158 C17.8924995,30.6554622 17.8016087,30.42497 17.7237023,30.3481393 C17.6457959,30.2713085 17.4380456,30.2328932 17.1004512,30.2328932 C16.2954186,30.1560624 15.5942611,29.9383754 14.9969789,29.5798319 L17.9963746,22.1272509 L20.0608938,24.3937575 C22.7876172,27.2877151 24.1509789,28.8883553 24.1509789,29.1956783 C24.1509789,29.6566627 23.8393534,29.9767907 23.2161023,30.1560624 C23.0862584,30.1560624 22.9174612,30.1816727 22.7097108,30.2328932 C22.3201789,30.2328932 22.125413,30.3737495 22.125413,30.6554622 C22.125413,30.7066827 22.1513818,30.9115646 22.2033194,31.270108 C22.3071945,31.6030412 22.4370385,31.8463385 22.5928513,32 L23.0602896,32 C23.0862584,32 23.4108683,32 24.0341194,32 C24.6573704,32 25.3585279,31.9743898 26.1375917,31.9231693 C26.9166556,31.8719488 27.4490159,31.8591437 27.7346726,31.8847539 C30.0718641,31.8847539 31.3573194,31.9231693 31.5910386,32 L31.9026641,32 C32.0844457,31.8207283 32.1753365,31.6798719 32.1753365,31.577431 C32.1233989,30.9371749 31.9675861,30.4889956 31.7078982,30.2328932 L31.0846471,30.2328932 C30.2017081,30.2072829 29.4096598,30.0920368 28.7085024,29.8871549 C28.0073449,29.6822729 27.4749847,29.4645858 27.1114215,29.2340936 C26.7478584,29.0036014 26.4362329,28.7731092 26.1765449,28.542617 L25.7480598,28.1968788 C25.7480598,28.222489 24.5145421,26.8907563 22.0475066,24.2016807 L18.6585789,20.5138055 C18.6585789,20.4881953 19.0610952,19.4637855 19.8661278,17.4405762 C20.6711605,15.4173669 21.5151463,13.3429372 22.3980853,11.2172869 C23.2810243,9.09163665 23.761447,7.99039616 23.8393534,7.91356543 C24.0990414,7.68307323 24.8910896,7.55502201 26.2154981,7.52941176 C27.1244059,7.52941176 27.5788598,7.41416567 27.5788598,7.18367347 C27.5788598,7.13245298 27.552891,6.95318127 27.5009534,6.64585834 C27.3970783,6.23609444 27.3061875,5.99279712 27.2282811,5.91596639 C27.1503747,5.83913565 26.9685932,5.80072029 26.6829364,5.80072029 C26.60503,5.80072029 26.1765449,5.80072029 25.3974811,5.80072029 C24.6184173,5.80072029 23.4628059,5.81352541 21.930647,5.83913565 C20.3465505,5.83913565 19.1390016,5.83913565 18.3080002,5.83913565 C17.4769987,5.83913565 17.0225448,5.82633053 16.9446385,5.80072029 C16.5031689,5.80072029 16.2824342,5.91596639 16.2824342,6.14645858 C16.2824342,6.17206883 16.308403,6.36414566 16.3603406,6.72268908 C16.4122782,6.97879152 16.4642158,7.15806323 16.5161533,7.2605042 C16.5680909,7.36294518 16.6330129,7.42697079 16.7109193,7.45258103 C16.7888257,7.47819128 16.9056853,7.50380152 17.061498,7.52941176 C17.2173108,7.55502201 17.4250612,7.56782713 17.6847491,7.56782713 C17.944437,7.56782713 18.2820314,7.65746299 18.6975321,7.83673469 C19.1390016,8.04161665 19.3597364,8.16966787 19.3597364,8.22088836 C19.3337676,8.22088836 18.7235009,9.71908764 17.5289363,12.7154862 L15.6981363,17.2484994 C10.8160029,12.0240096 8.31001426,9.27090836 8.18017029,8.98919568 C8.07629511,8.78431373 8.02435752,8.63065226 8.02435752,8.52821128 C8.02435752,8.04161665 8.40090504,7.7214886 9.15400009,7.56782713 C9.17996888,7.56782713 9.25787526,7.56782713 9.38771924,7.56782713 C9.51756321,7.56782713 9.59546959,7.55502201 9.62143839,7.52941176 C9.69934477,7.52941176 9.75128236,7.52941176 9.77725115,7.52941176 C9.80321995,7.52941176 9.84217314,7.51660664 9.89411073,7.4909964 C9.94604832,7.46538615 9.98500151,7.42697079 10.0109703,7.3757503 C10.0369391,7.32452981 10.0499235,7.23489396 10.0499235,7.10684274 C10.0499235,6.87635054 10.0239547,6.67146859 9.97201711,6.49219688 C9.89411073,6.15926371 9.81620435,5.96718687 9.73829796,5.91596639 C9.66039158,5.8647459 9.46562562,5.82633053 9.15400009,5.80072029 C9.0760937,5.80072029 8.84237455,5.80072029 8.45284263,5.80072029 C8.06331071,5.80072029 7.51796603,5.81352541 6.81680857,5.83913565 C6.11565112,5.8647459 5.34957168,5.8647459 4.51857025,5.83913565 C2.49300428,5.83913565 1.19456455,5.81352541 0.62325107,5.76230492 Z" id="路径"></path> <g transform="translate(33.526272, 0.000000)" id="路径" stroke-width="0.707"> <path d="M3.58018786,15.6982857 L3.22216908,15.5624874 C2.96512995,15.4719552 2.59793119,15.381423 2.12057281,15.2908908 C1.64321443,15.2003585 1.11077623,15.136986 0.523258226,15.1007731 L-1.1071163e-15,15.1007731 L-1.1071163e-15,16.3501176 L0.523258226,16.3501176 C1.3861753,16.3863305 2.18483259,16.5221289 2.9192301,16.7575126 C3.65362761,16.9928964 4.16770587,17.2101737 4.46146488,17.4093445 C4.75522388,17.6085154 5.01226301,17.8076863 5.23258226,18.0068571 C5.26930214,18.0611765 5.37946176,18.0883361 5.56306114,18.0883361 C5.72830058,18.0883361 5.88436005,18.0340168 6.03123955,17.9253782 L6.03123955,9.80463866 L6.05877946,1.6567395 C6.18729902,1.5299944 6.29745865,1.44851541 6.38925834,1.41230252 C6.48105803,1.37608964 6.70137728,1.33987675 7.0502161,1.30366387 C7.39905492,1.26745098 7.96821299,1.24934454 8.75769031,1.24934454 L9.47372788,1.24934454 L9.47372788,0 L9.17078891,0 C8.78523022,0.0543193277 7.32561517,0.0814789916 4.79194376,0.0814789916 C2.29499222,0.0814789916 0.853737106,0.0543193277 0.468178413,0 L0.137699533,0 L0.137699533,1.24934454 L0.853737106,1.24934454 C1.25765574,1.24934454 1.60649455,1.24934454 1.90025356,1.24934454 C2.19401256,1.24934454 2.42351178,1.25839776 2.58875122,1.2765042 C2.75399066,1.29461064 2.90087017,1.32177031 3.02938973,1.35798319 C3.15790929,1.39419608 3.23134905,1.41230252 3.24970898,1.41230252 C3.26806892,1.41230252 3.32314873,1.45756863 3.41494842,1.54810084 C3.50674811,1.63863305 3.56182793,1.67484594 3.58018786,1.6567395 L3.58018786,15.6982857 Z"></path> </g> </g> </g> <g id="x2" opacity="0.597795759" transform="translate(435.000000, 565.000000)" stroke="#000000"> <rect id="矩形" stroke-width="2" x="1" y="1" width="60" height="60"></rect> <g id="编组" transform="translate(31.500000, 32.000000) scale(-1, 1) rotate(-180.000000) translate(-31.500000, -32.000000) translate(10.000000, 16.000000)" fill="#000000" fill-rule="nonzero"> <path d="M0.614591364,5.76230492 L0.537767443,5.76230492 C0.179255814,5.76230492 0,5.90316126 0,6.18487395 C0,6.28731493 0.0384119602,6.49219688 0.115235881,6.79951981 C0.217667775,7.15806323 0.307295682,7.36294518 0.384119602,7.41416567 C0.460943523,7.46538615 0.704219271,7.50380152 1.11394685,7.52941176 C3.31623257,7.6062425 5.04477078,8.27210884 6.29956148,9.5270108 C6.55564121,9.78311325 8.04090368,11.3965586 10.7553489,14.3673469 C13.4697941,17.3381353 14.8142127,18.8491397 14.7886047,18.9003601 C11.9973356,26.020008 10.5376811,29.6438575 10.4096412,29.7719088 C10.1791695,30.0536214 9.3725183,30.2072829 7.98968773,30.2328932 L6.91415284,30.2328932 C6.760505,30.3865546 6.68368108,30.4889956 6.68368108,30.5402161 C6.68368108,30.5914366 6.70928905,30.8347339 6.760505,31.270108 C6.8629369,31.6030412 6.99097676,31.8463385 7.1446246,32 L7.68239205,32 C8.55306315,31.9487795 10.1023455,31.9231693 12.3302392,31.9231693 C13.1753024,31.9231693 13.9563456,31.9231693 14.6733688,31.9231693 C15.3903921,31.9231693 15.9665715,31.9359744 16.401907,31.9615846 C16.8372426,31.9871949 17.0933223,31.9871949 17.1701462,31.9615846 C17.6054818,31.9615846 17.8231495,31.8207283 17.8231495,31.5390156 C17.8231495,31.5134054 17.7975416,31.3469388 17.7463256,31.0396158 C17.6438937,30.6554622 17.5542658,30.42497 17.4774419,30.3481393 C17.400618,30.2713085 17.1957542,30.2328932 16.8628505,30.2328932 C16.0690034,30.1560624 15.3775881,29.9383754 14.7886047,29.5798319 L17.7463256,22.1272509 L19.7821595,24.3937575 C22.4709967,27.2877151 23.8154153,28.8883553 23.8154153,29.1956783 C23.8154153,29.6566627 23.5081197,29.9767907 22.8935283,30.1560624 C22.7654884,30.1560624 22.5990366,30.1816727 22.3941728,30.2328932 C22.0100532,30.2328932 21.8179934,30.3737495 21.8179934,30.6554622 C21.8179934,30.7066827 21.8436014,30.9115646 21.8948173,31.270108 C21.9972492,31.6030412 22.1252891,31.8463385 22.2789369,32 L22.7398805,32 C22.7654884,32 23.0855881,32 23.7001795,32 C24.3147708,32 25.0061861,31.9743898 25.7744253,31.9231693 C26.5426645,31.8719488 27.067628,31.8591437 27.3493157,31.8847539 C29.6540333,31.8847539 30.921628,31.9231693 31.1520997,32 L31.4593954,32 C31.6386512,31.8207283 31.7282792,31.6798719 31.7282792,31.577431 C31.6770632,30.9371749 31.5234154,30.4889956 31.2673356,30.2328932 L30.6527443,30.2328932 C29.7820732,30.2072829 29.00103,30.0920368 28.3096147,29.8871549 C27.6181994,29.6822729 27.093236,29.4645858 26.7347243,29.2340936 C26.3762127,29.0036014 26.068917,28.7731092 25.8128373,28.542617 L25.3903057,28.1968788 C25.3903057,28.222489 24.173927,26.8907563 21.7411695,24.2016807 L18.399329,20.5138055 C18.399329,20.4881953 18.7962525,19.4637855 19.5900997,17.4405762 C20.3839469,15.4173669 21.216206,13.3429372 22.0868771,11.2172869 C22.9575482,9.09163665 23.4312957,7.99039616 23.5081197,7.91356543 C23.7641994,7.68307323 24.5452426,7.55502201 25.8512492,7.52941176 C26.7475283,7.52941176 27.1956678,7.41416567 27.1956678,7.18367347 C27.1956678,7.13245298 27.1700599,6.95318127 27.1188439,6.64585834 C27.016412,6.23609444 26.9267841,5.99279712 26.8499602,5.91596639 C26.7731363,5.83913565 26.5938805,5.80072029 26.3121928,5.80072029 C26.2353688,5.80072029 25.8128373,5.80072029 25.0445981,5.80072029 C24.2763589,5.80072029 23.136804,5.81352541 21.6259336,5.83913565 C20.0638472,5.83913565 18.8730765,5.83913565 18.0536213,5.83913565 C17.2341662,5.83913565 16.7860266,5.82633053 16.7092027,5.80072029 C16.2738672,5.80072029 16.0561994,5.91596639 16.0561994,6.14645858 C16.0561994,6.17206883 16.0818074,6.36414566 16.1330233,6.72268908 C16.1842392,6.97879152 16.2354552,7.15806323 16.2866711,7.2605042 C16.3378871,7.36294518 16.401907,7.42697079 16.4787309,7.45258103 C16.5555549,7.47819128 16.6707907,7.50380152 16.8244386,7.52941176 C16.9780864,7.55502201 17.1829502,7.56782713 17.4390299,7.56782713 C17.6951097,7.56782713 18.0280133,7.65746299 18.4377409,7.83673469 C18.8730765,8.04161665 19.0907442,8.16966787 19.0907442,8.22088836 C19.0651363,8.22088836 18.4633489,9.71908764 17.2853821,12.7154862 L15.48002,17.2484994 C10.665721,12.0240096 8.19455152,9.27090836 8.06651165,8.98919568 C7.96407976,8.78431373 7.91286381,8.63065226 7.91286381,8.52821128 C7.91286381,8.04161665 8.28417942,7.7214886 9.02681065,7.56782713 C9.05241863,7.56782713 9.12924255,7.56782713 9.25728242,7.56782713 C9.38532228,7.56782713 9.4621462,7.55502201 9.48775418,7.52941176 C9.5645781,7.52941176 9.61579405,7.52941176 9.64140202,7.52941176 C9.66700999,7.52941176 9.70542195,7.51660664 9.7566379,7.4909964 C9.80785385,7.46538615 9.84626581,7.42697079 9.87187378,7.3757503 C9.89748175,7.32452981 9.91028574,7.23489396 9.91028574,7.10684274 C9.91028574,6.87635054 9.88467777,6.67146859 9.83346182,6.49219688 C9.7566379,6.15926371 9.67981398,5.96718687 9.60299006,5.91596639 C9.52616614,5.8647459 9.33410634,5.82633053 9.02681065,5.80072029 C8.94998673,5.80072029 8.71951497,5.80072029 8.33539537,5.80072029 C7.95127577,5.80072029 7.41350833,5.81352541 6.72209304,5.83913565 C6.03067776,5.8647459 5.27524254,5.8647459 4.45578739,5.83913565 C2.45836545,5.83913565 1.17796678,5.81352541 0.614591364,5.76230492 Z" id="路径"></path> <g transform="translate(32.164255, 0.000000)" id="路径" stroke-width="0.707"> <path d="M1.6022781,11.6514958 C1.11344749,11.6514958 0.72419349,11.8144538 0.434516094,12.1403697 C0.144838698,12.4662857 2.72933393e-16,12.8646275 2.72933393e-16,13.335395 C2.72933393e-16,14.6209524 0.479778187,15.7344986 1.43933456,16.6760336 C2.39889094,17.6175686 3.5938102,18.0883361 5.02409234,18.0883361 C6.67163253,18.0883361 8.04760016,17.5813557 9.15199523,16.567395 C10.2563903,15.5534342 10.8176403,14.2407171 10.8357451,12.6292437 C10.8357451,11.8506667 10.6546967,11.1083025 10.2926,10.4021513 C9.93050324,9.696 9.49598714,9.08038095 8.9890517,8.55529412 C8.48211625,8.03020728 7.75792276,7.36026891 6.81647123,6.54547899 C6.16469709,5.98417927 5.25945522,5.15128291 4.10074564,4.04678992 L2.49846754,2.52584874 L4.56241899,2.49868908 C7.40487844,2.49868908 8.91663235,2.54395518 9.09768072,2.63448739 C9.22441458,2.67070028 9.44167263,3.47643697 9.74945486,5.05169748 L9.74945486,5.13317647 L10.8357451,5.13317647 L10.8357451,5.05169748 C10.8176403,4.99737815 10.6999588,4.17353501 10.4827008,2.58016807 C10.2654427,0.98680112 10.1296564,0.153904762 10.0753419,0.0814789916 L10.0753419,0 L2.72933393e-16,0 L2.72933393e-16,0.516033613 L2.72933393e-16,0.84194958 C2.72933393e-16,0.968694678 0.0543145118,1.104493 0.162943535,1.24934454 C0.271572559,1.39419608 0.543145118,1.71105882 0.977661212,2.19993277 C1.50270149,2.77933894 1.95532242,3.28631933 2.33552401,3.72087395 C2.49846754,3.90193838 2.80624977,4.23690756 3.25887071,4.72578151 C3.71149164,5.21465546 4.01927387,5.54962465 4.18221741,5.73068908 C4.34516094,5.9117535 4.60768108,6.2105098 4.96977783,6.62695798 C5.33187457,7.04340616 5.58534229,7.35121569 5.73018099,7.55038655 C5.87501969,7.74955742 6.08322532,8.02115406 6.35479788,8.36517647 C6.62637044,8.70919888 6.81647123,8.99890196 6.92510025,9.23428571 C7.03372927,9.46966947 7.16951555,9.73221289 7.33245909,10.021916 C7.49540262,10.311619 7.61308407,10.6013221 7.68550342,10.8910252 C7.75792276,11.1807283 7.82128969,11.4523249 7.87560421,11.7058151 C7.92991872,11.9593053 7.95707597,12.2580616 7.95707597,12.602084 C7.95707597,13.7427899 7.64929374,14.729591 7.03372927,15.5624874 C6.41816481,16.3953838 5.5400802,16.8118319 4.39947545,16.8118319 C3.80201582,16.8118319 3.27697554,16.6579272 2.82435461,16.3501176 C2.37173368,16.0423081 2.05489903,15.7435518 1.87385066,15.4538487 C1.69280228,15.1641457 1.6022781,14.9921345 1.6022781,14.9378151 C1.6022781,14.9197087 1.64754019,14.9106555 1.73806438,14.9106555 C2.06395145,14.9106555 2.39889094,14.7839104 2.74288284,14.5304202 C3.08687475,14.27693 3.25887071,13.8604818 3.25887071,13.2810756 C3.25887071,12.8284146 3.11403201,12.4481793 2.82435461,12.1403697 C2.53467722,11.8325602 2.12731838,11.6696022 1.6022781,11.6514958 Z"></path> </g> </g> </g> <g id="x3" opacity="0.597795759" transform="translate(104.000000, 565.000000)" stroke="#000000"> <rect id="矩形" stroke-width="2" x="1" y="1" width="60" height="60"></rect> <g id="编组" transform="translate(30.500000, 31.000000) scale(-1, 1) rotate(-180.000000) translate(-30.500000, -31.000000) translate(9.000000, 15.000000)" fill="#000000" fill-rule="nonzero"> <path d="M0.611501743,6.24324203 L0.535064025,6.24324203 C0.178354675,6.24324203 0,6.38151648 0,6.65806537 C0,6.7586286 0.038218859,6.95975507 0.114656577,7.26144476 C0.216573534,7.61341608 0.305750872,7.81454254 0.38218859,7.86482416 C0.458626308,7.91510578 0.700679081,7.95281699 1.10834691,7.9779578 C3.29956149,8.05338022 5.01941014,8.70704123 6.26789287,9.93894083 C6.52268526,10.1903489 8.00048114,11.7742198 10.7012805,14.6905536 C13.4020799,17.6068873 14.7397399,19.090195 14.7142607,19.1404766 C11.9370236,26.1296213 10.484707,29.6870456 10.3573108,29.8127497 C10.1279976,30.0892986 9.32540159,30.2401434 7.94952266,30.2652842 L6.87939461,30.2652842 C6.72651918,30.4161291 6.65008146,30.5166923 6.65008146,30.5669739 C6.65008146,30.6172556 6.6755607,30.8560932 6.72651918,31.283487 C6.82843613,31.6103175 6.95583233,31.8491552 7.10870777,32 L7.64377179,32 C8.51006593,31.9497184 10.0515599,31.9245776 12.2682537,31.9245776 C13.1090686,31.9245776 13.8861854,31.9245776 14.5996041,31.9245776 C15.3130228,31.9245776 15.8863057,31.937148 16.3194528,31.9622888 C16.7525998,31.9874296 17.0073922,31.9874296 17.08383,31.9622888 C17.516977,31.9622888 17.7335506,31.8240143 17.7335506,31.5474655 C17.7335506,31.5223246 17.7080713,31.3589094 17.6571128,31.0572197 C17.5551959,30.6801076 17.4660185,30.4538403 17.3895808,30.3784179 C17.3131431,30.3029954 17.1093092,30.2652842 16.7780791,30.2652842 C15.9882227,30.1898618 15.3002832,29.9761649 14.7142607,29.6241936 L17.6571128,22.3082185 L19.6827124,24.53318 C22.3580325,27.3740913 23.6956926,28.9453918 23.6956926,29.2470815 C23.6956926,29.6996161 23.3899417,30.0138762 22.7784399,30.1898618 C22.6510437,30.1898618 22.4854287,30.2150026 22.2815948,30.2652842 C21.8994062,30.2652842 21.7083119,30.4035587 21.7083119,30.6801076 C21.7083119,30.7303892 21.7337911,30.9315157 21.7847496,31.283487 C21.8866666,31.6103175 22.0140628,31.8491552 22.1669382,32 L22.6255645,32 C22.6510437,32 22.9695342,32 23.581036,32 C24.1925377,32 24.8804772,31.9748592 25.6448544,31.9245776 C26.4092315,31.874296 26.9315559,31.8617256 27.2118276,31.8868664 C29.5049591,31.8868664 30.7661815,31.9245776 30.9954946,32 L31.3012455,32 C31.4796002,31.8240143 31.5687775,31.6857399 31.5687775,31.5851767 C31.517819,30.9566565 31.3649436,30.5166923 31.1101512,30.2652842 L30.4986495,30.2652842 C29.6323553,30.2401434 28.8552385,30.1270098 28.1672991,29.9258833 C27.4793596,29.7247569 26.9570352,29.51106 26.6003258,29.2847927 C26.2436165,29.0585254 25.9378656,28.8322582 25.6830732,28.6059909 L25.2626658,28.26659 C25.2626658,28.2917308 24.0524019,26.9844088 21.6318742,24.3446239 L18.3068334,20.7243475 C18.3068334,20.6992067 18.7017617,19.6935744 19.4916181,17.7074506 C20.2814745,15.7213267 21.1095498,13.6849213 21.9758439,11.5982342 C22.842138,9.51154709 23.313504,8.43049234 23.3899417,8.35506992 C23.6447341,8.12880265 24.4218509,8.0030986 25.7212921,7.9779578 C26.6130655,7.9779578 27.0589521,7.86482416 27.0589521,7.63855689 C27.0589521,7.58827527 27.0334729,7.41228961 26.9825144,7.11059991 C26.8805975,6.70834698 26.7914201,6.46950931 26.7149824,6.39408688 C26.6385447,6.31866446 26.46019,6.28095325 26.1799184,6.28095325 C26.1034807,6.28095325 25.6830732,6.28095325 24.918696,6.28095325 C24.1543189,6.28095325 23.0204927,6.29352365 21.5172176,6.31866446 C19.962984,6.31866446 18.7781994,6.31866446 17.9628637,6.31866446 C17.1475281,6.31866446 16.7016414,6.30609405 16.6252036,6.28095325 C16.1920566,6.28095325 15.975483,6.39408688 15.975483,6.62035416 C15.975483,6.64549496 16.0009623,6.83405102 16.0519208,7.18602234 C16.1028792,7.43743042 16.1538377,7.61341608 16.2047962,7.71397931 C16.2557547,7.81454254 16.3194528,7.87739456 16.3958905,7.90253537 C16.4723282,7.92767618 16.5869848,7.95281699 16.7398602,7.9779578 C16.8927357,8.0030986 17.0965696,8.01566901 17.351362,8.01566901 C17.6061544,8.01566901 17.9373845,8.10366184 18.3450523,8.27964749 C18.7781994,8.48077396 18.9947729,8.606478 18.9947729,8.65675962 C18.9692937,8.65675962 18.3705315,10.1274969 17.1984865,13.0689715 L15.4022002,17.5188945 C10.6121032,12.3901696 8.15335658,9.68753275 8.02596038,9.41098386 C7.92404342,9.2098574 7.87308495,9.05901255 7.87308495,8.95844931 C7.87308495,8.48077396 8.24253392,8.16651386 8.98143186,8.01566901 C9.0069111,8.01566901 9.08334881,8.01566901 9.21074501,8.01566901 C9.33814121,8.01566901 9.41457892,8.0030986 9.44005816,7.9779578 C9.51649588,7.9779578 9.56745436,7.9779578 9.5929336,7.9779578 C9.61841284,7.9779578 9.6566317,7.96538739 9.70759018,7.94024658 C9.75854866,7.91510578 9.79676751,7.87739456 9.82224675,7.82711295 C9.84772599,7.77683133 9.86046561,7.6888385 9.86046561,7.56313446 C9.86046561,7.33686719 9.83498637,7.13574072 9.78402789,6.95975507 C9.70759018,6.63292456 9.63115246,6.4443685 9.55471474,6.39408688 C9.47827702,6.34380527 9.28718273,6.30609405 8.98143186,6.28095325 C8.90499414,6.28095325 8.67568098,6.28095325 8.29349239,6.28095325 C7.91130381,6.28095325 7.37623978,6.29352365 6.68830032,6.31866446 C6.00036086,6.34380527 5.2487233,6.34380527 4.43338764,6.31866446 C2.44600697,6.31866446 1.17204501,6.29352365 0.611501743,6.24324203 Z" id="路径"></path> <g transform="translate(31.786396, 0.000000)" id="路径" stroke-width="0.707"> <path d="M2.29676233,12.9309861 C1.81038913,12.9309861 1.43209886,13.0820698 1.16189153,13.3842372 C0.891684198,13.6864046 0.747573621,14.0774447 0.729559799,14.5573576 C0.729559799,15.5349579 1.16189153,16.4059109 2.026555,17.1702166 C2.89121846,17.9345223 3.93602015,18.3166752 5.16096006,18.3166752 C5.79144383,18.3166752 6.18774792,18.3077879 6.34987232,18.2900134 C7.68289516,18.0767187 8.70067612,17.6056931 9.40321518,16.8769365 C10.1057542,16.1481799 10.4660307,15.3572124 10.4840445,14.5040339 C10.4840445,13.5797572 10.1778095,12.6821424 9.56533958,11.8111894 C8.95286963,10.9402364 8.10621999,10.3092398 7.02539065,9.91819967 L6.94432845,9.86487601 C6.94432845,9.84710146 7.02539065,9.82043963 7.18751505,9.78489053 C7.34963945,9.74934143 7.61083988,9.66935595 7.97111632,9.54493409 C8.33139276,9.42051223 8.67365538,9.23387944 8.99790418,8.98503572 C10.4750376,8.0429845 11.2136043,6.7987659 11.2136043,5.25237993 C11.2136043,3.84819037 10.6551758,2.62174633 9.53831885,1.5730478 C8.42146187,0.524349266 7.00737683,1.3397714e-16 5.29606372,1.3397714e-16 C3.85495795,1.3397714e-16 2.61200422,0.382152855 1.56720253,1.14645856 C0.522400844,1.91076427 4.07341989e-16,2.87947732 4.07341989e-16,4.05259771 C4.07341989e-16,4.55028515 0.1621244,4.95021256 0.486373199,5.25237993 C0.810621999,5.5545473 1.215933,5.71451827 1.7023062,5.73229282 C2.20669322,5.73229282 2.62101113,5.57232186 2.94525993,5.25237993 C3.26950873,4.93243801 3.43163313,4.5325106 3.43163313,4.05259771 C3.43163313,3.85707765 3.40461239,3.67933213 3.35057093,3.51936117 C3.29652946,3.35939021 3.23348108,3.2171938 3.16142579,3.09277194 C3.08937051,2.96835008 2.99029448,2.86170277 2.86419773,2.77283001 C2.73810097,2.68395726 2.63001804,2.61285905 2.53994893,2.5595354 C2.44987982,2.50621174 2.35981071,2.47066264 2.2697416,2.45288809 C2.17967249,2.43511354 2.1076172,2.40845171 2.05357573,2.37290261 L1.9454928,2.34624078 C2.86419773,1.54638597 3.9810547,1.14645856 5.29606372,1.14645856 C6.28682394,1.14645856 7.03439757,1.61748418 7.53878459,2.5595354 C7.84501956,3.14609559 7.99813705,4.04371044 7.99813705,5.25237993 L7.99813705,5.78561647 C7.99813705,7.47419885 7.42169474,8.60288287 6.26881012,9.17166851 C5.99860279,9.27831582 5.44918121,9.34052675 4.62054539,9.3583013 L3.48567459,9.38496313 L3.40461239,9.43828678 C3.36858475,9.49161043 3.35057093,9.63380684 3.35057093,9.86487601 C3.35057093,10.1848179 3.42262622,10.3447889 3.56673679,10.3447889 C4.07112382,10.3447889 4.59352466,10.3892253 5.13393932,10.478098 C5.74640928,10.5669708 6.30483777,10.9402364 6.80922479,11.5978948 C7.31361181,12.2555532 7.56580532,13.250928 7.56580532,14.5840194 L7.56580532,14.797314 C7.56580532,15.8104634 7.25056343,16.5125582 6.62007966,16.9035984 C6.22377557,17.1524421 5.80045075,17.2768639 5.35010519,17.2768639 C4.77366288,17.2768639 4.24225513,17.1791039 3.75588193,16.9835838 C3.26950873,16.7880638 2.92724611,16.5836564 2.72909406,16.3703618 C2.53094202,16.1570672 2.431866,16.0504199 2.431866,16.0504199 L2.5129282,16.0504199 C2.56696966,16.0326453 2.63902495,16.0148708 2.72909406,15.9970962 C2.81916317,15.9793217 2.90923228,15.9348853 2.99930139,15.8637871 C3.08937051,15.7926889 3.19745344,15.730478 3.32355019,15.6771543 C3.44964695,15.6238307 3.53971606,15.5260706 3.59375753,15.3838742 C3.64779899,15.2416778 3.71985428,15.1083687 3.80992339,14.9839468 C3.8999925,14.8595249 3.92701324,14.6817794 3.89098559,14.4507103 C3.89098559,14.0596701 3.76488884,13.7130664 3.51269533,13.410899 C3.26050182,13.1087316 2.85519082,12.9487607 2.29676233,12.9309861 Z"></path> </g> </g> </g> </g> </g></svg></div><p>上图描述了如下概率分布的分解 (该图是一个有环图的特例) :</p><p>$$g\left(X_{1}, X_{2}, X_{3}\right)=f_{1}\left(X_{1}\right) f_{2}\left(X_{1}, X_{2}\right) f_{3}\left(X_{1}, X_{2}\right) f_{4}\left(X_{2}, X_{3}\right)$$</p><h4 id="1-3-2-联系"><a href="#1-3-2-联系" class="headerlink" title="1.3.2 联系"></a>1.3.2 联系</h4><p>无论是<strong>有向概率图模型</strong>还是<strong>无向概率图模型</strong>,他们都能使用因子图来描述。因为因子节点 (factor) 理论上可以描述变量之间的任何定义在实数域上的函数关系,还摆脱了probability function的取值必须为$[0,1]$的限制,因此使用因子图表示的概率图模型可以实现高效、统一的计算。</p><h2 id="2-算法"><a href="#2-算法" class="headerlink" title="2. 算法"></a>2. 算法</h2><h3 id="2-1-消息传递算法-Massage-Passing-Algorithm"><a href="#2-1-消息传递算法-Massage-Passing-Algorithm" class="headerlink" title="2.1 消息传递算法 (Massage Passing Algorithm)"></a>2.1 消息传递算法 (Massage Passing Algorithm)</h3><p><strong>推断</strong> (inference) 是概率图模型上的一个重要任务,简单来讲就是已知一系列条件概率函数,求每个随机变量的边缘概率。</p><p>直接地,解上述问题可以有两种思路:</p><ol><li>采用解析的方式,利用贝叶斯法则求边缘概率,但这种方法的复杂度随变量规模呈指数增加<a href="#belief_propagation">[3]</a>。</li><li>采用模拟的方法,让消息按照概率图模型进行传递,在迭代$N$次后分析消息的传递效果。</li></ol><p>后者由于简单的思路与计算复杂度,具有较高的实用性,因此被广泛使用,这类方法被称为消息传递算法。</p><p>关于消息传递,<a href="https://www.cnblogs.com/ironstark/p/5146818.html">这篇blog</a>做了详细的阐述,可以参考。</p><h4 id="2-1-1-置信传播算法-Belief-propagation"><a href="#2-1-1-置信传播算法-Belief-propagation" class="headerlink" title="2.1.1 置信传播算法 (Belief propagation)"></a>2.1.1 置信传播算法 (Belief propagation)</h4><p><strong>又名</strong>:sum–product message passing</p><p><strong>置信传播算法</strong> (Belief propagation) ,是一种消息传递算法 (message passing algorithm) ,用于对图模型 (例如贝叶斯网络和马尔可夫随机场) 进行推理。它以任何观察到的节点 (或变量) 为条件,计算每个未观察到的节点 (或变量) 的<strong>边缘分布</strong>。</p><p>BP算法在贝叶斯网络与马尔可夫随机场上都有相关的变体,但由于上述两张图模型都可以使用因子图来描述,因此后文直接叙述BP算法在因子图上的计算方法。</p><h5 id="2-1-1-1-算法描述"><a href="#2-1-1-1-算法描述" class="headerlink" title="2.1.1.1 算法描述"></a>2.1.1.1 算法描述</h5><p>给定因子图$G=(V, F, E)$,有联合质量函数$p(\mathbf{x})=\prod_{a \in F} f_{a}\left(\mathbf{x}<em>{a}\right)$,其中$\mathbf{x}</em>{a}$是因子节点$a$的所有邻居变量节点所构成的向量。注:联合质量函数 (Joint mass function) 是变量在每个可能的值上取得该值的概率,不同于概率密度函数,概率质量函数是在离散随机变量上定义的,而概率密度函数是在连续随机变量上定义的。</p><p>BP算法的原理在于,让被称为消息 (Message) 的实值函数沿着概率图中隐藏节点之间的边进行传递。具体而言包含两种情况,其中$\operatorname{Dom}(v)$表示变量$v$的定义域。</p><ul><li>从变量节点$v$到因子节点$a$的传递,记传递的消息为$\mu_{v \to a}$,$\mu_{v \rightarrow a}: \operatorname{Dom}(v) \rightarrow \mathbb{R}$。</li><li>从因子节点$a$到变量节点$v$的传递,记为$\mu_{a \to v}$,$\mu_{a \rightarrow v}: \operatorname{Dom}(v) \rightarrow \mathbb{R}$。</li></ul><p>BP算法中对于上述两种消息的传递,定义不同的计算方式</p><ul><li>对于从变量节点到因子节点的消息传递$\mu_{v \to a}$ <p>$$ \mu_{v \rightarrow a}\left(x_{v}\right)=\prod_{a^{*} \in N(v) \backslash\{a\}} \mu_{a^{*} \rightarrow v}\left(x_{v}\right) $$</p>其中$x_{v} \in \operatorname{Dom}(v)$,$N(v)$是节点$v$的邻居节点,如果$N(v) \backslash\{a\} = \emptyset$,则上式被置为均匀分布。</li><li>对于从因子节点到变量节点的消息传递$\mu_{a \to v}$被定义为<strong>因子与来自所有其他节点的消息的乘积</strong>,除与$v$相关的变量外,所有变量都被边缘化 (marginalized) <p>$$ \mu_{a \rightarrow v}\left(x_{v}\right)=\sum_{\mathbf{x}_{a}^{\prime}: x_{v}^{\prime}=x_{v}} f_{a}\left(\mathbf{x}_{a}^{\prime}\right) \prod_{v^{*} \in N(a) \backslash\{v\}} \mu_{v^{*} \rightarrow a}\left(x_{v^{*}}^{\prime}\right) $$</p> 其中$x_{v} \in \operatorname{Dom}(v)$,$N(a)$是节点$a$的邻居节点,如果$N(a) \backslash{v} = \emptyset$,则$\mu_{a \rightarrow v}\left(x_{v}\right)=f_{a}\left(x_{v}\right)$</li></ul><p>由上式可以看出,BP算法将完全边缘化计算 (complete marginalization) 简化为了更简单的项的乘积和,这也是BP算法被称为sum-product message passing或sum-product algorithm的原因。</p><p>在典型的运行情况中,每条消息都将从相邻消息的先前值上迭代更新。可以使用不同的调度策略来更新消息。在图模型是树的情况下,最优调度允许在只计算每条消息一次后达到收敛。当因子图有循环时,这样的最优调度是不存在的,典型的选择是在每次迭代时同时更新所有消息。</p><h5 id="2-1-1-2-收敛性"><a href="#2-1-1-2-收敛性" class="headerlink" title="2.1.1.2 收敛性"></a>2.1.1.2 收敛性</h5><p>当BP算法收敛时,每个节点的<strong>估计边际分布</strong>与来自相邻因子节点的所有消息的乘积成正比 (下式缺少归一化常数) </p><p>$$p_{X_{v}}\left(x_{v}\right) \propto \prod_{a \in N(v)} \mu_{a \rightarrow v}\left(x_{v}\right)$$</p><p>属于一个因子节点的一组变量节点的<strong>估计联合边际分布</strong>与因子和变量消息的乘积成正比,即:</p><p>$$p_{X_{a}}\left(\mathbf{x}_{a}\right) \propto f_{a}\left(\mathbf{x}_{a}\right) \prod_{v \in N(a)} \mu_{v \rightarrow a}\left(x_{v}\right)$$</p>在因子图是非循环的 (即树或森林) 的情况下,这些估计的边际实际上会在有限次数的迭代中收敛到真实的边际。这可以通过[数学归纳法](https://en.wikipedia.org/wiki/Mathematical_induction)来证明。<h2 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h2><p><a id="graphical_model">[1]</a> <a href="https://en.wikipedia.org/wiki/Graphical_model">https://en.wikipedia.org/wiki/Graphical_model</a></p><p><a id="bayesian_model">[2]</a> <a href="https://en.wikipedia.org/wiki/Bayesian_network">https://en.wikipedia.org/wiki/Bayesian_network</a></p><p><a id="belief_propagation">[3]</a> <a href="https://en.wikipedia.org/wiki/Belief_propagation#Motivation">https://en.wikipedia.org/wiki/Belief_propagation#Motivation</a></p>]]></content>
<tags>
<tag> Bayesian </tag>
<tag> Belief Propagation </tag>
<tag> Markov Random Field </tag>
</tags>
</entry>
<entry>
<title>Root Cause Analysis and Diagnosis in Cellular Network</title>
<link href="/uncategorized/surveys/rca_in_cellular/"/>
<url>/uncategorized/surveys/rca_in_cellular/</url>
<content type="html"><![CDATA[<p>本文主要调研了,LTE场景下的异常根因分析(RCA)方法,用于支持在4G LTE/5G场景下的用户投诉分析。</p><p>2022-04-18: 以“root cause analysis, cellular network”为关键词,调研了google scholar中top 2 pages的相关工作,TODO: 完善某些论文的可借鉴点总结。</p><span id="more"></span><h2 id="任务目标"><a href="#任务目标" class="headerlink" title="任务目标"></a>任务目标</h2><ul><li><p><strong>总体目标</strong>:发现用户的投诉根因,<strong>解释</strong>:网络发生了什么问题,这些问题如何一步一步影响用户(在网络机理方面构成完整链条),最终导致用户投诉。</p></li><li><p><strong>主要发现几类问题</strong>:</p><ul><li><p>网络侧问题(重点):将用户投诉视为网络出现问题的指示剂,发现网络中出现了什么问题</p></li><li><p>用户侧问题:上一个问题假设用户的投诉都是公平的,还有一种可能,用户所在的位置有不可避免的环境因素,导致服务体验差</p></li></ul></li><li><p><strong>下一步工作方向</strong>:将发掘到的<u>异常根因</u>从“<u>单维KPI异常</u>”拓展到“<u>带网络机理链条的多维KPI异常pattern</u>”上</p></li></ul><p><img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/FM%26PM%20logic%20related%20graph.png" alt="网络机理图-KPI指标关联关系"></p><h2 id="相关工作调研"><a href="#相关工作调研" class="headerlink" title="相关工作调研"></a>相关工作调研</h2><ol><li><p><strong>Root Cause Analysis in 5G/6G Networks</strong></p><p> Dinis Canastro; Ricardo Rocha; Mário Antunes; Diogo Gomes; Rui L. Aguiar</p><p> 2021 8th International Conference on Future Internet of Things and Cloud (FiCloud)</p><p> 文章比较新,会议不知名,但可以参考related work部分,以及其中提到的其他工作。</p></li><li><p>🌟<strong>Automatic root cause analysis based on traces for LTE self-organizing networks</strong></p><p> Ana Gomez-Andrades; Raquel Barco; Immaculada Serrano; Patricia Delgado; Patricia Caro-Oliver; Pablo Munoz</p><p> IEEE Wireless Communications (Volume: 23, Issue: 3, June 2016)</p><p> 本文的研究重点在于,基于用户级别的数据,分析用户连接被释放的原因,本文包含大量有关蜂窝网络的<u>背景知识</u>,很值得借鉴</p><ul><li>用户连接被释放的原因可分为:Normal Release, Access Failure, Dropped Connection</li><li>RF层面上实现自动诊断所需要关注的指标信息(Indicators and Measurements)</li><li>RF层面上的大体根因包括:Coverage Hole (CH), Lack of Dominant Cells (LD), Cell Edge (CE), Mobility Problems (MP), Interference (I)<ul><li>注:其他层面的异常原因还包括:excessive antenna downtilt (EAD), too small antenna downtilt (TSAD), coverage hole (CH), too late handover (TLH), inter-system interference (ISI), excessive reduction of transmit power (ERTP), and normal cell (Normal) <a href="https://link.springer.com/article/10.1007/s11036-020-01589-1">link</a></li></ul></li><li>本文还包含一些在蜂窝网络中做RCA的必要背景知识(LTE Traces and Events),以及一个user case</li></ul><p> 本文所提出的框架大体为:“首先,它根据释放的类型对连接进行分类,随后,根据异常释放的连接的事件信息,确定具体的故障原因。”</p></li><li><p>🌟<strong>Root Cause Analysis Based on Temporal Analysis ofMetrics Toward Self-Organizing 5G Networks</strong></p><p> Pablo Muñoz, Isabel de la Bandera, Emil J. Khatib, Ana Gómez-Andrades, Inmaculada Serrano, and Raquel Barco</p><p> IEEE Transactions on Vehicular Technology ( Volume: 66, Issue: 3, March 2017)</p><p> 本文提出的自动RCA方法是基于小区级别的KPI的。与现有技术相比,所提出的方法考虑了网络度量的<u>时间依赖性</u>和<u>故障对相邻小区的影响</u>,以实现更好的诊断准确性。</p><ul><li>算法的输入包括<u>告警小区以及周围小区的大量KPI</u>。</li><li>关联性度量方法被用于计算这些指标之间的相关性以表征网络的状态。</li><li>通过计算加权相关性将该状态与存储的故障模式进行比较以提供诊断。</li><li>此相关性由根据先前计算的度量相关性和专家知识构建的有效权重进行调制。</li></ul></li><li><p><strong>Root Cause Analysis of Reduced Accessibility in 4G Networks</strong></p><p> Diogo Ferreira, Carlos Senna, Paulo Salvador, Luís Cortesão, Cristina Pires, Rui Pedro & Susana Sargento </p><p> International Conference on Machine Learning for Networking, MLN 2019: Machine Learning for Networking pp 117–133</p><p> 本文算是一个案例研究,分析了 4G 网络可访问性降低的可能根本原因,可参考性在于<u>分析结论</u>,以及L<u>TE数据中通用的数据(预)处理思路</u>。</p><ul><li><p>本文分析了 4G 网络可访问性降低的可能根本原因,同时考虑了重要关键绩效指标 (KPI) 的信息,并考虑了它们在以前时间框架中的演变。</p></li><li><p>结果表明,网络可访问性降低的主要原因与故障切换次数、网络中的电话和短信数量、整体下载量和小区可用性有关。然而,每个小区的可访问性降低的主要原因更多地与每个小区的用户数量及其产生的下载量有关。</p></li></ul></li><li><p>🌟<strong>Automatic Root Cause Analysis for LTE Networks Based on Unsupervised Techniques</strong></p><p> Ana Gómez-Andrades; Pablo Muñoz; Inmaculada Serrano; Raquel Barco</p><p> IEEE Transactions on Vehicular Technology ( Volume: 65, Issue: 4, April 2016)</p><p> 【待补充细节】</p></li></ol>]]></content>
<tags>
<tag> survey </tag>
<tag> root cause analysis </tag>
<tag> diagnosis </tag>
</tags>
</entry>
<entry>
<title>本站维护手册</title>
<link href="/uncategorized/notes/hexo_notes/"/>
<url>/uncategorized/notes/hexo_notes/</url>
<content type="html"><![CDATA[<p>本文档记录了建立基于hexo的网页系统的步骤,并对维护所需要的操作一一列出,日后的本站更新将依据该文档进行。</p><p>Note: 随时会更新维护策略</p><span id="more"></span><h2 id="🌟-目前正在维护的论文主题列表"><a href="#🌟-目前正在维护的论文主题列表" class="headerlink" title="🌟 目前正在维护的论文主题列表"></a>🌟 目前正在维护的论文主题列表</h2><h3 id="按任务分类"><a href="#按任务分类" class="headerlink" title="按任务分类"></a>按任务分类</h3><ul><li>Anomaly detection / Outlier / Out-of-distribution</li><li>Interpretable / Explainable</li><li>Causal discovery</li><li>Data augmentation </li></ul><h3 id="按数据分类"><a href="#按数据分类" class="headerlink" title="按数据分类"></a>按数据分类</h3><ul><li>Time series</li><li>Missing value / Irregular sampled / Imputation</li><li>Sequence</li><li>Heterogeneous</li></ul><h3 id="按深度学习架构分类"><a href="#按深度学习架构分类" class="headerlink" title="按深度学习架构分类"></a>按深度学习架构分类</h3><ul><li>Recurrent neural network / RNN / LSTM / GRU </li><li>Autoencoder</li></ul><h3 id="按应用分类"><a href="#按应用分类" class="headerlink" title="按应用分类"></a>按应用分类</h3><ul><li>Cloud native</li><li>Micro-service</li></ul><h2 id="🌟-目前正在维护的顶会列表"><a href="#🌟-目前正在维护的顶会列表" class="headerlink" title="🌟 目前正在维护的顶会列表"></a>🌟 目前正在维护的顶会列表</h2><ul><li>NeurIPS</li><li>SIGKDD</li><li>IJCAI</li><li>ICML</li><li>SIGIR</li><li>CVPR</li><li>WWW</li><li>ICLR</li><li>AAAI</li><li>SIGMOD</li><li>NDSS</li><li>ESEC/FSE</li><li>ICSE</li><li>ASE</li><li>ISSRE</li></ul><h2 id="🌟-制作新页面的规范"><a href="#🌟-制作新页面的规范" class="headerlink" title="🌟 制作新页面的规范"></a>🌟 制作新页面的规范</h2><p>⚠️⚠️⚠️ 由于配置原因,最好将文章中的<strong>全角符号</strong>替换为<strong>半角符号+空格</strong>,以获得最佳的字体效果</p><h3 id="页面配置"><a href="#页面配置" class="headerlink" title="页面配置"></a>页面配置</h3><ol><li>对于新的标签页</li></ol><ul><li>可以直接在<code><hexo_path></code>使用命令<code>hexo new page <new_page_name></code></li><li>可以在<code><hexo_path>/source</code>下新建相应的文件夹,并手动在其中新建<code>index.md</code>文件</li></ul><pre class="line-numbers language-console" data-language="console"><code class="language-console">$ cd <hexo_path>/source$ mkdir <new_folder>$ cd <new_folder>$ vim index.md<span aria-hidden="true" class="line-numbers-rows"><span></span><span></span><span></span><span></span></span></code></pre><p>随后在<code><hexo_path>/_config.mashiro.yml</code>中的<code>menu</code>字段进行配置:<code><display_name>: /<new_page_name></code></p><ol start="2"><li><code>index.md</code>文件的书写规范</li></ol><p>⚠️ 仅有在<code><hexo_path>/source/_post</code>中的文档,其中的<code>tag</code>字段可以被正确索引,其他目录中若包含<code>tag</code>字段则会导致部署失败。</p><ol start="3"><li>普通markdown文件的书写规范</li></ol><p>文件头示例如下,后续的文章标题从<strong>二级目录</strong>开始书写。</p><pre class="line-numbers language-yaml" data-language="yaml"><code class="language-yaml"><span class="token punctuation">---</span><span class="token key atrule">title</span><span class="token punctuation">:</span> Some Notes<span class="token key atrule">updated</span><span class="token punctuation">:</span> <span class="token datetime number">2022-04-12 20:00:00</span><span class="token key atrule">date</span><span class="token punctuation">:</span> <span class="token datetime number">2022-04-12 16:07:32</span><span class="token key atrule">tag</span><span class="token punctuation">:</span> <span class="token punctuation">-</span> note<span class="token punctuation">-</span> 5G<span class="token punctuation">-</span> 4G<span class="token punctuation">-</span> 5G NR<span class="token punctuation">---</span><span aria-hidden="true" class="line-numbers-rows"><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span></span></code></pre><ol start="4"><li><p>在适当的位置添加<code><--! more --></code>以控制文章在首页显示的摘要内容。</p></li><li><p>文档内部引用</p></li></ol><p>⚠️ 需要参考<code><hexo_path>/public</code>中生成的文档结构。一般的,<code><hexo_path>/source/_post</code>中<code><dir_name>/<file_name>.md</code>文档将存储在<code><hexo_path>/public/uncategorized/<dir_name>/<file_name>/index.html</code></p><p>⚠️⚠️ hexo现已配置为无须<code>/index.html</code>后缀,文章内引用请使用<code>/uncategorized/<dir_name>/<file_name></code></p><h3 id="页面部署"><a href="#页面部署" class="headerlink" title="页面部署"></a>页面部署</h3><ol><li>页面本地测试</li></ol><p>访问<a href="https://localhost:4000以测试">https://localhost:4000以测试</a></p><pre class="line-numbers language-console" data-language="console"><code class="language-console">$ hexo clean && hexo g && hexo s<span aria-hidden="true" class="line-numbers-rows"><span></span></span></code></pre><ol start="2"><li>页面部署</li></ol><p>官方方法:代码如下。注意<code>font-spider</code>那一行,您的<code>public</code>文件夹中最深的html文件嵌套了几层,就应当在后面写几层的通配符,可以将上面这些语句保存为一个脚本文件,部署时运行一下就行了。</p><pre class="line-numbers language-console" data-language="console"><code class="language-console">$ hexo clean$ hexo g$ font-spider public/*.html public/*/*.html public/*/*/*.html public/*/*/*/*.html public/*/*/*/*/*.html$ hexo deploy<span aria-hidden="true" class="line-numbers-rows"><span></span><span></span><span></span><span></span></span></code></pre><p>⚠️⚠️ 上述代码已写入<code><hexo_path>/mydeploy.sh</code>,运行<code>sudo ./mydeploy.sh</code></p><h2 id="基于模版的小修改"><a href="#基于模版的小修改" class="headerlink" title="基于模版的小修改"></a>基于模版的小修改</h2><h3 id="添加文章更新时间-并按更新时间倒序排列文章"><a href="#添加文章更新时间-并按更新时间倒序排列文章" class="headerlink" title="添加文章更新时间, 并按更新时间倒序排列文章"></a>添加文章更新时间, 并按更新时间倒序排列文章</h3><p>本站模版中没有内置更新时间的显示,需要按需对模版做以下修改</p><h4 id="更改Hexo策略"><a href="#更改Hexo策略" class="headerlink" title="更改Hexo策略"></a>更改Hexo策略</h4><ol><li>调整Hexo中post的默认生成格式:在<code>{hexo_path}/scaffolds/post.md</code>中指定默认模版样式</li></ol><pre class="line-numbers language-yaml" data-language="yaml"><code class="language-yaml"><span class="token punctuation">---</span><span class="token key atrule">title</span><span class="token punctuation">:</span> <span class="token punctuation">{</span><span class="token punctuation">{</span> title <span class="token punctuation">}</span><span class="token punctuation">}</span><span class="token key atrule">date</span><span class="token punctuation">:</span> <span class="token punctuation">{</span><span class="token punctuation">{</span> date <span class="token punctuation">}</span><span class="token punctuation">}</span><span class="token key atrule">updated</span><span class="token punctuation">:</span> <span class="token punctuation">{</span><span class="token punctuation">{</span> date <span class="token punctuation">}</span><span class="token punctuation">}</span><span class="token punctuation">---</span><span aria-hidden="true" class="line-numbers-rows"><span></span><span></span><span></span><span></span><span></span></span></code></pre><ol start="2"><li>开启<code>updated</code>排序: 在<code>.md</code>文件的头部加入<code>updated字段</code>, 并完善该字段</li><li>调整 Hexo 主配置文件: 在<code><hexo_path>/_config.yml</code>中更新文章排序为<u>按照更新时间排序</u></li></ol><pre class="line-numbers language-yaml" data-language="yaml"><code class="language-yaml"><span class="token comment"># Home page setting</span><span class="token comment"># path: Root path for your blogs index page. (default = '')</span><span class="token comment"># per_page: Posts displayed per page. (0 = disable pagination)</span><span class="token comment"># order_by: Posts order. (Order by date descending by default)</span><span class="token key atrule">index_generator</span><span class="token punctuation">:</span> <span class="token key atrule">path</span><span class="token punctuation">:</span> <span class="token string">''</span> <span class="token key atrule">per_page</span><span class="token punctuation">:</span> <span class="token number">10</span> <span class="token key atrule">order_by</span><span class="token punctuation">:</span> <span class="token punctuation">-</span>updated<span aria-hidden="true" class="line-numbers-rows"><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span></span></code></pre><h4 id="更改模版样式"><a href="#更改模版样式" class="headerlink" title="更改模版样式"></a>更改模版样式</h4><p>原主题默认没有更新时间显示的,所以需要自己新增更新时间显示。记主题文件夹为<code>{theme_path}</code>, 一般位于<code>{hexo_path}/themes/{theme_name}</code></p><ol><li>确定修改范围</li></ol><p>查看了<code>{theme_path}/layout</code>文件夹下的<code>post.ejs</code>文件,里面引入的是 <code>_partial/article</code>的内容。找到<code>{theme_path}/layout/_partial</code>文件夹下的 <code>article.ejs</code>。其中关于时间的内容如下,说明日期部分引用了<code>post/date</code>,样式文件为<code>article-date</code></p><pre class="line-numbers language-javascript" data-language="javascript"><code class="language-javascript"><span class="token operator"><</span>div <span class="token keyword">class</span><span class="token operator">=</span><span class="token string">"article-meta"</span><span class="token operator">></span><span class="token operator"><</span><span class="token operator">%</span><span class="token operator">-</span> <span class="token function">partial</span><span class="token punctuation">(</span><span class="token string">'post/date'</span><span class="token punctuation">,</span> <span class="token punctuation">{</span><span class="token literal-property property">class_name</span><span class="token operator">:</span> <span class="token string">'article-date'</span><span class="token punctuation">,</span> <span class="token literal-property property">date_format</span><span class="token operator">:</span> <span class="token keyword">null</span><span class="token punctuation">}</span><span class="token punctuation">)</span> <span class="token operator">%</span><span class="token operator">></span><span class="token operator"><</span><span class="token operator">%</span><span class="token operator">-</span> <span class="token function">partial</span><span class="token punctuation">(</span><span class="token string">'post/category'</span><span class="token punctuation">)</span> <span class="token operator">%</span><span class="token operator">></span><span class="token operator"><</span><span class="token operator">/</span>div<span class="token operator">></span><span aria-hidden="true" class="line-numbers-rows"><span></span><span></span><span></span><span></span></span></code></pre><p>修改范围即为:</p><ul><li><code>{theme_path}/layout/_partial/article.ejs</code></li><li><code>{theme_path}/layout/_partial/post/date.ejs</code></li><li><code>{theme_path}/source/css/_partial/article.styl</code></li></ul><ol start="2"><li>新建<code>updated.ejs</code></li></ol><p>考虑兼容性,模仿<code>{theme_path}/layout/_partial/post/date.ejs</code>新建一个<code>updated.ejs</code>文件。</p><p>修改<code>date.ejs</code>如下,其中仅添加了<code><%= __('published') %></code>字段,用于定义语言。</p><pre class="line-numbers language-javascript" data-language="javascript"><code class="language-javascript"><span class="token operator"><</span>a href<span class="token operator">=</span><span class="token string">"<%- url_for(post.path) %>"</span> <span class="token keyword">class</span><span class="token operator">=</span><span class="token string">"<%= class_name %>"</span><span class="token operator">></span> <span class="token operator"><</span><span class="token operator">%=</span> <span class="token function">__</span><span class="token punctuation">(</span><span class="token string">'published'</span><span class="token punctuation">)</span> <span class="token operator">%</span><span class="token operator">></span> <span class="token operator"><</span>time <span class="token keyword">class</span><span class="token operator">=</span><span class="token string">"dt-published"</span> datetime<span class="token operator">=</span><span class="token string">"<%= date_xml(post.date) %>"</span> itemprop<span class="token operator">=</span><span class="token string">"datePublished"</span><span class="token operator">></span><span class="token operator"><</span><span class="token operator">%=</span> <span class="token function">date</span><span class="token punctuation">(</span>post<span class="token punctuation">.</span>date<span class="token punctuation">,</span> date_format<span class="token punctuation">)</span> <span class="token operator">%</span><span class="token operator">></span><span class="token operator"><</span><span class="token operator">/</span>time<span class="token operator">></span><span class="token operator"><</span><span class="token operator">/</span>a<span class="token operator">></span><span aria-hidden="true" class="line-numbers-rows"><span></span><span></span><span></span></span></code></pre><p>新建相似的<code>updated.ejs</code>如下,其中定义了<code><%= __('updated') %></code>字段,用于定义语言。</p><pre class="line-numbers language-javascript" data-language="javascript"><code class="language-javascript"><span class="token operator"><</span>a href<span class="token operator">=</span><span class="token string">"<%- url_for(post.path) %>"</span> <span class="token keyword">class</span><span class="token operator">=</span><span class="token string">"<%= class_name %>"</span><span class="token operator">></span> <span class="token operator"><</span><span class="token operator">%=</span> <span class="token function">__</span><span class="token punctuation">(</span><span class="token string">'updated'</span><span class="token punctuation">)</span> <span class="token operator">%</span><span class="token operator">></span> <span class="token operator"><</span>time <span class="token keyword">class</span><span class="token operator">=</span><span class="token string">"dt-published"</span> datetime<span class="token operator">=</span><span class="token string">"<%= date_xml(post.updated) %>"</span> itemprop<span class="token operator">=</span><span class="token string">"dateUpdated"</span><span class="token operator">></span><span class="token operator"><</span><span class="token operator">%=</span> <span class="token function">date</span><span class="token punctuation">(</span>post<span class="token punctuation">.</span>updated<span class="token punctuation">,</span> date_format<span class="token punctuation">)</span> <span class="token operator">%</span><span class="token operator">></span><span class="token operator"><</span><span class="token operator">/</span>time<span class="token operator">></span><span class="token operator"><</span><span class="token operator">/</span>a<span class="token operator">></span><span aria-hidden="true" class="line-numbers-rows"><span></span><span></span><span></span><span></span></span></code></pre><ol start="3"><li>增加字段语言定义</li></ol><p>由于<code>date.ejs</code>和<code>updated.ejs</code>中分别引入了<code>published</code>和<code>updated</code>字段,因此需要在语言文件中新增对应的字段。<br>语言文件在 <code>{theme_path}/languages</code>中,按照主题的语言设置,修改对应的语言文件,没指定就修改<code>default.yml</code>文件。</p><p>在<code>{theme_path}/languages/zh-CN.yml</code>中增加下述内容,格式为<code>字段名: 字段值</code></p><pre class="line-numbers language-yaml" data-language="yaml"><code class="language-yaml"><span class="token key atrule">published</span><span class="token punctuation">:</span> 发布于<span class="token key atrule">updated</span><span class="token punctuation">:</span> 更新于<span aria-hidden="true" class="line-numbers-rows"><span></span><span></span></span></code></pre><ol start="4"><li>修改<code>article.ejs</code></li></ol><p>该文件中记录了文章日期列的显示内容,在已有的发布时间后增加更新时间。</p><pre class="line-numbers language-javascript" data-language="javascript"><code class="language-javascript"><span class="token operator"><</span>div <span class="token keyword">class</span><span class="token operator">=</span><span class="token string">"article-meta"</span><span class="token operator">></span><span class="token operator"><</span><span class="token operator">%</span><span class="token operator">-</span> <span class="token function">partial</span><span class="token punctuation">(</span><span class="token string">'post/date'</span><span class="token punctuation">,</span> <span class="token punctuation">{</span><span class="token literal-property property">class_name</span><span class="token operator">:</span> <span class="token string">'article-date'</span><span class="token punctuation">,</span> <span class="token literal-property property">date_format</span><span class="token operator">:</span> <span class="token keyword">null</span><span class="token punctuation">}</span><span class="token punctuation">)</span> <span class="token operator">%</span><span class="token operator">></span><span class="token operator"><</span><span class="token operator">%</span><span class="token operator">-</span> <span class="token function">partial</span><span class="token punctuation">(</span><span class="token string">'post/updated'</span><span class="token punctuation">,</span> <span class="token punctuation">{</span><span class="token literal-property property">class_name</span><span class="token operator">:</span> <span class="token string">'article-date'</span><span class="token punctuation">,</span> <span class="token literal-property property">date_format</span><span class="token operator">:</span> <span class="token keyword">null</span><span class="token punctuation">}</span><span class="token punctuation">)</span> <span class="token operator">%</span><span class="token operator">></span><span class="token operator"><</span><span class="token operator">%</span><span class="token operator">-</span> <span class="token function">partial</span><span class="token punctuation">(</span><span class="token string">'post/category'</span><span class="token punctuation">)</span> <span class="token operator">%</span><span class="token operator">></span><span class="token operator"><</span><span class="token operator">/</span>div<span class="token operator">></span><span aria-hidden="true" class="line-numbers-rows"><span></span><span></span><span></span><span></span><span></span></span></code></pre><ol start="5"><li>对日期的样式进行微调</li></ol><p>由于发布时间于更新时间的间隔太小,影响美观,因此在<code>{theme_path}/source/css/_partial/article.styl</code>中找到样式<code>article-date</code>,进行样式微调。</p><p>原样式:</p><pre class="line-numbers language-css" data-language="css"><code class="language-css">.article-date @extend $block-caption <span class="token property">float</span><span class="token punctuation">:</span> left<span aria-hidden="true" class="line-numbers-rows"><span></span><span></span><span></span></span></code></pre><p>在样式后增加一个空格,更新为</p><pre class="line-numbers language-css" data-language="css"><code class="language-css">.article-date @extend $block-caption <span class="token property">float</span><span class="token punctuation">:</span> left &<span class="token punctuation">:</span>after <span class="token property">content</span><span class="token punctuation">:</span> <span class="token string">"\00a0"</span><span aria-hidden="true" class="line-numbers-rows"><span></span><span></span><span></span><span></span><span></span></span></code></pre><p>©️ 本节内容部分参考 <a href="https://blog.vanxnf.top/2018/09/03/Hexo-%E5%8D%9A%E5%AE%A2-Hiker-%E4%B8%BB%E9%A2%98%E5%A2%9E%E5%8A%A0%E6%96%87%E7%AB%A0%E6%9C%80%E5%90%8E%E7%BC%96%E8%BE%91%E6%97%B6%E9%97%B4%EF%BC%8C%E5%B9%B6%E6%8C%89%E7%85%A7%E6%9C%80%E5%90%8E%E7%BC%96%E8%BE%91%E6%97%B6%E9%97%B4%E6%8E%92%E5%BA%8F/">link</a></p><h2 id="参考"><a href="#参考" class="headerlink" title="参考"></a>参考</h2><ol><li>本站使用的模版:<a href="https://github.com/bill-xia/hexo-theme-mashiro">hexo-theme-mashiro</a></li><li>TODO:站内搜索 <a href="https://liam.page/2017/09/21/local-search-engine-in-Hexo-site/">link</a></li></ol>]]></content>
<tags>
<tag> note </tag>
<tag> hexo </tag>
</tags>
</entry>
<entry>
<title>Hotpaper in 2021-2022 (Root Cause, Correlation)</title>
<link href="/uncategorized/paperlistfile/Hotpaper%20in%202021-2022%20(Root%20Cause,%20Correlation)/"/>
<url>/uncategorized/paperlistfile/Hotpaper%20in%202021-2022%20(Root%20Cause,%20Correlation)/</url>
<content type="html"><![CDATA[<p>本文整理2021-2022的关注的各大主题的热门论文。记得读后及时做笔记!</p><span id="more"></span><h2 id="Root-Cause"><a href="#Root-Cause" class="headerlink" title="Root Cause"></a>Root Cause</h2><table><thead><tr><th>title</th><th>authors</th><th>citation</th><th>conf</th><th>year</th><th>is read</th><th>note</th><th>the point</th></tr></thead><tbody><tr><td>Diagnosing Root Causes of Intermittent Slow Queries in Large-Scale Cloud Databases</td><td>Minghua Ma,Zheng Yin,Shenglin Zhang,Sheng Wang,Christopher Zheng,Xinhao Jiang,Hanwen Hu,Cheng Luo,Yilin Li,Nengjun Qiu,Feifei Li,Changcheng Chen,Dan Pei</td><td>42</td><td>VLDB</td><td>2019</td><td></td><td></td><td></td></tr><tr><td>Causal structure-based root cause analysis of outliers</td><td>Kailash Budhathoki,Lenon Minorics,Patrick Blöbaum,Dominik Janzing</td><td>10</td><td>ICML</td><td>2022</td><td></td><td></td><td></td></tr><tr><td>CloudRCA: A Root Cause Analysis Framework for Cloud Computing Platforms</td><td>Yingying Zhang,Zhengxiong Guan,Huajie Qian,Leili Xu,Hengbo Liu,Qingsong Wen,Liang Sun,Junwei Jiang,Lunting Fan,Min Ke</td><td>8</td><td>CIKM</td><td>2021</td><td></td><td></td><td></td></tr><tr><td>System Deterioration Detection and Root Cause Learning on Time Series Graphs</td><td>Hao Huang,Shinjae Yoo,Yunwen Xu</td><td>2</td><td>CIKM</td><td>2019</td><td></td><td></td><td></td></tr><tr><td>ICASSP-SPGC 2022: Root Cause Analysis for Wireless Network Fault Localization</td><td>Tianjian Zhang,Qian Chen,Yi Jiang,Dandan Miao,Feng Yin,Tao Quan,Qingjiang Shi,Zhi-Quan Luo</td><td>1</td><td>ICASSP</td><td>2022</td><td></td><td></td><td></td></tr><tr><td>Revelio: ML-Generated Debugging Queries for Finding Root Causes in Distributed Systems</td><td>Pradeep Dogga,Karthik Narasimhan,Anirudh Sivaraman,Shiv Saini,George Varghese,Ravi Netravali</td><td>1</td><td>MLSYS</td><td>2022</td><td></td><td></td><td></td></tr><tr><td>Causal Alignment Based Fault Root Causes Localization for Wireless Network</td><td>Yuequn Liu,Wenhui Zhu,Jie Qiao,Zhiyi Huang,Yu Xiang,Xuanzhi Chen,Wei Chen,Ruichu Cai</td><td>0</td><td>ICASSP</td><td>2022</td><td></td><td></td><td></td></tr><tr><td>Three-Stage Root Cause Analysis for Logistics Time Efficiency via Explainable Machine Learning</td><td>Shiqi Hao,Yang Liu,Yu Wang,Yuan Wang,Wenming Zhe</td><td>0</td><td>KDD</td><td>2022</td><td></td><td></td><td></td></tr><tr><td>CMMD: Cross-Metric Multi-Dimensional Root Cause Analysis</td><td>Shifu Yan,Caihua Shan,Wenyi Yang,Bixiong Xu,Dongsheng Li,Lili Qiu,Jie Tong,Qi Zhang</td><td>0</td><td>KDD</td><td>2022</td><td></td><td></td><td></td></tr><tr><td>Root Cause Analysis of Failures in Microservices through Causal Discovery</td><td>Muhammad Azam Ikram,Sarthak Chakraborty,Subrata Mitra,Shiv Saini,Saurabh Bagchi,Murat Kocaoglu</td><td>0</td><td>NIPS</td><td>2022</td><td></td><td></td><td></td></tr><tr><td>Causal Inference-Based Root Cause Analysis for Online Service Systems with Intervention Recognition</td><td>Mingjie Li,Zeyan Li,Kanglin Yin,Xiaohui Nie,Wenchi Zhang,Kaixin Sui,Dan Pei</td><td>-1</td><td>KDD</td><td>2022</td><td></td><td></td><td></td></tr><tr><td>ExplainIt! - A Declarative Root-cause Analysis Engine for Time Series Data</td><td>Vimalkumar Jeyakumar,Omid Madani,Ali Parandeh,Ashutosh Kulshreshtha,Weifei Zeng,Navindra Yadav</td><td>-1</td><td>SIGMOD</td><td>2019</td><td></td><td></td><td></td></tr></tbody></table><h2 id="Correlation-analysis"><a href="#Correlation-analysis" class="headerlink" title="Correlation analysis"></a>Correlation analysis</h2><table><thead><tr><th>title</th><th>authors</th><th>citation</th><th>conf</th><th>year</th><th>is read</th><th>note</th><th>the point</th></tr></thead><tbody><tr><td>A Survey on Canonical Correlation Analysis</td><td>Xinghao Yang,Weifeng Liu,Wei Liu,Dacheng Tao</td><td>76</td><td>TKDE</td><td>2021</td><td></td><td></td><td></td></tr><tr><td>From Canonical Correlation Analysis to Self-supervised Graph Neural Networks</td><td>Hengrui Zhang,Qitian Wu,Junchi Yan,David Wipf,Philip S Yu</td><td>36</td><td>NIPS</td><td>2021</td><td></td><td></td><td></td></tr><tr><td>Large-Scale Sparse Kernel Canonical Correlation Analysis</td><td>Viivi Uurtio,Sahely Bhadra,Juho Rousu</td><td>28</td><td>ICML</td><td>2019</td><td></td><td></td><td></td></tr><tr><td>Cross-Modal Subspace Clustering via Deep Canonical Correlation Analysis</td><td>Quanxue Gao,Huanhuan Lian,Qianqian Wang,Gan Sun</td><td>25</td><td>AAAI</td><td>2020</td><td></td><td></td><td></td></tr><tr><td>Stochastic Canonical Correlation Analysis</td><td>Chao Gao,Dan Garber,Nathan Srebro,Jialei Wang,Weiran Wang</td><td>23</td><td>JMLR</td><td>2020</td><td></td><td></td><td></td></tr><tr><td>Multi-Modal Sentiment Analysis Using Deep Canonical Correlation Analysis</td><td>Zhongkai Sun,Prathusha Kameswara Sarma,William A. Sethares,Erik P. Bucy</td><td>21</td><td>INTERSPEECH</td><td>2019</td><td></td><td></td><td></td></tr><tr><td>Deep RGB-D Canonical Correlation Analysis For Sparse Depth Completion</td><td>Yiqi Zhong,Cho-Ying Wu,Suya You,Ulrich Neumann</td><td>19</td><td>NIPS</td><td>2019</td><td></td><td></td><td></td></tr><tr><td>A Comparative Study of Deep Neural Network-Aided Canonical Correlation Analysis-Based Process Monitoring and Fault Detection Methods</td><td>Zhiwen Chen,Ketian Liang,Steven X. Ding,Chao Yang,Tao Peng,Xiaofeng Yuan</td><td>12</td><td>TNNLS</td><td>2022</td><td></td><td></td><td></td></tr><tr><td>Tensor Canonical Correlation Analysis Networks for Multi-View Remote Sensing Scene Recognition</td><td>Xinghao Yang,Weifeng Liu,Wei Liu</td><td>9</td><td>TKDE</td><td>2022</td><td></td><td></td><td></td></tr><tr><td>Multiview Canonical Correlation Analysis over Graphs</td><td>Jia Chen,Gang Wang,Georgios B. Giannakis</td><td>6</td><td>ICASSP</td><td>2019</td><td></td><td></td><td></td></tr><tr><td>An Online Riemannian PCA for Stochastic Canonical Correlation Analysis</td><td>Zihang Meng,Rudrasis Chakraborty,Vikas Singh</td><td>6</td><td>NIPS</td><td>2021</td><td></td><td></td><td></td></tr><tr><td>Estimating Viewed Image Categories from Human Brain Activity via Semi-supervised Fuzzy Discriminative Canonical Correlation Analysis</td><td>Yusuke Akamatsu,Ryosuke Harakawa,Takahiro Ogawa,Miki Haseyama</td><td>4</td><td>ICASSP</td><td>2019</td><td></td><td></td><td></td></tr><tr><td>Grayscale-thermal Tracking via Canonical Correlation Analysis Based Inverse Sparse Representation</td><td>Wan Ding,Bin Kang,Quan Zhou,Min Lin,Suofei Zhang</td><td>4</td><td>ICASSP</td><td>2019</td><td></td><td></td><td></td></tr><tr><td>Multiview Variational Graph Autoencoders for Canonical Correlation Analysis</td><td>Yacouba Kaloga,Pierre Borgnat,Sundeep Prabhakar Chepuri,Patrice Abry,Amaury Habrard</td><td>4</td><td>ICASSP</td><td>2021</td><td></td><td></td><td></td></tr><tr><td>Deep Probabilistic Canonical Correlation Analysis</td><td>Mahdi Karami,Dale Schuurmans</td><td>3</td><td>AAAI</td><td>2021</td><td></td><td></td><td></td></tr><tr><td>Deep Multiway Canonical Correlation Analysis For Multi-Subject Eeg Normalization</td><td>Jaswanth Reddy Katthi,Sriram Ganapathy</td><td>3</td><td>ICASSP</td><td>2021</td><td></td><td></td><td></td></tr><tr><td>Longitudinal Correlation Analysis for Decoding Multi-modal Brain Development</td><td>Qingyu Zhao,Ehsan Adeli,Kilian M. Pohl</td><td>3</td><td>MICCAI</td><td>2021</td><td></td><td></td><td></td></tr><tr><td>Graph Wasserstein Correlation Analysis for Movie Retrieval</td><td>Xueya Zhang,Tong Zhang,Xiaobin Hong,Zhen Cui,Jian Yang</td><td>2</td><td>ECCV</td><td>2020</td><td></td><td></td><td></td></tr><tr><td>Towards Cross-modality Topic Modelling via Deep Topical Correlation Analysis</td><td>Jun Peng,Yiyi Zhou,Liujuan Cao,Xiaoshuai Sun,Jinsong Su,Rongrong Ji</td><td>2</td><td>ICASSP</td><td>2019</td><td></td><td></td><td></td></tr><tr><td>Supervised Canonical Correlation Analysis of Data on Symmetric Positive Definite Manifolds by Riemannian Dimensionality Reduction</td><td>Faezeh Fallah,Bin Yang</td><td>2</td><td>ICASSP</td><td>2020</td><td></td><td></td><td></td></tr><tr><td>Generalized Autocorrelation Analysis for Multi-Target Detection</td><td>Ye’Ela Shalit,Ran Weber,Asaf Abas,Shay Kreymer,Tamir Bendory</td><td>2</td><td>ICASSP</td><td>2022</td><td></td><td></td><td></td></tr><tr><td>Distributed Differentially-private Canonical Correlation Analysis</td><td>Hafiz Imtiaz,Anand D. Sarwate</td><td>1</td><td>ICASSP</td><td>2019</td><td></td><td></td><td></td></tr><tr><td>Discriminative Feature Selection Guided Deep Canonical Correlation Analysis</td><td>Nour El-Din El-Madany,Yifeng He,Ling Guan</td><td>1</td><td>ICASSP</td><td>2019</td><td></td><td></td><td></td></tr><tr><td>Blind Carbon Copy on Dirty Paper: Seamless Spectrum Underlay via Canonical Correlation Analysis</td><td>Mohamed Salah Ibrahim,Nicholas D. Sidiropoulos</td><td>1</td><td>ICASSP</td><td>2021</td><td></td><td></td><td></td></tr><tr><td>Multisubject Task-Related fMRI Data Processing via a Two-Stage Generalized Canonical Correlation Analysis</td><td>Paris A. Karakasis,Athanasios P. Liavas,Nicholas D. Sidiropoulos,Panagiotis G. Simos,Efrosini Papadaki</td><td>1</td><td>TIP</td><td>2022</td><td></td><td></td><td></td></tr><tr><td>The Magnitude and Phase based Speech Representation Learning using Autoencoder for Classifying Speech Emotions using Deep Canonical Correlation Analysis</td><td>Ashishkumar Prabhakar Gudmalwar,Biplove Basel,Anirban Dutta,Ch V. Rama Rao</td><td>0</td><td>INTERSPEECH</td><td>2022</td><td></td><td></td><td></td></tr><tr><td>Combining Multiple Behavioral Measures and Multiple Connectomes via Multipath Canonical Correlation Analysis</td><td>Siyuan Gao,Xilin Shen,R. Todd Constable,Dustin Scheinost</td><td>0</td><td>MICCAI</td><td>2019</td><td></td><td></td><td></td></tr><tr><td>Time and Memory Efficient Large-Scale Canonical Correlation Analysis in Fourier Domain</td><td>Xiang-Jun Shen,Zhaorui Xu,Liangjun Wang,Zechao Li</td><td>0</td><td>MM</td><td>2022</td><td></td><td></td><td></td></tr><tr><td>A Self-Consistent-Field Iteration for Orthogonal Canonical Correlation Analysis</td><td>Lei-Hong Zhang,Li Wang,Zhaojun Bai,Ren-Cang Li</td><td>0</td><td>TPAMI</td><td>2022</td><td></td><td></td><td></td></tr><tr><td>L0-Sparse Canonical Correlation Analysis</td><td>Ofir Lindenbaum,Moshe Salhov,Amir Averbuch,Yuval Kluger</td><td>-1</td><td>ICLR</td><td>2022</td><td></td><td></td><td></td></tr></tbody></table>]]></content>
<tags>
<tag> paper list </tag>
</tags>
</entry>
<entry>
<title>Related Papers in ICLR 2022 (2022.04.25)</title>
<link href="/uncategorized/paperlistfile/ICLR2022/"/>
<url>/uncategorized/paperlistfile/ICLR2022/</url>
<content type="html"><![CDATA[<p><a href="https://openreview.net/group?id=ICLR.cc/2022/Conference">Accept paper list</a></p><span id="more"></span><h2 id="anomaly-detection-anomaly-outlier-out-of-distribution-one-class"><a href="#anomaly-detection-anomaly-outlier-out-of-distribution-one-class" class="headerlink" title="anomaly detection [anomaly, outlier, out-of-distribution, one-class]"></a>anomaly detection [anomaly, outlier, out-of-distribution, one-class]</h2><ul><li><p>Anomaly Detection for Tabular Data with Internal Contrastive Learning</p><p>Tom Shenkar, Lior Wolf</p><p><strong>摘要</strong>: 我们考虑在表格数据中寻找类外样本的任务,其中几乎不能假设数据的结构。<br>为了捕捉单个训练类样本的结构,我们学习了最大化每个样本与被屏蔽部分之间的互信息的映射。通过使用对比损失来学习映射,该损失一次只考虑一个样本。一旦学习,我们可以通过使用该样本的掩码部分测量学习的映射是否导致小的对比损失来对测试样本进行评分。我们的实验表明,与文献相比,我们的方法存在相当大的准确性差距,并且相同的默认超参数集在基准测试中提供了最先进的结果。</p><p><strong>一句话总结</strong>: 一种基于预测向量中被屏蔽部分的能力的异常检测方法。</p></li><li><p>Igeood: An Information Geometry Approach to Out-of-Distribution Detection </p><p>Eduardo Dadalto Camara Gomes, Florence Alberge, Pierre Duhamel, Pablo Piantanida</p><p><strong>摘要</strong>:可靠的分布外 (OOD) 检测是实现更安全的现代机器学习 (ML) 系统的基础。在本文中,我们介绍了 Igeood,一种检测 OOD 样本的有效方法。Igeood 适用于任何预训练的神经网络,在对 ML 模型的不同程度的访问下工作,不需要 OOD 样本或对 OOD 数据的假设,但也可以从 OOD 样本中受益(如果有的话)。通过建立基础数据分布之间的测地线(Fisher-Rao)距离,我们的鉴别器可以结合来自 logits 输出的置信度分数和深度神经网络的学习特征。根据经验,我们表明 Igeood 在各种网络架构和数据集上优于竞争的最先进方法。</p><p><strong>一句话总结</strong>: 我们通过建立概率分布之间的 Fisher-Rao 距离,提出了一种灵活有效的分布外检测方法。</p></li></ul><ul><li><p>VOS: Learning What You Don’t Know by Virtual Outlier Synthesis </p><p>Xuefeng Du, Zhaoning Wang, Mu Cai, Yixuan Li</p><p><strong>摘要</strong>: 由于其在神经网络的安全部署中的重要性,分布外(OOD)检测最近受到了很多关注。主要挑战之一是模型缺乏来自未知数据的监督信号,因此可能会对 OOD 数据产生过度自信的预测。以前的方法依赖于真正的异常数据集来进行模型正则化,这在实践中可能代价高昂,有时甚至不可行。在本文中,我们提出了 VOS,这是一种新的 OOD 检测框架,通过自适应合成虚拟异常值,可以在训练期间有意义地规范模型的决策边界。具体来说,VOS 从特征空间中估计的类条件分布的低似然区域采样虚拟异常值。此外,我们引入了一个新颖的未知感知训练目标,它对比地塑造了 ID 数据和合成异常值数据之间的不确定性空间。VOS 在物体检测和图像分类模型上都取得了具有竞争力的性能,与之前在物体检测器上的最佳方法相比,FPR95 降低了高达 7.87%。代码可在 <a href="https://github.com/deeplearning-wisc/vos">https://github.com/deeplearning-wisc/vos</a> 获得。</p></li></ul><h2 id="Time-series"><a href="#Time-series" class="headerlink" title="Time series"></a>Time series</h2><ul><li><p>Pyraformer: Low-Complexity Pyramidal Attention for Long-Range Time Series Modeling and Forecasting </p><p>Shizhan Liu, Hang Yu, Cong Liao, Jianguo Li, Weiyao Lin, Alex X. Liu, Schahram Dustdar</p></li><li><p>CoST: Contrastive Learning of Disentangled Seasonal-Trend Representations for Time Series Forecasting </p><p>Gerald Woo, Chenghao Liu, Doyen Sahoo, Akshat Kumar, Steven Hoi</p></li><li><p>Huber Additive Models for Non-stationary Time Series Analysis </p><p>Yingjie Wang, Xianrui Zhong, Fengxiang He, Hong Chen, Dacheng Tao</p></li><li><p>DEPTS: Deep Expansion Learning for Periodic Time Series Forecasting </p><p>Wei Fan, Shun Zheng, Xiaohan Yi, Wei Cao, Yanjie Fu, Jiang Bian, Tie-Yan Liu</p></li><li><p>Reversible Instance Normalization for Accurate Time-Series Forecasting against Distribution Shift </p><p>Taesung Kim, Jinhee Kim, Yunwon Tae, Cheonbok Park, Jang-Ho Choi, Jaegul Choo</p></li><li><p>Omni-Scale CNNs: a simple and effective kernel size configuration for time series classification </p><p>Wensi Tang, Guodong Long, Lu Liu, Tianyi Zhou, Michael Blumenstein, Jing Jiang</p></li><li><p>T-WaveNet: A Tree-Structured Wavelet Neural Network for Time Series Signal Analysis </p><p>Minhao LIU, Ailing Zeng, Qiuxia LAI, Ruiyuan Gao, Min Li, Jing Qin, Qiang Xu</p></li><li><p>Graph-Guided Network for Irregularly Sampled Multivariate Time Series </p><p>Xiang Zhang, Marko Zeman, Theodoros Tsiligkaridis, Marinka Zitnik</p></li><li><p>Heteroscedastic Temporal Variational Autoencoder For Irregularly Sampled Time Series </p><p>Satya Narayan Shukla, Benjamin Marlin</p></li><li><p>Filling the G_ap_s: Multivariate Time Series Imputation by Graph Neural Networks </p><p>Andrea Cini, Ivan Marisca, Cesare Alippi</p></li></ul><ul><li><p>Coherence-based Label Propagation over Time Series for Accelerated Active Learning </p><p>Yooju Shin, Susik Yoon, Sundong Kim, Hwanjun Song, Jae-Gil Lee, Byung Suk Lee</p></li><li><p>PSA-GAN: Progressive Self Attention GANs for Synthetic Time Series </p><p>Paul Jeha, Michael Bohlke-Schneider, Pedro Mercado, Shubham Kapoor, Rajbir Singh Nirwan, Valentin Flunkert, Jan Gasthaus, Tim Januschowski</p></li></ul><h2 id="Sequence-learning"><a href="#Sequence-learning" class="headerlink" title="Sequence learning"></a>Sequence learning</h2><ul><li><p>Efficiently Modeling Long Sequences with Structured State Spaces </p><p>Albert Gu, Karan Goel, Christopher Re</p></li><li><p>Long Expressive Memory for Sequence Modeling </p><p>T. Konstantin Rusch, Siddhartha Mishra, N. Benjamin Erichson, Michael W. Mahoney</p></li><li><p><strong>【需要看看】</strong> On the approximation properties of recurrent encoder-decoder architectures<br>Zhong Li, Haotian Jiang, Qianxiao Li</p><p><strong>摘要</strong>: 编码器-解码器架构最近在序列到序列建模方面获得了普及,在最先进的模型(如转换器)中具有特色。然而,对其工作原理的数学理解仍然有限。在本文中,我们研究了循环编码器-解码器架构的近似特性。先前的工作为线性设置中的 RNN 建立了理论结果,其中近似能力可能与目标时间关系的平滑度和记忆有关。在这里,我们发现编码器和解码器一起形成了一个特定的“时间积结构”,它决定了逼近效率。此外,编码器-解码器架构泛化了具有学习时间非均匀关系的能力的 RNN。</p><p><strong>一句话总结</strong>: 给出了循环编码器-解码器架构的近似属性,其中形成的时间积结构进一步表征了能够有效学习<br>的时间关系。</p></li><li><p>Temporal Alignment Prediction for Supervised Representation Learning and Few-Shot Sequence Classification </p><p>Bing Su, Ji-Rong Wen</p></li></ul><h1 id="interpretable-interpretability"><a href="#interpretable-interpretability" class="headerlink" title="interpretable/interpretability"></a>interpretable/interpretability</h1><ul><li><p>Do Users Benefit From Interpretable Vision? A User Study, Baseline, And Dataset </p><p>Leon Sixt, Martin Schuessler, Oana-Iuliana Popescu, Philipp Weiß, Tim Landgraf</p></li><li><p>Toward Faithful Case-based Reasoning through Learning Prototypes in a Nearest Neighbor-friendly Space. </p><p>Seyed Omid Davoudi, Majid Komeili</p></li><li><p>Explaining Point Processes by Learning Interpretable Temporal Logic Rules </p><p>Shuang Li, Mingquan Feng, Lu Wang, Abdelmajid Essofi, Yufeng Cao, Junchi Yan, Le Song</p></li><li><p>Hidden Convexity of Wasserstein GANs: Interpretable Generative Models with Closed-Form Solutions </p><p>Arda Sahiner, Tolga Ergen, Batu Ozturkler, Burak Bartan, John M. Pauly, Morteza Mardani, Mert Pilanci</p></li><li><p>Model Agnostic Interpretability for Multiple Instance Learning </p><p>Joseph Early, Christine Evers, SArvapali Ramchurn</p></li><li><p>POETREE: Interpretable Policy Learning with Adaptive Decision Trees </p><p>Alizée Pace, Alex Chan, Mihaela van der Schaar</p></li><li><p>NODE-GAM: Neural Generalized Additive Model for Interpretable Deep Learning </p><p>Chun-Hao Chang, Rich Caruana, Anna Goldenberg</p></li></ul>]]></content>
<tags>
<tag> paper list </tag>
</tags>
</entry>
<entry>
<title>Related Papers in AAAI 2022 (2022.2.22)</title>
<link href="/uncategorized/paperlistfile/AAAI2022/"/>
<url>/uncategorized/paperlistfile/AAAI2022/</url>
<content type="html"><![CDATA[<p><a href="https://aaai.org/Conferences/AAAI-22/wp-content/uploads/2021/12/AAAI-22_Accepted_Paper_List_Main_Technical_Track.pdf">accept paper list</a></p><p><a href="https://mp.weixin.qq.com/s/N2tQOBnYOuC9ZwHG7PIx4w">时序相关论文一览</a></p><span id="more"></span><h2 id="Anomaly-detection-anomaly-outlier-out-of-distribution-one-class"><a href="#Anomaly-detection-anomaly-outlier-out-of-distribution-one-class" class="headerlink" title="Anomaly detection [anomaly, outlier, out-of-distribution, one-class]"></a>Anomaly detection [anomaly, outlier, out-of-distribution, one-class]</h2><ul><li><p>A Causal Inference Look at Unsupervised Video Anomaly Detection </p><p>Xiangru Lin, Yuyang Chen, Guanbin Li, Yizhou Yu</p></li><li><p>Comprehensive Regularization in a Bi-Directional Predictive Network for Video Anomaly Detection</p><p>Chengwei Chen, Yuan Xie, Shaohui Lin, Angela Yao, Guannan Jiang, Wei Zhang, Yanyun Qu, Ruizhi Qiao, Bo Ren, Lizhuang Ma</p></li></ul><ul><li><p>Towards a Rigorous Evaluation of Time-Series Anomaly Detection</p><p>Siwon Kim, Kukjin Choi, Hyun-Soo Choi, Byunghan Lee, Sungroh Yoon</p></li><li><p>Unsupervised Anomaly Detection by Robust Density Estimation</p><p>Boyang Liu, Pang-Ning Tan, Jiayu Zhou</p></li></ul><ul><li><p>Self-Training Multi-Sequence Learning with Transformer for Weakly Supervised Video Anomaly Detection</p><p>Shuo Li, Fang Liu, Licheng Jiao</p></li><li><p>Transferring the Contamination Factor between Anomaly Detection Domains by Shape Similarity</p><p>Lorenzo Perini, Vincent Vercruyssen, Jesse Davis</p></li></ul><h2 id="Heterogeneous-data"><a href="#Heterogeneous-data" class="headerlink" title="Heterogeneous data"></a>Heterogeneous data</h2><ul><li><p>Heterogeneous Facility Location with Limited Resources</p><p>Argyrios Deligkas, Aris Filos Ratsikas, Alexandros A. Voudouris</p></li><li><p>H^2-MIL: Exploring Hierarchical Representation with Heterogeneous Multiple Instance Learning for Whole Slide Image Analysis</p><p>Wentai Hou, Lequan Yu, Chengxuan Lin, Helong Huang, Rongshan Yu, Jing Qin, Liansheng Wang</p></li><li><p>FedProto: Federated Prototype Learning Across Heterogeneous Clients</p><p>Yue Tan, Guodong Long, Lu Liu, Tianyi Zhou, Qinghua Lu, Jing Jiang, Chengqi Zhang</p></li></ul><h2 id="Time-series"><a href="#Time-series" class="headerlink" title="Time series"></a>Time series</h2><ul><li><p>Conditional Loss and Deep Euler Scheme for Time Series Generation</p><p>Carl Remlinger, Joseph Mikael, Romuald Elie</p></li></ul><ul><li><p>Training Robust Deep Models for Time-Series Domain: Novel Algorithms and Theoretical Analysis</p><p>Taha Belkhouja, Yan Yan, Janardhan Rao Doppa</p></li></ul><ul><li><p>CATN: Cross Attentive Tree-Aware Network for Multivariate Time Series Forecasting</p><p>Hui He, Qi Zhang, Simeng Bai, Kun Yi, Zhendong Niu</p></li></ul><ul><li>Reinforcement Learning based Dynamic Model Combination for Time Series ForecastingYuwei Fu, Di Wu, Benoit Boulet</li></ul><ul><li><p><strong>【需要看看】</strong> TS2Vec: Towards Universal Representation of Time Series</p><p>Zhihan Yue, Yujing Wang, Juanyong Duan, Tianmeng Yang, Congrui Huang, Yunhai Tong, Bixiong Xu</p></li></ul><ul><li><p><strong>【需要看看】</strong> I-SEA: Importance Sampling and Expected Alignment-Based Deep Distance Metric Learning for Time Series Analysis and Embedding</p><p>Sirisha Rambhatla, Zhengping Che, Yan Liu</p></li></ul><ul><li><p>Clustering Interval-Censored Time-Series for Disease Phenotyping</p><p>Irene Y. Chen, Rahul G. Krishnan, David Sontag</p></li><li><p><strong>【需要看看】</strong> Learning Temporal Point Processes for Efficient Retrieval of Continuous Time Event Sequences</p><p>Vinayak Gupta, Srikanta Bedathur, Abir De</p></li></ul><h2 id="Sequence-learning"><a href="#Sequence-learning" class="headerlink" title="Sequence learning"></a>Sequence learning</h2><ul><li><p>Post-OCR Document Correction with Large Ensembles of Character Sequence-to-Sequence Models</p><p>Juan Ramirez-Orta, Eduardo Xamena, Ana Maguitman, Evangelos Milios, Axel J. Soto</p></li><li><p>Symbolic Brittleness in Sequence Models: On Systematic Generalization in Symbolic Mathematics</p><p>Sean Welleck, Peter West, Jize Cao, Yejin Choi</p></li></ul><h2 id="interpretable-interpretability"><a href="#interpretable-interpretability" class="headerlink" title="interpretable/interpretability"></a>interpretable/interpretability</h2><ul><li><p>Optimal Local Explainer Aggregation for Interpretable Prediction</p><p>Qiaomei Li, Rachel Cummings, Yonatan Mintz</p></li><li><p>LIMREF: Local Interpretable Model Agnostic Rule-Based Explanations for Forecasting, with an Application to Electricity Smart Meter Data</p><p>Dilini Rajapaksha, Christoph Bergmeir</p></li><li><p>Social Interpretable Tree for Pedestrian Trajectory Prediction</p><p>Liushuai Shi, Le Wang, Chengjiang Long, Sanping Zhou, Fang Zheng, Nanning Zheng, Gang Hua</p></li><li><p><strong>【需要看看】</strong> Adversarial Training for Improving Model Robustness? Look at Both Prediction and Interpretation</p><p>Hanjie Chen, Yangfeng Ji</p></li><li><p>Interpretable Clustering via Multi-Polytope Machines</p><p>Connor Lawless, Jayant Kalagnanam, Lam M. Nguyen, Dzung Phan, Chandra Reddy</p></li></ul><ul><li><p><strong>【需要看看】</strong> Interpretable Generative Adversarial Networks</p><p>Chao Li, Kelu Yao, Jin Wang, Boyu Diao, Yongjun Xu, Quanshi Zhang</p></li><li><p><strong>【需要看看】</strong> Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding</p><p>Zizhao Zhang, Han Zhang, Long Zhao, Ting Chen, Sercan . Arik, Tomas Pfister</p></li></ul>]]></content>
<tags>
<tag> paper list </tag>
</tags>
</entry>
<entry>
<title>5G NR(New Radio)规范</title>
<link href="/uncategorized/notes/5G_NR/"/>
<url>/uncategorized/notes/5G_NR/</url>
<content type="html"><![CDATA[<h2 id="基础概念解析"><a href="#基础概念解析" class="headerlink" title="基础概念解析"></a>基础概念解析</h2><h3 id="LTE与NR"><a href="#LTE与NR" class="headerlink" title="LTE与NR"></a>LTE与NR</h3><ul><li>LTE是4G时代的主要技术,且保留向后兼容的特性</li><li>NR是针对5G的新无线接入技术,全称为New Radio(新空口),NR借用了许多LTE的结构与功能,但作为新的无线接入技术,无需考虑向后兼容的问题</li></ul><span id="more"></span><h3 id="NR标准的制定-以及相关组织概念"><a href="#NR标准的制定-以及相关组织概念" class="headerlink" title="NR标准的制定, 以及相关组织概念"></a>NR标准的制定, 以及相关组织概念</h3><ul><li>ITU-R: 国际电联的无线通信部门</li><li>IMT: 国际移动通信,international mobile telecommunications</li><li>ITU-R WP5D: ITU-R的工作小组,负责IMT系统的无线方面的全部工作;ITU-R WP5D与各个国家与地区的标准化组织合作,对IMT进行定义,维护一系列<strong>IMT建议书</strong>与<strong>报告</strong>、<strong>无线接口规范</strong>(<strong>RSPC</strong>,ratio interface specification)。<ul><li>IMT建议书: 包括每一代IMT无线接口技术(RIT,radio interface technologies)</li><li>RSPC: 对每个RIT作出概述、对详细规范列出引用列表</li></ul></li></ul><h3 id="几个重要的RSPC建议书"><a href="#几个重要的RSPC建议书" class="headerlink" title="几个重要的RSPC建议书"></a>几个重要的RSPC建议书</h3><ul><li>IMT-2000: 包含六个RIT,主要包括WCDMA/HSPA等3G技术</li><li>IMT-Advanced: 包含两个RIT,主要包括4G/LTE技术</li><li>IMT-2020: 新的建议书,在2019-2020制定,主要包含5G技术</li></ul><h3 id="IMT-2020使用场景"><a href="#IMT-2020使用场景" class="headerlink" title="IMT-2020使用场景"></a>IMT-2020使用场景</h3><ul><li>增强的移动宽带通信(enhanced mobile broadband,eMBB): <em>以人为中心的通信场景</em>——延续3G/4G的主要驱动力——无线宽带;新的挑战包括: <strong>热点覆盖</strong>(高速率、高用户密度、高容量需求)、<strong>广域覆盖</strong>(低速率、低用户密度、移动性、无缝用户体验)</li><li>超可靠低时延通信(ultra-reliable and low-latency communications,URLLC): <em>以人、机器为中心的通信场景</em>——特点包括: 低时延、高可靠性、高可用性;典型场景: 3D游戏、触觉互联网</li><li>大规模机器类型通信(massive machine type communications,mMTC): <em>以机器为中心的通信场景</em>——特点包括: 终端规模巨大、数据量小、传输不频繁、延迟不敏感;新的挑战包括: <strong>一个系统中能容纳的总终端数量,以及如何降低终端成本</strong></li></ul><h3 id="IMT-2020能力集"><a href="#IMT-2020能力集" class="headerlink" title="IMT-2020能力集"></a>IMT-2020能力集</h3><p>规范定义了13种能力,其中8种为<strong>关键能力</strong></p><ul><li>关键能力(针对eMBB场景重要的)<ul><li>峰值数据速率(peak data rate): 理论议题,严重依赖频谱资源,$峰值数据速率=系统带宽 * 峰值频谱速率$</li><li>用户体验速率(user experienced data rate): 针对<em>大多数用户(95%)、在大范围内</em>可实现的速率;城区/郊区: 100Mbit/s,室内/热点: 1Gbit/s</li><li>频谱效率(spectrum efficiency): 每单位无线设备的平均数据吞吐量,目标确定为4G的三倍</li><li>区域话务容量(area traffic capacity): 依赖频谱效率、带宽、网络部署密度,$区域话务容量=频谱效率 * 带宽 * TRP密度$</li><li>网络能效(network energy efficiency): 与上代持平</li></ul></li><li>关键能力(其他)<ul><li>时延(latency): 针对URLLC场景重要,时延相比前代减少10倍</li><li>移动性(mobility): 针对URLLC场景重要,目标场景500km/h,同时要求低时延(不要求高用户速率)</li><li>连接密度(connection density): 针对mMTC场景重要,每单位面积可接入的终端总数</li></ul></li><li>其他能力<ul><li>频谱和带宽灵活性(spectrum and bandwidth flexibily): 系统在不同频段上的工作能力</li><li>可靠性(reliability): 服务可用性</li><li><strong>可恢复性(resilience)</strong>: 在自然、人为破坏期间、之后,网络能继续正常运行的能力</li><li>安全与隐私(security and privacy): 数据、信令的加密/完整性保护,拒绝未经授权的跟踪</li><li>运行寿命(operational lifetime): 每单位储存能量的运行时间</li></ul></li></ul><h3 id="IMT-2020性能评估"><a href="#IMT-2020性能评估" class="headerlink" title="IMT-2020性能评估"></a>IMT-2020性能评估</h3><p>典型测试环境</p><ul><li>室内热点(indoor hotspot)-eMBB: 办公室/购物中心的室内隔离环境,针对密度很高的静止人群</li><li>密集市区(dense urban)-eMBB: 高用户密度和流量的城市环境,针对行人/车辆用户</li><li>郊区(Rural)-eMBB: 农村环境,针对大覆盖面积内的行人、车辆、高速车辆</li><li>市区宏站(urban macro)-mMTC: 具有连续覆盖范围的城市宏基站环境,针对大量机器终端</li><li>市区宏站(urban macro)- URLLC: 具有连续覆盖范围的城市宏基站环境,针对超可靠、低时延通信</li></ul><p>每个技术在每个典型测试环境中进行性能评估的三个基本方法</p><ul><li>仿真: 无线接口的系统级、链路级仿真</li><li>分析: 基于无线接口参数的计算,或其他KPI值来评估</li><li>检查: 审核无线接口的功能等</li></ul><h2 id="LTE概述"><a href="#LTE概述" class="headerlink" title="LTE概述"></a>LTE概述</h2><h3 id="LTE的资源配置"><a href="#LTE的资源配置" class="headerlink" title="LTE的资源配置"></a>LTE的资源配置</h3><p>LTE在时域上的传输以10ms为一帧(frame),每帧包括10个1ms的子帧(subframe),每个子帧分为两个长度为0.5ms的时隙(slot),<br>每个slot在时域上(在普通CP模式下)分成7个OFDM符号,或(在扩展CP模式下)分成6个OFDM符号,这里的一个OFDM符号是LTE资源调度的最小单元。由此延伸出TTI/RG/RB/RE的概念: </p><ul><li>TTI(transmission time-interval): subframe作为LTE的一个调度时间单位,称为一个TTI<ul><li>时域: 一个subframe(1ms)</li><li>频域: /(这仅仅是个时域定义)</li></ul></li><li>RG(resource grid): 一个slot中的传输信号可以用一个资源格(RG)描述。<ul><li>时域: 一个Slot(0.5ms)</li><li>频域: 全部子载波</li></ul></li><li>RB(Resource block): 一个slot中的每个子载波称为一个资源块(RB)(这可以被视为资源的粗粒度分割方法),<strong>RB是分配资源到UE的最小单位</strong><ul><li>时域: 一个Slot(0.5ms)</li><li>频域: 连续12个子载波</li></ul></li><li>RE(Resource elements): 一个RB中的一个OFDM symbol,其在RG上的位置可由$(k,l)$唯一标注,其中$l$表时域,$k$表频域(这可以被视为资源的细粒度分割方法),<strong>RE时LTE资源调度的最小单位</strong><ul><li>时域: OFDM symbol(slot中的1/7或1/6)</li><li>频域: 一个子载波</li></ul></li><li>RG/RB/RE的关系: 一个RG可在频域上分为多个RB;一个RB可在时域和频域上分为多个RE(如下图所示)。</li></ul><div align="center"> <img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/20220106185157.png" width = "70%" /></div><h3 id="LTE数据传输概述"><a href="#LTE数据传输概述" class="headerlink" title="LTE数据传输概述"></a>LTE数据传输概述</h3><p>由于5G NR在设计时参考了并复用了许多LTE技术构建,此外5G NR与LTE均由3GPP制定,因此在此总结LTE层的相关知识。</p><p>LTE根据不同的帧格式,可以配置帧为FDD或TDD,同时可以对全双工和半双工模式进行配置。在此首先忽略LTE中具体的帧格式,对上下行链路的中的关键信号作出解释。</p><h4 id="物理层上行链路"><a href="#物理层上行链路" class="headerlink" title="物理层上行链路"></a>物理层上行链路</h4><p>用户的上行链路传输包括: </p><ul><li>RS(Reference signals,参考信号): 包括SRS(Sounding Reference Signal,探测参考信号)、DMRS(Demodulation Reference Signal,解调参考信号),参考信号用于信道估计或均衡。<ul><li>DMRS: BS使用UE发送的DMRS来均衡和解调UE的传输</li><li>SRS: 基站了解该UE的上行信道特性。基站可以使用该信息来为UE分配好的上行链路以进行传输</li></ul></li><li>物理信道<ul><li>PUSCH(physical uplink share channel,物理上行共享信道): 该信道传输用户的上行数据,这里的 “共享” 指同一物理信道可由多个用户分时使用,或者说信道具有较短的持续时间。一个UE可以并行存在多条USCH,这些并行的USCH数据可以在物理层进行编码组合。</li><li>PUCCH(physical uplink control channel,物理上行控制信道): 该信道用于承载UCI(uplink control information,上行链路控制信息),PUCCH包括: HARQ ACK/NACK、CQI 信道质量指标、MIMO 反馈 - RI(秩指标),PMI(预编码矩阵指标)、上行链路传输的调度请求、用于 PUCCH 调制的 BPSK 或 QPSK</li><li>PRACH(Physical Random Access Channel,物理随机接入信道): 上行链路用户使用物理随机接入信道(PRACH)来发起与基站的联系。基站广播一些基本的小区信息,包括可以发送随机接入请求的位置。然后,UE 进行 PRACH 传输,请求分配 PUSCH,基站使用下行链路控制信道 (PDCCH) 来回复 UE 可以在何处传输 PUSCH。</li><li>注意: 用户不能在同一个时隙中同时传输PUCCH和PUSCH数据</li></ul></li><li>关于同步: 上行信号没有专用的同步信号。在实际环境中,上行链路信号将使用下行链路信号进行同步。但是,为了在使用 89600 VSA LTE 解调器时分析上行链路与下行链路分离,可以使用 PUCCH DM-RS、PUSCH DM -RS、PRACH 或 SRS同步上行链路帧。</li></ul><h4 id="物理层下行链路"><a href="#物理层下行链路" class="headerlink" title="物理层下行链路"></a>物理层下行链路</h4><p>基站的下行链路传输包括: </p><ul><li>SYN(Synchronization,同步信号): 下行同步信号有两个,主同步信号(P-SS)和辅同步信号(S-SS)</li><li>RS(Reference signals,参考信号),根据不同的帧配置,所发送的参考信号将有所不同: <ul><li>C-RS(Cell specific Reference Signal): BS发送的C-RS被UE用于下行物理信道的信道估计、获取信道状态信息以便信道调度、执行终端测量从而决定UE的初始接入/选择和切换、终端侧频率误差的校正。C-RS在每个下行子帧,整个下行传输带宽内的每个RB上都会发送,无论是否下行链路有数据发送。</li><li>UE-RS(UE specific Reference Signal): BS可以在分配给UE的PDSCH的RB中发送UE-RS。</li><li>P-RS(Positioning Reference Signal,定位参考信号): 用于增强UE地理定位精度</li><li>MBSFN-RS(Multicast/Broadcast Single Frequency Network Reference Signal,组播/广播单频网络参考信号 ): 用于补偿在物理多播信道(下行链路信道的影响PMCH),其中包含的多播/广播数据</li></ul></li><li>物理信道<ul><li>控制信道(Control channels): 控制信道提供管理用户信道上数据传输所需的信息,并促进与基站的连接。这些通道放置在帧中的特定位置。<ul><li>PBCH: 物理广播频道,携带特定于cell的信息。</li><li>PCFICH: 物理控制格式指示通道,包含有关子帧中用于 PDCCH 的 OFDM 符号数量的信息。</li><li>PDCCH: 物理下行控制信道,包含调度信息。</li><li>PHICH: 物理混合ARQ指示通道,携带混合 ARQ ACK / NACK。</li></ul></li><li>共享信道(Shared channel): PDSCH(physical downlink share channel)包含发送给用户的数据。所有资源块都可用于分配,但只有未为控制信道预留的子载波可用于承载数据。</li><li>组播信道(Multicast channel): 物理多播信道 (Physical Multicast Channel,PMCH) 支持MBMS(Multimedia Broadcast/Multicast Service,多媒体广播/多播服务),并承载供多个用户使用的数据。单个小区(广播)或多个小区(多播)都可以参与传输数据。来自不同小区的信号在UE处汇合,以能够提供更高的信号功率。MBMS 信号在扩展 CP 模式下传输,以减轻由于每个小区到 UE 的距离不同而导致的多径效应。</li></ul></li></ul><p>(后续更新: 5G NR接入流程,竞争机制)</p><hr><h2 id="参考文献"><a href="#参考文献" class="headerlink" title="参考文献"></a>参考文献</h2><p>[1] <a href="https://rfmw.em.keysight.com/wireless/helpfiles/89600B/WebHelp/subsystems/lte/content/lte_overview.htm">https://rfmw.em.keysight.com/wireless/helpfiles/89600B/WebHelp/subsystems/lte/content/lte_overview.htm</a></p>]]></content>
<tags>
<tag> note </tag>
<tag> 5G </tag>
<tag> 4G </tag>
<tag> 5G NR </tag>
</tags>
</entry>
<entry>
<title>Transformers - Survey</title>
<link href="/uncategorized/notes/x_formers/"/>
<url>/uncategorized/notes/x_formers/</url>
<content type="html"><![CDATA[<p>A Survey of Transformers</p><p>TIANYANG LIN, YUXIN WANG, XIANGYANG LIU, and XIPENG QIU</p><p>School of Computer Science, Fudan University, China and Shanghai Key Laboratory of Intelligent Information Processing, Fudan<br>University, China</p><span id="more"></span><h2 id="Motivations"><a href="#Motivations" class="headerlink" title="Motivations"></a>Motivations</h2><p>Vanilla Transformer的主要弊端与改进方向</p><ul><li><strong>Model Efficiency</strong>:Transforme处理长序列时效率低下,这主要是由于self-attention的计算和内存复杂性造成的。改进方法包括轻量级attention,例如sparse attention<br>variants、和分治方法(Divide-and-conquer),例如recurrent and hierarchical mechanism。</li><li><strong>Model Generalization</strong>:Transformer的结构从理论上来说是非常灵活的,几乎不对输入数据的结构性偏差进行假设,因此很难对<strong>小规模数据</strong>进行训练。<br>改进方法包括引入结构性偏差(structural bias)或正则化(regularization,)、对大规模未标记数据进行预训练等。</li><li><strong>Model Adaptation</strong>:这类工作旨在使Transformer适应特定的下游任务和应用。</li></ul><p>这篇文章主要根据改进vanilla Transformer的方式来组织相关的工作,即:<strong>架构修改</strong>、<strong>预训练</strong>、<strong>应用</strong>。且本文主要关注架构变体,并简要讨论预训练和面向应用的变体。</p><h2 id="Background"><a href="#Background" class="headerlink" title="Background"></a>Background</h2><h3 id="Vanilla-Transformer"><a href="#Vanilla-Transformer" class="headerlink" title="Vanilla Transformer"></a>Vanilla Transformer</h3><div align="center"> <img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/20220120210749.png" width = "70%" /></div><p>首先,transformer遵循seq2seq结构,其中encoder decoder都由$L$个单独的模块堆叠而成。要点包括:</p><ul><li><strong>encoder</strong>: multi-head self-attention, position-wise feed-forward network (FFN), residual connection, Layer Normalization.</li><li><strong>decoder</strong>: 上述模块 + cross-attention (between the multi-head self-attention modules and the position-wise FFNs), decoder中的attention matrix计算是有位置限制的<br>(考虑到后续时刻输出不能为前序时刻的输出提供参考)</li></ul><h4 id="Multi-head-attention"><a href="#Multi-head-attention" class="headerlink" title="Multi-head attention"></a>Multi-head attention</h4><div align="center"> $\operatorname{Attention}(\mathrm{Q}, \mathrm{K}, \mathrm{V})=\operatorname{softmax}\left(\frac{\mathrm{QK}^{\top}}{\sqrt{D_{k}}}\right) \mathrm{V}=\mathrm{AV}$</div><p>上式是attention的基本原理,其中query $Q \in \mathbb{R}^{N \times D_{k}}$,key $\mathrm{K} \in \mathbb{R}^{M \times D_{k}}$,value $\mathbf{V} \in \mathbb{R}^{M \times D_{v}}$。$N,M$分别为query和key(value)的长度,$D_k, D_v$为key(query)与value的维度。$\mathrm{A}=\operatorname{softmax}\left(\frac{\mathrm{QK}^{\top}}{\sqrt{D_{k}}}\right)$也被称为attention matrix。除以$\sqrt{D_{k}}$是为了缓解梯度消失。</p><p>将数据维度压缩为1,则上述三个对象可以理解为,$\mathbf{v}=[v_1, \cdots, v_N]$代表module在未经过筛选时要输出的值,我们期望的输出是$\mathbf{w} * \mathbf{v}$,其中权重$\mathbf{w}$的计算即为上式中$softmax(\cdot)$的结果。</p><p> <strong>与软寻址之间的联系(如下图)</strong>:令Source $\mathbf{S}=[<k_1, v_1>, \cdots, <k_n, v_n>]$视为存储器中的全部内容,当前有一个查询$q=k_i$,目的是取出source中匹配键值的值$v_i$。<br> 我们记$\mathbf{k} = [k_1, \cdots, k_n]$,我们通过Query $q$和存储器内元素的地址$\mathbf{k}$进行相似性比较来寻址,之所以说是软寻址,指的不像一般寻址只从存储内容里面找出一条内容$k_i$,<br>而是可能从$\mathbf{k}$中的每一项都会取出内容,取出内容的重要性根据$q$和$\mathbf{k}$的相似性来决定,相似性记为$\mathbf{w} = [w_1, \cdots, w_n]$,<br> 之后对存储器中的每一项对应的值进行加权求和,即$v = w_1 v_1 + \cdots + w_n v_n$,得到最终的Value值,也即Attention的结果值。<br> 所以不少研究人员将Attention机制看作软寻址的一种特例,这也是非常有道理的。</p><div align="center"> <img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/20220120203102.png" width = "50%" /></div><div align="center"> \begin{aligned} \text { MultiHeadAttn }(Q, K, V) &=\text { Concat }\left(\text { head }_{1}, \cdots, \text { head }_{H}\right) \mathrm{W}^{O}, \\ \text { where head }_{i} &=\operatorname{Attention}\left(Q W_{i}^{Q}, K W_{i}^{K}, V W_{i}^{V}\right) . \end{aligned}</div><p>上式是multi-head attention的基本表达式,其中$Q, K, V$的维度均为$D_m$,他们分别由几个线性映射($W_{i}^{Q}, W_{i}^{K} W_{i}^{V}$)投影到维度为$D_k, D_k, D_v$的空间中,并进行attention计算,最后模型将所有输出连接并将其投影到$D_m$维空间。</p><ul><li><strong>Self-attention</strong>:$Q=K=V=X$,$X$是前一层的输出</li><li><strong>Masked Self-attention</strong>:在Transformer解码器中,self-attention生成的weight受到位置限制,其生成的attention matrix只度量某个位置i和j之间的权重,且$i>=j$。具体地,其实现过程是为attention matrix的某些位置赋予mask。$\hat{A}=\exp \left(\frac{Q K^{\top}}{\sqrt{D_{k}}}\right)$, $\hat{A}_{i j}=-\infty \text { if } i<j$。这种自我注意通常被称为自回归注意或因果注意。</li><li><strong>Cross-attention</strong>:query由上一层decoder的输出投影而来,key/value又encoder的输出投影而来</li></ul><h4 id="Position-wise-FFN"><a href="#Position-wise-FFN" class="headerlink" title="Position-wise FFN"></a>Position-wise FFN</h4><p>基于位置的FFN是一个全连接的前馈网络,它在每一个位置上进行独立运算,注意:前向网络的参数在不同位置上是共享的,因此Position-wise FFN也可以理解为两层kernel size为1的卷积层。</p><div align="center"> $\operatorname{FFN}\left(\mathbf{H}^{\prime}\right)=\operatorname{ReLU}\left(\mathbf{H}^{\prime} \mathbf{W}^{1}+\mathbf{b}^{1}\right) \mathbf{W}^{2}+\mathbf{b}^{2}$</div><p>其中$\mathbf{H}^{\prime}$为上一层的输出,$\mathbf{W}^{1} \in \mathbb{R}^{D_{m} \times D_{f}}, \mathbf{W}^{2} \in \mathbb{R}^{D_{f} \times D_{m}}, \mathbf{b}^{1} \in \mathbb{R}^{D_{f}}, \mathbf{b}^{2} \in \mathbb{R}^{D_{m}}$,一般来讲FFN的维度参数设置为$D_f > D_m$</p><h4 id="Residual-connection-and-normalization"><a href="#Residual-connection-and-normalization" class="headerlink" title="Residual connection and normalization"></a>Residual connection and normalization</h4><div align="center"> \begin{aligned} \mathrm{H}^{\prime} &=\text { LayerNorm }(\text { SelfAttention }(\mathrm{X})+\mathrm{X}) \\ \mathrm{H} &=\text { LayerNorm }\left(\mathrm{FFN}\left(\mathrm{H}^{\prime}\right)+\mathrm{H}^{\prime}\right) \end{aligned}</div><h4 id="Position-Encodings"><a href="#Position-Encodings" class="headerlink" title="Position Encodings"></a>Position Encodings</h4><p>因为Transformer没有引入递归结构或卷积操作,所以对于每个attention来说,它们不知道数据的前后位置信息(特别是对于编码器来说)。因此需要对数据的位置做额外的表征</p><h3 id="模型的拆解用法"><a href="#模型的拆解用法" class="headerlink" title="模型的拆解用法"></a>模型的拆解用法</h3><ul><li><strong>encoder-decoder</strong>:用于seq2seq modeling</li><li><strong>encoder only</strong>:representation learning,用于支持classification,sequence labeling</li><li><strong>decoder only</strong>:(此时encoder-decoder cross-attention module也被移除),sequence generation,用于支持language modeling</li></ul><p>(后续更新:主要transformer的总结,以及亮点结构)</p>]]></content>
<tags>
<tag> note </tag>
<tag> transformers </tag>
</tags>
</entry>
<entry>
<title>4GWAN中的告警(alarm)分析</title>
<link href="/uncategorized/notes/alarm_in_4GWAN/"/>
<url>/uncategorized/notes/alarm_in_4GWAN/</url>
<content type="html"><![CDATA[<h1 id="告警与性能指标之间的关系"><a href="#告警与性能指标之间的关系" class="headerlink" title="告警与性能指标之间的关系"></a>告警与性能指标之间的关系</h1><p>本文主要以case为单位,分析每个告警可能对性能指标造成的影响,回顾专家在判定告警对流量损失时的思路: </p><ol><li>以expert label中标注的<code>Time of Interest (TOI)</code>确定case中的target alarm: target alarm为TOI中发生过的告警、仍未清除的告警</li><li>以alarm为中心,分析与告警相关的上下游知识(如物理层、链路层等)是否发生异常(评价所用的参考指标: 环比KPI - 一天环比或一周环比)</li><li>分析异常链路是否完整,包括: (1)告警所影响的KPI是否发生异常(2)目标KPI(一般为流量)是否发生异常</li></ol><span id="more"></span><h2 id="case-2-RRU级告警-接口异常-告警无损失案例"><a href="#case-2-RRU级告警-接口异常-告警无损失案例" class="headerlink" title="case 2 (RRU级告警 接口异常 告警无损失案例)"></a>case 2 (RRU级告警 接口异常 告警无损失案例)</h2><p><code>alarm: RF Unit CPRI Interface Error</code> (Minor) - 射频单元链路上承载的业务质量可能会略有下降(详细说明见下图)</p><p><img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/20220104134155.png"></p><p><strong>专家思路</strong>: </p><ol><li>告警时间内,告警小区的物理层中,上下行RSRP指标与环比参考指标相比没有明显变化,说明告警对小区无影响。</li><li>进一步的比较告警小区与环比流量是否有损失。</li></ol><p>强相关的KPI指标(告警小区): </p><ul><li>L.MeasRpts.RSRP.Index0~2 (RSRP<-110dbm) Ratio (%)(MR测量上报RSRP<-110dbm比例(%))</li><li>UL PUCCH RSRP low than-120dBm ratio(上行PUCCH RSRP低于-120dBm的比例)</li><li>UL PUSCH RSRP low than-120dBm ratio(上行PUSCH RSRP低于-120dBm的比例)</li></ul><h2 id="case-3-全站级告警-时钟类异常-告警无损失案例"><a href="#case-3-全站级告警-时钟类异常-告警无损失案例" class="headerlink" title="case 3 (全站级告警 时钟类异常 告警无损失案例)"></a>case 3 (全站级告警 时钟类异常 告警无损失案例)</h2><p><code>alarm: External Clock Reference Problem</code>(Major)- Minor 影响切换,可能导致断站;Major 导致跨站干扰,影响速率、接入成功率、掉话率等指标,严重时会引发断站(详细说明见下图)</p><p><img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/20220104134040.png"></p><p><strong>专家思路</strong>: 告警小区的物理层中的干扰指标、用户层的速率指标均无明显变化,说明告警无影响。</p><p>强相关的KPI指标(告警小区): </p><ul><li>L.UL.Interference.Avg(dBm)(上行平均干扰)</li><li>(上行用户体验速率(Mbps))</li><li>LTE_User UL Average Throughput(Mbps)下行用户体验速率(Mbps)(下行用户体验速率(Mbps))</li></ul><h2 id="case-4-小区级告警-断站异常-告警有损失案例"><a href="#case-4-小区级告警-断站异常-告警有损失案例" class="headerlink" title="case 4 (小区级告警 断站异常 告警有损失案例)"></a>case 4 (小区级告警 断站异常 告警有损失案例)</h2><p><code>alarm: Cell Unavailable</code></p><p>强相关的KPI指标(告警小区): (未列出)</p><p>这个案例直接比较了告警小区的流量环比是否损失(实际情况是损失的),和邻区小区的流量累加环比是否增加(实际情况专家认为是无增加的)</p><h2 id="case-5-6-RRU级告警-断站异常-告警无损失案例"><a href="#case-5-6-RRU级告警-断站异常-告警无损失案例" class="headerlink" title="case 5/6 (RRU级告警 断站异常 告警无损失案例)"></a>case 5/6 (RRU级告警 断站异常 告警无损失案例)</h2><p><code>alarm: RF Unit VSWR Threshold Crossed (Minor, Major)</code> - 射频单元的输出功率降低,小区覆盖缩小(详细说明见下图)</p><p><img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/20220104141701.png"></p><p> <strong>专家思路</strong> : </p><ol><li>物理层指标中,RSRP<-110dbm比例无明显变化,说明告警小区的信号强度在告警期间未发生明显变化 </li><li>下行平均MCS无明显变化,说明告警小区的信道质量在告警期间未发生明显变化;下行64QAM的比例无明显变化,(QAM指相正交振幅调制,常用的QAM有16QAM、64QAM、256QAM,越大对信道质量的要求越高)</li><li>同频和异频切换出成功次数无明显变化,说明告警小区的用户未因小区覆盖范围变小而切出告警小区</li></ol><p>强相关的KPI指标(告警小区): </p><ul><li>L.MeasRpts.RSRP.Index0~2 (RSRP<-110dbm) Ratio (%) (MR测量上报RSRP<-110dbm比例(%),物理层指标,RSRP测量提供小区信号强度的测量)</li><li>DL MCS AVG/DL64QAM RATIO (下行平均MCS/下行64QAM的比例,MAC层指标,MCS:调制编码方案,当信道质量好时采用高阶的调制方式和更高的编码效率,MCS越高,码率越大,传输效率越高)</li><li>InterFreq HO Succ Times/IntraFreq HO Succ Times(同频切换出成功次数/异频切换出成功次数,用户层指标)</li></ul><h2 id="case-13-RRU级告警-断站异常-告警有损失案例"><a href="#case-13-RRU级告警-断站异常-告警有损失案例" class="headerlink" title="case 13 (RRU级告警 断站异常 告警有损失案例)"></a>case 13 (RRU级告警 断站异常 告警有损失案例)</h2><p>TOI时间段内223196上发生过的告警,黑体为还未清除的告警: </p><ul><li><strong>RF Unit Optical Module Fault</strong></li><li>External Clock Reference Problem</li><li>Remote Maintenance Link Failure</li><li><strong>RF Unit Maintenance Link Failure</strong></li><li>Cell Capability Degraded</li><li>BBU CPRI Interface Error</li><li>BBU CPRI Line Rate Negotiation Abnormal</li><li><strong>Cell Unavailable</strong></li></ul><p>告警解析如下表</p><table><thead><tr><th>告警名</th><th>告警类型</th><th>告警描述</th><th>影响</th></tr></thead><tbody><tr><td>RF Unit Optical Module Fault</td><td>射频类</td><td>射频单元光模块故障</td><td>光模块的收发性能下降 =>射频单元链路上承载的业务质量下降=>射频单元上承载的业务可能会中断</td></tr><tr><td>External Clock Reference Problem</td><td>时钟类</td><td>时钟参考源异常</td><td>可能导致TDD网元系统时钟不可用,可能出现小区接入失败、掉话等业务异常或无法提供业务</td></tr><tr><td>Remote Maintenance Link Failure</td><td>断站类</td><td>Operation and Maintenance CHannel (OMCH)中断超过五分钟导致无法维护</td><td>用户无法维护远端设备。基站只能在现场维护。</td></tr><tr><td>RF Unit Maintenance Link Failure</td><td>射频类</td><td>BBU与射频单元之间的维护链路故障,且故障持续一段时间</td><td>射频单元上承载的业务中断。</td></tr><tr><td>Cell Capability Degrade</td><td>断站类</td><td>小区服务能力下降</td><td>告警小区提供给客户可用的无线空口能力会下降,可能出现用户接入异常</td></tr><tr><td>BBU CPRI Interface Error</td><td>射频类</td><td>BBU CPRI光模块接口异常告警</td><td>BBU与下级射频单元的光模块的连接链路中断,或收发性能轻微恶化,MAC层错帧率超过指定门限=>业务质量变差</td></tr><tr><td>BBU CPRI Line Rate Negotiation Abnormal</td><td>射频类</td><td>BBU与下级射频单元间线速率不一致</td><td>引发BBU与下级射频单元的间的带宽不足,导致小区建立失败,系统容量降低</td></tr></tbody></table><p>这个案例直接比较了告警小区的流量环比是否损失(实际情况是损失的),和邻区小区的流量累加环比是否增加(实际情况专家认为是无增加的)</p>]]></content>
<tags>
<tag> note </tag>
<tag> 5G </tag>
<tag> 4G </tag>
</tags>
</entry>
<entry>
<title>linux服务器常用配置</title>
<link href="/uncategorized/notes/linux_server_notes/"/>
<url>/uncategorized/notes/linux_server_notes/</url>
<content type="html"><![CDATA[<p>仅记录不常用的命令,可能会不全哦👐🏻</p><span id="more"></span><ol><li>服务器<code>conda</code>切换北外镜像源</li></ol><pre class="line-numbers language-bash" data-language="bash"><code class="language-bash">$ conda config <span class="token parameter variable">--add</span> channels https://mirrors.bfsu.edu.cn/anaconda/pkgs/free/ $ conda config <span class="token parameter variable">--add</span> channels https://mirrors.bfsu.edu.cn/anaconda/pkgs/main/ $ conda config <span class="token parameter variable">--add</span> channels https://mirrors.bfsu.edu.cn/anaconda/cloud/conda-forge $ conda config <span class="token parameter variable">--add</span> channels https://mirrors.bfsu.edu.cn/anaconda/cloud/msys2/$ conda config <span class="token parameter variable">--set</span> show_channel_urls <span class="token function">yes</span> $ conda config <span class="token parameter variable">--add</span> channels https://mirrors.bfsu.edu.cn/anaconda/cloud/pytorch/<span aria-hidden="true" class="line-numbers-rows"><span></span><span></span><span></span><span></span><span></span><span></span></span></code></pre><ol start="2"><li><p><code>kill -STOP 1234</code> 将进程暂停,如果要让它恢复到后台,用<code>kill -CONT 1234</code>。<a href="https://www.cnblogs.com/kexinxin/p/9939119.html">ref1</a>, <a href="https://www.jianshu.com/p/d4190447736e">ref2</a> </p></li><li><p>linux解压、安装rar文件</p></li></ol><pre class="line-numbers language-bash" data-language="bash"><code class="language-bash">$ <span class="token function">wget</span> http://rarsoft.com/rar/rarlinux-x64-5.5.0.tar.gz$ <span class="token function">sudo</span> <span class="token function">tar</span> <span class="token parameter variable">-zxvf</span> rarlinux-x64-5.5.0.tar.gz<span class="token comment"># 编译</span>$ <span class="token builtin class-name">cd</span> <span class="token function">rar</span>$ <span class="token function">sudo</span> <span class="token function">make</span><span class="token comment"># 测试</span>$ <span class="token function">rar</span><span class="token comment"># 解压, 按压缩包内的文件结构解压</span>$ <span class="token function">rar</span> x <span class="token operator"><</span>file_path<span class="token operator">></span><span aria-hidden="true" class="line-numbers-rows"><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span></span></code></pre><ol start="4"><li>查看目录的文件大小使用<code>du -h <dir_path></code>,<code>-h</code>代表human,将输出人类易读的文件单位(MB/GB/…)</li></ol>]]></content>
<tags>
<tag> note </tag>
<tag> linux </tag>
</tags>
</entry>
<entry>
<title>Markdown常用语法</title>
<link href="/uncategorized/notes/markdown_notes/"/>
<url>/uncategorized/notes/markdown_notes/</url>
<content type="html"><![CDATA[<p><a href="http://www.w3chtml.com/html/html-basic-grammar.html">Remark: HTML常用语法</a></p><span id="more"></span><ol><li>插入并调整图片大小</li></ol><pre class="line-numbers language-markup" data-language="markup"><code class="language-markup"><span class="token tag"><span class="token tag"><span class="token punctuation"><</span>div</span> <span class="token attr-name">align</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>center<span class="token punctuation">"</span></span><span class="token punctuation">></span></span> <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>img</span> <span class="token attr-name">src</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>fig.png<span class="token punctuation">"</span></span> <span class="token attr-name">width</span> <span class="token attr-value"><span class="token punctuation attr-equals">=</span> <span class="token punctuation">"</span>70%<span class="token punctuation">"</span></span> <span class="token punctuation">/></span></span><span class="token tag"><span class="token tag"><span class="token punctuation"></</span>div</span><span class="token punctuation">></span></span><span aria-hidden="true" class="line-numbers-rows"><span></span><span></span><span></span></span></code></pre><ol start="2"><li>设置页面支持latex公式,在页面头部插入以下内容</li></ol><pre class="line-numbers language-markup" data-language="markup"><code class="language-markup"><span class="token tag"><span class="token tag"><span class="token punctuation"><</span>head</span><span class="token punctuation">></span></span> <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>script</span> <span class="token attr-name">src</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML<span class="token punctuation">"</span></span> <span class="token attr-name">type</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>text/javascript<span class="token punctuation">"</span></span><span class="token punctuation">></span></span><span class="token script"></span><span class="token tag"><span class="token tag"><span class="token punctuation"></</span>script</span><span class="token punctuation">></span></span> <span class="token tag"><span class="token tag"><span class="token punctuation"><</span>script</span> <span class="token attr-name">type</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>text/x-mathjax-config<span class="token punctuation">"</span></span><span class="token punctuation">></span></span><span class="token script"><span class="token language-javascript"> MathJax<span class="token punctuation">.</span>Hub<span class="token punctuation">.</span><span class="token function">Config</span><span class="token punctuation">(</span><span class="token punctuation">{</span> <span class="token literal-property property">tex2jax</span><span class="token operator">:</span> <span class="token punctuation">{</span> <span class="token literal-property property">skipTags</span><span class="token operator">:</span> <span class="token punctuation">[</span><span class="token string">'script'</span><span class="token punctuation">,</span> <span class="token string">'noscript'</span><span class="token punctuation">,</span> <span class="token string">'style'</span><span class="token punctuation">,</span> <span class="token string">'textarea'</span><span class="token punctuation">,</span> <span class="token string">'pre'</span><span class="token punctuation">]</span><span class="token punctuation">,</span> <span class="token literal-property property">inlineMath</span><span class="token operator">:</span> <span class="token punctuation">[</span> <span class="token punctuation">[</span><span class="token string">'$'</span><span class="token punctuation">,</span><span class="token string">'$'</span><span class="token punctuation">]</span><span class="token punctuation">,</span> <span class="token punctuation">[</span><span class="token string">"\\("</span><span class="token punctuation">,</span><span class="token string">"\\)"</span><span class="token punctuation">]</span> <span class="token punctuation">]</span><span class="token punctuation">,</span> <span class="token punctuation">}</span> <span class="token punctuation">}</span><span class="token punctuation">)</span><span class="token punctuation">;</span> </span></span><span class="token tag"><span class="token tag"><span class="token punctuation"></</span>script</span><span class="token punctuation">></span></span><span class="token tag"><span class="token tag"><span class="token punctuation"></</span>head</span><span class="token punctuation">></span></span><span aria-hidden="true" class="line-numbers-rows"><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span></span></code></pre><ol start="3"><li><p>为文章加入锚点</p><ol><li>markdown的标题行本身自带锚点,直接使用<code>[some_text](#title)</code>即可实现页内跳转,注意,<code>title</code>中不可以包含符号,否则会跳转失败。</li><li>使用<code><a></code>标签,注意,包围在标签中的部分,不适用于markdown语法,以及公式语句。</li></ol></li></ol><pre class="line-numbers language-markup" data-language="markup"><code class="language-markup"><span class="token comment"><!-- HTML方法 --></span><span class="token tag"><span class="token tag"><span class="token punctuation"><</span>a</span> <span class="token attr-name">id</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>tag<span class="token punctuation">"</span></span><span class="token punctuation">></span></span> 添加锚点 <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>a</span><span class="token punctuation">></span></span><span class="token tag"><span class="token tag"><span class="token punctuation"><</span>a</span> <span class="token attr-name">href</span><span class="token attr-value"><span class="token punctuation attr-equals">=</span><span class="token punctuation">"</span>#tag<span class="token punctuation">"</span></span><span class="token punctuation">></span></span> 链接锚点 <span class="token tag"><span class="token tag"><span class="token punctuation"></</span>a</span><span class="token punctuation">></span></span><span aria-hidden="true" class="line-numbers-rows"><span></span><span></span><span></span></span></code></pre><pre class="line-numbers language-markdown" data-language="markdown"><code class="language-markdown"><span class="token comment"><!-- markdown方法 --></span><span class="token url">[<span class="token content">链接锚点</span>](<span class="token url">#tag</span>)</span><span class="token url">[<span class="token content">链接到另一个文章的锚点</span>](<span class="token url">other_file.md#tag</span>)</span><span aria-hidden="true" class="line-numbers-rows"><span></span><span></span><span></span></span></code></pre>]]></content>
<tags>
<tag> note </tag>
<tag> markdown </tag>
</tags>
</entry>
<entry>
<title>Related Papers in Other Top Conference (2022)</title>
<link href="/uncategorized/paperlistfile/otherconf_2022/"/>
<url>/uncategorized/paperlistfile/otherconf_2022/</url>
<content type="html"><![CDATA[<h2 id="WSDM-2022"><a href="#WSDM-2022" class="headerlink" title="WSDM 2022"></a>WSDM 2022</h2><p><a href="https://mp.weixin.qq.com/s/ewdkeq39RODHSsKABNh6cA">时序论文一览</a></p><span id="more"></span>]]></content>
<tags>
<tag> paper list </tag>
</tags>
</entry>
<entry>
<title>关联分析与因果分析调研</title>
<link href="/uncategorized/surveys/correlation/"/>
<url>/uncategorized/surveys/correlation/</url>
<content type="html"><![CDATA[<p>本文旨在调研在智能运维领域中的关联分析与因果分析方法,本文的组织如下:</p><ol><li>首先我们将给出一个源自真实业界需求的案例场景。</li><li>围绕该上述场景,我们将介绍该场景中可用的关联分析、因果分析方法,并给出相关的方法分类</li><li>对于经典方法(基于频繁项挖掘)和与我们的研究相关的方法(基于图嵌入),本文也将列出相关的综述与论文列表。</li><li>在本文的研究范围之外,本文也列出了“事件序列-事件序列”关联分析、“事件序列-时间序列”关联分析的相关方法。</li></ol><span id="more"></span><h2 id="案例场景与问题转化"><a href="#案例场景与问题转化" class="headerlink" title="案例场景与问题转化"></a>案例场景与问题转化</h2><p>该场景源于移动通信网络中的网络优化任务。在现网网络优化的过程中,往往需要进行<strong>网络参数</strong>的调整,参数调整后将收集网络中的<strong>性能指标数据</strong>(KPI)用于验证参数调整的效果。实际上,网络KPI受多种配置参数的联合影响,且针对某个小区的参数调整也会影响该小区的邻区,进而造成邻区网络性能的变化。因此,我们希望了解:</p><ol><li>给定需要优化的网络性能指标,确定有哪些参数(组)将影响该指标;</li><li>给定要调整的网络参数,分析该参数将影响哪些指标。</li></ol><p>综上所述,本场景的本质是要建立“参数”到“网络性能指标”之间的关联关系(较低阶)或因果关系(较高阶)。其中KPI数据本质上是<strong>时间序列</strong>,而参数调整可以视为<strong>时间序列</strong>或<strong>事件序列</strong>。因此本任务对应的科学问题被简化为</p><blockquote><p>给定一对时间序列(或一个时间序列与一个事件序列),如何建模二者之间的关联关系(或因果关系)</p></blockquote><p><strong>在后面的章节中,我们主要调研“时间序列-时间序列”之间的关联分析方法。</strong></p><p>一些背景知识:</p><ul><li>移动网络的建设过程分为:规划、建设、维护、优化,四个步骤,我们的研究属于优化环节。</li><li>网络配置参数分为:非协同类参数(参数调整后只影响本小区的网络性能)、协同类参数(参数调整后会影响邻区网络性能)</li><li>调整协同类参数的具体场景包括:切换类与负载均衡类的网络性能优化,RF参数优化</li></ul><h2 id="分类"><a href="#分类" class="headerlink" title="分类"></a>分类</h2><div align="center"> <img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/20220216193335.png" width = "90%" /></div><p><strong>注:</strong></p><ol><li>上述分类是针对本文的研究目标进行的分类,事实上“基于相似度”、“基于相关性”、“基于图模型 - 基于回归模型”的方法属于属于关联分析,“基于图模型”分类下的“基于条件约束”、“基于得分”、“基于函数式模型”的方法属于因果分析。</li><li>因果分析的分类依据、以及介绍可参照<a href="#refer12">[12]</a><a href="#refer13">[13]</a><a href="#refer13">[14]</a></li></ol><h3 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h3><blockquote><div></div>[1] Su, Y., Zhao, Y., Xia, W., Liu, R., Bu, J., Zhu, J., ... & Pei, D. (2019, June). CoFlux: robustly correlating KPIs by fluctuations for service troubleshooting. In Proceedings of the International Symposium on Quality of Service (pp. 1-10).<div></div>[2] Niennattrakul, V., & Ratanamahatana, C. A. (2007, April). On clustering multimedia time series data using k-means and dynamic time warping. In 2007 International Conference on Multimedia and Ubiquitous Engineering (MUE'07) (pp. 733-738). IEEE.<div></div>[3] 贾海涛. (2018). 基于数据挖掘的动环监控系统告警相关性研究 (Doctoral dissertation, 北京: 北京交通大学).<div></div>[4] Luo, C., Lou, J. G., Lin, Q., Fu, Q., Ding, R., Zhang, D., & Wang, Z. (2014, August). Correlating events with time series for incident diagnosis. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 1583-1592).<div></div>[5] Jiang, G., Chen, H., & Yoshihira, K. (2006, June). Discovering likely invariants of distributed transaction systems for autonomic system management. In 2006 IEEE International Conference on Autonomic Computing (pp. 199-208). IEEE.<div></div>[6] Gerhardus, A., & Runge, J. (2020). High-recall causal discovery for autocorrelated time series with latent confounders. Advances in Neural Information Processing Systems, 33, 12615-12625.<div></div>[7] Nauta, M., Bucur, D., & Seifert, C. (2019). Causal discovery with attention-based convolutional neural networks. Machine Learning and Knowledge Extraction, 1(1), 312-340.<div></div>[8] Chu, Y., Wang, X., Ma, J., Jia, K., Zhou, J., & Yang, H. (2020, November). Inductive Granger Causal Modeling for Multivariate Time Series. In 2020 IEEE International Conference on Data Mining (ICDM) (pp. 972-977). IEEE.<div></div>[9] Brouillard, P., Lachapelle, S., Lacoste, A., Lacoste-Julien, S., & Drouin, A. (2020). Differentiable causal discovery from interventional data. Advances in Neural Information Processing Systems, 33, 21865-21877.<div></div>[10] Ogarrio, J. M., Spirtes, P., & Ramsey, J. (2016, August). A hybrid causal search algorithm for latent variable models. In Conference on probabilistic graphical models (pp. 368-379). PMLR.<div></div>[11] Ding, C., Gong, M., Zhang, K., & Tao, D. (2019). Likelihood-free overcomplete ICA and applications in causal discovery. Advances in Neural Information Processing Systems, 32.<div id="refer12">[12] Glymour, C., Zhang, K., & Spirtes, P. (2019). Review of causal discovery methods based on graphical models. Frontiers in genetics, 10, 524.</div> <div id="refer13">[13] Yao, L., Chu, Z., Li, S., Li, Y., Gao, J., & Zhang, A. (2021). A survey on causal inference. ACM Transactions on Knowledge Discovery from Data (TKDD), 15(5), 1-46.</div> <div id="refer14">[14] Malinsky, D., & Danks, D. (2018). Causal discovery algorithms: A practical guide. Philosophy Compass, 13(1), e12470.</div> </blockquote><h2 id="基于频繁项挖掘的工作调研"><a href="#基于频繁项挖掘的工作调研" class="headerlink" title="基于频繁项挖掘的工作调研"></a>基于频繁项挖掘的工作调研</h2><h3 id="Apriori"><a href="#Apriori" class="headerlink" title="Apriori"></a>Apriori</h3><p><strong>Apriori</strong>最早由 Agrawal 提出,通过多次迭代建立候选集查找频繁项。在<a href="#refer15">[15]</a>中,作者对大规模操作系统中事件序列间的相关性进行了研究,为了方便对事件序列的相关性进行研究,作者首先将冗长的事件转化为不同的事件类型,并根据事件类型序列数据定义了“episode”。记在时间窗口 TW 内发生的所有事件类型即为一个 episode,并记为$E_{eA}, T_{eA}$,其中$E_{eA}$为与事件𝑒A相邻的所有事件的集合,$T_{eA}$为对应的时间窗口。然后,文中用频繁项查找算法 Apriori 对 episode 序列中频繁项集进行搜索,而这种频繁项集的形式就被认定为事件之间的相关性。文中利用不同 h-置信度与 Apriori 中修剪压缩比的变化曲线关系,自动求出最适宜的最小支持度,从而对频繁项集进行修剪。</p><p>HJ Lu<a href="#refer16">[16]</a>认为经典的关联规则挖掘忽略了事物发生的语境,如时间、地点等。作者认为项目关联有两种:<br>1)事物内的频繁项关联(如,同一天内,两个股票一起涨);<br>2)不同事物间的频繁项关联(如 A 股票第一天涨了后,B 股票在第四天有较大概率也涨)。<br>而传统的频繁项关联算法只局限于查找关联1。因此文中对Apriori算法进行了扩展,以提出一种Extension-Apriori算法,将关联规则挖掘的范围从传统的单维度关联扩展到多维的事物间关联,并研究了多维事件关联规则中支持度与置信度的计算方式。同时算法利用哈希技术过滤掉不必要的候选二项集,将所有可能的二项集散列到一个哈希表中,减少了数据库扫描的复杂度。</p><p>但由于 Apriori 算法每一次增加频繁项集大小时都需要重新扫描整个数据集,所以当数据集很大时,算法效率较低,因此有许多研究是针对如何将Apriori 算法进行优化提速。</p><p>如<a href="#refer17">[17]</a>研究了频繁项挖掘算法在 MapReduce 框架中的实现。文献<a href="#refer17">[17]</a>将串行的Apriori算法转化为分布式的MapReduce版本,在每一次查找频繁项集时,使用map生成候选支持,并用reduce收集全局的候选支持。并且算法可以根据候选对象的数量与前一个MapReduce阶段的执行时间,动态的收集可变长度的候选对象,极大地缩短了Apriori生成频繁项集的时间。 </p><h3 id="FP-growth"><a href="#FP-growth" class="headerlink" title="FP-growth"></a>FP-growth</h3><p>频繁模式生长算法(<strong>FP-growth</strong>)是最早由韩家炜等人提出的利用频繁模式树进行频繁项挖掘的算法。相比Apriori,FP-growth只用遍历两遍数据,且不需要产生候选序列,极大提高了挖掘效率。因此也有许多研究人员通过 FP-growth 来挖掘序列中的频繁项。</p><p>文献<a href="#refer18">[18]</a>中,作者提出了一种并行化FP-Growth算法的MapReduce方法,将大规模的频繁项挖掘任务自动分割成独立的计算任务,并将其映射到MapReduce框架中。作者用提出的方法对包含网页与标签的事件序列数据库进行网页间相关度的计算,从而实现有效的相关网页推荐。</p><p>而传统的Apriori和FP-growth 算法都是基于最小支持度的频繁项搜索算法,因此存在以下两个问题:<br>1)当最小支持度设置较大时,包含稀有项的频繁项会被过滤掉;<br>2)当最小支持度设置较小时,则会生成过多的频繁项导致计算量爆炸。<br>对于 FP-Growth 算法,由于每个项目都有最低支持度,因此用户很难一次为所有的项目设置适当的阈值,所以用户通常需要多次优化算法的参数,直到达到满意的结果。</p><p>基于上述问题,文献<a href="#refer19">[19]</a>提出一种MIS-Tree结构和名为CFP-Growth的算法,与FP-growth不同,CFP-Growth中所有的输入项目以最小支持度进行降序排列,然后将所有的项目输入到树结构中,构建一个类似FP-tree的树状结构称为MIS-Tree,同时测量树中各项目的支持度。然后对树中所有支持度小于全部项目中最小支持度的项进行修剪,并对含有相同父节点的项进行合并,以保证树结构的紧凑性,得到的即为最小支持度项目树(MIS-Tree)。<br>最后只需要将MIS-Tree中每个项目作为后缀项并进行扫描就可以发现完整的频繁模式集。文中作者分别在合成数据集和真实数据集上进行测试,结果表明该算法更加有效与快速。</p><h3 id="Reference-1"><a href="#Reference-1" class="headerlink" title="Reference"></a>Reference</h3><blockquote><div id="refer15">[15] Gupta, C. (2012, September). Event correlation for operations management of largescale it systems. In Proceedings of the 9th international conference on Autonomic computing (pp. 91-96).</div><div id="refer16">[16] Lu, H., Feng, L., & Han, J. (2000). Beyond intratransaction association analysis: mining multidimensional intertransaction association rules. ACM Transactions on Information Systems (TOIS), 18(4), 423-454.</div><div id="refer17">[17] Lin, M. Y., Lee, P. Y., & Hsueh, S. C. (2012, February). Apriori-based frequent itemset mining algorithms on MapReduce. In Proceedings of the 6th international conference on ubiquitous information management and communication (pp. 1-8).</div><div id="refer18">[18] Li, H., Wang, Y., Zhang, D., Zhang, M., & Chang, E. Y. (2008, October). Pfp: parallel fp-growth for query recommendation. In Proceedings of the 2008 ACM conference on Recommender systems (pp. 107-114).</div><div id="refer19">[19] Hu, Y. H., & Chen, Y. L. (2006). Mining association rules with multiple minimum supports: a new mining algorithm and a support tuning mechanism. Decision support systems, 42(1), 1-24.</div></blockquote><h2 id="基于图的方法"><a href="#基于图的方法" class="headerlink" title="基于图的方法"></a>基于图的方法</h2><p><a href="#refer20">[20]</a>描述了在一个分布式系统中构建的异常检测系统及其原因分析平台,在图结构部分,本文构建图的方式是值得借鉴的。<br>首先,本文的系统图结构是已知的<strong>系统架构图</strong>,针对探测到的异常,本文提出方法用于在图上分析异常所造成的影响,并定位根因,图的构造方式:</p><ol><li>发生异常时的图anomaly graph使用$G=(V, E)$来表示,其中$C \cup A$。</li><li>$C$是系统的组件构成的集合,包含系统中的逻辑组件、物理组件等,$A$是底层异常检测结果报告的异常(包括由规则定义的和由实时监控系统得到的),$E$是anomaly graph中的边集合,</li><li>图中存在两种边:1)连接组件的边,代表组件之间的从属关系;2)连接组件与异常的边,代表某个组件产生了某个异常</li><li>本文还为alarm edge设置了分数,代表了某个组件发生某个异常的严重程度,这个分数由:1)某个时间段内该异常每次发生的严重程度;2)该异常的发生频率,共同决定。</li></ol><p><em>本文在异常发生的时候,建立一个异常时的状态图,然后在图上针对异常及其连接的边计算异常严重程度分数,分数高的边所连接的组件可能就是异常。</em></p><p><a href="#refer21">[21]</a>从云服务设施提供商的角度,建立了一种异常定位的方法,其中使用了有向无环图$G=(V,E)$来建模异常的传播,其中:</p><ol><li>节点表示虚拟机VM</li><li>边表示节点之间的两种关系<ol><li>由service call引起的业务关联</li><li>由于在同一个物理主机上而可能产生资源竞争的</li></ol></li></ol><p>该文章假设对于某个异常$a$,它发生的时候(一般指这个异常是某个服务的等待时间过长),某个VM上的一组相关指标一定也是繁忙的(如CPU等,意思是这个指标的繁忙导致了异常的发生)。</p><p>因此,为了找到这样的一组指标作为异常的原因,作者提出在$G$上进行随机游走,游走的过程中来计算每个vm的指标与异常服务的响应时间之间的关联关系(利用similarity,其中similarity的计算为:提高某个VM的物理指标占用,如cpu,然后测量service的响应时间,计算二者在这段时间内的相关系数),游走规则如下:</p><ol><li>walker总是更倾向于往具有更高similarity的节点去游走</li><li>walker游走到低similarity的节点的时候,可以选择返回</li><li>walker的领域均为低similarity的节点时,可以选择待着不动</li></ol><p>综上所述,在游走的过程中,被访问最多的节点将被视为是高可能性的异常根因。<em>本文在异常发生的时候,建立一个包含了物理关系与服务调用关系的网络拓扑图,然后在图上以随机游走的方式计算节点的异常程度分数,游走完毕后,即可识别出异常的根因。</em></p><p><a href="#refer22">[22]</a>利用<strong>多维时序指标</strong>来动态生成服务关联,诊断根因。针对多维时间序列,该工作分析时序之间的异常关联,推断<strong>异常行为图</strong>来描述不同服务之间的相关性。根据行为图,该工作使用前向、自向和后向随机游走算法设计启发式模型,用以识别服务故障的根本原因。解析可以看:<a href="https://blog.csdn.net/weixin_53741275/article/details/111973738">link</a></p><p><a href="#refer23">[23]</a>这项工作提出了一个框架,MicroCause,可以准确地定位导致微服务故障的根本原因的<strong>监控指标(时间序列)</strong>。MicroCause结合了简单而有效的路径条件时间序列(PCTS)算法以准确地捕获时间序列之间的<strong>因果关系</strong>,以及面向时间因果的新型随机游走方法(TCORW)</p><p><a href="#refer24">[24]</a>中提到,<strong>不变网络</strong>已被证明是表征复杂系统行为的有效方法。在不变网络中,节点表示系统组件,边缘表示两个组件之间的稳定,重要的交互作用。不变性网络的结构和演化,尤其是<strong>消失的相关性</strong>,可以为定位因果异常和执行诊断提供重要的启示。然而,现有的利用不变网络检测因果异常的方法通常使用消失的相关性百分比来对可能的偶然分量进行排名,这有几个局限性:1)网络中的故障传播被忽略;2)根源偶然异常可能并不总是那些消失率很高的节点;3)消失的相关性的时间模式未用于鲁棒检测。为了解决这些局限性,在本文中,我们提出了一个<strong>基于网络扩散的框架</strong>,以识别重大的因果异常并对它们进行排名。我们的方法可以有效地对整个不变网络上的故障传播建模,并且可以对结构和随时间变化的破碎不变模式进行联合推断。</p><h3 id="Reference-2"><a href="#Reference-2" class="headerlink" title="Reference"></a>Reference</h3><blockquote><div id="refer20">[20] Wang, H., Nguyen, P., Li, J., Kopru, S., Zhang, G., Katariya, S., & Ben-Romdhane, S. (2019). GRANO: Interactive graph-based root cause analysis for cloud-native distributed data platform. Proceedings of the VLDB Endowment, 12(12), 1942-1945.</div><div id="refer21">[21] Weng, J., Wang, J. H., Yang, J., & Yang, Y. (2018). Root cause analysis of anomalies of multitier services in public clouds. IEEE/ACM Transactions on Networking, 26(4), 1646-1659.</div><div id="refer22">[22] Ma, M., Xu, J., Wang, Y., Chen, P., Zhang, Z., & Wang, P. (2020, April). Automap: Diagnose your microservice-based web applications automatically. In Proceedings of The Web Conference 2020(pp. 246-258).</div><div id="refer23">[23] Meng, Y., Zhang, S., Sun, Y., Zhang, R., Hu, Z., Zhang, Y., ... & Pei, D. (2020, June). Localizing failure root causes in a microservice through causality inference. In 2020 IEEE/ACM 28th International Symposium on Quality of Service (IWQoS) (pp. 1-10). IEEE.</div><div id="refer24">[24] Cheng, W., Zhang, K., Chen, H., Jiang, G., Chen, Z., & Wang, W. (2016, August). Ranking causal anomalies via temporal and dynamical analysis on vanishing correlations. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 805-814).</div></blockquote><h2 id="“事件序列-时间序列”关联分析论文列表"><a href="#“事件序列-时间序列”关联分析论文列表" class="headerlink" title="“事件序列-时间序列”关联分析论文列表"></a>“事件序列-时间序列”关联分析论文列表</h2><ul><li><p>Luo, C., Lou, J. G., Lin, Q., Fu, Q., Ding, R., Zhang, D., & Wang, Z. (2014, August). Correlating events with time series for incident diagnosis. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 1583-1592).</p></li><li><p>Chi, L., Sathe, S., Han, B., & Wang, Y. (2016, December). A Novel Method for Assessing Event Impacts on Event-Driven Time Series. In 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) (pp. 519-526). IEEE.</p><p> <strong>梗概</strong>:本文提出使用自适应窗口的$t_score$来动态的度量事件对时间序列的影响,并提出了$A_score$,用于定量评估事件对时间序列的影响程度。</p></li><li><p>Wu, C., Zhao, N., Wang, L., Yang, X., Li, S., Zhang, M., … & Pei, D. Identifying Root-Cause Metrics for Incident Diagnosis in Online Service Systems. In The 32nd International Symposium on Software Reliability Engineering (ISSRE 2021). IEEE.</p><p> <strong>梗概</strong>:本文贡献在于三个方面,首先根据假设检验,识别发生异常时有哪些指标被影响。然后对异常进行分类,去除那些运维人员不会考虑的异常模式。最后根据上述两个输出,输出一个经过排序的可疑指标列表,以辅助工程师识别异常根因。</p></li><li><p>Xun, P., Zhu, P. D., Li, C. L., & Zhu, H. Y. (2016, December). Discovering multi-type correlated events with time series for exception detection of complex systems. In 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) (pp. 21-28). IEEE.</p></li><li><p>Krosman, K., & Sosnowski, J. (2021). Correlating Time Series Signals and Event Logs in Embedded Systems. Sensors, 21(21), 7128.</p></li><li><p>Minaei-Bidgoli, B., & Lajevardi, S. B. (2008, September). Correlation mining between time series stream and event stream. In 2008 Fourth International Conference on Networked Computing and Advanced Information Management (Vol. 2, pp. 333-338). IEEE.</p></li></ul><h3 id="不具备很强参考意义但很有趣的论文"><a href="#不具备很强参考意义但很有趣的论文" class="headerlink" title="不具备很强参考意义但很有趣的论文"></a>不具备很强参考意义但很有趣的论文</h3><ul><li><p>Chi, L., Han, B., & Wang, Y. (2016). Open Problem: Accurately Measuring Event Impacts on Time Series. In KDD MiLeTS Workshop.</p><p> KDD workshop中提出的open problem,有指导性意见</p></li><li><p>Van Dortmont, M. A. M. M., van den Elzen, S., & van Wijk, J. J. (2019, June). ChronoCorrelator: Enriching events with time series. In Computer Graphics Forum (Vol. 38, No. 3, pp. 387-399).</p><p> 一种可视化的方法,用于将事件关联到时间序列的变化上,也是采用了类似two sample test的方法</p></li><li><p>Xiao, S., Yan, J., Farajtabar, M., Song, L., Yang, X., & Zha, H. (2019). Learning time series associated event sequences with recurrent point process networks. IEEE transactions on neural networks and learning systems, 30(10), 3124-3136.</p><p> 这篇揭示了多种事件类型之间的相关关系(基于attention),但没有说明事件对时间序列的影响程度,本文的主要目标是事件预测</p></li></ul><h2 id="“事件序列-事件序列”关联分析论文列表(待筛选)"><a href="#“事件序列-事件序列”关联分析论文列表(待筛选)" class="headerlink" title="“事件序列-事件序列”关联分析论文列表(待筛选)"></a>“事件序列-事件序列”关联分析论文列表(待筛选)</h2><ul><li><p>Noel, S., Robertson, E., & Jajodia, S. (2004, December). Correlating intrusion events and building attack scenarios through attack graph distances. In 20th Annual Computer Security Applications Conference (pp. 350-359). IEEE.</p></li><li><p>Bayomie, D., Awad, A., & Ezat, E. (2016, June). Correlating unlabeled events from cyclic business processes execution. In International Conference on Advanced Information Systems Engineering (pp. 274-289). Springer, Cham.</p></li><li><p>Dindar, N., Fischer, P. M., Soner, M., & Tatbul, N. (2011, July). Efficiently correlating complex events over live and archived data streams. In Proceedings of the 5th ACM international conference on Distributed event-based system (pp. 243-254).</p></li><li><p>Steiger, E., Resch, B., de Albuquerque, J. P., & Zipf, A. (2016). Mining and correlating traffic events from human sensor observations with official transport data using self-organizing-maps. Transportation Research Part C: Emerging Technologies, 73, 91-104.</p></li><li><p>Vlachos, M., Wu, K. L., Chen, S. K., & Philip, S. Y. (2008). Correlating burst events on streaming stock market data. Data Mining and Knowledge Discovery, 16(1), 109-133.</p></li><li><p>Koch, G. G., Koldehofe, B., & Rothermel, K. (2010, July). Cordies: Expressive event correlation in distributed systems. In Proceedings of the Fourth ACM International Conference on Distributed Event-Based Systems (pp. 26-37).</p></li><li><p>Cheng, L., Van Dongen, B. F., & Van Der Aalst, W. M. (2017, May). Efficient event correlation over distributed systems. In 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID) (pp. 1-10). IEEE.</p></li><li><p>Gruschke, B. (1998, October). Integrated event management: Event correlation using dependency graphs. In Proceedings of the 9th IFIP/IEEE International Workshop on Distributed Systems: Operations & Management (DSOM 98) (pp. 130-141).</p></li><li><p>Jiang, G., & Cybenko, G. (2004, June). Temporal and spatial distributed event correlation for network security. In Proceedings of the 2004 American Control Conference (Vol. 2, pp. 996-1001). IEEE.</p></li><li><p>Liu, G., Mok, A. K., & Yang, E. J. (1999, May). Composite events for network event correlation. In Integrated Network Management VI. Distributed Management for the Networked Millennium. Proceedings of the Sixth IFIP/IEEE International Symposium on Integrated Network Management.(Cat. No. 99EX302) (pp. 247-260). IEEE.</p></li><li><p>Rozsnyai, S., Slominski, A., & Lakshmanan, G. T. (2011, July). Discovering event correlation rules for semi-structured business processes. In Proceedings of the 5th ACM international conference on Distributed event-based system (pp. 75-86).</p></li><li><p>Wu, G., Zhang, H., Qiu, M., Ming, Z., Li, J., & Qin, X. (2013). A decentralized approach for mining event correlations in distributed system monitoring. Journal of parallel and Distributed Computing, 73(3), 330-340.</p></li><li><p>Kotenko, I., Fedorchenko, A., Saenko, I., & Kushnerevich, A. (2018, March). Parallelization of security event correlation based on accounting of event type links. In 2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP) (pp. 462-469). IEEE.</p></li><li><p>Xuewei, F., Dongxia, W., Jiemei, Z., Guoqing, M., & Jin, L. (2010, June). Analyzing and correlating security events using state machine. In 2010 10th IEEE International Conference on Computer and Information Technology (pp. 2849-2854). IEEE.</p></li><li><p>Hasan, M., Sugla, B., & Viswanathan, R. (1999, May). A conceptual framework for network management event correlation and filtering systems. In Integrated Network Management VI. Distributed Management for the Networked Millennium. Proceedings of the Sixth IFIP/IEEE International Symposium on Integrated Network Management.(Cat. No. 99EX302) (pp. 233-246). IEEE.</p></li><li><p>Skopik, F., & Fiedler, R. (2013). Intrusion detection in distributed systems using fingerprinting and massive event correlation. INFORMATIK 2013–Informatik angepasst an Mensch, Organisation und Umwelt.</p></li><li><p>Flammini, F., Mazzocca, N., Pappalardo, A., Pragliola, C., & Vittorini, V. (2011, August). Augmenting surveillance system capabilities by exploiting event correlation and distributed attack detection. In International Conference on Availability, Reliability, and Security (pp. 191-204). Springer, Berlin, Heidelberg.</p></li><li><p>Martin-Flatin, J. P. (2004, June). Distributed event correlation and self-managed systems. In Proceedings of the 1st International Workshop on Self-* Properties in Complex Information Systems (Self-Star 2004) (pp. 61-64).</p></li><li><p>Griffith, R., Hellerstein, J., Kaiser, G., & Diao, Y. (2006, June). Dynamic adaptation of temporal event correlation for qos management in distributed systems. In 200614th IEEE International Workshop on Quality of Service (pp. 290-294). IEEE.</p></li><li><p>Steinert, R., Gestrelius, S., & Gillblad, D. (2011, December). A distributed spatio-temporal event correlation protocol for multi-layer virtual networks. In 2011 IEEE Global Telecommunications Conference-GLOBECOM 2011 (pp. 1-5). IEEE.</p></li><li><p>Griffith, R., Hellerstein, J. L., Diao, Y., & Kaiser, G. E. (2005). Dynamic Adaptation of Rules for Temporal Event Correlation in Distributed Systems.</p></li><li><p>Fu, X., Ren, R., Zhan, J., Zhou, W., Jia, Z., & Lu, G. (2012, October). Logmaster: Mining event correlations in logs of large-scale cluster systems. In 2012 IEEE 31st Symposium on Reliable Distributed Systems (pp. 71-80). IEEE.</p></li><li><p>Myers, J., Grimaila, M., & Mills, R. (2010, April). Insider threat detection using distributed event correlation of web server logs. In International Conference on Cyber Warfare and Security (p. 251). Academic Conferences International Limited.</p></li><li><p>Myers, J., Grimaila, M., & Mills, R. (2010, April). Insider threat detection using distributed event correlation of web server logs. In International Conference on Cyber Warfare and Security (p. 251). Academic Conferences International Limited.</p></li><li><p>Myers, J., Grimaila, M. R., & Mills, R. F. (2011, January). Log-based distributed security event detection using simple event correlator. In 2011 44th Hawaii International Conference on System Sciences (pp. 1-7). IEEE.</p></li><li><p>Parekh, J. J. (2005). Privacy-Preserving Distributed Event Correlation.</p></li><li><p>Kato, N., Ohta, K., Ika, T., Mansfield, G., & Nemoto, Y. (1999). A proposal of event correlation for distributed network fault management and its evaluation. IEICE Transactions on Communications, 82(6), 859-867.</p></li><li><p>Cerullo, G., Coppolino, L., D’Antonio, S., Formicola, V., Papale, G., & Ragucci, B. (2016). Enabling convergence of physical and logical security through intelligent event correlation. In Intelligent Distributed Computing IX (pp. 427-437). Springer, Cham.</p></li><li><p>Alves, P. G. (2014). A Distributed Security Event Correlation Platform for SCADA (Doctoral dissertation, Universidade de Coimbra).</p></li><li><p>Zhang, B., & Al-Shaer, E. (2007, October). Self-organizing monitoring agents for hierarchical event correlation. In International Workshop on Distributed Systems: Operations and Management (pp. 13-24). Springer, Berlin, Heidelberg.</p></li><li><p>Katker, S. (1996, March). A modeling framework for integrated distributed systems fault management. In Proceedings of IFIP/IEEE International Conference on Distributed Platforms (pp. 186-198). IEEE.</p></li><li><p>Teufl, P., Payer, U., & Fellner, R. (2010, February). Event correlation on the basis of activation patterns. In 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing (pp. 631-640). IEEE.</p></li><li><p>Yoneki, E. (2010). Time/space aware event correlation. In Principles and Applications of Distributed Event-Based Systems (pp. 43-74). IGI Global.</p></li><li><p>Guo, N., Gao, T., Zhang, B., & Zhao, H. (2007, October). Distributed and scalable event correlation based on causality graph. In Asia-Pacific Network Operations and Management Symposium (pp. 567-570). Springer, Berlin, Heidelberg.</p></li><li><p>Tai, W., OSullivan, D., & Keeney, J. (2008, April). Distributed fault correlation scheme using a semantic publish/subscribe system. In NOMS 2008-2008 IEEE Network Operations and Management Symposium (pp. 835-838). IEEE.</p></li><li><p>Marwede, N., Rohr, M., van Hoorn, A., & Hasselbring, W. (2009, March). Automatic failure diagnosis support in distributed large-scale software systems based on timing behavior anomaly correlation. In 2009 13th European Conference on Software Maintenance and Reengineering (pp. 47-58). IEEE.</p></li><li><p>Fu, S., & Xu, C. Z. (2007, October). Quantifying temporal and spatial correlation of failure events for proactive management. In 2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS 2007) (pp. 175-184). IEEE.</p></li></ul><hr><h2 id="其他工作"><a href="#其他工作" class="headerlink" title="其他工作"></a>其他工作</h2><p>这部分记录在调研过程中,实现关联分析的其他工作,尤其在运维领域</p><ul><li><p>Liu, P., Chen, Y., Nie, X., Zhu, J., Zhang, S., Sui, K., … & Pei, D. (2019, October). FluxRank: A Widely-Deployable Framework to Automatically Localizing Root Cause Machines for Software Service Failure Mitigation. In 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE) (pp. 35-46). IEEE.</p><p> 用于已经确定了异常的发生后,快速的定位异常的根本原因,即定位到导致异常的服务器(而非导致异常的代码段或者更具体的原因)</p><p> FluxRank的设计思路在于:在系统发生故障的时候,通常需要先确认故障,然后在尽可能短的时间内将业务转移到其他不受异常影响的服务器上以便尽快恢复任务。最后将使用很长的时间来分析导致异常的原因:如代码错误等。本文旨在定位异常到服务器级别,专注于异常的快速缓解,而非找到根本原因。</p></li><li><p>Jayathilaka, H., Krintz, C., & Wolski, R. (2017, April). Performance monitoring and root cause analysis for cloud-hosted web applications. In Proceedings of the 26th International Conference on World Wide Web (pp. 469-478).</p><p> 这个工作与关联分析完全无关,主要是开发了一种在PaaS平台上的监控系统,用于快速异常定位于根因分析</p></li><li><p>Arzani, B., Ciraci, S., Loo, B. T., Schuster, A., & Outhred, G. (2016, August). Taking the blame game out of data centers operations with netpoirot. In Proceedings of the 2016 ACM SIGCOMM Conference (pp. 440-453).</p><p> 这个工作涉及到了关联分析,但与序列数据完全无关</p></li><li><p>Gao, J., Yaseen, N., MacDavid, R., Frujeri, F. V., Liu, V., Bianchini, R., … & Arzani, B. (2020, July). Scouts: Improving the Diagnosis Process Through Domain-customized Incident Routing. In Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication (pp. 253-269).</p><p> 完全无关,主要做异常定位的,也不是序列数据</p></li></ul>]]></content>
<tags>
<tag> causal discovery </tag>
<tag> paper list </tag>
<tag> survey </tag>
<tag> correlation analysis </tag>
</tags>
</entry>
<entry>
<title>Related Papers in NeuralIPS 2021 (2021.12.06 - 2021.12.14)</title>
<link href="/uncategorized/paperlistfile/NeuralIPS2021/"/>
<url>/uncategorized/paperlistfile/NeuralIPS2021/</url>
<content type="html"><![CDATA[<!--tags:按任务分类### Anomaly detection / Outlier / Out-of-distribution### Interpretable / Explainable### Causal discovery### Data augmentation 按数据分类### Time series### Missing value / Irregular sampled / Imputation### Sequence### Heterogeneous按深度学习架构分类### Recurrent neural network / RNN / LSTM / GRU ### Autoencoder--><p><a href="https://neurips.cc/Conferences/2021/Schedule?type=Poster">Accept paper lists</a></p><p><a href="https://nips.cc/Conferences/2021/DatasetsBenchmarks/AcceptedPapers">Benchmarks papers</a></p><span id="more"></span><h2 id="Main-Track"><a href="#Main-Track" class="headerlink" title="Main Track"></a>Main Track</h2><h3 id="Anomaly-detection-Outlier-Out-of-distribution"><a href="#Anomaly-detection-Outlier-Out-of-distribution" class="headerlink" title="Anomaly detection / Outlier / Out-of-distribution"></a>Anomaly detection / Outlier / Out-of-distribution</h3><ul><li><p>Online false discovery rate control for anomaly detection in time series </p><p> Quentin Rebjock, Baris Kurt, Tim Januschowski, Laurent Callot</p></li><li><p>Detecting Anomalous Event Sequences with Temporal Point Processes </p><p> Oleksandr Shchur, Ali Caner Turkmen, Tim Januschowski, Jan Gasthaus, Stephan Günnemann</p></li><li><p>Learned Robust PCA: A Scalable Deep Unfolding Approach for High-Dimensional Outlier Detection </p><p> HanQin Cai, Jialin Liu, Wotao Yin</p></li><li><p>Task-Agnostic Undesirable Feature Deactivation Using Out-of-Distribution Data</p><p> Dongmin Park · Hwanjun Song · Minseok Kim · Jae-Gil Lee</p></li><li><p>ReAct: Out-of-distribution Detection With Rectified Activations </p><p> Yiyou Sun, Chuan Guo, Yixuan Li</p></li><li><p>STEP: Out-of-Distribution Detection in the Presence of Limited In-Distribution Labeled Data </p><p> Zhi Zhou, Lan-Zhe Guo, Zhanzhan Cheng, Yu-Feng Li, Shiliang Pu</p></li><li><p>Locally Most Powerful Bayesian Test for Out-of-Distribution Detection using Deep Generative Models </p><p> Keunseo Kim, JunCheol Shin, Heeyoung Kim</p></li></ul><ul><li><p>Single Layer Predictive Normalized Maximum Likelihood for Out-of-Distribution Detection </p><p> Koby Bibas, Meir Feder, Tal Hassner</p></li></ul><h3 id="Interpretable-Explainable"><a href="#Interpretable-Explainable" class="headerlink" title="Interpretable / Explainable"></a>Interpretable / Explainable</h3><ul><li><p>Self-Interpretable Model with Transformation Equivariant Interpretation </p><p> Yipei Wang, Xiaoqian Wang</p></li><li><p>Physics-Integrated Variational Autoencoders for Robust and Interpretable Generative Modeling </p><p> Naoya Takeishi, Alexandros Kalousis</p></li><li><p>Scalable Rule-Based Representation Learning for Interpretable Classification </p><p> Zhuo Wang, Wei Zhang, Ning Liu, Jianyong Wang</p></li><li><p>Dynamic Inference with Neural Interpreters </p><p> Nasim Rahaman, Muhammad Waleed Gondal, Shruti Joshi, Peter Vincent Gehler, Yoshua Bengio, Francesco Locatello, Bernhard Schölkopf</p></li><li><p>Understanding Instance-based Interpretability of Variational Auto-Encoders </p><p> Zhifeng Kong, Kamalika Chaudhuri</p></li><li><p>Reliable Post hoc Explanations: Modeling Uncertainty in Explainability </p><p> Dylan Z Slack, Sophie Hilgard, Sameer Singh, Himabindu Lakkaraju</p></li><li><p>Explaining Latent Representations with a Corpus of Examples </p><p> Jonathan Crabbé, Zhaozhi Qian, Fergus Imrie, Mihaela van der Schaar</p></li></ul><h3 id="Causal-discovery"><a href="#Causal-discovery" class="headerlink" title="Causal discovery"></a>Causal discovery</h3><p>有46篇文章,暂时不整理</p><h3 id="Data-augmentation"><a href="#Data-augmentation" class="headerlink" title="Data augmentation"></a>Data augmentation</h3><ul><li><p>Learning Debiased Representation via Disentangled Feature Augmentation </p><p> Jungsoo Lee, Eungyeup Kim, Juyoung Lee, Jihyeon Lee, Jaegul Choo</p></li><li><p>Data Augmentation Can Improve Robustness </p><p> Sylvestre-Alvise Rebuffi, Sven Gowal, Dan Andrei Calian, Florian Stimberg, Olivia Wiles, Timothy Mann</p></li><li><p>Predify: Augmenting deep neural networks with brain-inspired predictive coding dynamics </p><p> Bhavin Choksi, Milad Mozafari, Callum Biggs O’May, B. ADOR, Andrea Alamia, Rufin VanRullen</p></li><li><p>How Data Augmentation affects Optimization for Linear Regression </p><p> Boris Hanin, Yi Sun</p></li><li><p>AugMax: Adversarial Composition of Random Augmentations for Robust Training </p><p> Haotao Wang, Chaowei Xiao, Jean Kossaifi, Zhiding Yu, Anima Anandkumar, Zhangyang Wang</p></li><li><p>Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited Data </p><p> Liming Jiang, Bo Dai, Wayne Wu, Chen Change Loy</p></li><li><p>Self-Supervised GANs with Label Augmentation </p><p> Liang Hou, Huawei Shen, Qi Cao, Xueqi Cheng</p></li><li><p>Explanation-based Data Augmentation for Image Classification </p><p> Sandareka Wickramanayake, Wynne Hsu, Mong-Li Lee</p></li></ul><hr><h3 id="Time-series"><a href="#Time-series" class="headerlink" title="Time series"></a>Time series</h3><ul><li><p>SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event Data </p><p> Alicia Curth, Changhee Lee, Mihaela van der Schaar</p></li><li><p>CSDI: Conditional Score-based Diffusion Models for Probabilistic Time Series Imputation </p><p> YUSUKE TASHIRO, Jiaming Song, Yang Song, Stefano Ermon</p></li><li><p>Coresets for Time Series Clustering </p><p> Lingxiao Huang, K. Sudhir, Nisheeth K Vishnoi</p></li><li><p>MixSeq: Connecting Macroscopic Time Series Forecasting with Microscopic Time Series Data </p><p> Zhibo Zhu, Ziqi Liu, Ge Jin, Zhiqiang Zhang, Lei Chen, JUN ZHOU, Jianyong Zhou</p></li><li><p>Deep Explicit Duration Switching Models for Time Series </p><p> Abdul Fatir Ansari, Konstantinos Benidis, Richard Kurle, Ali Caner Turkmen, Harold Soh, Alex Smola, Bernie Wang, Tim Januschowski</p></li><li><p>Online false discovery rate control for anomaly detection in time series </p><p> Quentin Rebjock, Baris Kurt, Tim Januschowski, Laurent Callot</p></li><li><p>Topological Attention for Time Series Forecasting </p><p> Sebastian Zeng, Florian Graf, Christoph Hofer, Roland Kwitt</p></li><li><p>Adjusting for Autocorrelated Errors in Neural Networks for Time Series </p><p> Fan-Keng Sun, Chris Lang, Duane S Boning</p></li><li><p>Probabilistic Transformer For Time Series Analysis </p><p> Binh Tang, David S. Matteson</p></li><li><p>Dynamical Wasserstein Barycenters for Time-series Modeling </p><p> Kevin C Cheng, Shuchin Aeron, Michael C Hughes, Eric Miller</p></li><li><p>Conformal Time-series Forecasting </p><p> Kamilė Stankevičiūtė, Ahmed Alaa, Mihaela van der Schaar</p></li><li><p>Time-series Generation by Contrastive Imitation </p><p> Daniel Jarrett, Ioana Bica, Mihaela van der Schaar</p></li></ul><hr><h3 id="Missing-value-Irregular-sampled-Imputation"><a href="#Missing-value-Irregular-sampled-Imputation" class="headerlink" title="Missing value / Irregular sampled / Imputation"></a>Missing value / Irregular sampled / Imputation</h3><ul><li><p>Identifiable Generative models for Missing Not at Random Data Imputation </p><p> Chao Ma, Cheng Zhang</p></li><li><p>What’s a good imputation to predict with missing values? </p><p> Marine Le Morvan, Julie Josse, Erwan Scornet, Gael Varoquaux</p></li><li><p>MIRACLE: Causally-Aware Imputation via Learning Missing Data Mechanisms </p><p> Trent Kyono, Yao Zhang, Alexis Bellot, Mihaela van der Schaar</p></li><li><p>Assessing Fairness in the Presence of Missing Data </p><p> Yiliang Zhang, Qi Long</p></li><li><p>Coresets for Clustering with Missing Values </p><p> Vladimir Braverman, Shaofeng H.-C. Jiang, Robert Krauthgamer, Xuan Wu</p></li><li><p>Time-series Generation by Contrastive Imitation </p><p> Daniel Jarrett, Ioana Bica, Mihaela van der Schaar</p></li><li><p>CSDI: Conditional Score-based Diffusion Models for Probabilistic Time Series Imputation </p><p> YUSUKE TASHIRO, Jiaming Song, Yang Song, Stefano Ermon</p></li></ul><hr><h3 id="Sequence"><a href="#Sequence" class="headerlink" title="Sequence"></a>Sequence</h3><ul><li><p>Duplex Sequence-to-Sequence Learning for Reversible Machine Translation </p><p> Zaixiang Zheng, Hao Zhou, Shujian Huang, Jiajun Chen, Jingjing Xu, Lei Li</p></li><li><p>Sequence-to-Sequence Learning with Latent Neural Grammars </p><p> Yoon Kim</p></li><li><p>A Constant Approximation Algorithm for Sequential Random-Order No-Substitution k-Median Clustering </p><p> Tom Hess, Michal Moshkovitz, Sivan Sabato</p></li><li><p>Pay Better Attention to Attention: Head Selection in Multilingual and Multi-Domain Sequence Modeling </p><p> Hongyu Gong, Yun Tang, Juan Pino, Xian Li</p></li><li><p>Contrastively Disentangled Sequential Variational Autoencoder </p><p> Junwen Bai, Weiran Wang, Carla P Gomes</p></li><li><p>Detecting Anomalous Event Sequences with Temporal Point Processes </p><p> Oleksandr Shchur, Ali Caner Turkmen, Tim Januschowski, Jan Gasthaus, Stephan Günnemann</p></li></ul><hr><h3 id="Heterogeneous"><a href="#Heterogeneous" class="headerlink" title="Heterogeneous"></a>Heterogeneous</h3><ul><li><p>RelaySum for Decentralized Deep Learning on Heterogeneous Data </p><p> Thijs Vogels, Lie He, Anastasia Koloskova, Sai Praneeth Karimireddy, Tao Lin, Sebastian U Stich, Martin Jaggi</p></li><li><p>Distilling Meta Knowledge on Heterogeneous Graph for Illicit Drug Trafficker Detection on Social Media </p><p> Yiyue Qian, Yiming Zhang, Yanfang Ye, Chuxu Zhang</p></li><li><p>Distributed Machine Learning with Sparse Heterogeneous Data </p><p> Dominic Richards, Sahand Negahban, Patrick Rebeschini</p></li><li><p>FjORD: Fair and Accurate Federated Learning under heterogeneous targets with Ordered Dropout </p><p> Samuel Horváth, Stefanos Laskaridis, Mario Almeida, Ilias Leontiadis, Stylianos Venieris, Nicholas Donald Lane</p></li></ul><hr><h3 id="Recurrent-neural-network-RNN-LSTM-GRU"><a href="#Recurrent-neural-network-RNN-LSTM-GRU" class="headerlink" title="Recurrent neural network / RNN / LSTM / GRU"></a>Recurrent neural network / RNN / LSTM / GRU</h3><ul><li><p>Charting and Navigating the Space of Solutions for Recurrent Neural Networks </p><p> Elia Turner, Kabir Vinay Dabholkar, Omri Barak</p></li><li><p>Self-Instantiated Recurrent Units with Dynamic Soft Recursion </p><p> Aston Zhang, Yi Tay, Yikang Shen, Alvin Chan, Shuai Zhang</p></li><li><p>SBO-RNN: Reformulating Recurrent Neural Networks via Stochastic Bilevel Optimization </p><p> Ziming Zhang, Yun Yue, Guojun Wu, Yanhua Li, Haichong Zhang</p></li><li><p><strong>(需要看看)</strong> On the Provable Generalization of Recurrent Neural Networks </p><p> Lifu Wang, Bo Shen, Bo Hu, Xing Cao</p></li><li><p>Recurrence along Depth: Deep Convolutional Neural Networks with Recurrent Layer Aggregation </p><p> Jingyu Zhao, Yanwen Fang, Guodong Li</p></li><li><p>Reverse engineering recurrent neural networks with Jacobian switching linear dynamical systems </p><p> Jimmy T.H. Smith, Scott Linderman, David Sussillo</p></li><li><p>Noisy Recurrent Neural Networks </p><p> Soon Hoe Lim, N. Benjamin Erichson, Liam Hodgkinson, Michael W. Mahoney</p></li><li><p>Combining Recurrent, Convolutional, and Continuous-time Models with Linear State Space Layers </p><p> Albert Gu, Isys Johnson, Karan Goel, Khaled Kamal Saab, Tri Dao, Atri Rudra, Christopher Re</p></li><li><p>Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks </p><p> Avi Schwarzschild, Eitan Borgnia, Arjun Gupta, Furong Huang, Uzi Vishkin, Micah Goldblum, Tom Goldstein</p></li><li><p><strong>(需要看看)</strong> Learning and Generalization in RNNs </p><p> Abhishek Panigrahi, Navin Goyal</p></li><li><p>Framing RNN as a kernel method: A neural ODE approach </p><p> Adeline Fermanian, Pierre Marion, Jean-Philippe Vert, Gérard Biau</p></li><li><p>Structured in Space, Randomized in Time: Leveraging Dropout in RNNs for Efficient Training </p><p> Anup Sarma, Sonali Singh, Huaipan Jiang, Rui Zhang, Mahmut Kandemir, Chita Das</p></li></ul><hr><h3 id="Autoencoder"><a href="#Autoencoder" class="headerlink" title="Autoencoder"></a>Autoencoder</h3><ul><li><p>Statistical Regeneration Guarantees of the Wasserstein Autoencoder with Latent Space Consistency </p><p> Anish Chakrabarty, Swagatam Das</p></li><li><p>Physics-Integrated Variational Autoencoders for Robust and Interpretable Generative Modeling </p><p> Naoya Takeishi, Alexandros Kalousis</p></li><li><p>On the Value of Infinite Gradients in Variational Autoencoder Models </p><p> Bin Dai, Li Kevin Wenliang, David Wipf</p></li><li><p><strong>(需要看看)</strong> Shape your Space: A Gaussian Mixture Regularization Approach to Deterministic Autoencoders </p><p> Amrutha Saseendran, Kathrin Skubch, Stefan Falkner, Margret Keuper</p></li><li><p>Neighborhood Reconstructing Autoencoders </p><p> Yonghyeon LEE, Hyeokjun Kwon, Frank C. Park</p></li><li><p>Model Selection for Bayesian Autoencoders </p><p> Ba-Hien Tran, Simone Rossi, Dimitrios Milios, Pietro Michiardi, Edwin V Bonilla, Maurizio Filippone</p></li><li><p>Clockwork Variational Autoencoders </p><p> Vaibhav Saxena, Jimmy Ba, Danijar Hafner</p></li><li><p>A Contrastive Learning Approach for Training Variational Autoencoder Priors </p><p> Jyoti Aneja, Alex Schwing, Jan Kautz, Arash Vahdat</p></li><li><p>Contrastively Disentangled Sequential Variational Autoencoder </p><p> Junwen Bai, Weiran Wang, Carla P Gomes</p></li><li><p>Permutation-Invariant Variational Autoencoder for Graph-Level Representation Learning </p><p> Robin Winter, Frank Noe, Djork-Arne Clevert</p></li></ul><hr><h3 id="Others"><a href="#Others" class="headerlink" title="Others"></a>Others</h3><ul><li><p>Robust and Fully-Dynamic Coreset for Continuous-and-Bounded Learning (With Outliers) Problems </p><p> Zixiu Wang, Yiwen Guo, Hu Ding</p></li><li><p>Automatic Unsupervised Outlier Model Selection </p><p> Yue Zhao, Ryan Rossi, Leman Akoglu</p></li><li><p>Improving Self-supervised Learning with Automated Unsupervised Outlier Arbitration </p><p> Yu Wang, Jingyang Lin, Jingjing Zou, Yingwei Pan, Ting Yao, Tao Mei</p></li><li><p>Drop-DTW: Aligning Common Signal Between Sequences While Dropping Outliers </p><p> Nikita Dvornik, Isma Hadji, Konstantinos G. Derpanis, Animesh Garg, Allan Douglas Jepson</p></li><li><p>Consistent Estimation for PCA and Sparse Regression with Oblivious Outliers </p><p> Tommaso d’Orsi, Chih-Hung Liu, Rajai Nasser, Gleb Novikov, David Steurer, Stefan Tiegel</p></li><li><p>Approximate optimization of convex functions with outlier noise </p><p> Anindya De, Sanjeev Khanna, Huan Li, Hesam Nikpey</p></li></ul><hr><h2 id="NeurIPS-2021-Datasets-and-Benchmarks-Accepted-Papers"><a href="#NeurIPS-2021-Datasets-and-Benchmarks-Accepted-Papers" class="headerlink" title="NeurIPS 2021 Datasets and Benchmarks Accepted Papers"></a>NeurIPS 2021 Datasets and Benchmarks Accepted Papers</h2><h3 id="Anomaly-detection-Outlier-Out-of-distribution-1"><a href="#Anomaly-detection-Outlier-Out-of-distribution-1" class="headerlink" title="Anomaly detection / Outlier / Out-of-distribution"></a>Anomaly detection / Outlier / Out-of-distribution</h3><ul><li><p>SegmentMeIfYouCan: A Benchmark for Anomaly Segmentation</p><p> Robin Chan · Krzysztof Lis · Svenja Uhlemeyer · Hermann Blum · Sina Honari · Roland Siegwart · Pascal Fua · Mathieu Salzmann · Matthias Rottmann</p></li><li><p><strong>(要看一下)</strong> Revisiting Time Series Outlier Detection: Definitions and Benchmarks</p><p> Henry Lai · Daochen Zha · Junjie Xu · Yue Zhao · Guanchu Wang · Xia Hu</p></li></ul><h3 id="Interpretable-Explainable-1"><a href="#Interpretable-Explainable-1" class="headerlink" title="Interpretable / Explainable"></a>Interpretable / Explainable</h3><ul><li><p>Chaos as an interpretable benchmark for forecasting and data-driven modelling</p><p> William Gilpin</p></li><li><p>Synthetic Benchmarks for Scientific Research in Explainable Machine Learning</p><p> Yang Liu · Sujay Khandagale · Colin White · Willie Neiswanger</p></li><li><p>FFA-IR: Towards an Explainable and Reliable Medical Report Generation Benchmark </p><p> Mingjie Li · Wenjia Cai · Rui Liu · Yuetian Weng · Xiaoyun Zhao · Cong Wang · Xin Chen · Zhong Liu · Caineng Pan · Mengke Li · yingfeng zheng · Yizhi Liu · Flora Salim · Karin Verspoor · Xiaodan Liang · Xiaojun Chang</p></li><li><p>Teach Me to Explain: A Review of Datasets for Explainable Natural Language Processing</p><p> Sarah Wiegreffe · Ana Marasovic</p></li></ul><h3 id="causal-discovery"><a href="#causal-discovery" class="headerlink" title="causal discovery"></a>causal discovery</h3><ul><li>Systematic Evaluation of Causal Discovery in Visual Model Based Reinforcement Learning Nan Rosemary Ke · Aniket Didolkar · Sarthak Mittal · Anirudh Goyal ALIAS PARTH GOYAL · Guillaume Lajoie · Stefan Bauer · Danilo Jimenez Rezende · Michael Mozer · Yoshua Bengio · Chris Pal</li></ul><h3 id="Time-series-1"><a href="#Time-series-1" class="headerlink" title="Time series"></a>Time series</h3><ul><li><p>Revisiting Time Series Outlier Detection: Definitions and Benchmarks</p><p> Henry Lai · Daochen Zha · Junjie Xu · Yue Zhao · Guanchu Wang · Xia Hu</p></li><li><p>Monash Time Series Forecasting Archive</p><p> Rakshitha Godahewa · Christoph Bergmeir · Geoffrey Webb · Rob Hyndman · Pablo Montero-Manso</p></li><li><p>Benchmarking the Robustness of Spatial-Temporal Models Against Corruptions</p><p> Chenyu Yi · SIYUAN YANG · Haoliang Li · Yap-peng Tan · Alex Kot</p></li></ul>]]></content>
<tags>
<tag> paper list </tag>
</tags>
</entry>
<entry>
<title>Causal Discovery</title>
<link href="/uncategorized/surveys/causal_discovery/"/>
<url>/uncategorized/surveys/causal_discovery/</url>
<content type="html"><![CDATA[<ul><li>调研近年因果发现的最新研究</li><li>将”因果发现”与”根因分析”以及”基于因果图的因果发现”等概念理清关系</li><li>从优化的角度重新看待三类因果发现算法, 这可以帮助我们借鉴已有方法到自己的工作中。</li><li>总结用于因果发现的各种包, 以及Github上star较多的Repositories</li></ul><span id="more"></span><h2 id="1-因果三阶梯"><a href="#1-因果三阶梯" class="headerlink" title="1. 因果三阶梯"></a>1. 因果三阶梯</h2><p><strong>Causal discovery(CD)</strong> 包含三个层次 <a href="#ref1">[1]</a>: </p><ul><li>关联 <em>Association</em>: A、B相关</li><li>干预 <em>Intervention</em>: 改变A的时候, B如何改变</li><li>反事实 <em>Counterfactuals</em>: 要想改变B, 应该如何改变A</li></ul><p>事实上, 许多CD的方法都仅能做到<strong>前两个阶段</strong>。”反事实”通常是不可实现的, 因为”反事实”要求在与<u>历史完全相同的环境因素</u>下<strong>探究决策对个体的影响</strong>。</p><h2 id="2-因果的本质、数学表达-与-因果结构学习"><a href="#2-因果的本质、数学表达-与-因果结构学习" class="headerlink" title="2. 因果的本质、数学表达 与 因果结构学习"></a>2. 因果的本质、数学表达 与 因果结构学习</h2><p>这一节的分析源自下面对三类causal structure learning方法挖掘的因果本质的思考。后文中已经分析得出,三类causal structure learning方法发现的因果都只停留在由“条件依赖”或者“回归”所刻画的“<strong>关联关系</strong>”,而非真正的因果关系,因此本节的主题是</p><ul><li>调研casual discovery和causal structure learning之间的区别(这个问题和哲学上的因果定义有关)</li><li>除了DAG表示的因果图外,还有哪些表征因果的模型范式(DAG因果图就是节点代表变量,边代表直接原因的图模型)</li></ul><p>要分析causal discovery和causal structure learning之间的区别,有以下思路:</p><ul><li><p>已知causal structure learning中,已经用不同的模型对因果进行了定义,这种定义或是定性的、或是定量的,因此这类方法本质上就是在根据每种方法的定义,从数据中拟合所谓的“因果”。因此causal structure learning可以视为“学习以模型定义的因果”的一类方法</p></li><li><p>那除了用模型定义的因果,还有哪些定义呢。</p></li></ul><h3 id="2-1-因果的本质"><a href="#2-1-因果的本质" class="headerlink" title="2.1 因果的本质"></a>2.1 因果的本质</h3><p>文献<a href="#ref14">[14]</a>从哲学、社会学等层面探究了因果在AI领域的定义,在Section 6.5中提到了一个关键问题:“因果是<strong>确定性</strong>的还是<strong>概率性</strong>的?”,以下是一些结论:自古以来,因果这个概念都<strong>不具备一个统一的定义</strong>。自古以来,因果总是与决定论、物理必然性和逻辑必然性联系在一起(即:<strong>确定性</strong>)。但在现代科学所研究的因果理论几乎都是<strong>概率性</strong>的,即以概率论和统计学论为基础的。但也有部分确定性的表述,这种表述大多源自于工程学,是一种不证自明方法(这里我理解,在工程学里总有“改变A,基于设计原理,B就一定会改变,并且B只受A控制,因此AB是因果关系”的表述,这种工程学上的因果显然是确定性的)。</p><p>其实,因果在统计学上的定义也并未统一,文献<a href="#ref16">[16]</a>中列出了三种主要的观点,下文说明一种较常见的观点,来自于Good<a href="#ref17">[17]</a>:</p><blockquote><p>事件 $F$ 是事件 $E$ 的原因需要满足:</p><ol><li><p>事件 $E$ 和 $F$ 同时成立(对有时间约束的事件则是$F_t \leq E_t$)</p></li><li><p>$P(E|H)<1, P(F|H)<1$ ,其中$H$包含两部分:</p><ul><li><p>$H1$ : 所有的自然法则</p></li><li><p>$H2$ : 所有被认为是理所当然的真实背景条件</p></li></ul></li><li><p>$P(E|FH) > P(E|\bar{F}H)$</p></li></ol></blockquote><p>Good在后续的研究<a href="#ref18">[18]</a><a href="#ref19">[19]</a>中试图给出因果关系的一种定量描述。$F$ 对$E$ 的潜在因果趋势可以由下式度量</p><p>$$\log \frac{P(\bar{E}|\bar{F}H)}{P(\bar{E}|FH)}$$</p><blockquote><p>where $H$ consists of all laws of nature and the background conditions before $F$ started. Thus for $F$ to be a potential cause of $E$, the <strong>two must be probabilistically dependent conditional on $H$</strong>. The (actual causal) degree to which $F$ caused $E$ is the <strong>limit</strong>, as the sizes of the events tend uniformly to zero, of the strength of the network of causal connections between $E$ and $F$. Here the strength of a link from $F$ to $E$ is measured by the tendency of $F$ to cause $E$; the strength of the network as a whole is a function of these link strengths which takes into account interactions amongst causes.</p></blockquote><p>根据以上思路,可以分析得到:</p><ul><li>因果在哲学上的定义一般都是确定性的,但在现代科学中,通常被定义为概率性的。之所以被如此定义,一方面是有部分前人的工作也如此定义;另一方面,经典的概率足以模拟人类推理的几乎所有方面<a href="#ref15">[15]</a>。从这个角度上来讲,<strong>概率性因果是对确定性因果的一个更广义的描述</strong>。</li><li>站在概率性因果的角度,因果发现可以被称之为,在数据中发现概率结构的工作(即因素A的概率分布变化时,因素B的概率分布是否变化),其本质都是在拟合一个概率模型。从这个角度来说,<strong>概率模型</strong>也可以用于因果发现。</li></ul><h3 id="2-2-因果结构学习"><a href="#2-2-因果结构学习" class="headerlink" title="2.2 因果结构学习"></a>2.2 因果结构学习</h3><p>近三十年来, 因果学习的工作一般聚焦于”因果结构学习(casual structure learning)”, 所得到的<strong>结构因果模型(structural causal model, SCM)</strong> 对因果进行了概率、统计上的定义,并用恰当的模型来描述因果结构。SCM包含两个部分: </p><ul><li>Graphical models: 由图模型表示的因果关系, 其中节点表示随机变量, 有向边表示因果方向</li><li>Structural equations: 在图模型中, 有向边上的因果效应, 由函数式表示</li></ul><p>graphical model则是structure causal model的一种,也是受广泛研究的一种。</p><p><a href="#ref1">[1]</a>中对SCM有以下论述</p><blockquote><p><strong>“structural causal models”</strong> (SCM), which consists of three parts: <em>graphical models</em>, <em>structural equations,</em> and <em>counterfactual and interventional logic</em>. </p><p>Graphical models serve as a language for representing what agents know about the world. Counterfactuals help them articulate what they wish to know. And structural equations serve to tie the two together in a solid semantics.</p></blockquote><p><a href="#ref3">[3]</a>中则着重推崇了图模型作为因果模型的表达形式</p><blockquote><p>Methods for extracting causal conclusions from observational studies are on the <strong>middle</strong> rung of Pearl’s Ladder of Causation, and they can be expressed in a mathematical language that extends classical statistics and <strong>emphasizes graphical models</strong>.</p><p>Various options exist for causal models: causal diagrams, structural equations, logical statements, and so forth. I am strongly sold on causal diagrams for nearly all applications, primarily due to their transparency but also due to the explicit answers they provide to many of the questions we wish to ask.</p><p>…… </p><p>Pearl defines a causal model to be <strong>a directed acyclic graph</strong> that can be paired with data to produce quantitative causal estimates. The graph embodies the structural relationships that a researcher assumes are driving empirical results. The structure of the graphical model, including the identification of vertices as mediators, confounders, or colliders, can guide experimental design through the identification of minimal sets of control variables. Modern expositions on graphical cause and effect models are Pearl (2009) and Spirtes et al. (2000).</p></blockquote><p>综上所述,casual discovery / probabilistic model / structure casual model / casual graph 之间的关系为</p><div align="center"> <img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/20221024103453.png" width = "65%" /></div>**Remark**: (1) 格兰杰因果应该属于structure casual model,但不一定属于图模型;(2) 图模型不仅包括DAG,还有其他类型的图。<h3 id="2-3-因果与统计模型的关联与区别"><a href="#2-3-因果与统计模型的关联与区别" class="headerlink" title="2.3 因果与统计模型的关联与区别"></a>2.3 因果与统计模型的关联与区别</h3><p>上文中提到,广义上讲,因果学习的本质是学习一个概率模型(概率模型可以被视为统计模型)。但<strong>概率模型 (或统计模型)与 因果 是两个完全不同的概念</strong>。</p><p>首先,明确一个关系,即:根据因果三阶梯,从上到下分别为“关联”“干预”“反事实”,其中关联是上述两个概念的基础,而干预和反事实才是因果特有的层次。<a href="#ref20">[20]</a>对关联和因果给出了明确的分界线。</p><ul><li>关联:可以用分布函数来定义,例子:相关性、回归、依赖性、条件独立性等</li><li>因果:不能只用分布函数来定义,例子:影响、混淆、干扰等。<strong>因果不能从关联中推导出</strong>,甚至<strong>不可以只从统计关联的角度上定义</strong>。</li></ul><p>从数学的角度上,任何因果分析的数学方法都<strong>必须获得新的符号</strong>来表示因果关系。概率符号对于表示因果关系来说是不够的。</p><ul><li>举例来说,概率相关的数学语言不允许我们表达“症状不导致疾病”这一简单事实,更不允许我们从这些事实中得出数学结论。我们只能说,两个事件是相互依赖的——这意味着,如果我们找到一个,我们就可以预期遇到另一个,但我们无法<strong>区分统计依赖性</strong>(由条件概率 P 量化的疾病症状)和<strong>因果依赖性</strong>,我们在标准概率演算中没有对此关系的表达式<a href="#ref20">[20]</a>。</li></ul><p>这是我们为因果关系建立一套新的数学表达的动机。</p><h3 id="2-4-因果分析中的数学表示"><a href="#2-4-因果分析中的数学表示" class="headerlink" title="2.4 因果分析中的数学表示"></a>2.4 因果分析中的数学表示</h3><p><strong>Structural Causal Models</strong> </p><ul><li><p>Structure Equations:变量-下标形式 / 带 $do$ 操作符的形式:</p> <p>$$ 𝑃(𝑌_𝑥 = 𝑦)⟺𝑃(𝑌=𝑦|𝑠𝑒𝑡(𝑋=𝑥))⟺𝑃(𝑌=𝑦|𝑑𝑜(𝑋=𝑥)) $$</p><ul><li><p>$P(Y_x = y)$:$Y$ 是一个随机变量,当随机变量 $X=x$ 时 $Y$ 在总体中达到 $y$ 值的概率。</p></li><li><p>$P(Y=y|do(X=x))$:当 $X=x$ 均匀的施加在每一个个体上时,$Y=y$ 发生的概率,常用于控制实验。</p></li></ul></li><li><p>Graphical Models:因果假设编码在<strong>缺失的链接</strong>中——两个变量之间有连接只代表存在因果关系的<strong>可能性</strong>,而没有连接则代表没有因果。可以辅以方程式来表达因果之间的定量关系。</p></li></ul><p><strong>表达因果效应大小</strong></p><ul><li>$Cov(X, Y)$:给定由以上模型表征的因果假设后,可以用协方差表示因果效应的大小,辅助从观察到的非实验数据中估计因果参数。</li></ul><p><strong>干预——数学表达</strong></p><p>干预的基础是结构方程(SEM),并通过$do$运算符完成。</p><ul><li>$do(x)$:模拟了一种物理世界的交互——在给定的SEM中,<strong>删除</strong>原模型中X变量的函数,并人为<strong>替换</strong>变量为一个常数 $X=x$。如下图所示<ul><li>$P(z, y | do(x))$:变量 $Y$ 和 $Z$ 的干预后分布,即表示对所有个体均实施了$X=x$ 后,变量 $Y$ 和 $Z$ 的联合分布</li><li>$P(z, y, x)$:变量的干预前分布</li></ul></li><li>记干预前的因果模型为$M$,干预 $X=x$ 后的因果模型为$M_x$<ul><li>$P_M (y|do(x)) ≜ P_{M_x}(y)$:在干预前的模型 $M$ 中实施干预 $do(x)$ 后 $Y$ 的分布,等价于在干预后模型 $M_x$ 中$ Y$ 的分布。</li></ul></li></ul><img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/image-20221117170455716.png" alt="image-20221117170455716" style="zoom:50%;" /><p>干预的效果表示</p><ul><li>Causal effect:$P(Y=y|do(x))= \sum_z P(z, y|do(x))$</li><li>Measurements<ul><li>Average Difference:$E(Y|do(x_0^{‘})) -E(Y|do(x_0))$</li><li>Experimental Risk Ratio:$\frac{E(Y|do(x_0^′))}{E(Y|do(x_0))}$</li></ul></li></ul><p><strong>反事实——数学表达</strong></p><ul><li>$Y_x (u)$:在变量曾被设置(has been set)为 $X=x$ 的情况下,个体 $u$ 得到的 $Y$ 变量的值</li><li>$Y_x (u)≜Y_{M_x} (u)$:Unit-level的反事实,定义为在干预后模型 $M_x$ 中,个体 $u$ 得到的 $Y$ 变量的值。<ul><li>反事实可以被视为特定情境因素(与过去某个时间的所有因素一致)下的干预。</li><li>反事实一般研究特定情境因素下,个体的干预结果。干预则一般研究决策的<strong>平均</strong>影响情况</li></ul></li></ul><h3 id="2-5-反事实推理"><a href="#2-5-反事实推理" class="headerlink" title="2.5 反事实推理"></a>2.5 反事实推理</h3><p>从上文可以看出,因果区别与一般的统计模型,其中干预和反事实起到了决定性的作用。本节将先区别干预与反事实的区别,然后阐述反事实推理的基本步骤。</p><p>从目标的角度</p><ul><li>干预:回答“如果<strong>现在</strong>干预变量X,那么对Z会发生什么变化?”</li><li>反事实:回答“如果在<strong>过去某个时间</strong>干预变量X,Z会发生什么变化?”</li></ul><p>其区别在于</p><ul><li><p>“干预”改变但不与观察到的世界相矛盾,因为干预前后的世界处于不同的时间</p></li><li><p>“反事实”则与已知事实相冲突</p></li></ul><p>注:二者并不是被严格区分的,也有研究不区分二者,这是一种哲学/文化上的差别</p><p><strong>反事实推理的基本步骤</strong></p><p><strong>Background</strong>:SCM及其当中的参数</p><ol><li><p>Step1: 外展——为每个观测变量设置一个在<strong>过去可能未观测</strong>到的变量集合$U$,基于历史观测学习集合$U$</p></li><li><p>Step2: 干预——为反事实推理建立新的干预模型,其中干预操作将会代替SCM中的一个(或多个)变量</p></li><li><p>Step3: 预测——将历史未观测变量$U$以及<strong>干预操作</strong>代入修改后的SCM,进行反事实推理</p></li></ol><img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/image-20221117171058267.png" alt="image-20221117171058267" style="zoom:67%;" /><p><strong>反事实(Counterfactual)推理——以确定性模型为例</strong></p><p><img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/image-20221117171259748.png" alt="image-20221117171259748"></p><p>对于概率型模型,则需要针对变量的所有情况进行考虑。其中“预测”则由“求概率”转为“求期望”。</p><p>下面的指标可以用于<strong>评估</strong>某个操作是否对未来有影响<a href="#ref20">[20]</a>,此外还有多种指标,如</p><p>$$P N(x, y)=P\left(Y_{x^{\prime}}=y^{\prime} \mid X=x, Y=y\right) \\E T T=P\left(Y_x=y \mid X=x^{\prime}\right)$$</p><h3 id="2-6-支撑材料"><a href="#2-6-支撑材料" class="headerlink" title="2.6 支撑材料"></a>2.6 支撑材料</h3><p>“结构学习本质上是一个模型选择问题,选择一个给定数据集上最能够描述数据依赖的模型。因果结构学习是结构学习中的一种特例,其学习了一个因果图”。这个观点是被普遍接受的。</p><p><strong>Remark</strong>:结构学习的三种方法(基于约束、基于分数、基于函数)本质上都可以细分为通过组合/搜索算法,来识别因果结构 <a href="#ref13">[13]</a></p><blockquote><p><strong>Structure learning is a model selection problem</strong> in which one estimates or learns a graph that best describes the dependence structure in a given data set (Drton & Maathuis 2017). <strong>Causal structure learning is the special case</strong> in which one tries to learn the <strong>causal graph</strong> or certain aspects of it, and this is what we focus on in this article.</p><p>—— Heinze-Deml, C., Maathuis, M. H., & Meinshausen, N. (2018). Causal structure learning. <em>Annual Review of Statistics and Its Application</em>, <em>5</em>, 371-391.</p></blockquote><p><strong>Q1:除DAG外,还有哪些因果图</strong> </p><p>摘自 <a href="#ref13">[13]</a></p><blockquote><p>Other types of graph used to represent causal structure include Partially Oriented Induced Path Graphs (POIPGs)[190, 228], SingleWorld Intervention Graphs (SWIGs) [24, 201, 202], σ-connection graphs [56], undirected graphs[11], interaction and component graphs for dynamic systems [40], Maximal Almost Ancestral Graphs (MAAGs)[231], psi-ECs [110], Patterns [274], and arid, bow-free, and ancestral ADMGs [19]. </p><p>There are also other types of assumptions relating to the functional form of the structural relationships (e.g., linear or non-linear) as well as the parametric form of the marginals and the errors (e.g., Gaussian or non-Gaussian).</p></blockquote><p><strong>Q2:大数据对因果发现的贡献</strong></p><p>由三阶梯定义的因果通常是难以推断的,因为大多数情况下,实验人员都难以“干预”,更别说“反事实”。此外,可能存在“未被观测的潜在因素”、“因果知识通常是非先验的(我理解是,非先验导致难以反事实)”也是阻碍因果发现与估计的因素之一 <a href="#ref13">[13]</a>。</p><blockquote><p>Unfortunately, in many cases, it may not be possible to undertake such experiments due to prohibitive cost, ethical concerns, or impracticality. For example, to understand the impact of smoking, it would be necessary to force diferent individuals to smoke or not-smoke. <strong>Researchers are therefore often left with non-experimental, observational data.</strong> In the absence of intervention and manipulation, observational data leave researchers facing a number of challenges: Firstly, observational datasets may not contain all relevant variables - <strong>there may exist unobserved/hidden/latent factors</strong> (this is sometimes referred to as the third variable problem). Secondly, observational data may <strong>exhibit selection bias</strong> - for example, younger patients may in general prefer to opt for surgery, whereas older patients may prefer medication. Thirdly, the causal <strong>relationships underlying these data may not be known a priori</strong> - for example, are genetic factors independent causes of a particular outcome, or do they mediate or moderate an outcome? These three challenges afect the discovery and estimation of causal relationships</p></blockquote><p>因此,大数据、或机器学习算法在因果发现问题中扮演的角色可以描述为</p><ul><li>大数据的数据量级弥补了观察不充分导致的“未观察、漏观察”、“选择偏差”,即大数据使得我们可以观察到更多的变量,当观察足够充分时,推断出真实因果的概率就越大。</li><li>大数据也可以以数据量级减轻选择偏差(我理解是:虽然年轻人可能更倾向于选择整形手术,但随着样本增多,也可以找到倾向于做整形手术的老年人)。</li><li>大数据可能可以帮助我们进行反事实,例如对于一个周期系统,通过足够多周期的观察,我们也许可以找到一个时间点,只有一个原因变量发生变化,而其他所有变量都与历史保持一致,这种场景有利于推断因果反事实。</li></ul><p>此外,对比实验数据(即存在“干预-效果”结构的数据)以及观测数据(即只包含非主动干预以及被动观测的数据),观测数据可以提供更好的统计能力和可推广性 <a href="#ref13">[13]</a>。</p><h2 id="3-Casual-Structure-Learning"><a href="#3-Casual-Structure-Learning" class="headerlink" title="3. Casual Structure Learning"></a>3. Casual Structure Learning</h2><h3 id="3-1-Casual-structure-learning的三类方法"><a href="#3-1-Casual-structure-learning的三类方法" class="headerlink" title="3.1 Casual structure learning的三类方法"></a>3.1 Casual structure learning的三类方法</h3><p>Casual structure learning的经典分类方法可分为三个<strong>主要类别</strong>:constrain-based, score-based, functional casual model <a href="#ref2">[2]</a>,还有一些hybird method,此处不列出。</p><p><strong>Remark</strong>:也有文章<a href="#ref13">[13]</a>提出“constraint-based, score-based, those exploiting structural asymmetries, and those exploiting various forms of intervention”的分类方法,这种分类方法比较新,可能对近期(2022)工作有较好的适应性。</p><ul><li><p><strong>Constraint-based methods</strong>: 这类方法依赖随机变量间的<strong>条件独立性测试(conditional independency test)</strong> 探究变量间的因果结构</p><ul><li><p>在传统的PC算法中, 为了简便的推导出因果结果, 基于CI定义了两种图上的结构, 即 V-structure / D-separation, 这两种结构可以辅助推导出因果的结构与方向。具体的, PC首先构造一个完全图, 然后通过两两变量间的independency test删除某些无向边, 然后基于CI test以及V-structure / D-separation, 确定其余边的方向或删除某些边。</p></li><li><p>缺点: </p><ol><li>不能存在未观测的混杂变量, 该条件在大数据的情况下很难满足, 但存在如FCI的算法放宽了该限制</li><li>根据因果信念假设, 只能根据条件独立性来判断因果关系, 因此需要非常多且高质量的数据, 如果数据较少, 则条件独立性假设测试可能会互斥</li><li>对于分叉结构以及对撞结构, 该类算法无法根据条件独立性分辨<strong>马尔可夫等价类(Markov equivalent class)</strong>, 因此对局部因果关系的判别不足</li></ol><ul><li><strong>Markov equivalent class</strong>: 拥有相同d分离结构的因果图并且具有相同条件独立性关系的因果图被称作马尔可夫等价类, 无法根据条件独立性分辨因果方向。</li></ul></li></ul></li><li><p><strong>Score-based methods</strong>: 这类方法首先指定因果父节点到子节点之间的函数关系, 然后以某个分数, 如AIC / BIC, 为优化目标, 优化得到<strong>图结构</strong>以及相关参数。</p><ul><li>如NOTEARS假设函数关系为 $x=\sum w_x f(P_a(x))$ , 其中 $w_x$ 是变量 $x$ 的权重, $P_a(x)$ 是其因果父节点</li><li>缺点: <ol><li>该方法也会得到马尔可夫等价类。</li><li>由于要找到最优分数, 就要搜索全部的图, 这是一个NP-hard的问题, 复杂度极高且容易陷入局部最优。</li></ol></li></ul></li><li><p><strong>Functional casual model</strong>: 这类方法往往探究两个已有关联的变量之间的因果方向。首先对数据与因果函数做出假设, 然后通过测试两个变量之间是否满足关联函数, 来判断二者之间的关联方向。</p><ul><li><p>如LiNGAM假设因果之间满足<strong>线性关系</strong>, 且数据中的噪音为<strong>高斯噪声</strong></p></li><li><p>该类方法由于进行了严格的假设, 且一般会根据函数的拟合程度来找到唯一的因果方向, 因此一般不会出现马尔可夫等价类</p></li><li><p>注意, FCM一般是探究两个变量之间因果关系的方法, 如<a href="#ref4">[4]</a>所述。该类方法的缺点是对数据特性以及因果有较强的假设。</p><blockquote><p>Determining causal relationships <strong>between two variables</strong> is a fundamental and challenging causal discovery task (Janzing et al., 2012). <strong>Conventional constraint-based and score-based causal discovery methods identify causal structures only up to Markov equivalent classes (Spirtes et al., 2001), in which some causal relationships are undetermined.</strong> To address this challenge, properly constrained functional causal models (<strong>FCMs</strong>) have been proposed. FCMs represent the effect as a function of its cause and independent noise and can help identify the causal direction between two variables by imposing substantial structural constraints on model classes, such as additive noise models (ANMs)</p></blockquote></li></ul></li></ul><div align="center"> <img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/image-20221018113112863.png" width = "75%" /></div><p><strong>需要指出的是: 图模型只是一种因果关系的表示方式, 但该表示方式不是必要的。</strong></p><p>上述三类方法, 由于其假设不同、检验方法不同, 因此挖掘出的”因果”具有不同含义(即因果图中的有向边具有不同的含义), 分属不同的因果阶梯: </p><ul><li>Constraint-based: 该类方法通过一系列变量之间的CI, 学习变量之间的因果关系。 该类方法挖掘出的因果本质是”<strong>条件依赖</strong>“, 这类依赖属于”<strong>关联</strong>“层面。</li><li>Score-based / Functional causal model: 该类方法首先定义了因果变量之间满足的<strong>函数关系</strong>, 前者优化全局得分函数来确定变量间的因果, 后者采用穷举优化算法搜索变量间的因果。这两类方法挖掘出的因果本质是由”回归关系”表示的”关联”。</li></ul><p><strong>Q3:CD与RCA的关系</strong></p><p>此外, 强调一下Root Cause Analysis(RCA)与CD的关系, 即: 一般的RCA更关注Causal discovery的前两个等级, 即探究”什么和异常相关?””什么导致了异常?”, 这就是为何许多RCA的方法都<strong>只考虑了关联、推理</strong>, 因此Causal discovery和RCA的关系如下: </p><div align="center"> <img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img/image-20221018113441012.png" width = "33%" /></div><h3 id="3-2-从优化的角度分析三类方法"><a href="#3-2-从优化的角度分析三类方法" class="headerlink" title="3.2 从优化的角度分析三类方法"></a>3.2 从优化的角度分析三类方法</h3><p>为了理解三类Casual structure learning的方法, 这里从优化的角度对三类方法进行阐述。</p><p>首先用$\mathcal{G}$表示$d$个节点所构成的有向图空间。对 $\forall G(\boldsymbol{M}) \in \mathcal{G}$ , 用 $\boldsymbol{M}$ 表示对应的邻接矩阵, 反过来, 用 $G(\boldsymbol{M})$ 表示以 $\boldsymbol{M}$ 为邻接矩阵的有向图。元素 $M_{i,j}=1$ 表示存在因果关系 $\boldsymbol{x}_i \rightarrow \boldsymbol{x}<em>j$ ,$M</em>{i,j}=0$ 表示不存在因果关系。 $\boldsymbol{X}=[\boldsymbol{x}_1, \cdots, \boldsymbol{x}_d] \in \mathbb{R}^{n \times d}$ 表示 $d$ 维的观测数据, 其中每个数据观测 $n$ 次。$\mathbb{D}$ 表示由 $d$ 个节点组成的有向无环图集合。</p><h3 id="3-2-1-Constraint-based-Method"><a href="#3-2-1-Constraint-based-Method" class="headerlink" title="3.2.1 Constraint-based Method"></a>3.2.1 Constraint-based Method</h3><p>基于约束的算法利用 从一系列统计测试中获得的一组条件独立性结果 来恢复因果图。</p><p>当从数据 $\boldsymbol{X}$ 中已经学习到每一对变量间条件独立检验的最小测试统计量,如p-value,因此可以构造出测试统计量矩阵 $\boldsymbol{P}$ ,其中对角线上的元素均为0,元素 $P_{i,j}$ 表示 $\boldsymbol{x}_i , \boldsymbol{x}_j$ 间的条件独立性检验的测试统计量,假设检验的显著性水平为 $\alpha$ 。 $f$ 表示条件独立检验统计量的函数, $Q$ 表示用于评价得到的统计量矩阵与图 $G$ 的拟合程度的函数。</p><p>该类方法的优化问题表述为: </p><p>$$\begin{array}{}\min : & Q(\boldsymbol{M}, f(\boldsymbol{P}, \alpha)) \\s.t. & G(\boldsymbol{M}) \in \mathbb{D} \\var: & \boldsymbol{M} \in\{0,1\}^{d \times d} \\\end{array}$$</p>**举例**: 如在PC<a href="#ref5">[5]</a>算法中,存在假设检验 $\left\{\begin{array}{}H_0: & \boldsymbol{x}_i \perp \boldsymbol{x}_j \\H_1: & \boldsymbol{x}_i \perp / \boldsymbol{x}_j\end{array}\right.$ ,当测试统计量小于显著性水平时, $H_1$ 成立,即有<p>$$Q(\boldsymbol{M}, f(\boldsymbol{P}, \alpha))=\sum_{i, j=1: \mathrm{d}} M_{i, j} \cdot\left(P_{i, j}-\alpha\right)$$</p>表示从数据中得到的(条件)独立约束在图得到的(条件)独立约束集合中未出现的数量。<h3 id="3-2-2-Score-based-Method"><a href="#3-2-2-Score-based-Method" class="headerlink" title="3.2.2 Score-based Method"></a>3.2.2 Score-based Method</h3><p>基于得分的算法最大化图 $G$ 与观测数据 $\boldsymbol{X}$ 之间的适应度, 来构建因果结构。该类方法的优化问题表述为: </p><p>$$\begin{array}{ll}\max &S\left( \boldsymbol{M} ,\boldsymbol{X} \right) \\ \text{s.t.} &G\left( \boldsymbol{M} \right) \in \mathbb{D} \\ \text{var} &\boldsymbol{M} \in \left\{ 0,1\right\}^{d\times d} \end{array}$$</p>其中DAG约束在<a href="#ref6">[6]</a>中被重写为 $\text{tr} \left( {}e^{\boldsymbol{M} \circ \boldsymbol{M} }\right) -d=0$ , 这使得目标函数可以被连续优化。$S(\cdot)$为图与观测助局之间的适应度得分, 可用的得分函数包括BIC(GES<a href="#ref7">[7]</a>)、Bde<a href="#ref8">[8]</a>、Bge<a href="#ref9">[9]</a>。不同的方法往往采用不同的搜索算法在图空间中与哦话上述目标函数, 如: 贪心搜索(greedy search)<a href="#ref8">[8]</a>、顺序查找(order search)<a href="#ref10">[10]</a>、坐标下降<a href="#ref5">[5]</a>。<p><strong>举例</strong>: NOTEARS<a href="#ref11">[11]]</a>中的得分函数为</p><p>$$\mathcal{S}(\boldsymbol{M}, \boldsymbol{X})=\frac{1}{2 n} \sum_{t=1}^n\left\|\boldsymbol{x}_{t,:}-\boldsymbol{f}\left(\boldsymbol{M}, \boldsymbol{x}_{t,:}\right)\right\|_F^2$$</p>$\boldsymbol{x}_{t,:} \in \mathbb{R}^{1 \times d}$ 表示第 $t$ 个观测样本,$f$ 为生成模型<h3 id="3-2-3-Functional-Causal-Model"><a href="#3-2-3-Functional-Causal-Model" class="headerlink" title="3.2.3 Functional Causal Model"></a>3.2.3 Functional Causal Model</h3><p>基于FCM的算法假设变量间的因果关系满足函数 $\boldsymbol{x}_j=f(\boldsymbol{x}_i,\boldsymbol{e}<em>j;\boldsymbol{\theta}</em>{i,j})$ , 其中 $\boldsymbol{x}_i, \boldsymbol{x}_j$ 分别为直接原因变量、果变量, $\boldsymbol{e}_j \in \mathbb{R}^{n}$ 表示一些不可测量因素或噪音。 $\boldsymbol{\epsilon}=[\boldsymbol{e}_1, \cdots, \boldsymbol{e}_d]$ , $\boldsymbol{\theta}$ 为模型参数。下式中用 $L(\cdot)$ 表示用于衡量参数 $\boldsymbol{\theta}$ 的模型的预测值与实际观测的数据 $\boldsymbol{x}_j$ 间的拟合程度的函数。该类方法的优化问题表述为: </p><p>$$\begin{array}{}\min : & \sum_{i, j=1: \mathrm{d}} M_{i, j} \cdot\left(L\left(\boldsymbol{x}_j, f\left(\boldsymbol{x}_i, \theta_{i, j}\right)\right)+Q\left(\boldsymbol{x}_i, \boldsymbol{x}_j-f\left(\boldsymbol{x}_i, \theta_{i, j}\right)\right)\right) \\s.t. & G(M) \in \mathbb{D} \\var: & \boldsymbol{\theta}, \boldsymbol{M} \in\{0,1\}^{d \times d}\end{array}$$</p>设有假设 $H_1: \boldsymbol{x}_i \perp (\boldsymbol{x}_j - f(\boldsymbol{x}_i,\boldsymbol{\theta}_{i,j})$ ,$Q\left(\boldsymbol{x}_i, \boldsymbol{x}_j-f\left(\boldsymbol{x}_i, \theta_{i, j}\right)\right)$ 表示 $\boldsymbol{x}_i$ 和 $\boldsymbol{x}_j-f\left(\boldsymbol{x}_i, \theta_{i, j}\right)$ 之间的独立性检验的测试统计量,当其小于显著性水平 $\alpha$ 时接受 $H1$ (这种方法对应于一类假设检验方法:regression-based independence test)。<h2 id="Reference"><a href="#Reference" class="headerlink" title="Reference"></a>Reference</h2><p><a name="ref1">[1]</a> Pearl, J. (2019). The seven tools of causal inference, with reflections on machine learning. <em>Communications of the ACM</em>, <em>62</em>(3), 54-60.</p><p><a name="ref2">[2]</a> Glymour, Clark, Kun Zhang, and Peter Spirtes. “Review of causal discovery methods based on graphical models.” <em>Frontiers in genetics</em> 10 (2019): 524.</p><p><a name="ref3">[3] </a>Goldberg, L. R. (2019). The Book of Why: The New Science of Cause and Effect: by Judea Pearl and Dana Mackenzie, <em>Basic Books</em> (2018). ISBN: 978-0465097609.</p><p><a name="ref4">[4]</a> Tu, R., Zhang, K., Kjellström, H., & Zhang, C. (2022). Optimal transport for causal discovery. In *ICLR 2022-The Tenth International Conference on Learning Representations (Virtual), Apr 25th-29th, 2022.</p><p><a name="ref5">[5]</a> Kalisch, Markus, and Peter Bühlman. “Estimating high-dimensional directed acyclic graphs with the PC-algorithm.” Journal of Machine Learning Research 8.3 (2007).</p><p><a name="ref6">[6]</a> Xun Zheng, Bryon Aragam, Pradeep Ravikumar, and Eric P. Xing. 2018. DAGs with NO TEARS: continuous optimization for structure learning. In Proceedings of the 32nd International Conference on Neural Information Processing Systems (NIPS’18). Curran Associates Inc., Red Hook, NY, USA, 9492–9503.</p><p><a name="ref7">[7]</a> D. M. Chickering and D. Heckerman. Efficient approximations for the marginal likelihood of Bayesian networks with hidden variables. Machine Learning, 29(2-3):181–212, 1997.</p><p><a name="ref8">[8]</a> D. Heckerman, D. Geiger, and D. M. Chickering. Learning Bayesian networks: The combination of knowledge and statistical data. Machine learning, 20(3):197–243, 1995.</p><p><a name="ref9">[9]</a> J. Kuipers, G. Moffa, and D. Heckerman. Addendum on the scoring of gaussian directed acyclic graphical models. The Annals of Statistics, pages 1689–1691, 2014.</p><p><a name="ref10">[10]</a> F. Fu and Q. Zhou. Learning sparse causal Gaussian networks with experimental intervention: Regularization and coordinate descent. Journal of the American Statistical Association, 108(501):288–300, 2013. </p><p><a name="ref11">[11]</a> Zheng, X., Aragam, B., Ravikumar, P. K., & Xing, E. P. (2018). Dags with no tears: Continuous optimization for structure learning. <em>Advances in Neural Information Processing Systems</em>, <em>31</em>.</p><p><a name="ref12">[12]</a> Shimizu, S., Hoyer, P. O., Hyvärinen, A., Kerminen, A., & Jordan, M. (2006). A linear non-Gaussian acyclic model for causal discovery. <em>Journal of Machine Learning Research</em>, <em>7</em>(10).</p><p><a name="ref13">[13]</a> Matthew J. Vowels, Necati Cihan Camgoz, and Richard Bowden. 2022. D’ya Like DAGs? A Survey on Structure Learning and Causal Discovery. ACM Comput. Surv. Just Accepted (March 2022). <a href="https://doi.org/10.1145/3527154">https://doi.org/10.1145/3527154</a></p><p><a name="ref14">[14]</a> Starmans, R. (2020). Prometheus unbound or Paradise regained: the concept of Causality in the contemporary AI-Data Science debate. <em>Journal de la Société Française de Statistique</em>, <em>161</em>(1), 4-41.</p><p><a name="ref15">[15]</a> Cheeseman, P. (1985). In defense of probability. In Proceedings of the Ninth International Joint Conference on AI (IJCAI, 1983).</p><p><a name="ref16">[16]</a> Williamson, J. (2009). Probabilistic theories of causality. <em>The Oxford handbook of causation</em>, 185-212.</p><p><a name="ref17">[17]</a> Good, I. J. (1959). A theory of causality. <em>The British Journal for the Philosophy of Science</em>, <em>9</em>(36), 307-310.</p><p><a name="ref18">[18]</a> Good, I. J. (1961). A causal calculus (I). <em>The British journal for the philosophy of science</em>, <em>11</em>(44), 305-318.</p><p><a name="ref19">[19]</a> Good, I. J. (1961). A causal calculus (II). <em>The British journal for the philosophy of science</em>, <em>12</em>(45), 43-51.</p><p><a name="ref20">[20]</a> Pearl, J. (2009). Causal inference in statistics: An overview. <em>Statistics surveys</em>, <em>3</em>, 96-146.</p><h2 id="2020-2022最新论文列表"><a href="#2020-2022最新论文列表" class="headerlink" title="2020-2022最新论文列表"></a>2020-2022最新论文列表</h2><ol><li>Jalaldoust, A., Hlaváčková-Schindler, K., & Plant, C. (2022, June). Causal Discovery in Hawkes Processes by Minimum Description Length. In <em>Proceedings of the AAAI Conference on Artificial Intelligence</em> (Vol. 36, No. 6, pp. 6978-6987).【高维Hawkes序列中的Grange causal graph learning】</li><li>Zhang, H., Zhang, K., Zhou, S., Guan, J., & Zhang, J. (2021, May). Testing independence between linear combinations for causal discovery. In <em>Proceedings of the AAAI Conference on Artificial Intelligence</em> (Vol. 35, No. 7, pp. 6538-6546).【线性非高斯结构方程模型下两个线性组合之间的独立性——条件独立性测试中的一个特殊问题】</li><li>Lu, N. Y., Zhang, K., & Yuan, C. (2021, May). Improving causal discovery by optimal bayesian network learning. In <em>Proceedings of the AAAI Conference on Artificial Intelligence</em> (Vol. 35, No. 10, pp. 8741-8748).【提出了一种基于分数的方法中, 一种新的穷举优化方法】</li><li>Hyttinen, A., Eberhardt, F., & Järvisalo, M. (2014, July). Constraint-based Causal Discovery: Conflict Resolution with Answer Set Programming. In <em>UAI</em> (pp. 340-349).【将因果图搜索问题, 视为带约束的优化问题】</li><li>Dhir, A., & Lee, C. M. (2020, April). Integrating overlapping datasets using bivariate causal discovery. In <em>Proceedings of the AAAI Conference on Artificial Intelligence</em> (Vol. 34, No. 04, pp. 3781-3790).【从多个数据集中学习一致的因果结构的问题】</li><li>Huang, B., Zhang, K., Gong, M., & Glymour, C. (2020, April). Causal discovery from multiple data sets with non-identical variable sets. In <em>Proceedings of the AAAI Conference on Artificial Intelligence</em> (Vol. 34, No. 06, pp. 10153-10161).【具有不同变量集的多个数据集的因果发现】</li><li>Maeda, T. N., & Shimizu, S. (2020, June). RCD: Repetitive causal discovery of linear non-Gaussian acyclic models with latent confounders. In <em>International Conference on Artificial Intelligence and Statistics</em> (pp. 735-745). PMLR.【受潜在混杂因素影响的数据中, 发现利用函数模型来发现因果(以往通常是基于约束)】</li><li>Tu, R., Zhang, C., Ackermann, P., Mohan, K., Kjellström, H., & Zhang, K. (2019, April). Causal discovery in the presence of missing data. In <em>The 22nd International Conference on Artificial Intelligence and Statistics</em> (pp. 1762-1770). PMLR.【<strong>缺失数据中的因果发现</strong>】</li><li>Feng, G., Yu, K., Wang, Y., Yuan, Y., & Djurić, P. M. (2020, May). Improving convergent cross mapping for causal discovery with Gaussian processes. In <em>ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</em> (pp. 3692-3696). IEEE.【耦合时间序列之间的因果发现】</li><li>Lippe, P., Cohen, T., & Gavves, E. (2021). Efficient neural causal discovery without acyclicity constraints. <em>arXiv preprint arXiv:2107.10483</em>.【一个基于神经网络+score based的因果发现】</li><li>Tu, R., Zhang, K., Kjellström, H., & Zhang, C. (2022). Optimal transport for causal discovery. In <em>ICLR 2022-The Tenth International Conference on Learning Representations (Virtual), Apr 25th-29th, 2022</em>.【用<strong>最优传输</strong>理论重写了FCM的方法, 并以优化的形式做优化, 实现因果发现】</li><li>Zhu, S., Ng, I., & Chen, Z. (2019, September). Causal Discovery with Reinforcement Learning. In <em>International Conference on Learning Representations</em>.【这个文章里对<strong>各类优化方法</strong>有比较好的调研】</li><li>Huang, B., Zhang, K., Gong, M., & Glymour, C. (2019, May). Causal discovery and forecasting in nonstationary environments with state-space models. In <em>International conference on machine learning</em> (pp. 2901-2910). PMLR.【非平稳时间序列中的因果发现】</li><li>Empirical Bayesian Approaches for Robust Constraint-based Causal Discovery under Insufficient Data【小数据、非平稳】</li><li>Brouillard, P., Lachapelle, S., Lacoste, A., Lacoste-Julien, S., & Drouin, A. (2020). Differentiable causal discovery from interventional data. <em>Advances in Neural Information Processing Systems</em>, <em>33</em>, 21865-21877.【弱忠诚假设的因果发现】</li><li>Mokhtarian, E., Akbari, S., Ghassami, A., & Kiyavash, N. (2021, August). A recursive markov boundary-based approach to causal structure learning. In The KDD’21 Workshop on Causal Discovery (pp. 26-54). PMLR.【基于约束的方法, 用递归的优化方法】</li></ol><h2 id="值得关注的最新工作"><a href="#值得关注的最新工作" class="headerlink" title="值得关注的最新工作"></a>值得关注的最新工作</h2><ol><li>Bhattacharya, R., Nagarajan, T., Malinsky, D., & Shpitser, I. (2021, March). Differentiable causal discovery under unmeasured confounding. In <em>International Conference on Artificial Intelligence and Statistics</em> (pp. 2314-2322). PMLR.【confounded systems中的因果图模型发现, 其中节点的定义可能不一样】</li><li>Brouillard, P., Lachapelle, S., Lacoste, A., Lacoste-Julien, S., & Drouin, A. (2020). Differentiable causal discovery from interventional data. <em>Advances in Neural Information Processing Systems</em>, <em>33</em>, 21865-21877.【和上面的有点像】</li><li>S. Ren, H. Yin, M. Sun and P. Li, “Causal Discovery with Flow-based Conditional Density Estimation,” <em>2021 IEEE International Conference on Data Mining (ICDM)</em>, 2021, pp. 1300-1305, doi: 10.1109/ICDM51629.2021.00161.【流模型来估计变量的联合概率密度, 根据条件密度估计的方差推断每个潜在因果方向的分数, 我们的因果发现方法减轻了传统方法所做的限制性假设, 更好地捕捉各种问题领域中以任意形式出现的数据之间的复杂因果关系】</li><li>Zhang, H., Zhou, S., Zhang, K., & Guan, J. (2022, June). Residual Similarity Based Conditional Independence Test and Its Application in Causal Discovery. In <em>Proceedings of the AAAI Conference on Artificial Intelligence</em> (Vol. 36, No. 5, pp. 5942-5949).【CI转优化问题】</li></ol><hr><h2 id="ToolBox"><a href="#ToolBox" class="headerlink" title="ToolBox"></a>ToolBox</h2><ul><li>gCastle <a href="https://github.com/huawei-noah/trustworthyAI/tree/master/gcastle">URL</a></li></ul><blockquote><p>gCastle是华为诺亚方舟实验室自研的因果结构学习工具链, 主要的功能和愿景包括: </p><ol><li>数据生成及处理: 包含各种模拟数据生成算子, 数据读取算子, 数据处理算子(如先验灌入, 变量选择, CRAM)。</li><li>因果图构建: 提供了一个因果结构学习python算法库, 包含了主流的因果学习算法以及最近兴起的基于梯度的因果结构学习算法。</li><li>因果评价: 提供了常用的因果结构学习性能评价指标, 包括F1, SHD, FDR, TPR, FDR, NNZ等</li></ol></blockquote><ul><li>Causal Discovery Toolbox <a href="https://github.com/FenTechSolutions/CausalDiscoveryToolbox">URL</a></li></ul><blockquote><p>The Causal Discovery Toolbox is a package for causal inference in graphs and in the pairwise settings for Python>=3.5.<br>Tools for graph structure recovery and dependencies are included. The package is based on Numpy, Scikit-learn, Pytorch and R.</p></blockquote><ul><li>Tigramite <a href="https://github.com/jakobrunge/tigramite">URL</a></li></ul><blockquote><p>Tigramite 是一个因果时间序列分析 python 包。它允许从高维时间序列数据集有效地重建因果图, 并对获得的因果依赖进行建模,<br>以进行因果中介和预测分析。因果发现基于适用于离散或连续值时间序列的线性和非参数条件独立性测试。</p><ul><li>包含的因果发现方法: PCMCI、PCMCIplus、LPCMCI</li><li>包含的独立性测试方法: ParCorr、GPDC / GPDCtorch、CMIknn、CMIsymb</li></ul></blockquote><ul><li>causalDisco: an R package with tools for causal discovery on observational data <a href="https://github.com/annennenne/causalDisco">URL</a></li></ul><blockquote><p>causalDisco 包括temporal PC的实现</p></blockquote><ul><li>Causal Discovery Tools for Time Series Applications - A Collection of Tutorials <a href="https://github.com/savinims/DATAS_Causal_Discovery">URL</a></li></ul><blockquote><p>为大气科学家数据分析工具 (DATAS) 网关的一部分, 编写的教程侧重于大气科学应用。数据: <a href="https://datasgateway.colostate.edu/">https://datasgateway.colostate.edu/</a></p><p>本资料库中解释的方法侧重于观察性研究, 其中不进行受控实验(例如, 气候中的有针对性的建模研究)来确定原因和影响。<br>这些方法允许您识别需要使用我们现有的特定应用领域知识进一步验证的”潜在”关系。</p><p>方法包括: 二元格兰杰因果检验、PC稳定算法的时间序列扩展</p></blockquote><h2 id="Related-work-with-code"><a href="#Related-work-with-code" class="headerlink" title="Related work with code"></a>Related work with code</h2><p>[1] TCDF: Causal Discovery with Attention-Based Convolutional Neural Networks <a href="https://github.com/M-Nauta/TCDF">URL</a></p><blockquote><p>时间因果发现框架 (TCDF) 是在 PyTorch 中实现的深度学习框架。给定多个时间序列作为输入, TCDF 发现这些时间序列之间的因果关系并输出因果图。<br>它还可以根据其他时间序列预测一个时间序列。TCDF 使用基于注意力的卷积神经网络结合因果验证步骤。通过解释卷积网络的内部参数, TCDF 还可以发现因果之间的时间延迟。</p></blockquote><p>[2] Amortize Causal Discovery: Learning to Infer Causal Graphs from Time-Series Data <a href="https://github.com/loeweX/AmortizedCausalDiscovery">URL</a></p><blockquote><p>通过 Amortized Causal Discovery, 我们学习从具有不同潜在因果图但共享动态的样本中推断因果关系。这使我们能够跨样本进行泛化, 从而通过增加训练数据大小来提高我们的性能。</p></blockquote><p>[3] Causal Discovery from Nonstationary/Heterogeneous Data: Skeleton Estimation and Orientation Determination. IJCAI 2017. <a href="https://github.com/Biwei-Huang/Causal-Discovery-from-Nonstationary-Heterogeneous-Data">URL</a></p><p>[4] Causal Discovery in Heavy-Tailed Models <a href="https://github.com/nicolagnecco/causalXtreme">URL</a></p><p>[5] Differentiable Causal Discovery from Interventional Data <a href="https://github.com/slachapelle/dcdi">URL</a></p><p>[6] Generalized Score Functions for Causal Discovery. KDD, 2018 <a href="https://github.com/Biwei-Huang/Generalized-Score-Functions-for-Causal-Discovery">URL</a></p><blockquote><p>具有广义得分函数的贪婪等价搜索的因果结构学习(适用于混合连续和离散数据、具有高斯或非高斯分布的数据、线性或非线性因果机制以及具有多维的变量。)</p></blockquote><p>[7] Learning the Causal Structure of Copula Models with Latent Variables. UAI. 2018 <a href="https://github.com/cuiruifei/CopulaFactorModel">URL</a></p><p>[8] Data Generating Process to Evaluate Causal Discovery Techniques for Time Series Data, at the Causal Discovery & Causality-Inspired Machine Learning Workshop at NeurIPS 2020. <a href="https://github.com/causalens/cdml-neurips2020">URL</a></p><p>[9] Process Mining Meets Causal Machine Learning: Discovering Causal Rules from Event Logs <a href="https://github.com/zahradbozorgi/CausalRulesDiscovery">URL</a></p>]]></content>
<tags>
<tag> note </tag>
<tag> causal discovery </tag>
<tag> paper list </tag>
<tag> survey </tag>
<tag> structural causal model </tag>
<tag> toolbox </tag>
</tags>
</entry>
<entry>
<title>Related Papers in IJCAI 2021 (2021.08.19-2021.08.26)</title>
<link href="/uncategorized/paperlistfile/IJCAI2021/"/>
<url>/uncategorized/paperlistfile/IJCAI2021/</url>
<content type="html"><![CDATA[<p>Accept papers: <a href="https://ijcai-21.org/program-main-track/">link</a></p><span id="more"></span><h2 id="anomaly-detection-anomaly-outlier-out-of-distribution-one-class-Malware-detection-Fraud-Detection-Fake-News-Detection"><a href="#anomaly-detection-anomaly-outlier-out-of-distribution-one-class-Malware-detection-Fraud-Detection-Fake-News-Detection" class="headerlink" title="anomaly detection (anomaly, outlier, out-of-distribution, one-class, Malware detection, Fraud Detection, Fake News Detection)"></a>anomaly detection (anomaly, outlier, out-of-distribution, one-class, Malware detection, Fraud Detection, Fake News Detection)</h2><ul><li><p>Masked Contrastive Learning for Anomaly Detection</p><p>Hyunsoo Cho (Seoul National University)<br>Jinseok Seol (Seoul National University)<br>Sang-goo Lee (Seoul National University)</p></li><li><p>Weakly-Supervised Spatio-Temporal Anomaly Detection in Surveillance Video</p><p>Jie Wu (Sun Yat-sen University ByteDance Inc.)<br>Wei Zhang (Baidu Inc.)<br>Guanbin Li (Sun Yat-sen University)<br>Wenhao Wu (Baidu Inc.)<br>Xiao Tan (Baidu Inc.)<br>Yingying Li (Baidu Inc.)<br>Errui Ding (Baidu Inc.)<br>Liang Lin (Sun Yat-sen University)</p></li><li><p>RCA: A Deep Collaborative Autoencoder Approach for Anomaly Detection</p><p>Boyang Liu (Michigan State University)<br>Ding Wang (Michigan State University)<br>Kaixiang Lin (Michigan State University)<br>Pang-Ning Tan (Michigan State University)<br>Jiayu Zhou (Michigan State University)</p></li><li><p>Understanding the Effect of Bias in Deep Anomaly Detection</p><p>Ziyu Ye (University of Chicago)<br>Yuxin Chen (University of Chicago)<br>Haitao Zheng (University of Chicago)</p></li><li><p>Likelihood-free Out-of-Distribution Detection with Invertible Generative Models</p><p>Amirhossein Ahmadian (Division of Statistics and Machine Learning, Department of Computer and Information Science, Linköping University)<br>Fredrik Lindsten (Division of Statistics and Machine Learning, Department of Computer and Information Science, Linköping University)</p></li><li><p>MG-DVD: A Real-time Framework for Malware Variant Detection Based on Dynamic Heterogeneous Graph Learning</p><p>Chen Liu (School of Computer Science and Engineering, Beihang University, Beijing, China Beijing Advanced Innovation Center for Big Data and Brain Computing, Beihang University, China)<br>Bo Li (School of Computer Science and Engineering, Beihang University, Beijing, China Beijing Advanced Innovation Center for Big Data and Brain Computing, Beihang University, China)<br>Jun Zhao (School of Computer Science and Engineering, Beihang University, Beijing, China Beijing Advanced Innovation Center for Big Data and Brain Computing, Beihang University, China)<br>Ming Su (School of Computer Science and Engineering, Beihang University, Beijing, China Beijing Advanced Innovation Center for Big Data and Brain Computing, Beihang University, China)<br>Xu-Dong Liu (School of Computer Science and Engineering, Beihang University, Beijing, China Beijing Advanced Innovation Center for Big Data and Brain Computing, Beihang University, China)</p></li><li><p>Online Credit Payment Fraud Detection via Structure-Aware Hierarchical Recurrent Neural Network</p><p>Wangli Lin (Alibaba Group, Hangzhou, China)<br>Li Sun (Alibaba Group, Hangzhou, China)<br>Qiwei Zhong (Alibaba Group, Hangzhou, China)<br>Can Liu (Alibaba Group, Hangzhou, China)<br>Jinghua Feng (Alibaba Group, Hangzhou, China)<br>Xiang Ao (Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China)<br>Hao Yang (Alibaba Group, Hangzhou, China)</p></li></ul><h2 id="Time-series"><a href="#Time-series" class="headerlink" title="Time series"></a>Time series</h2><ul><li><p>Time-Series Representation Learning via Temporal and Contextual Contrasting</p><p>Emadeldeen Eldele (School of Computer Science and Engineering, Nanyang Technological University, Singapore)<br>Mohamed Ragab (School of Computer Science and Engineering, Nanyang Technological University, Singapore)<br>Zhenghua Chen (Institute for Infocomm Research, A<em>STAR, Singapore)<br>Min Wu (Institute for Infocomm Research, A</em>STAR, Singapore)<br>Chee Keong Kwoh (School of Computer Science and Engineering, Nanyang Technological University, Singapore)<br>Xiaoli Li (Institute for Infocomm Research, A*STAR, Singapore)<br>Cuntai Guan (School of Computer Science and Engineering, Nanyang Technological University, Singapore)</p></li></ul><ul><li><p>TE-ESN: Time Encoding Echo State Network for Prediction Based on Irregularly Sampled Time Series Data</p><p>Chenxi Sun (Key Laboratory of Machine Perception (Ministry of Education), Peking University, Beijing, China. School of Electronics Engineering and Computer Science, Peking University, Beijing, China.)<br>Shenda Hong (National Institute of Health Data Science, Peking University, Beijing, China. Institute of Medical Technology, Health Science Center of Peking University, Beijing, China.)<br>Moxian Song (Key Laboratory of Machine Perception (Ministry of Education), Peking University, Beijing, China. School of Electronics Engineering and Computer Science, Peking University, Beijing, China.)<br>Yen-Hsiu Chou (Key Laboratory of Machine Perception (Ministry of Education), Peking University, Beijing, China. School of Electronics Engineering and Computer Science, Peking University, Beijing, China.)<br>Yongyue Sun (Key Laboratory of Machine Perception (Ministry of Education), Peking University, Beijing, China. School of Electronics Engineering and Computer Science, Peking University, Beijing, China.)<br>Derun Cai (Key Laboratory of Machine Perception (Ministry of Education), Peking University, Beijing, China. School of Electronics Engineering and Computer Science, Peking University, Beijing, China.)<br>Hongyan Li (Key Laboratory of Machine Perception (Ministry of Education), Peking University, Beijing, China. School of Electronics Engineering and Computer Science, Peking University, Beijing, China.)</p></li><li><p>Time-Aware Multi-Scale RNNs for Time Series Modeling</p><p>Zipeng Chen (School of Computer Science and Engineering, South China University of Technology, Guangzhou, China)<br>Qianli Ma (School of Computer Science and Engineering, South China University of Technology, Guangzhou, China Key Laboratory of Big Data and Intelligent Robot (South China University of Technology), Ministry of Education)<br>Zhenxi Lin (School of Computer Science and Engineering, South China University of Technology, Guangzhou, China)</p></li><li><p>Adversarial Spectral Kernel Matching for Unsupervised Time Series Domain Adaptation</p><p>Qiao Liu (School of Computer Science and Engineering, Southeast University, Nanjing, 210096, China MOE Key Laboratory of Computer Network and Information Integration (Southeast University))<br>Hui Xue (School of Computer Science and Engineering, Southeast University, Nanjing, 210096, China MOE Key Laboratory of Computer Network and Information Integration (Southeast University))</p></li><li><p>Two Birds with One Stone: Series Saliency for Accurate and Interpretable Multivariate Time Series Forecasting</p><p>Qingyi Pan (High Performance Computing Group, Dept. of Comp. Sci. and Tech., BNRist Center, Institute for AI, Tsinghua-Bosch Joint ML Center, THBI Lab, Tsinghua University, Beijing, 100084 China)<br>Wenbo Hu (RealAI)<br>Ning Chen (High Performance Computing Group, Dept. of Comp. Sci. and Tech., BNRist Center, Institute for AI, Tsinghua-Bosch Joint ML Center, THBI Lab, Tsinghua University, Beijing, 100084 China)</p></li></ul><h2 id="heterogeneous-multi-source"><a href="#heterogeneous-multi-source" class="headerlink" title="heterogeneous (multi-source)"></a>heterogeneous (multi-source)</h2><ul><li><p>Adapting Meta Knowledge with Heterogeneous Information Network for COVID-19 Themed Malicious Repository Detection</p><p>Yiyue Qian (Department of Computer and Data Sciences, Case Western Reserve University, USA)<br>Yiming Zhang (Department of Computer and Data Sciences, Case Western Reserve University, USA)<br>Yanfang Ye (Department of Computer and Data Sciences, Case Western Reserve University, USA)<br>Chuxu Zhang (Department of Computer Science, Brandeis University, USA)</p></li><li><p>Temporal Heterogeneous Information Network Embedding</p><p>Hong Huang (National Engineering Research Center for Big Data Technology and System Service Computing Technology and Systems Laboratory Huazhong University of Science and Technology, China)<br>Ruize Shi (National Engineering Research Center for Big Data Technology and System Service Computing Technology and Systems Laboratory Huazhong University of Science and Technology, China)<br>Wei Zhou (Huazhong University of Science and Technology, China)<br>Xiao Wang (Beijing University of Posts and Telecommunications, China)<br>Hai Jin (National Engineering Research Center for Big Data Technology and System Service Computing Technology and Systems Laboratory Huazhong University of Science and Technology, China)<br>Xiaoming Fu (University of Goettingen, Germany)</p></li></ul><h2 id="Graph-Representation-Learning"><a href="#Graph-Representation-Learning" class="headerlink" title="Graph Representation Learning"></a>Graph Representation Learning</h2><ul><li><p>MG-DVD: A Real-time Framework for Malware Variant Detection Based on Dynamic Heterogeneous Graph Learning</p><p>Chen Liu (School of Computer Science and Engineering, Beihang University, Beijing, China Beijing Advanced Innovation Center for Big Data and Brain Computing, Beihang University, China)<br>Bo Li (School of Computer Science and Engineering, Beihang University, Beijing, China Beijing Advanced Innovation Center for Big Data and Brain Computing, Beihang University, China)<br>Jun Zhao (School of Computer Science and Engineering, Beihang University, Beijing, China Beijing Advanced Innovation Center for Big Data and Brain Computing, Beihang University, China)<br>Ming Su (School of Computer Science and Engineering, Beihang University, Beijing, China Beijing Advanced Innovation Center for Big Data and Brain Computing, Beihang University, China)<br>Xu-Dong Liu (School of Computer Science and Engineering, Beihang University, Beijing, China Beijing Advanced Innovation Center for Big Data and Brain Computing, Beihang University, China)</p></li><li><p>Heterogeneous Graph Information Bottleneck</p><p>Liang Yang (Hebei University of Technology, Tianjin, China Institute of Information Engineering, CAS, Beijing, China)<br>Fan Wu (Hebei University of Technology, Tianjin, China)<br>Zichen Zheng (Hebei University of Technology, Tianjin, China)<br>Bingxin Niu (Hebei University of Technology, Tianjin, China)<br>Junhua Gu (Hebei University of Technology, Tianjin, China)<br>Chuan Wang (Institute of Information Engineering, CAS, Beijing, China)<br>Xiaochun Cao (Institute of Information Engineering, CAS, Beijing, China)<br>Yuanfang Guo (Beihang University, Beijing, China)</p></li><li><p>Learning Attributed Graph Representation with Communicative Message Passing Transformer</p><p>Jianwen Chen (School of Computer Science and Engineering, Sun Yat-sen University)<br>Shuangjia Zheng (School of Computer Science and Engineering, Sun Yat-sen University Galixir Technologies Ltd, Beijing)<br>Ying Song (School of System Science and Engineering, Sun Yat-sen University)<br>Jiahua Rao (School of Computer Science and Engineering, Sun Yat-sen University Galixir Technologies Ltd, Beijing)<br>Yuedong Yang (School of Computer Science and Engineering, Sun Yat-sen University Key Laboratory of Machine Intelligence and Advanced Computing, Sun Yat-sen University)</p></li><li><p>CuCo: Graph Representation with Curriculum Contrastive Learning</p><p>Guanyi Chu (Beijing University of Posts and Telecommunications)<br>Xiao Wang (Beijing University of Posts and Telecommunications)<br>Chuan Shi (Beijing University of Posts and Telecommunications)<br>Xunqiang Jiang (Beijing University of Posts and Telecommunications)</p></li><li><p>Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning</p><p>Ming Jin (Monash University)<br>Yizhen Zheng (Monash University)<br>Yuan-Fang Li (Monash University)<br>Chen Gong (Nanjing University of Science and Technology)<br>Chuan Zhou (Chinese Academy of Sciences)<br>Shirui Pan (Monash University)</p></li></ul><h2 id="sequence"><a href="#sequence" class="headerlink" title="sequence"></a>sequence</h2><ul><li><p>k-Nearest Neighbors by Means of Sequence to Sequence Deep Neural Networks and Memory Networks</p><p>Yiming Xu (Northwestern University)<br>Diego Klabjan (Northwestern University)</p></li><li><p>A Novel Sequence-to-Subgraph Framework for Diagnosis Classification</p><p>Jun Chen (Baidu Inc, Beijing 100193, China)<br>Quan Yuan (Baidu Inc, Beijing 100193, China)<br>Chao Lu (Baidu Inc, Beijing 100193, China)<br>Haifeng Huang (Baidu Inc, Beijing 100193, China)</p></li><li><p>Multi-series Time-aware Sequence Partitioning for Disease Progression Modeling</p><p>Xi Yang (Department of Computer Science, North Carolina State University)<br>Yuan Zhang (Department of Computer Science, North Carolina State University)<br>Min Chi (Department of Computer Science, North Carolina State University)</p></li></ul><h2 id="Autoencoder"><a href="#Autoencoder" class="headerlink" title="Autoencoder"></a>Autoencoder</h2><ul><li><p>Regularizing Variational Autoencoder with Diversity and Uncertainty Awareness</p><p>Dazhong Shen (School of Computer Science and Technology, University of Science and Technology of China Baidu Talent Intelligence Center)<br>Chuan Qin (Baidu Talent Intelligence Center)<br>Chao Wang (School of Computer Science and Technology, University of Science and Technology of China Baidu Talent Intelligence Center)<br>Hengshu Zhu (Baidu Talent Intelligence Center)<br>Enhong Chen (School of Computer Science and Technology, University of Science and Technology of China)<br>Hui Xiong (Rutgers, The State University of New Jersey)</p></li></ul><h2 id="Recurrent-Neural-Network"><a href="#Recurrent-Neural-Network" class="headerlink" title="Recurrent Neural Network"></a>Recurrent Neural Network</h2><ul><li><p>Change Matters: Medication Change Prediction with Recurrent Residual Networks</p><p>Chaoqi Yang (University of Illinois at Urbana-Champaign)<br>Cao Xiao (IQVIA)<br>Lucas Glass (IQVIA)<br>Jimeng Sun (University of Illinois at Urbana-Champaign)</p></li><li><p>State-Based Recurrent SPMNs for Decision-Theoretic Planning under Partial Observability</p><p>Layton Hayes (Institute for AI, University of Georgia, Athens GA 30602)<br>Prashant Doshi (Institute for AI, University of Georgia, Athens GA 30602 Department of Computer Science, University of Georgia, Athens GA 30602)<br>Swaraj Pawar (Dept. of Computer Science, University of Georgia, Athens GA 30602)<br>Hari Teja Tatavarti (Institute for AI, University of Georgia, Athens GA 30602)</p></li><li><p>Online Credit Payment Fraud Detection via Structure-Aware Hierarchical Recurrent Neural Network</p><p>Wangli Lin (Alibaba Group, Hangzhou, China)<br>Li Sun (Alibaba Group, Hangzhou, China)<br>Qiwei Zhong (Alibaba Group, Hangzhou, China)<br>Can Liu (Alibaba Group, Hangzhou, China)<br>Jinghua Feng (Alibaba Group, Hangzhou, China)<br>Xiang Ao (Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China)<br>Hao Yang (Alibaba Group, Hangzhou, China)</p></li></ul><h2 id="causal-analysis"><a href="#causal-analysis" class="headerlink" title="causal analysis"></a>causal analysis</h2><ul><li><p>Causal Discovery with Multi-Domain LiNGAM for Latent Factors</p><p>Yan Zeng (Guangdong University of Technology RIKEN)<br>Shohei Shimizu (Shiga University RIKEN)<br>Ruichu Cai (Guangdong University of Technology)<br>Feng Xie (Peking University)<br>Michio Yamamoto (Okayama University RIKEN)<br>Zhifeng Hao (Guangdong University of Technology Foshan University)</p></li><li><p>Inferring Time-delayed Causal Relations in POMDPs from the Principle of Independence of Cause and Mechanism</p><p>Junchi Liang (Department of Computer Science, Rutgers University, New Jersey, USA)<br>Abdeslam Boularias (Department of Computer Science, Rutgers University, New Jersey, USA)</p></li><li><p>User Retention: A Causal Approach with Triple Task Modeling</p><p>Yang Zhang (Ant Group Beihang University)<br>Dong Wang (Ant Group)<br>Qiang Li (Ant Group)<br>Yue Shen (Ant Group)<br>Ziqi Liu (Ant Group)<br>Xiaodong Zeng (Ant Group)<br>Zhiqiang Zhang (Ant Group)<br>Jinjie Gu (Ant Group)<br>Derek F. Wong (University of Macau)</p></li><li><p>Ordering-Based Causal Discovery with Reinforcement Learning</p><p>Xiaoqiang Wang (State Key Laboratory for Manufacturing Systems Engineering, School of Automation Science and Engineering, Xi’an Jiaotong University)<br>Yali Du (University College London)<br>Shengyu Zhu (Huawei Noah’s Ark Lab)<br>Liangjun Ke (State Key Laboratory for Manufacturing Systems Engineering, School of Automation Science and Engineering, Xi’an Jiaotong University)<br>Zhitang Chen (Huawei Noah’s Ark Lab)<br>Jianye Hao (Huawei Noah’s Ark Lab College of Intelligence and Computing, Tianjin University)<br>Jun Wang (University College London)</p></li><li><p>Dependent Multi-Task Learning with Causal Intervention for Image Captioning</p><p>Wenqing Chen (MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University State Key Lab of Advanced Optical Communication System and Network, Shanghai Jiao Tong University)<br>Jidong Tian (MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University State Key Lab of Advanced Optical Communication System and Network, Shanghai Jiao Tong University)<br>Caoyun Fan (MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University State Key Lab of Advanced Optical Communication System and Network, Shanghai Jiao Tong University)<br>Hao He (MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University State Key Lab of Advanced Optical Communication System and Network, Shanghai Jiao Tong University)<br>Yaohui Jin (MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University State Key Lab of Advanced Optical Communication System and Network, Shanghai Jiao Tong University)</p></li><li><p>Provable Guarantees on the Robustness of Decision Rules to Causal Interventions</p><p>Benjie Wang (University of Oxford)<br>Clare Lyle (University of Oxford)<br>Marta Kwiatkowska (University of Oxford)</p></li><li><p>A Ladder of Causal Distances</p><p>Maxime Peyrard (EPFL)<br>Robert West (EPFL)</p></li></ul><h2 id="correlation-analysis-association-analysis"><a href="#correlation-analysis-association-analysis" class="headerlink" title="correlation analysis (association analysis)"></a>correlation analysis (association analysis)</h2><ul><li><p>Differentially Private Correlation Alignment for Domain Adaptation</p><p>Kaizhong Jin (State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China)<br>Xiang Cheng (State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China)<br>Jiaxi Yang (State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China)<br>Kaiyuan Shen (State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China)</p></li><li><p>Learning Unknown from Correlations: Graph Neural Network for Inter-novel-protein Interaction Prediction</p><p>Guofeng Lv (SenseTime Research)<br>Zhiqiang Hu (SenseTime Research)<br>Yanguang Bi (SenseTime Research)<br>Shaoting Zhang (SenseTime Research)</p></li><li><p> Correlation-Guided Representation for Multi-Label Text Classification</p></li></ul><p> Qian-Wen Zhang (Tencent Cloud Xiaowei, Beijing 100080, China)<br> Ximing Zhang (Beijing University of Posts and Telecommunications, Beijing 100876, China)<br> Zhao Yan (Tencent Cloud Xiaowei, Beijing 100080, China)<br> Ruifang Liu (Beijing University of Posts and Telecommunications, Beijing 100876, China)<br> Yunbo Cao (Tencent Cloud Xiaowei, Beijing 100080, China)<br> Min-Ling Zhang (School of Computer Science and Engineering, Southeast University, Nanjing 210096, China Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education, China)</p><ul><li><p>Location Predicts You: Location Prediction via Bi-direction Speculation and Dual-level Association</p><p>Xixi Li (National Engineering Research Center for Multimedia Software (NERCMS), School of Computer Science, Wuhan University)<br>Ruimin Hu (National Engineering Research Center for Multimedia Software (NERCMS), School of Computer Science, Wuhan University)<br>Zheng Wang (Research Institute for an Inclusive Society through Engineering (RIISE), The University of Tokyo Department of Information and Communication Engineering, The University of Tokyo)<br>Toshihiko Yamasaki (Research Institute for an Inclusive Society through Engineering (RIISE), The University of Tokyo Department of Information and Communication Engineering, The University of Tokyo)</p></li></ul><h2 id="clustering"><a href="#clustering" class="headerlink" title="clustering"></a>clustering</h2><h2 id="About-distribution"><a href="#About-distribution" class="headerlink" title="About distribution"></a>About distribution</h2><h2 id="missing-value-amp-irregularly-sampled-time-series-Incomplete-imputation-…"><a href="#missing-value-amp-irregularly-sampled-time-series-Incomplete-imputation-…" class="headerlink" title="missing value & irregularly sampled time series [Incomplete, imputation, …]"></a>missing value & irregularly sampled time series [Incomplete, imputation, …]</h2><h2 id="interpretable-Understanding-explanation-Attribution-…"><a href="#interpretable-Understanding-explanation-Attribution-…" class="headerlink" title="interpretable [Understanding, explanation, Attribution …]"></a>interpretable [Understanding, explanation, Attribution …]</h2>]]></content>
<tags>
<tag> paper list </tag>
</tags>
</entry>
<entry>
<title>Related Papers in SIGKDD 2021 (2021.08.14-2021.08.18)</title>
<link href="/uncategorized/paperlistfile/KDD2021/"/>
<url>/uncategorized/paperlistfile/KDD2021/</url>
<content type="html"><![CDATA[<p>Accept papers: <a href="https://kdd.org/kdd2021/accepted-papers/index">link</a></p><span id="more"></span><h2 id="anomaly-detection-anomaly-outlier-out-of-distribution-one-class-Malware-detection-Fraud-Detection-Fake-News-Detection…"><a href="#anomaly-detection-anomaly-outlier-out-of-distribution-one-class-Malware-detection-Fraud-Detection-Fake-News-Detection…" class="headerlink" title="anomaly detection (anomaly, outlier, out-of-distribution, one-class, Malware detection, Fraud Detection, Fake News Detection…)"></a>anomaly detection (anomaly, outlier, out-of-distribution, one-class, Malware detection, Fraud Detection, Fake News Detection…)</h2><ul><li><p>ELITE : Robust Deep Anomaly Detection with Meta Gradient</p><p>Authors: Huayi Zhang (WPI); Lei Cao (MIT)$^{\star}$; Peter VanNostrand (WPI); Samuel Madden (MIT); Elke A Rundensteiner (WPI)</p></li><li><p>Joint Optimization of Known and Unknown Anomaly Detection</p><p>Authors: Guansong Pang (University of Adelaide)$^{\star}$; Anton van den Hengel (University of Adelaide); Chunhua Shen (University of Adelaide); Longbing Cao (University of Technology Sydney)</p></li><li><p>Multi-Scale One-Class Recurrent Neural Networks for Discrete Event Sequence Anomaly Detection</p><p>Authors: Zhiwei Wang (Michigan State University)$^{\star}$; Zhengzhang Chen (NEC Laboratories America, Inc.); Jingchao Ni ( NEC Laboratories America); Hui Liu (Michigan State University); Haifeng Chen (NEC Labs); Jiliang Tang (Michigan State University)</p></li><li><p>Multivariate Time Series Anomaly Detection and Interpretation using Hierarchical Inter-Metric and Temporal Embedding</p><p>Authors: Zhihan Li (Tsinghua University)$^{\star}$; Youjian Zhao (Tsinghua University); Jiaqi Han (Tsinghua University); Ya Su (Tsinghua University); Rui Jiao (Tsinghua University); Xidao Wen (Tsinghua University); Dan Pei (Tsinghua University)</p></li><li><p>Practical Approach to Asynchronous Multivariate Time Series Anomaly Detection and Localization</p><p>Authors: Ahmed Abdulaal (eBay)$^{\star}$; Zhuanghua Liu (eBay); Tomer Lancewicki (EBay)</p></li><li><p>Time Series Anomaly Detection for Cyber-physical Systems via Neural System Identification and Bayesian Filtering</p><p>Authors: Cheng Feng (Siemens)$^{\star}$; Pengwei Tian (Siemens)</p></li><li><p>Deep Clustering-based Fair Outlier Detection</p><p>Authors: Hanyu Song (Brandeis University)$^{\star}$; Peizhao Li (Brandeis University); Hongfu Liu (Brandeis University)</p></li><li><p>Fast One-class Classification using Class Boundary-preserving Random Projections</p><p>Authors: Arindam Bhattacharya (IIT DELHI)$^{\star}$; Sumanth Varambally (IIT Delhi); Amitabha Bagchi (IIT Delhi); Srikanta Bedathur (IIT Delhi)</p></li><li><p>Heterogeneous Temporal Graph Transformer: An Intelligent System for Evolving Android Malware Detection</p><p>Authors: Yujie Fan (Case Western Reserve University); Mingxuan Ju (Case Western Reserve University); Shifu Hou (Case Western Reserve University); Yanfang Ye (Case Western Reserve University)$^{\star}$; Wenqiang Wan (Tencent Security Lab); Kui Wang (Tencent Security Lab); Yinming Mei (Tencent Security Lab); Qi Xiong (Tencent Security Lab)</p></li><li><p>Live-Streaming Fraud Detection: A Heterogeneous Graph Neural Network Approach</p><p>Authors: Zhao Li (Alibaba Group); Haishuai Wang (Fairfield University,Department of Computer Science and Engineering); Peng Zhang (Guangzhou University)$^{\star}$; Pengrui Hui ( Alibaba Group); Jiaming Huang (Alibaba Group); Jian Liao (Alibaba Group); Ji Zhang (The University of Southern Queensland); Jiajun Bu (Zhejiang University)</p></li><li><p>Intention-aware Heterogeneous Graph Attention Networks for Fraud Transactions Detection</p><p>Authors: Can Liu (Alibaba Group)$^{\star}$; Li Sun (Alibaba Group); Xiang Ao (Institute of Computing Technology, CAS); Jinghua Feng (Ailbaba Group ); Qing He (Institute of Computing Technology, Chinese Academy of Sciences); Hao Yang (Alibaba Group)</p></li><li><p>Multi-modal Emergent Fake News Detection via Meta Neural Process Networks</p><p>Authors: Yaqing Wang (Purdue University)$^{\star}$; Fenglong Ma (Pennsylvania State University); Haoyu Wang (SUNY Buffalo); Kishlay Jha (University of Virginia); Jing Gao (University at Buffalo)</p></li><li><p>Automated Testing of Graphics Units by Deep-Learning Detection of Visual Anomalies</p><p>Authors: Lev Faivishevsky (Intel)$^{\star}$; Adi Szeskin (Intel); Ashwin k Muppalla (Intel); Ravid Ziv (Intel); Ronen Laperdon (Intel); Benjamin Melloul (intel); Tahi Hollander (Intel); Tom Hope (Intel); Amitai Armon (Intel)</p></li></ul><h2 id="Time-series"><a href="#Time-series" class="headerlink" title="Time series"></a>Time series</h2><ul><li><p>Apriori Convolutions for Highly Efficient and Accurate Time Series Classification</p><p>Authors: Angus Dempster (Monash University)$^{\star}$; Daniel F Schmidt (Monash University); Geoffrey I Webb (Monash)</p></li><li><p>Fast and Accurate Partial Fourier Transform for Time Series Data</p><p>Authors: Yong-chan Park (Seoul National University)$^{\star}$; Jun-gi Jang (Seoul National University); U Kang (Seoul National University)</p></li><li><p>Representation Learning of Multivariate Time Series using a Transformer Framework</p><p>Authors: George Zerveas (Brown University)$^{\star}$; Srideepika Jayaraman (IBM); Dhaval Patel (IBM TJ Watson Research Center); Anuradha Bhamidipaty (IBM Watson Research Center); Carsten Eickhoff (Brown University)</p></li><li><p>ST-Norm: Spatial and Temporal Normalization for Multi-variate Time Series Forecasting</p><p>Authors: Jinliang Deng (University of Technology Sydney); Xiusi Chen (University of California, Los Angeles); Renhe Jiang (The University of Tokyo); Xuan Song (Southern University of Science and Technology); Ivor Tsang (University of Technology Sydney)$^{\star}$</p></li><li><p>Statistical models coupling allows for complex localmultivariate time series analysis</p><p>Authors: Veronica Tozzo (Massachusets General Hospital - Harvard Medical School)$^{\star}$; Federico Ciech (University of Genoa); Davide Garbarino (University of Genoa); Alessandro Verri (University of Genova, Italy)</p></li><li><p>Causal and Interpretable Rules for Time Series Analysis</p><p>Authors: Amin Dhaou (Total)$^{\star}$; Josselin Garnier (École Polytechnique); Antoine Bertoncello (Total); Erwann LE PENNEC (Polytechnique)</p></li></ul><h2 id="Graph-Representation-Learning"><a href="#Graph-Representation-Learning" class="headerlink" title="Graph Representation Learning"></a>Graph Representation Learning</h2><ul><li><p>Are we really making much progress? Revisiting, benchmarking and refining the Heterogeneous Graph Neural Networks</p><p>Authors: Qingsong Lv (Tsinghua University); Ming Ding (Tsinghua University); Qiang Liu (Institute of Information Engineering, Chinese Academy of Sciences); Yuxiang Chen (Tsinghua University); Wenzheng Feng (Tsinghua University); Siming He (University of Pennsylvania); Chang Zhou (Alibaba Group); Jian-guo Jiang (Institute of Information Engineering , Chinese Academy of Sciences); Yuxiao Dong (Facebook AI); Jie Tang (Tsinghua University)$^{\star}$</p></li><li><p>Attentive Heterogeneous Graph Embedding for Job Mobility Prediction</p><p>Authors: Le Zhang (University of Science and Technology of China)$^{\star}$; Ding Zhou (University of Science and Technology of China); Hengshu Zhu (Baidu Talent Intelligence Center, Baidu Inc.); Tong Xu (University of Science and Technology of China); Rui Zha ( University of Science and Technology of China); Enhong Chen (University of Science and Technology of China); Hui Xiong (Rutgers University)</p></li><li><p>DiffMG: Differentiable Meta Graph Search for Heterogeneous Graph Neural Networks</p><p>Authors: Yuhui Ding (The Hong Kong University of Science and Technology)$^{\star}$; Quanming Yao (4Paradigm); Huan Zhao (4Paradigm Inc.); Tong Zhang (Hong Kong University of Science and Technology)</p></li><li><p>HGK-GNN: Heterogeneous Graph Kernel based Graph Neural Networks</p><p>Authors: Qingqing Long (Peking University)$^{\star}$; Lingjun Xu (Peking University); Zheng Fang (pku); Guojie Song (Peking University)</p></li><li><p>Pre-training on Large-Scale Heterogeneous Graph</p><p>Authors: Xunqiang Jiang (Beijing University of Posts and Telecommunications)$^{\star}$; Tianrui Jia (Beijing University of Posts and Telecommunications); Chuan Shi (Beijing University of Posts and Telecommunications); Yuan Fang (Singapore Management University); Zhe Lin (Peng Cheng Laboratory); Hui Wang (Peng Cheng Laboratory)</p></li><li><p>Scalable Heterogeneous Graph Neural Networks for Predicting High-potential Early-stage Startups</p><p>Authors: SHENGMING ZHANG (Rutgers University)$^{\star}$; Hao Zhong (ESCP Business School); Zixuan Yuan (Rutgers University); Hui Xiong (the State University of New Jersey)</p></li><li><p>Self-supervised Heterogeneous Graph Neural Network with Co-contrastive Learning</p><p>Authors: Xiao Wang (Beijing University of Posts and Telecommunications); Nian Liu (Beijing University of Posts and Telecommunications)$^{\star}$; Hui Han (Beijing University of Posts and Telecommunications); Chuan Shi (Beijing University of Posts and Telecommunications)</p></li><li><p>Heterogeneous Temporal Graph Transformer: An Intelligent System for Evolving Android Malware Detection</p><p>Authors: Yujie Fan (Case Western Reserve University); Mingxuan Ju (Case Western Reserve University); Shifu Hou (Case Western Reserve University); Yanfang Ye (Case Western Reserve University)$^{\star}$; Wenqiang Wan (Tencent Security Lab); Kui Wang (Tencent Security Lab); Yinming Mei (Tencent Security Lab); Qi Xiong (Tencent Security Lab)</p></li><li><p>HGAMN: Heterogeneous Graph Attention Matching Network for Multilingual POI Retrieval at Baidu Maps</p><p>Authors: Jizhou Huang (Baidu)$^{\star}$; Haifeng Wang (Baidu); Yibo Sun (Baidu); Miao Fan (Baidu); Zhengjie Huang (Baidu); Chunyuan Yuan (Baidu); Yawen Li (BUPT)</p></li></ul><h2 id="sequence"><a href="#sequence" class="headerlink" title="sequence"></a>sequence</h2><ul><li>PETGEN: Personalized Text Generation Attack on Deep User Sequence Classification ModelsAuthors: Bing He (Georgia Institute of Technology)$^{\star}$; Dr.Mustaque Ahamad (Georgia Institute of Technology); Srijan Kumar (Georgia Institute of Technology)</li><li>TimeSHAP: Explaining Recurrent Models through Sequence PerturbationsAuthors: Joao Bento (Feedzai); Pedro Saleiro (Feedzai)$^{\star}$; André F. Cruz (Feedzai); Mario Figueiredo (University of Lisbon); Pedro Bizarro (Feedzai)</li></ul><h2 id="causal-analysis"><a href="#causal-analysis" class="headerlink" title="causal analysis"></a>causal analysis</h2><ul><li><p>Causal models for Real Time Bidding with repeated user interactions</p><p>Authors: Martin Bompaire (Criteo)$^{\star}$; Benjamin Heymann (Criteo); Alexandre Gilotte (Criteo)</p></li><li><p>DARING: Differentiable Causal Discovery with Residual Independence</p><p>Authors: Yue He (Tsinghua University)$^{\star}$; Peng Cui (Tsinghua University); Zheyan Shen (Tsinghua University); Renzhe Xu (Tsinghua University); Furui Liu (Huawei Noah’s Ark Lab); Yong Jiang (Tsinghua University)</p></li><li><p>MPCSL - A Modular Pipeline for Causal Structure Learning</p><p>Authors: Johannes Huegle (Hasso Plattner Institute)$^{\star}$; Christopher Hagedorn (Hasso Plattner Institute); Michael Perscheid (Hasso Plattner Institute); Hasso Plattner (Hasso Plattner Institute)</p></li></ul><hr><h2 id="clustering"><a href="#clustering" class="headerlink" title="clustering"></a>clustering</h2><h2 id="About-distribution"><a href="#About-distribution" class="headerlink" title="About distribution"></a>About distribution</h2><h2 id="interpretable-Understanding-explanation-Attribution-…"><a href="#interpretable-Understanding-explanation-Attribution-…" class="headerlink" title="interpretable [Understanding, explanation, Attribution …]"></a>interpretable [Understanding, explanation, Attribution …]</h2><h2 id="missing-value-amp-irregularly-sampled-time-series-Incomplete-imputation-…"><a href="#missing-value-amp-irregularly-sampled-time-series-Incomplete-imputation-…" class="headerlink" title="missing value & irregularly sampled time series [Incomplete, imputation, …]"></a>missing value & irregularly sampled time series [Incomplete, imputation, …]</h2>]]></content>
<tags>
<tag> paper list </tag>
</tags>
</entry>
<entry>
<title>AIOps 2021挑战赛答辩记录</title>
<link href="/uncategorized/notes/AIOps2021%E6%8C%91%E6%88%98%E8%B5%9B/"/>
<url>/uncategorized/notes/AIOps2021%E6%8C%91%E6%88%98%E8%B5%9B/</url>
<content type="html"><![CDATA[<h1 id="1-时空数据多指标预测"><a href="#1-时空数据多指标预测" class="headerlink" title="1. 时空数据多指标预测"></a>1. 时空数据多指标预测</h1><p>时空数据哪有几种</p><ul><li>图片+时间轴</li><li>时间序列+空域关系</li></ul><span id="more"></span><h3 id="Related-Work"><a href="#Related-Work" class="headerlink" title="Related Work"></a>Related Work</h3><p><img src="https://user-images.githubusercontent.com/16149619/118065808-2e857600-b3d0-11eb-9d1b-90659d92c37f.png" alt="da83bb35b2ee1661e1c9a674721a092"><br><img src="https://user-images.githubusercontent.com/16149619/118066874-11ea3d80-b3d2-11eb-8fc6-d193e9e90d1b.png" alt="2afb0700c4e39369712fbb8ea34e0aa"><br><img src="https://user-images.githubusercontent.com/16149619/118066884-157dc480-b3d2-11eb-868b-6971c9d0cc3d.png" alt="092efde5fcc1ead4c548ea92808792d"><br><img src="https://user-images.githubusercontent.com/16149619/118066889-17478800-b3d2-11eb-9cc5-158e6496540c.png" alt="482d0beee98602dae12b360f6031d18"><br><img src="https://user-images.githubusercontent.com/16149619/118066898-1a427880-b3d2-11eb-96a2-1bd2a807d5c7.png" alt="3e27d036423732fd15c7a62c1e1128c"><br><img src="https://user-images.githubusercontent.com/16149619/118066914-20385980-b3d2-11eb-8019-9a7e6e2cb35f.png" alt="58898959d166daba42e0ebb387b2f79"><br><img src="https://user-images.githubusercontent.com/16149619/118066921-23cbe080-b3d2-11eb-8f48-6659212f580f.png" alt="dd47545c4bd922d7b84c960ee7d4833"><br><img src="https://user-images.githubusercontent.com/16149619/118066931-262e3a80-b3d2-11eb-9cbc-996f792bb8c6.png" alt="88c7451c8cc1e57e4d2eeb486c469df"><br><img src="https://user-images.githubusercontent.com/16149619/118066951-2cbcb200-b3d2-11eb-923e-83d27bd4d36e.png" alt="30d0135c1f07ed84151f625d3d57a2e"><br><img src="https://user-images.githubusercontent.com/16149619/118066956-2fb7a280-b3d2-11eb-83b6-b15bbc07d8d8.png" alt="6930d622e79c8636b9190138728649b"></p><h3 id="数据集"><a href="#数据集" class="headerlink" title="数据集"></a>数据集</h3><p><img src="https://user-images.githubusercontent.com/16149619/118065786-2594a480-b3d0-11eb-9c39-f4690cc4673a.png" alt="2b64a3575b3878a9420d6f72c5b7f68"></p><h3 id="挑战"><a href="#挑战" class="headerlink" title="挑战"></a>挑战</h3><ol><li>node edge均有异质性:</li></ol><ul><li>node:基站业务不同,数据分布不同</li><li>edge:node间有不同的关系(举例:来向与去向的高铁节点、小区间的节点)</li></ul><p><img src="https://user-images.githubusercontent.com/16149619/118066346-24b04280-b3d1-11eb-920d-04e16840a234.png" alt="87bc8f9f29b296c3f76e60b2802642e"></p><ol start="2"><li><p>缺乏连续性的假设:用户接入不受物理空间约束,可以瞬间从小区A接入到小区B</p></li><li><p>突发性</p></li></ol><p><img src="https://user-images.githubusercontent.com/16149619/118066497-6b05a180-b3d1-11eb-9629-c2f33b294cfa.png" alt="2e14a03c01af3b1caf0a662ee8a7146"></p><ol start="4"><li>网络结构复杂性(领区关系):不同区域,基站之间的连接密集程度有很大不同</li></ol><p><img src="https://user-images.githubusercontent.com/16149619/118066700-c041b300-b3d1-11eb-832e-d075536509e7.png" alt="175124e6e9fd50b498402605498cb46"></p><h3 id="自己的工作"><a href="#自己的工作" class="headerlink" title="自己的工作"></a>自己的工作</h3><p>强调自适应感受野,强调异质性</p><p><img src="https://user-images.githubusercontent.com/16149619/118066996-43630900-b3d2-11eb-985d-f46dcef6ef33.png" alt="70fcdc172c8cedf6935d33ac839464a"><br><img src="https://user-images.githubusercontent.com/16149619/118066999-452ccc80-b3d2-11eb-8d9a-12dc0f79a6d1.png" alt="37a65aed9e33e8fa09b12b8be7683c6"><br><img src="https://user-images.githubusercontent.com/16149619/118067004-46f69000-b3d2-11eb-84cb-9d979621749f.png" alt="02bb2f207040d2afe6de2757b8c1e72"><br><img src="https://user-images.githubusercontent.com/16149619/118067008-4958ea00-b3d2-11eb-9f61-2ccde47c3644.png" alt="a0a14ff67f978b4ad952b64ff14c559"><br><img src="https://user-images.githubusercontent.com/16149619/118067012-4bbb4400-b3d2-11eb-9e42-491317cf8528.png" alt="c16290875c6b0455c241f091633377a"></p><hr><h1 id="2-人机物融合智能运维:感知、诊断、交互"><a href="#2-人机物融合智能运维:感知、诊断、交互" class="headerlink" title="2. 人机物融合智能运维:感知、诊断、交互"></a>2. 人机物融合智能运维:感知、诊断、交互</h1><p>重点:需要融合人类知识到已有的时间序列、日志分析中</p><p><img src="https://user-images.githubusercontent.com/16149619/118067350-ef0c5900-b3d2-11eb-9899-4b577997d00c.png" alt="17bde0bb13da6f7b437d75aefb74218"></p><p>主要分析日志数据,数据特点:半结构化、多言且复杂、不同组件、第三方组件日志的异质性。</p><h3 id="难点"><a href="#难点" class="headerlink" title="难点"></a>难点</h3><p><img src="https://user-images.githubusercontent.com/16149619/118067374-00edfc00-b3d3-11eb-8e6d-938d3f9183cf.png" alt="83e9edce68a0e8f224e3abf3e4536c4"><br><img src="https://user-images.githubusercontent.com/16149619/118067397-0ba89100-b3d3-11eb-8e92-cc80bb72edc6.png" alt="ba672144e6d23c3ab8da3696835ddc1"></p><h3 id="分布式系统的特点"><a href="#分布式系统的特点" class="headerlink" title="分布式系统的特点"></a>分布式系统的特点</h3><p>分布式系统,统一请求日志在数据中交替出现,而不唯一</p><p>…</p><p><img src="https://user-images.githubusercontent.com/16149619/118067606-7b1e8080-b3d3-11eb-85db-c757c38269b5.png" alt="29fb6e68dc5cf6736622b00546c8985"></p><h3 id="他们的工作"><a href="#他们的工作" class="headerlink" title="他们的工作"></a>他们的工作</h3><p>概述</p><p><img src="https://user-images.githubusercontent.com/16149619/118067701-a7d29800-b3d3-11eb-9f81-b0095b575551.png" alt="e09e7e6521aabee6c58e6aa477b81cb"></p><p>重点1:模型适应性问题,自学习、自更新的故障预测模型,human-in-loop</p><p><img src="https://user-images.githubusercontent.com/16149619/118067887-0435b780-b3d4-11eb-8890-a73e3d0bfbd9.png" alt="5f23f647ca6efef99e4caa69ee2a015"></p><p>重点2:机器学习结果如何提高人类的运维水平,指导程序员设置meaningful的日志打印点(打印点越多,日志包含的信息越无用),实际效果大幅减少打印点、且提升了故障诊断准确度</p><p><img src="https://user-images.githubusercontent.com/16149619/118068263-a6559f80-b3d4-11eb-9f93-9b2df53c2550.png" alt="f2b06778d60907994e463c2649ea753"></p><p>重点3:CMDB相关</p><p><img src="https://user-images.githubusercontent.com/16149619/118068466-fcc2de00-b3d4-11eb-80f5-4841c766391b.png" alt="43bd7aa2f12acc7183ce089ab7d53f4"><br><img src="https://user-images.githubusercontent.com/16149619/118068476-ffbdce80-b3d4-11eb-9c3d-6226fc2896b7.png" alt="24f3046d112f0daa16561614fad677a"><br><img src="https://user-images.githubusercontent.com/16149619/118068633-47445a80-b3d5-11eb-8892-8ab0c02406f6.png" alt="91c0926c0db9d0e85064f0f944e80e7"></p><p>重点4:知识图图谱</p><p><img src="https://user-images.githubusercontent.com/16149619/118069018-fa14b880-b3d5-11eb-87c1-6b59a87d22fc.png" alt="2b060c89fc58c019e5d86588c8b2410"><br><img src="https://user-images.githubusercontent.com/16149619/118069021-fb45e580-b3d5-11eb-9d98-8ffae06434f9.png" alt="cbef862f03ea4a8d74de16e561aa09f"></p><p>重点5:人机智能问答</p><p><img src="https://user-images.githubusercontent.com/16149619/118069086-1b75a480-b3d6-11eb-8f15-74ab2a01f95b.png" alt="f9b1bd79a7e34528ad30ea964f2116b"><br><img src="https://user-images.githubusercontent.com/16149619/118069207-57106e80-b3d6-11eb-9748-0aba1fa8a83a.png" alt="ACL2020_8dbe96fbf771796f37977008d26fc1e"><br><img src="https://user-images.githubusercontent.com/16149619/118069295-82935900-b3d6-11eb-9319-4233679d332c.png" alt="IJCAI2020_668cd8dcc538e660acd07ea56ffa7c2"></p><hr><h1 id="3-落地经验"><a href="#3-落地经验" class="headerlink" title="3. 落地经验"></a>3. 落地经验</h1><p><img src="https://user-images.githubusercontent.com/16149619/118069658-1f55f680-b3d7-11eb-9743-cbf86997616e.png" alt="b307dd866403412e2fe07928660b09e"></p>]]></content>
<tags>
<tag> note </tag>
<tag> AIOps </tag>
</tags>
</entry>
<entry>
<title>Related Papers in ICML 2021 (2021.07.18 - 2021.07.24)</title>
<link href="/uncategorized/paperlistfile/ICML2021/"/>
<url>/uncategorized/paperlistfile/ICML2021/</url>
<content type="html"><![CDATA[<p>Accept papers: <a href="https://icml.cc/Conferences/2021/AcceptedPapersInitial">link</a></p><p><a href="https://mp.weixin.qq.com/s/VRWHgES7NK6j-c1pYjZ3Yg">时序论文一览</a></p><span id="more"></span><h2 id="Anomaly-detection-anomaly-outlier-out-of-distribution-one-class-Malware-detection-…"><a href="#Anomaly-detection-anomaly-outlier-out-of-distribution-one-class-Malware-detection-…" class="headerlink" title="Anomaly detection (anomaly, outlier, out-of-distribution, one-class, Malware detection, …)"></a>Anomaly detection (anomaly, outlier, out-of-distribution, one-class, Malware detection, …)</h2><ul><li><p>Near-Optimal Entrywise Anomaly Detection for Low-Rank Matrices with Sub-Exponential Noise</p><p>Vivek Farias (MIT) · Andrew Li (Carnegie Mellon University) · Tianyi Peng (MIT)</p></li><li><p>Transfer-Based Semantic Anomaly Detection</p><p>Lucas Deecke (University of Edinburgh) · Lukas Ruff (Aignostics) · Robert Vandermeulen (TU Berlin) · Hakan Bilen (University of Edinburgh)</p></li><li><p>Neural Transformation Learning for Deep Anomaly Detection Beyond Images</p><p>Chen Qiu (TU Kaiserslautern/Bosch Center for Artificial Intelligence) · Timo Pfrommer (Bosch Center for Artificial Intelligence) · Marius Kloft (TU Kaiserslautern) · Stephan Mandt (University of California, Irivine) · Maja Rudolph (BCAI)</p></li></ul><ul><li><p>Event Outlier Detection in Continuous Time</p><p>Siqi Liu (University of Pittsburgh) · Milos Hauskrecht (University of Pittsburgh)</p></li><li><p>Understanding Failures in Out-of-Distribution Detection with Deep Generative Models</p><p>Lily Zhang (New York University) · Mark Goldstein (New York University) · Rajesh Ranganath (New York University)</p></li></ul><ul><li><p>Outlier-Robust Optimal Transport</p><p>Debarghya Mukherjee (University of Michigan) · Aritra Guha (Duke University) · Justin Solomon (MIT) · Yuekai Sun (University of Michigan) · Mikhail Yurochkin (IBM Research AI)</p></li><li><p>DORO: Distributional and Outlier Robust Optimization</p><p>Runtian Zhai (Carnegie Mellon University) · Chen Dan (Carnegie Mellon University) · Zico Kolter (Carnegie Mellon University / Bosch Center for AI) · Pradeep Ravikumar (Carnegie Mellon University)</p></li><li><p>Consistent regression when oblivious outliers overwhelm</p><p>Tommaso d’Orsi (ETH Zurich) · Gleb Novikov (ETH Zurich) · David Steurer (ETH Zurich)</p></li><li><p>Fixed-Parameter and Approximation Algorithms for PCA with Outliers</p><p>Yogesh Dahiya (The Institute of Mathematical Sciences (HBNI), Chennai, India) · Fedor Fomin (University of Bergen) · Fahad Panolan (Indian Institute of Technology Hyderabad) · Kirill Simonov (University of Bergen)</p></li><li><p>Generalization Bounds in the Presence of Outliers: a Median-of-Means Study</p><p>Pierre Laforgue (University of Milan) · Guillaume Staerman (Télécom Paris) · Stephan Clémençon (Télécom Paris)</p></li><li><p>Can Subnetwork Structure Be the Key to Out-of-Distribution Generalization?</p><p>Dinghuai Zhang (Mila) · Kartik Ahuja (Mila) · Yilun Xu (MIT) · Yisen Wang (Peking University) · Aaron Courville (Université de Montréal)</p></li><li><p>Out-of-Distribution Generalization via Risk Extrapolation (REx)</p><p>David Krueger (MILA (University of Montreal)) · Ethan Caballero (Mila) · Joern-Henrik Jacobsen (Apple Inc.) · Amy Zhang (FAIR / UC Berkeley) · Jonathan Binas (Mila, Montreal) · Dinghuai Zhang (Mila) · Remi Le Priol (Mila, Université de Montréal) · Aaron Courville (Université de Montréal</p></li><li><p>Graph Convolution for Semi-Supervised Classification: Improved Linear Separability and Out-of-Distribution Generalization</p><p>Aseem Baranwal (University of Waterloo) · Kimon Fountoulakis (University of Waterloo) · Aukosh Jagannath (University of Waterloo)</p></li><li><p>Accuracy on the Line: on the Strong Correlation Between Out-of-Distribution and In-Distribution Generalization</p><p>John Miller (University of California, Berkeley) · Rohan Taori (Stanford University) · Aditi Raghunathan (Stanford) · Shiori Sagawa (Stanford University) · Pang Wei Koh (Stanford University) · Vaishaal Shankar (UC Berkeley) · Percy Liang (Stanford University) · Yair Carmon (Tel Aviv University) · Ludwig Schmidt (Toyota Research Institute)</p></li></ul><h2 id="Time-series"><a href="#Time-series" class="headerlink" title="Time series"></a>Time series</h2><ul><li><p>Conformal prediction interval for dynamic time-series</p><p>Chen Xu (Georgia Institute of Technology) · Yao Xie (Georgia Institute of Technology)</p></li><li><p>Voice2Series: Reprogramming Acoustic Models for Time Series Classification</p><p>Huck Yang (Georgia Tech) · Yun-Yun Tsai (Columbia University) · Pin-Yu Chen (IBM Research AI)</p></li></ul><ul><li><p>Explaining Time Series Predictions with Dynamic Masks</p><p>Jonathan Crabbé (University of Cambridge) · Mihaela van der Schaar (University of Cambridge and UCLA)</p></li><li><p>Autoregressive Denoising Diffusion Models for Multivariate Probabilistic Time Series Forecasting</p><p>Kashif Rasul (Zalando Research) · Calvin Seward (Zalando Research) · Ingmar Schuster (Zalando Research) · Roland Vollgraf (Zalando Research)</p></li></ul><ul><li><p>Necessary and sufficient conditions for causal feature selection in time series with latent common causes</p><p>Atalanti Mastakouri (Amazon Research Tuebingen) · Bernhard Schölkopf (MPI for Intelligent Systems Tübingen, Germany) · Dominik Janzing (Amazon)</p></li><li><p>Approximation Theory of Convolutional Architectures for Time Series Modelling</p><p>Haotian Jiang (National University of Singapore) · Zhong Li (Peking University) · Qianxiao Li (National University of Singapore; IHPC, Singapore)</p></li><li><p>Whittle Networks: A Deep Likelihood Model for Time Series</p><p>Zhongjie Yu (TU Darmstadt) · Fabrizio Ventola (TU Darmstadt) · Kristian Kersting (TU Darmstadt)</p></li></ul><ul><li><p>Neural Rough Differential Equations for Long Time Series</p><p>James Morrill (University of Oxford) · Cristopher Salvi (University of Oxford) · Patrick Kidger (University of Oxford) · James Foster (University of Oxford)</p></li><li><p>End-to-End Learning of Coherent Probabilistic Forecasts for Hierarchical Time Series</p><p>Syama Sundar Yadav Rangapuram (Amazon) · Lucien D Werner (California Institute of Technology) · Konstantinos Benidis (Amazon Research) · Pedro Mercado (Amazon Research) · Jan Gasthaus (Amazon Research) · Tim Januschowski (Amazon Research)</p></li><li><p>Z-GCNETs: Time Zigzags at Graph Convolutional Networks for Time Series Forecasting</p><p>Yuzhou Chen (Southern Methodist University) · Ignacio Segovia (University of Texas at Dallas) · Yulia R Gel (University of Texas at Dallas)</p></li></ul><h2 id="Heterogeneous-multi-source"><a href="#Heterogeneous-multi-source" class="headerlink" title="Heterogeneous (multi-source)"></a>Heterogeneous (multi-source)</h2><ul><li><p>Quasi-global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data</p><p>Tao Lin (EPFL) · Sai Praneeth Reddy Karimireddy (EPFL) · Sebastian Stich (EPFL) · Martin Jaggi (EPFL)</p></li><li><p>Budgeted Heterogeneous Treatment Effect Estimation</p><p>Tian Qin (Nanjing University) · Tian-Zuo Wang (Nanjing University) · Zhi-Hua Zhou (Nanjing University)</p></li><li><p>Data-Free Knowledge Distillation for Heterogeneous Federated Learning</p><p>Zhuangdi Zhu (Michigan State University) · Junyuan Hong (Michigan State University) · Jiayu Zhou (Michigan State University)</p></li><li><p>Heterogeneous Risk Minimization</p><p>Jiashuo Liu (Tsinghua University) · Zheyuan Hu (Tsinghua University) · Peng Cui (Tsinghua University) · Bo Li (Tsinghua University) · Zheyan Shen (Tsinghua University)</p></li><li><p>Bias-Variance Reduced Local SGD for Less Heterogeneous Federated Learning</p><p>Tomoya Murata (NTT DATA Mathematical Systems Inc.) · Taiji Suzuki (The University of Tokyo / RIKEN)</p></li><li><p>Byzantine-Resilient High-Dimensional SGD with Local Iterations on Heterogeneous Data</p><p>Deepesh Data (UCLA) · Suhas Diggavi (UCLA)</p></li><li><p>KD3A: Unsupervised Multi-Source Decentralized Domain Adaptation via Knowledge Distillation</p><p>Haozhe Feng (State Key Lab of CAD&CG, Zhejiang University) · Zhaoyang You (Zhejiang University) · Minghao Chen (Zhejiang University) · Tianye Zhang (Zhejiang University) · Minfeng Zhu (State Key Lab of CAD&CG, Zhejiang University) · Fei Wu (Zhejiang University, China) · Chao Wu (Zhejiang University) · Wei Chen (State Key Lab of CAD&CG, Zhejiang University)</p></li></ul><h2 id="Graph-Representation-Learning"><a href="#Graph-Representation-Learning" class="headerlink" title="Graph Representation Learning"></a>Graph Representation Learning</h2><ul><li><p>Explainable Automated Graph Representation Learning with Hyperparameter Importance</p><p>Xin Wang (Tsinghua University) · Shuyi Fan (Tsinghua University) · Kun Kuang (Zhejiang University) · wenwu zhu (Tsinghua University)</p></li><li><p>Size-Invariant Graph Representations for Graph Classification Extrapolations</p><p>Beatrice Bevilacqua (Purdue University) · Yangze Zhou (Purdue University) · Bruno Ribeiro (Purdue University)</p></li><li><p>Generative Causal Explanations for Graph Neural Networks</p><p>Wanyu Lin (Department of Computing, The Hong Kong Polytechnic University) · Hao Lan (University of Toronto) · Baochun Li (University of Toronto)</p></li></ul><h2 id="Sequence"><a href="#Sequence" class="headerlink" title="Sequence"></a>Sequence</h2><ul><li><p>Near-Optimal Confidence Sequences for Bounded Random Variables</p><p>Arun Kuchibhotla (Carnegie Mellon University) · Qinqing Zheng (Facebook AI Research)</p></li><li><p>Off-Policy Confidence Sequences</p><p>Nikos Karampatziakis (Microsoft) · Paul Mineiro (Microsoft) · Aaditya Ramdas (Carnegie Mellon University)</p></li><li><p>Learning to Rehearse in Long Sequence Memorization</p><p>Zhu Zhang (DAMO Academy, Alibaba Group,) · Chang Zhou (Alibaba Group) · Jianxin Ma (Alibaba Group) · Zhijie Lin (Zhejiang University) · Jingren Zhou (Alibaba Group) · Hongxia Yang (Alibaba Group) · Zhou Zhao (Zhejiang University)</p></li><li><p>Order Matters: Probabilistic Modeling of Node Sequence for Graph Generation</p><p>Xiaohui Chen (Tufts University) · Xu Han (Tufts University) · Jiajing Hu (Tufts University) · Francisco R Ruiz (DeepMind) · Liping Liu (Tufts University)</p></li><li><p>A Structured Observation Distribution for Generative Biological Sequence Prediction and Forecasting</p><p>Eli N. Weinstein (Harvard) · Debora Marks (Harvard Medical School)</p></li><li><p>Fold2Seq: A Joint Sequence(1D)-Fold(3D) Embedding-based Generative Model for Protein Design</p><p>yue cao (Texas A&M University) · Payel Das (IBM Research AI) · Vijil Chenthamarakshan (IBM Research) · Pin-Yu Chen (IBM Research AI) · Igor Melnyk (IBM) · Yang Shen (Texas A&M University)</p></li><li><p>Temporally Correlated Task Scheduling for Sequence Learning</p><p>Xueqing Wu (University of Science and Technology of China) · Lewen Wang (Microsoft Research Asia) · Yingce Xia (Microsoft Research Asia) · Weiqing Liu (Microsoft Research) · Lijun Wu (Microsoft Research) · Shufang Xie (Microsoft Research Asia) · Tao Qin (Microsoft Research Asia) · Tie-Yan Liu (Microsoft Research Asia)</p></li></ul><h2 id="Autoencoder"><a href="#Autoencoder" class="headerlink" title="Autoencoder"></a>Autoencoder</h2><ul><li><p>Unified Robust Semi-Supervised Variational Autoencoder</p><p>Xu Chen (SAS Inc)</p></li><li><p>MorphVAE: Generating Neural Morphologies from 3D-Walks using a Variational Autoencoder with Spherical Latent Space</p><p>Sophie C Laturnus (University of Tübingen) · Philipp Berens (University of Tübingen)</p></li><li><p>Spectral Smoothing Unveils Phase Transitions in Hierarchical Variational Autoencoders</p><p>Adeel Pervez (University of Amsterdam) · Efstratios Gavves (University of Amsterdam )</p></li><li><p>Autoencoder Image Interpolation by Shaping the Latent Space</p><p>Alon Oring (IDC) · Zohar Yakhini (Herzliya Interdisciplinary Center) · Yacov Hel-Or (The Interdisciplinary Center, Herzliya)</p></li><li><p>BasisDeVAE: Interpretable Simultaneous Dimensionality Reduction and Feature-Level Clustering with Derivative-Based Variational Autoencoders</p><p>Dominic Danks (Alan Turing Institute) · Christopher Yau (University of Manchester)</p></li><li><p>Composed Fine-Tuning: Freezing Pre-Trained Denoising Autoencoders for Improved Generalization</p><p>Sang Michael Xie (Stanford University) · Tengyu Ma (Stanford University) · Percy Liang (Stanford University)</p></li><li><p>Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech</p><p>Jaehyeon Kim (Kakao Enterprise) · Jungil Kong (Kakao Enterprise) · Juhee Son (Kakao Enterprise)</p></li></ul><h2 id="Recurrent-Neural-Network"><a href="#Recurrent-Neural-Network" class="headerlink" title="Recurrent Neural Network"></a>Recurrent Neural Network</h2><ul><li><p>Training Recurrent Neural Networks via Forward Propagation Through Time</p><p>Anil Kag (Boston University) · Venkatesh Saligrama (Boston University)</p></li><li><p>Re-understanding Finite-State Representations of Recurrent Policy Networks</p><p>Mohamad H Danesh (Oregon State University) · Anurag Koul (Oregon State University) · Alan Fern (Oregon State University) · Saeed Khorram (Oregon State University) </p></li><li><p>UnICORNN: A recurrent model for learning very long time dependencies</p><p>T. Konstantin Rusch (ETH Zurich) · Siddhartha Mishra (ETH Zurich)</p></li></ul><h2 id="Correlation-analysis-association-analysis"><a href="#Correlation-analysis-association-analysis" class="headerlink" title="Correlation analysis (association analysis)"></a>Correlation analysis (association analysis)</h2><ul><li><p>Inferring serial correlation with dynamic backgrounds</p><p>Song Wei (Georgia Tech) · Yao Xie (Georgia Institute of Technology) · Dobromir Rahnev (Georgia Tech)</p></li><li><p>Connecting Optimal Ex-Ante Collusion in Teams to Extensive-Form Correlation: Faster Algorithms and Positive Complexity Results</p><p>Gabriele Farina (Carnegie Mellon University) · Andrea Celli (Facebook CDS) · Nicola Gatti (Politecnico di Milano) · Tuomas Sandholm (Carnegie Mellon University)</p></li><li><p>Local Correlation Clustering with Asymmetric Classification Errors</p><p>Jafar Jafarov (University of Chicago) · Sanchit Kalhan (Northwestern University) · Konstantin Makarychev (Northwestern University) · Yury Makarychev (Toyota Technological Institute at Chicago)</p></li><li><p>Differentially Private Correlation Clustering</p><p>Mark Bun (Boston University) · Marek Elias (CWI) · Janardhan Kulkarni (Microsoft Research)</p></li><li><p>A theory of high dimensional regression with arbitrary correlations between input features and target functions: sample complexity, multiple descent curves and a hierarchy of phase transitions</p><p>Gabriel Mel (Stanford University) · Surya Ganguli (Stanford)</p></li><li><p>Correlation Clustering in Constant Many Parallel Rounds</p><p>Vincent Cohen-Addad (Google) · Silvio Lattanzi (Google) · Slobodan Mitrović (MIT) · Ashkan Norouzi-Fard (Google) · Nikos Parotsidis (Google) · Jakub Tarnawski (Microsoft Research)</p></li><li><p>Lottery Ticket Preserves Weight Correlation: Is It Desirable or Not?</p><p>Ning Liu (Midea Group) · Geng Yuan (Northeastern University) · Zhengping Che (Didi Chuxing) · Xuan Shen (Northeastern University) · Xiaolong Ma (Northeastern University) · Qing Jin (Northeastern University) · Jian Ren (Snap Inc.) · Jian Tang (AI Innovation Center, Midea Group) · Sijia Liu (Michigan State University) · Yanzhi Wang (Northeastern University)</p></li><li><p>Accuracy on the Line: on the Strong Correlation Between Out-of-Distribution and In-Distribution Generalization</p><p>John Miller (University of California, Berkeley) · Rohan Taori (Stanford University) · Aditi Raghunathan (Stanford) · Shiori Sagawa (Stanford University) · Pang Wei Koh (Stanford University) · Vaishaal Shankar (UC Berkeley) · Percy Liang (Stanford University) · Yair Carmon (Tel Aviv University) · Ludwig Schmidt (Toyota Research Institute)</p></li></ul><h2 id="Causal-analysis"><a href="#Causal-analysis" class="headerlink" title="Causal analysis"></a>Causal analysis</h2><ul><li><p>Causal Curiosity: RL Agents Discovering Self-supervised Experiments for Causal Representation Learning</p><p>Sumedh Sontakke (University of Southern California) · Arash Mehrjou (Max Planck Institute for Intelligent Systems) · Laurent Itti (University of Southern California) · Bernhard Schölkopf (MPI for Intelligent Systems Tübingen, Germany)</p></li><li><p>Integer Programming for Causal Structure Learning in the Presence of Latent Variables</p><p>Rui Chen (University of Wisconsin-Madison) · Sanjeeb Dash (IBM Research) · Tian Gao (IBM Research)</p></li><li><p>How and Why to Use Experimental Data to Evaluate Methods for Observational Causal Inference</p><p>Amanda Gentzel (University of Massachusetts Amherst) · Purva Pruthi (University of Massachusetts Amherst) · David Jensen (University of Massachusetts Amherst)</p></li><li><p>Model-Free and Model-Based Policy Evaluation when Causality is Uncertain</p><p>David Bruns-Smith (UC Berkeley)</p></li><li><p>Domain Generalization using Causal Matching</p><p>Divyat Mahajan (Microsoft Research India) · Shruti Tople (Microsoft Research) · Amit Sharma (Microsoft Research)</p></li><li><p>Estimating Identifiable Causal Effects on Markov Equivalence Class through Double Machine Learning</p><p>Yonghan Jung (Purdue University) · Jin Tian (Iowa State University) · Elias Bareinboim (Columbia)</p></li><li><p>Valid Causal Inference with (Some) Invalid Instruments</p><p>Jason Hartford (University of British Columbia) · Victor Veitch (Google; University of Chicago) · Dhanya Sridhar (Columbia University) · Kevin Leyton-Brown (University of British Columbia)</p></li><li><p>Quantifying Ignorance in Individual-Level Causal-Effect Estimates under Hidden Confounding</p><p>Andrew Jesson (University of Oxford) · Sören Mindermann (University of Oxford) · Yarin Gal (University of Oxford) · Uri Shalit (Technion)</p></li><li><p>Regularizing towards Causal Invariance: Linear Models with Proxies</p><p>Michael Oberst (MIT) · Nikolaj Thams (University of Copenhagen) · Jonas Peters (University of Copenhagen) · David Sontag (Massachusetts Institute of Technology)</p></li><li><p>Proximal Causal Learning with Kernels: Two-Stage Estimation and Moment Restriction</p><p>Afsaneh Mastouri (University College London) · Yuchen Zhu (University College London) · Limor Gultchin (University of Oxford) · Anna Korba (CREST/ENSAE) · Ricardo Silva (University College London) · Matt J. Kusner (University College London) · Arthur Gretton (Gatsby Computational Neuroscience Unit) · Krikamol Muandet (Max Planck Institute for Intelligent Systems)</p></li><li><p>Causality-aware counterfactual confounding adjustment as an alternative to linear residualization in anticausal prediction tasks based on linear learners</p><p>Elias Chaibub Neto (Sage Bionetworks)</p></li></ul><hr><h2 id="Clustering"><a href="#Clustering" class="headerlink" title="Clustering"></a>Clustering</h2><h2 id="About-distribution"><a href="#About-distribution" class="headerlink" title="About distribution"></a>About distribution</h2><h2 id="Interpretable-Understanding-explanation-Attribution-…"><a href="#Interpretable-Understanding-explanation-Attribution-…" class="headerlink" title="Interpretable [Understanding, explanation, Attribution …]"></a>Interpretable [Understanding, explanation, Attribution …]</h2>]]></content>
<tags>
<tag> paper list </tag>
</tags>
</entry>
<entry>
<title>Related Papers in SIRIR 2021 (2021.07.11 - 2021.07.15)</title>
<link href="/uncategorized/paperlistfile/SIGIR2021/"/>
<url>/uncategorized/paperlistfile/SIGIR2021/</url>
<content type="html"><![CDATA[<!--tags:按任务分类### Anomaly detection / Outlier / Out-of-distribution### Interpretable / Explainable### Causal discovery### Data augmentation 按数据分类### Time series### Missing value / Irregular sampled / Imputation### Sequence### Heterogeneous按深度学习架构分类### Recurrent neural network / RNN / LSTM / GRU ### Autoencoder--><p><a href="https://sigir.org/sigir2021/accepted-papers/">accept paper list</a></p><span id="more"></span><h2 id="Anomaly-detection-Outlier-Out-of-distribution"><a href="#Anomaly-detection-Outlier-Out-of-distribution" class="headerlink" title="Anomaly detection / Outlier / Out-of-distribution"></a>Anomaly detection / Outlier / Out-of-distribution</h2><ul><li>Decoupling Representation Learning and Classification for GNN-based Anomaly Detection Yanling Wang, Jing Zhang, Shasha Guo, Hongzhi Yin, Cuiping Li and Hong Chen</li></ul><h2 id="Interpretable-Explainable"><a href="#Interpretable-Explainable" class="headerlink" title="Interpretable / Explainable"></a>Interpretable / Explainable</h2><ul><li><p>FedNLP: An interpretable NLP System to Decode Federal Reserve Communications</p><p> Jean Lee, Hoyoul Luis Youn, Nicholas Stevens, Josiah Poon and Soyeon Caren Han</p></li><li><p>Interpretable Document Representations for Fast and Accurate Retrieval of Mathematical Information</p><p> Vít Novotný</p></li><li><p>Towards Trustworthiness in the context of Explainable Search</p><p> Sayantan Polley, Rashmi Koparde, Akshaya Bindu Gowri, Maneendra Perera and Andreas Nuernberger</p></li></ul><h2 id="Causal-discovery"><a href="#Causal-discovery" class="headerlink" title="Causal discovery"></a>Causal discovery</h2><ul><li><p>Should Graph Convolution Trust Neighbors? A Simple Causal Inference Method</p><p> Fuli Feng, Weiran Huang, Xin Xin, Xiangnan He, Tat-Seng Chua and Qifan Wang</p></li><li><p>Deconfounded Video Moment Retrieval with Causal Intervention</p><p> Xun Yang, Fuli Feng, Wei Ji, Meng Wang and Tat-Seng Chua</p></li></ul><h2 id="Data-augmentation"><a href="#Data-augmentation" class="headerlink" title="Data augmentation"></a>Data augmentation</h2><ul><li>Counterfactual Data-Augmented Sequential Recommendation Zhenlei Wang, Jingsen Zhang, Hongteng Xu, Xu Chen, Yongfeng Zhang, Wayne Xin Zhao and Ji-Rong Wen</li></ul><ul><li>AdsGNN: Behavior-Graph Augmented Relevance Modeling in Sponsored Search Xing Xie, Chaozhuo Li, Zheng Liu, Bochen Pang, Tianqi Yang, Yuming Liu, Yanling Cui, Hao Sun, Qi Zhang and Liangjie Zhang</li></ul><ul><li>NIP-GCN: An Augmented Graph Convolutional Network with Node Interaction Patterns Manish Chandra, Debasis Ganguly, Pabitra Mitra, Bithika Pal and James Thomas</li></ul><ul><li><p>Rumor Detection on Social Media with Event Augmentations</p><p> Zhenyu He, Ce Li, Fan Zhou and Yi Yang</p></li></ul><ul><li>Unsupervised Extractive Text Summarization with Distance-Augmented Sentence Graphs Jingzhou Liu, Dominic Hughes and Yiming Yang</li></ul><ul><li>Augmenting Sequential Recommendation with Pseudo-Prior Items via Reversely Pre-training Transformer Zhiwei Liu, Ziwei Fan, Yu Wang and Philip S. Yu</li></ul><ul><li><p>Cheap and Good? Simple and Effective Data Augmentation for Low Resource Machine Reading</p><p> Hoang Van, Vikas Yadav and Mihai Surdeanu</p></li></ul><ul><li>ReadsRE: Retrieval-Augmented Distantly Supervised Relation Extraction Yue Zhang, Hongliang Fei and Ping Li</li></ul><ul><li>Temporal Augmented Graph Neural Networks for Session-Based Recommendations Huachi Zhou, Qiaoyu Tan, Xiao Huang, Kaixiong Zhou and Xiaoling Wang</li></ul><ul><li>Select, Substitute, Search: A New Benchmark for Knowledge-Augmented Visual Question Answering Mayank Kothyari, Aman Jain, Vishwajeet Kumar, Preethi Jyothi, Soumen Chakrabarti and Ganesh Ramakrishnan</li></ul><h2 id="Time-Series"><a href="#Time-Series" class="headerlink" title="Time Series"></a>Time Series</h2><ul><li><p>Temporal Event Profiling based on Multivariate Time Series Analysis over Long-term Document Archives</p><p> Jiexin Wang, Adam Jatowt and Masatoshi Yoshikawa</p></li><li><p>Time Aware Hyperbolic LSTM for Financial Stream Modeling</p><p> Ramit Sawhney, Shivam Agarwal, Megh Thakkar, Arnav Wadhwa and Rajiv Shah</p></li></ul><h2 id="Sequence"><a href="#Sequence" class="headerlink" title="Sequence"></a>Sequence</h2><ul><li>CINES: Explore Citation Network and Event Sequences for Citation Forecasting Fang He, Wang-Chien Lee, Tao-Yang Fu and Zhen Lei</li></ul><h2 id="Heterogeneous"><a href="#Heterogeneous" class="headerlink" title="Heterogeneous"></a>Heterogeneous</h2><ul><li>Heterogeneous Attention Network for Effective and Efficient Cross-modal Retrieval Tan Yu, Yi Yang, Yi Li, Lin Liu, Hongliang Fei and Ping Li</li></ul><ul><li>Graph Meta Network for Multi-Behavior Recommendation with Interaction Heterogeneity and Diversity Lianghao Xia, Chao Huang, Yong Xu, Peng Dai and Liefeng Bo</li></ul><h2 id="Recurrent-neural-network-RNN-LSTM-GRU"><a href="#Recurrent-neural-network-RNN-LSTM-GRU" class="headerlink" title="Recurrent neural network / RNN / LSTM / GRU"></a>Recurrent neural network / RNN / LSTM / GRU</h2><ul><li>Time Aware Hyperbolic LSTM for Financial Stream Modeling Ramit Sawhney, Shivam Agarwal, Megh Thakkar, Arnav Wadhwa and Rajiv Shah</li></ul>]]></content>
<tags>
<tag> paper list </tag>
</tags>
</entry>
<entry>
<title>Quasi-periodic time series</title>
<link href="/uncategorized/surveys/QTS/"/>
<url>/uncategorized/surveys/QTS/</url>
<content type="html"><![CDATA[<h2 id="研究问题描述"><a href="#研究问题描述" class="headerlink" title="研究问题描述"></a>研究问题描述</h2><ul><li><p>准周期时间序列的异常检测问题</p><p> 该问题可以拆解为两部分:准周期时间序列建模、异常检测。本调研主要针对准周期时间序列建模进行调研</p></li><li><p>数学描述</p><p> 给定一个准周期时间序列的一组周期样本${\bf{x}}_1, …, {\bf{x}}_N$,其中${\bf{x}}_i$为一个周期内的时间序列片段,如何对$\bf{x}_i$进行建模,<br>即如何从${\bf{x}}_i$中提取出表征向量${\bf{v}}_i$用于后续的任务(如:异常检测、分类、预测,等)</p></li></ul><span id="more"></span><h2 id="领域现状"><a href="#领域现状" class="headerlink" title="领域现状"></a>领域现状</h2><ol><li>基于手工设计的特征的</li></ol><p>这类方法依赖手工设定的特征,提取时间序列的统计特征与形状特征。这类方法往往仅能用于特定种类的数据,如ECG数据,而缺乏可拓展性。具体包括:</p><ol start="2"><li>基于统计特征的</li></ol><p>[1] 设计了7种统计特征用于描述时间序列</p><p>[2] 对ECG数据进行Recurrence Quantification Analysis,以提取出15种特征用于后续的任务</p><ol start="3"><li>无需手工设计特征的</li></ol><p>此类方法无需手工设计特征提取方法,而是自适应的从原始时间序列中学习时间序列的动态。</p><ol start="4"><li>基于频率域特征的:</li></ol><p>[3]: 将ECG数据进行小波变换(DWT)后,联合使用三种降维方法(LDA,ICA,PCA)对DWT结果进行降维,最后送入SVM或PNN(probabilistic neural network)进行判别</p><ol start="5"><li>基于形态学特征的:</li></ol><p>[6]: 使用Triadic Motif Field Images描述准周期序列中所包含的motifs,然后借助VGG-16作为特征提取器对TMF images进行特征提取以支持异常检测。</p><ol start="6"><li>基于深度学习的:</li></ol><p>[8]: 提出一种Hybrid Attentional LSTM-CNN Model,它结合了LSTM与CNN,分别用于提取准周期时间序列中的趋势变化与局部特征变化。</p><hr><h2 id="代表性论文10篇"><a href="#代表性论文10篇" class="headerlink" title="代表性论文10篇"></a>代表性论文10篇</h2><ol><li>基于手工设计的特征</li></ol><p>[1] Ma, J., Sun, L., Wang, H., Zhang, Y., & Aickelin, U. (2016). Supervised anomaly detection in uncertain pseudoperiodic data streams. ACM Transactions on Internet Technology (TOIT), 16(1), 1-20.</p><p>[2] Desai, U., Martis, R. J., Acharya, U. R., Nayak, C. G., Seshikala, G., & SHETTY K, R. A. N. J. A. N. (2016). Diagnosis of multiclass tachycardia beats using recurrence quantification analysis and ensemble classifiers. Journal of Mechanics in Medicine and Biology, 16(01), 1640005.</p><ol start="2"><li>无需手工设计特征的</li></ol><p>[3] Martis, R. J., Acharya, U. R., & Min, L. C. (2013). ECG beat classification using PCA, LDA, ICA and discrete wavelet transform. Biomedical Signal Processing and Control, 8(5), 437-448.</p><p><a href="%E6%8F%90%E5%87%BA%E4%BA%86%E4%B8%80%E7%A7%8D%E5%9F%BA%E4%BA%8E%E5%BA%8F%E5%88%97%E6%95%B0%E6%8D%AEDFT%E7%9A%84%E5%91%A8%E6%9C%9F%E6%80%A7%E7%82%B9%E5%BC%82%E5%B8%B8%E6%A3%80%E6%B5%8B%E7%AE%97%E6%B3%95">4</a> Erkuş, E. C., & Purutçuoğlu, V. (2020). Outlier detection and quasi-periodicity optimization algorithm: Frequency domain based outlier detection (FOD). European Journal of Operational Research.</p><p><a href="%E8%B0%83%E7%A0%94%E4%BA%86%E7%94%A8%E4%BA%8E%E5%87%86%E5%91%A8%E6%9C%9F%E6%97%B6%E9%97%B4%E5%BA%8F%E5%88%97%E5%BC%82%E5%B8%B8%E6%A3%80%E6%B5%8B%E7%9A%8410%E7%A7%8D%E9%A2%91%E7%8E%87%E5%9F%9F%E7%89%B9%E5%BE%81%E6%8F%90%E5%8F%96%E6%96%B9%E6%B3%95%E5%8F%8A%E5%85%B6%E7%89%B9%E6%80%A7">5</a> Iskhakova, A. O., Alekhin, M. D., & Bogomolov, A. V. (2020). Time-frequency transforms in analysis of non-stationary quasi-periodic biomedical signal patterns for acoustic anomaly detection. Информационно-управляющие системы, (1), 15-23.</p><p>[6] Zhang, Y., & Chen, X. (2020). Anomaly Detection in Time Series with Triadic Motif Fields and Application in Atrial Fibrillation ECG Classification. arXiv preprint arXiv:2012.04936.</p><p><a href="%E9%92%88%E5%AF%B9ECG%E4%BF%A1%E5%8F%B7%E9%A6%96%E5%85%88%E5%88%A9%E7%94%A8STFT%E6%8F%90%E5%8F%96%E9%A2%91%E5%9F%9F%E7%89%B9%E5%BE%81%EF%BC%8C%E7%84%B6%E5%90%8E%E8%AE%A1%E7%AE%97%E6%AF%8F%E4%B8%AA%E9%A2%91%E5%B8%A6%E7%9A%84AMDF%EF%BC%8C%E7%94%A8%E4%BB%A5%E8%A7%A3%E6%9E%90ECG%E4%B8%AD%E7%9A%84%E9%95%BF%E7%9F%AD%E6%9C%9F%E4%BE%9D%E8%B5%96%EF%BC%8C%E8%BF%9B%E4%B8%80%E6%AD%A5%E7%9A%84%E5%80%9F%E5%8A%A9LSTM%E5%BB%BA%E6%A8%A1%E9%95%BF%E7%9F%AD%E6%9C%9F%E4%BE%9D%E8%B5%96%E5%B9%B6%E5%81%9A%E6%AD%A3%E5%BC%82%E5%B8%B8%E6%A3%80%E6%B5%8B%E3%80%82">7</a> Ngo, D., & Veeravalli, B. (2015, November). Design of a real-time morphology-based anomaly detection method from ECG streams. In 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (pp. 829-836). IEEE.</p><p>[8] Liu, F., Zhou, X., Cao, J., Wang, Z., Wang, T., Wang, H., & Zhang, Y. (2020). Anomaly Detection in Quasi-Periodic Time Series based on Automatic Data Segmentation and Attentional LSTM-CNN. IEEE Transactions on Knowledge and Data Engineering.</p><p><a href="%E5%88%A9%E7%94%A8LSTM%E7%BD%91%E7%BB%9C%E6%9E%84%E5%BB%BA%E4%BA%86%E4%B8%80%E4%B8%AA%E6%97%B6%E9%97%B4%E5%BA%8F%E5%88%97%E9%A2%84%E6%B5%8B%E6%A8%A1%E5%9E%8B%EF%BC%8C%E5%B9%B6%E4%BD%BF%E7%94%A8%E5%A4%9A%E7%BB%B4%E9%AB%98%E6%96%AF%E5%88%86%E5%B8%83%E6%8B%9F%E5%90%88%E8%AF%A5%E9%A2%84%E6%B5%8B%E6%A8%A1%E5%9E%8B%E7%9A%84%E9%A2%84%E6%B5%8B%E8%AF%AF%E5%B7%AE%E3%80%82%E9%80%9A%E8%BF%87%E5%88%A4%E6%96%AD%E9%A2%84%E6%B5%8B%E8%AF%AF%E5%B7%AE%E6%98%AF%E5%90%A6%E5%B1%9E%E4%BA%8E%E6%89%80%E5%AD%A6%E4%B9%A0%E7%9A%84%E9%AB%98%E6%96%AF%E5%88%86%E5%B8%83%E6%9D%A5%E8%BF%9B%E8%A1%8C%E5%BC%82%E5%B8%B8%E6%A3%80%E6%B5%8B%E3%80%82">9</a> Thill, M., Däubener, S., Konen, W., & Bäck, T. (2019). Anomaly Detection in Electrocardiogram Readings with Stacked LSTM Networks. In ITAT (pp. 17-25).</p><p><a href="%E4%BD%BF%E7%94%A8%E7%94%B1LSTM%E6%9E%84%E5%BB%BA%E7%9A%84seq2seq%E6%A8%A1%E5%9E%8B%E5%BB%BA%E6%A8%A1%E6%AD%A3%E5%B8%B8%E6%97%B6%E9%97%B4%E5%BA%8F%E5%88%97%EF%BC%8C%E5%B9%B6%E4%BD%BF%E7%94%A8%E9%87%8D%E5%BB%BA%E8%AF%AF%E5%B7%AE%E4%BD%9C%E4%B8%BA%E5%BC%82%E5%B8%B8%E5%88%86%E6%95%B0%E3%80%82">10</a> Malhotra, P., Ramakrishnan, A., Anand, G., Vig, L., Agarwal, P., & Shroff, G. (2016). LSTM-based encoder-decoder for multi-sensor anomaly detection. arXiv preprint arXiv:1607.00148.</p><p><a href="%E5%88%A9%E7%94%A8%E5%88%86%E5%88%AB%E5%88%A9%E7%94%A8LSTM%E4%B8%8ECNN%E5%AF%B9ECG%E6%95%B0%E6%8D%AE%E8%BF%9B%E8%A1%8C%E7%89%B9%E5%BE%81%E6%8F%90%E5%8F%96%E4%B8%8E%E8%9E%8D%E5%90%88%EF%BC%8C%E5%B9%B6%E5%88%A9%E7%94%A8MLP%E4%BD%9C%E4%B8%BA%E5%88%86%E7%B1%BB%E5%99%A8%E4%BB%A5%E6%A3%80%E6%B5%8B%E5%BC%82%E5%B8%B8%E5%BF%83%E9%9F%B3">11</a> Dissanayake, T., Fernando, T., Denman, S., Sridharan, S., Ghaemmaghami, H., & Fookes, C. (2020). A robust interpretable deep learning classifier for heart anomaly detection without segmentation. IEEE Journal of Biomedical and Health Informatics.</p><h2 id="经典论文or强相关论文"><a href="#经典论文or强相关论文" class="headerlink" title="经典论文or强相关论文"></a>经典论文or强相关论文</h2><p>[8] Liu, F., Zhou, X., Cao, J., Wang, Z., Wang, T., Wang, H., & Zhang, Y. (2020). Anomaly Detection in Quasi-Periodic Time Series based on Automatic Data Segmentation and Attentional LSTM-CNN. IEEE Transactions on Knowledge and Data Engineering.</p><p><a href="%E5%88%A9%E7%94%A8LSTM%E7%BD%91%E7%BB%9C%E6%9E%84%E5%BB%BA%E4%BA%86%E4%B8%80%E4%B8%AA%E6%97%B6%E9%97%B4%E5%BA%8F%E5%88%97%E9%A2%84%E6%B5%8B%E6%A8%A1%E5%9E%8B%EF%BC%8C%E5%B9%B6%E4%BD%BF%E7%94%A8%E5%A4%9A%E7%BB%B4%E9%AB%98%E6%96%AF%E5%88%86%E5%B8%83%E6%8B%9F%E5%90%88%E8%AF%A5%E9%A2%84%E6%B5%8B%E6%A8%A1%E5%9E%8B%E7%9A%84%E9%A2%84%E6%B5%8B%E8%AF%AF%E5%B7%AE%E3%80%82%E9%80%9A%E8%BF%87%E5%88%A4%E6%96%AD%E9%A2%84%E6%B5%8B%E8%AF%AF%E5%B7%AE%E6%98%AF%E5%90%A6%E5%B1%9E%E4%BA%8E%E6%89%80%E5%AD%A6%E4%B9%A0%E7%9A%84%E9%AB%98%E6%96%AF%E5%88%86%E5%B8%83%E6%9D%A5%E8%BF%9B%E8%A1%8C%E5%BC%82%E5%B8%B8%E6%A3%80%E6%B5%8B%E3%80%82">9</a> Thill, M., Däubener, S., Konen, W., & Bäck, T. (2019). Anomaly Detection in Electrocardiogram Readings with Stacked LSTM Networks. In ITAT (pp. 17-25).</p><p><a href="%E4%BD%BF%E7%94%A8%E7%94%B1LSTM%E6%9E%84%E5%BB%BA%E7%9A%84seq2seq%E6%A8%A1%E5%9E%8B%E5%BB%BA%E6%A8%A1%E6%AD%A3%E5%B8%B8%E6%97%B6%E9%97%B4%E5%BA%8F%E5%88%97%EF%BC%8C%E5%B9%B6%E4%BD%BF%E7%94%A8%E9%87%8D%E5%BB%BA%E8%AF%AF%E5%B7%AE%E4%BD%9C%E4%B8%BA%E5%BC%82%E5%B8%B8%E5%88%86%E6%95%B0%E3%80%82">10</a> Malhotra, P., Ramakrishnan, A., Anand, G., Vig, L., Agarwal, P., & Shroff, G. (2016). LSTM-based encoder-decoder for multi-sensor anomaly detection. arXiv preprint arXiv:1607.00148.</p><p>[1] Dissanayake, T., Fernando, T., Denman, S., Sridharan, S., Ghaemmaghami, H., & Fookes, C. (2020). A robust interpretable deep learning classifier for heart anomaly detection without segmentation. IEEE Journal of Biomedical and Health Informatics.</p><h3 id="异同点"><a href="#异同点" class="headerlink" title="异同点"></a>异同点</h3><p>[8]提出的方法是有监督的,无法很好的适用于标签不足或标签不精确的现实情况。且缺少可解释性,无法定位异常所在的位置。</p><p>[9, 10] 所提出的方法缺乏可解释性</p><p><a href="%E5%88%A9%E7%94%A8%E5%88%86%E5%88%AB%E5%88%A9%E7%94%A8LSTM%E4%B8%8ECNN%E5%AF%B9ECG%E6%95%B0%E6%8D%AE%E8%BF%9B%E8%A1%8C%E7%89%B9%E5%BE%81%E6%8F%90%E5%8F%96%E4%B8%8E%E8%9E%8D%E5%90%88%EF%BC%8C%E5%B9%B6%E5%88%A9%E7%94%A8MLP%E4%BD%9C%E4%B8%BA%E5%88%86%E7%B1%BB%E5%99%A8%E4%BB%A5%E6%A3%80%E6%B5%8B%E5%BC%82%E5%B8%B8%E5%BF%83%E9%9F%B3">11</a> 所使用的解释方法仅能够显示出数据对模型输出的贡献,而不能指出异常的段。</p>]]></content>
<tags>
<tag> paper list </tag>
<tag> survey </tag>
<tag> Quasi-periodic </tag>
<tag> time series </tag>
</tags>
</entry>
<entry>
<title>Related Papers in ICLR 2021 (May 04 2021)</title>
<link href="/uncategorized/paperlistfile/ICLR2021/"/>
<url>/uncategorized/paperlistfile/ICLR2021/</url>
<content type="html"><![CDATA[<p><a href="https://openreview.net/group?id=ICLR.cc/2021/Conference">Accept paper list</a></p><span id="more"></span><h2 id="Anomaly-detection-anomaly-outlier-out-of-distribution-one-class"><a href="#Anomaly-detection-anomaly-outlier-out-of-distribution-one-class" class="headerlink" title="Anomaly detection [anomaly, outlier, out-of-distribution, one-class]"></a>Anomaly detection [anomaly, outlier, out-of-distribution, one-class]</h2><ul><li><p><strong>已读</strong>SSD: A Unified Framework for Self-Supervised Outlier Detection </p><p> Vikash Sehwag, Mung Chiang, Prateek Mittal</p><p> <strong>Reviewers say:</strong><br> 这项工作调查了一个经典的无监督离群值检测问题,其中我们没有任何标签信息,需要从那些未标记的数据中学习检测模型,以将任何不一致的数据点识别为离群值。这里的关键方法是应用现有的自我监督对比特征学习方法来提取特征表示,然后应用基于聚类的方法来计算离群值。它还提供了两种利用标记的异常数据(如果可用)的方法,包括改进的马哈拉诺比斯距离方法和最近提出的监督式对比学习方法的应用。使用四个数据集评估了这些方法,包括使用一些标记的异常数据的无监督和半监督方法。</p><p> <strong>解决的问题</strong>:分布外数据检测问题(Out-of-distribution detection)</p><p> <strong>难点</strong></p><ul><li>现有的无监督方法在复杂数据形态(如图像数据)上性能不好</li><li>现有的在图像数据上性能较好的方法大多假设有很多in-distribution数据的标可以获得</li></ul><p> <strong>创新点</strong></p><ul><li>本文提出一个无监督的OOD框架,可以不使用任何分布内数据的标签。</li><li>本文将few-shot设定拓展到所提出的框架中,即已知少量的OOD样本,少量有标签的OOD数据有助于提高OOD性能。</li><li>本文将所提出的SSD也拓展到了已知分部内数据标签的情况,使得改进后的方法比原始方案有更好的性能。</li></ul><p> <strong>为何选择/如何应用</strong></p><ul><li>本文提出了一种完全无需标签的方法,这在大多数异常检测问题中都是可借鉴的</li><li>本文的关键方法在于应用现有的self-supervised contrastive learning方法对数据进行了无监督的特征提取,这为后续工作中的特征提取提供了思路。 </li><li>本文进一步评估了改进后的mahalanobis distance在聚类时能够显著提高性能</li></ul></li><li><p>Learning and Evaluating Representations for Deep One-Class Classification </p><p> Kihyuk Sohn, Chun-Liang Li, Jinsung Yoon, Minho Jin, Tomas Pfister</p><p> <strong>One-sentence Summary:</strong> We present a two-stage framework for deep one-class classification, composed of state-of-the-art self-supervised representation learning followed by generative or discriminative one-class classifiers.</p><p> <strong>Reviewers say:</strong> 本文研究了一类分类问题,提出了一种学习自我监督表示和分布增强的对比学习方法。全面的结果和分析表明,该方法是有效的,并就其起作用的潜在机制支持了他们的主张。总体而言,尽管发现新颖性不高,但审稿人认为该论文写得好,动机强/论证力强,并提供了详尽的相关工作比较和实验。几位审稿人在证明表示的统一性以及建议其他数据集方面提出了一些可能的弱点。通过有趣的讨论,作者在Mvtec数据集上提供了其他可视化效果和结果。这进一步支持了本文的论点。</p></li></ul><ul><li><p>Explainable Deep One-Class Classification </p><p> Philipp Liznerski, Lukas Ruff, Robert A. Vandermeulen, Billy Joe Franks, Marius Kloft, Klaus Robert Muller</p><p> <strong>One-sentence Summary:</strong> We introduce an approach to explainable deep anomaly detection based on fully convolutional neural networks. </p><p> <strong>Reviewers say:</strong> 本文涉及可解释的异常检测。为此,它将超球面分类器修改为完全卷积数据描述(FCDD)。正如两个评论者所指出的那样,这是在超球面分类器中直接应用全卷积网络的方法。但是,本文还展示了如何使用具有固定高斯核的跨步转置卷积对接收场进行上采样。两者以及解决可解释的异常检测都很重要。此外,实证评估是详尽无遗的,并且与最新技术相比显示出一些好处。所以,是的,增量式的,但是对于一个非常有趣的重要案例来说,增量式的。</p></li><li><p>Multiscale Score Matching for Out-of-Distribution Detection </p><p> Ahsan Mahmood, Junier Oliva, Martin Andreas Styner</p><p> <strong>One-sentence Summary:</strong> Using score estimates at multiple noise scales outperforms state-of-the-art in out-of-distribution detection.</p><p> <strong>Reviewers say:</strong></p><ol><li>作者利用并重新调整了噪声条件评分网络(NCSN),该条件最初由Song&Ermon(2019)引入用于生成模型,用于检测失配(OOD)图像。作者展示了分数匹配背后的直觉和基本原理,然后进行了等价的去噪自动编码器(DAE)来得出NCSN作为分数估算器,并提供了分析以证明多尺度分数分析的价值。在对SVHN和CIFAR数据集进行的实验分析中,他们证明了他们的方法(MSMA)优于以前使用OOD任务模型(ODIN,JEM,似然比)在文献中报道的发现。</li><li>摘要:他们提出了一种新的OOD检测方法,即MSMA,它使用了一种新的生成模型[NCSN],并且在不同规模的似然矢量上拟合了简单密度模型的第二阶段。他们在标准OOD图像数据集(CIFAR10与OOD,SVHN与OOD等)上显示了经验良好的结果。他们能够在大多数情况下实现完美的分离,并且与以前的无监督方法相比,CIFAR10与SVHN的结果大大提高。他们显示了在医学图像中检测OOD的有趣应用,这些图像的扫描范围为9-11岁,并且OOD为<9岁。</li><li>本文将多尺度得分估计应用于分布外检测。他们证明了多尺度估计的有用性,并采用了辅助模型来识别异常数据。所提出的方法在两个不同的设置上进行了评估,对于分布不均的检测非常有效。</li></ol></li></ul><ul><li><p>In-N-Out: Pre-Training and Self-Training using Auxiliary Information for Out-of-Distribution Robustness </p><p> Sang Michael Xie, Ananya Kumar, Robbie Jones, Fereshte Khani, Tengyu Ma, Percy Liang</p><p> <strong>One-sentence Summary:</strong> Using auxiliary information as inputs hurts OOD, but using auxiliary information by pretraining and self-training improves in-distribution and OOD accuracies on real-world datasets, with theoretical guarantees in a linear multi-task setting.</p><p> <strong>Reviewers say:</strong> 本文通过利用可用的辅助信息解决了在注释数据很少的情况下改善泛化的问题。作者考虑了两种选择的各自优点:在多任务或传输设置中,使用辅助信息作为补充输入或作为附加输出。对于线性回归,他们从理论上表明前者可以帮助改善分布误差,但可能会损害OOD误差,而后者可能有助于改善OOD误差。他们提出了一个将这两种选择结合起来的框架,并凭经验表明它在三个不同的数据集上做到了。</p></li></ul><h2 id="Heterogeneous"><a href="#Heterogeneous" class="headerlink" title="Heterogeneous"></a>Heterogeneous</h2><ul><li><p>Multi-Level Local SGD: Distributed SGD for Heterogeneous Hierarchical Networks </p><p> Timothy Castiglia, Anirban Das, Stacy Patterson</p></li><li><p>HeteroFL: Computation and Communication Efficient Federated Learning for Heterogeneous Clients </p><p> Enmao Diao, Jie Ding, Vahid Tarokh</p></li></ul><h2 id="Time-series"><a href="#Time-series" class="headerlink" title="Time series"></a>Time series</h2><ul><li>Generative Time-series Modeling with Fourier Flows Ahmed Alaa, Alex James Chan, Mihaela van der Schaar</li></ul><ul><li><p>Coupled Oscillatory Recurrent Neural Network (coRNN): An accurate and (gradient) stable architecture for learning long time dependencies </p><p> T. Konstantin Rusch, Siddhartha Mishra</p><p> <strong>One-sentence Summary:</strong> A biologically motivated and discretized ODE based RNN for learning long-term dependencies, with rigorous bounds mitigating the exploding and vanishing gradient problem.</p></li></ul><ul><li><p>Multi-Time Attention Networks for Irregularly Sampled Time Series </p><p> Satya Narayan Shukla, Benjamin Marlin</p><p> <strong>One-sentence Summary:</strong> This paper presents a new deep learning architecture for learning with sparse and irregularly sampled multivariate time series.</p><p> <strong>Reviewers say:</strong> 本文讨论不规则样本时间序列的分析。该方法主要基于插值。因此,作者可以研究有监督和无监督的问题。该架构由正弦波注意层,可在潜在空间中形成固定大小的界标的VAE层和RNN解码器组成。对于监督任务,作者添加了分类损失。</p><ul><li>他们在插值任务中获得了令人印象深刻的结果,而在分类任务中获得了有趣的结果。</li><li>在插值问题中,我们想将稳健的基线视为线性插值或类似AR的模型。即使我必须承认作者已经提出了与最近文献的模型进行的大量比较,这也将为我们提供有意义的MSE结果,以比较其他方法。</li><li>结果令人印象深刻,但我不知道架构的哪个部分会带来如此出色的性能</li></ul></li><li><p>Generative Time-series Modeling with Fourier Flows </p><p> Ahmed Alaa, Alex James Chan, Mihaela van der Schaar</p><p> <strong>Reviewers say:</strong></p><ol><li><p>作者提出了傅里叶域中时间序列的流量生成模型。首先将时间序列数据转换为傅立叶域。代替先前文献中提出的仿射耦合层,作者设计了仿射耦合层的频域版本。</p></li><li><p>本文介绍了一种新的卷积流体系结构,该体系结构使用DFT将生成的时间序列转换为频域。卷积是通过频谱仿射层在频域中进行乘法来完成的,该仿射层使用依赖于数据的滤波器对信号的偶数或奇数部分进行变换。所得的时域卷积具有与输入有关的权重,这是一种有趣的原始方法,与其他卷积流(例如[1])明显不同。</p><ul><li>相关性:时间序列生成建模在从医学到金融的各个领域都有广泛的关键应用。新方法显示出非常有前途的性能,并且有可能成为时间序列生成的最新方法。</li><li>独创性:DFT和谱仿射层的使用在归一化流量的背景下是独创的。重要的是,DFT非常适合流动,因为它是等轴测图,因此具有很小的雅可比。即使在常规ConvNet体系结构中,依赖输入的卷积的使用也非常有趣。</li></ul></li></ol></li><li><p>Clairvoyance: A Pipeline Toolkit for Medical Time Series </p><p> Daniel Jarrett, Jinsung Yoon, Ioana Bica, Zhaozhi Qian, Ari Ercole, Mihaela van der Schaar</p><p> <strong>One-sentence Summary:</strong> We develop and present Clairvoyance: a pipeline toolkit for medical time series.</p><p> <strong>Reviewers say:</strong> 该手稿介绍并说明了用于时间序列数据的医学机器学习的端到端软件管道,称为Clairvoyance。必须为设计和开发了这一出色的资源,以加速这些计算技术在临床实践中的采用,以支持人们的判断和决策的方式,对作者表示祝贺。该手稿擅长描述其贡献并将其与相关工作联系起来。它还包括一组令人信服的实验,这些实验是来自三个相互补充的医学环境的数据集</p></li><li><p>Discrete Graph Structure Learning for Forecasting Multiple Time Series </p><p> Chao Shang, Jie Chen, Jinbo Bi</p><p> <strong>One-sentence Summary:</strong> We propose a graph neural network approach that learns a graph structure to enhance the forecasting of multiple multivariate time series.</p><p> <strong>Reviewers say:</strong> 本文提出了一种通过尝试通过学习的图结构估算各个维度之间的相关性来进行多元时间序列预测的方法。维度被视为图形中的节点,问题被映射到学习离散图形结构,该结构可以帮助进行下游预测任务。该论文表明,即使显式结构未知,也可以利用图神经网络(GNN)来提高预测性能。这是在以端到端的方式学习图形结构和预测体系结构的同时实现的。与在元学习框架中学习离散图结构的双层优化方法相比,所提出的方法在计算效率上更高。该方法还被要求能够通过提出确保所学习的图结构保持与已知图结构接近的正则化器来合并图结构的先验知识。与在三个真实数据集上使用的几种强基准方法相比,该方法可以提高预测性能。通常,该论文写得很好并且易于阅读。</p></li><li><p>Unsupervised Representation Learning for Time Series with Temporal Neighborhood Coding </p><p> Sana Tonekaboni, Danny Eytan, Anna Goldenberg</p><p> <strong>One-sentence Summary:</strong> An unsupervised representation learning framework for high-dimensional non-stationary time series </p><p> <strong>Reviewers say:</strong> 本文提出了一种用于时间序列的无监督表示(嵌入)学习方法。尽管无监督的表示学习已被广泛研究,并在NLP和视觉等领域表现出良好的表现,但对于时间序列社区而言,它相对较新。与最近的工作(CPC和三重损失)相比,本文具有以下差异:</p><ol><li>它使用平稳性/非平稳性的统计测试来估计固定的时间窗。</li><li>它使用CPC和Triplet-loss中的对比学习来学习嵌入,但是还考虑到天真的负采样可能包含虚假负数,并且不利于在季节性强的时间序列上进行嵌入学习的事实。相反,它采用了“积极无标签学习”框架来解决此问题。</li></ol></li></ul><h2 id="About-deep-learning"><a href="#About-deep-learning" class="headerlink" title="About deep learning"></a>About deep learning</h2><ul><li><p>Understanding Over-parameterization in Generative Adversarial Networks </p><p> Yogesh Balaji, Mohammadmahdi Sajedi, Neha Mukund Kalibhat, Mucong Ding, Dominik Stöger, Mahdi Soltanolkotabi, Soheil Feizi</p><p> <strong>One-sentence Summary:</strong> We present an analysis of over-paramterization in GANs both theoretically and empirically.</p><p> Reviewers say : 本文从理论和经验上研究了在GAN训练中增加参数数量(“过参数化”)的效果。与神经网络在监督学习中发生的情况类似,过度参数化确实有助于稳定训练动态(并凭经验提高性能)。本文为1层ReLU网络生成器的宽度提供了一个明确的阈值,以便使用线性鉴别器进行梯度上升训练可产生与全局鞍点的线性收敛速度(这对应于与数据均值)。作者还提供了一个更通用的定理,将这个结果推广到更深的网络。</p></li></ul><ul><li><p>Understanding the role of importance weighting for deep learning </p><p> Da Xu, Yuting Ye, Chuanwei Ruan</p><p> <strong>One-sentence Summary:</strong> We study the theoretical properties of importance weighting for deep learning.</p><p> <strong>Reviewers say:</strong> 本文研究了深度学习模型中重要性加权方案对梯度下降的隐式偏差的影响。它提供了一些理论结果,可为重要权重方案对收敛极限以及收敛速度的影响提供重要见解。给出了线性分隔符和深度学习模型的结果。还研究了协变量平移设置。理论结果得到了经验论证的支持,并且也得出了关于哪些加权方案将更有用的有用见解。他们还解释了一些以前观察到的经验现象。</p></li></ul><h2 id="Sequence"><a href="#Sequence" class="headerlink" title="Sequence"></a>Sequence</h2><ul><li><p>On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines </p><p> Marius Mosbach, Maksym Andriushchenko, Dietrich Klakow</p><p> <strong>One-sentence Summary:</strong> We provide an analysis of the fine-tuning instability of BERT-based models and present a simple method to fix it</p><p> <strong>Reviewers say:</strong> 本文确定了NLP深度学习中一个主要已知问题背后的原因:在自监督式预训练之后对小型数据集进行微调可能会非常不稳定,在某些情况下,该模型需要数十次重新启动才能达到可接受的性能。然后,本文介绍了一个简单的建议修复方法。</p></li><li><p>Multi-timescale Representation Learning in LSTM Language Models </p><p> Shivangi Mahto, Vy Ai Vo, Javier S. Turek, Alexander Huth</p><p> <strong>One-sentence Summary:</strong> This work presents a theoretically-motivated analysis of memory and timescale in LSTM language models.</p><p> <strong>Reviewers say:</strong> 本文指出自然语言中单词之间的关系通常遵循幂律。门控递归神经网络(例如LSTM)在自然语言建模方面表现出色,但是LSTM的遗忘机制是由指数衰减决定的。这项工作展示了一种工程化LSTM的遗忘机制,以模仿自然语言中表现出的幂律关系的方法。通过应用他们的技术,修改后的LSTM模型可以更好地建模稀有标记,这些标记通常跨越较长的时间范围,因此,该模型可以在频率较低的单词上获得较低的困惑度。本文的主要贡献是推导,该推导表明,在给出第一个输入令牌后,LSTM的遗忘门在零输入状态下会经历指数衰减。</p><p> 实验表明,从反伽马分布中绘制T是自然语言的自然拟合。然后,作者提出了利用此特性的多尺度LSTM模型。每个时间标度T是从反伽玛分布中得出的,该反伽玛分布实际上成为一个遗忘偏差项,并且在训练过程中是固定的。绘制多个T来模拟幂律。多尺度LSTM可以捕获正确的归纳偏置,以便在对频率较低的单词进行建模时表现更好,这可能有助于在内存中保留更长的时间。该论文写得很好,并且该方法的动机和解释都清楚。实验经过适当设计,结果很好地支持了主要主张。</p></li><li><p>Representation Learning for Sequence Data with Deep Autoencoding Predictive Components </p><p> Junwen Bai, Weiran Wang, Yingbo Zhou, Caiming Xiong</p><p> <strong>Reviewers say:</strong> 本文提出了深度自动编码预测组件(DAPC),这是一种用于序列数据的自监督表示学习方法。在这种方法中,模型学习最大化预测信息,即过去和未来时间窗口之间的相互信息。为了避免退化的解决方案,提出的方法依赖于优化蒙版重建的第二种损失。</p></li><li><p>Differentiable Segmentation of Sequences </p><p> Erik Scharwächter, Jonathan Lennartz, Emmanuel Müller</p><p> <strong>One-sentence Summary:</strong> We propose an architecture for effective gradient-based learning of segmented models for sequential data.</p></li><li><p>PSTNet: Point Spatio-Temporal Convolution on Point Cloud Sequences </p><p> Fan, Xin Yu, Yuhang Ding, Yi Yang, Mohan Kankanhalli</p><p> <strong>One-sentence Summary:</strong> This paper proposes a point spatio-temporal (PST) convolution to learn representations of raw point cloud sequences by disentangling space and time.</p><p> <strong>Reviewers say:</strong> 本文介绍了一种新的卷积方法来直接处理原始时空(ST)点云数据。提出的点时空(PST)卷积在“点管”上运行,并通过每个时间步的共享空间卷积解耦空间和时间,然后进行时间卷积。它还引入了转置的PST,以在编码器-解码器框架(PSTNet)中启用逐点预测。提出的实验通过使用PSTNet对点云序列进行动作识别和语义分割,证明了这些卷积的有效性,显示了对相关最新工作的改进。</p></li><li><p>Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning </p><p> Xuebo Liu, Longyue Wang, Derek F. Wong, Liang Ding, Lidia S. Chao, Zhaopeng Tu</p><p> <strong>Reviewers say:</strong> 本文介绍了对细粒度层的关注,以评估各个编码器层的作用并研究编码器层融合的工作原理,其中解码器层可以访问各种编码器层的信息,而与标准Transformer中的最终编码器层不同。</p><p> 基于以下观点:编码器嵌入层对于编码器层融合的成功至关重要,而最上层的解码器层则更加关注编码器嵌入层,因此提出了SurfaceFusion,该方法仅将编码器嵌入层连接到解码器的softmax层。 ,导致BLEU等指标获得了可观的收益。</p></li><li><p>Seq2Tens: An Efficient Representation of Sequences by Low-Rank Tensor Projections </p><p> Csaba Toth, Patric Bonnier, Harald Oberhauser</p><p> <strong>One-sentence Summary:</strong> An Efficient Representation of Sequences by Low-Rank Tensor Projections</p><p> <strong>Reviewers say:</strong></p><ol><li>本文介绍了自由代数,这是一种经典的数学概念,是表示任意长度的顺序数据的通用工具。所提方法具有吸引人的理论特性,例如保持静态特征映射的通用性和连续设置的收敛性。作者进一步建议使用自由代数的堆叠秩1投影作为序列表示的近似值,以使其在计算上可行的神经网络层。作者通过将NN实现与FCN结合起来,以多元时间序列分类问题为基准,并以GP-VAE模型为序列数据归纳问题进行了基准,说明了该方法的灵活性和有效性。与以前的最新技术相比,所提出的方法显示出改进的结果。</li></ol><ul><li>启示:本文为社区提供了NN在序列数据上的通用逼近定理的扩展,以及将静态特征图转换为序列特征的通用方法。实验表明,该方法在判别和生成问题上均具有灵活性和有效性。</li></ul><ol start="2"><li>本文针对序列数据提出了一种有趣的低秩张量表示学习模型,称为Seq2Tens。可以将所提出的模型作为Seq2Tens层插入现有的最新神经网络模型中,以提高性能,这已在本文的一些基准数据集中得到了证明。</li></ol></li></ul><h2 id="Interpretable"><a href="#Interpretable" class="headerlink" title="Interpretable"></a>Interpretable</h2><ul><li><p>Understanding the failure modes of out-of-distribution generalization </p><p> Vaishnavh Nagarajan, Anders Andreassen, Behnam Neyshabur</p><p> <strong>One-sentence Summary:</strong> In this theoretical study, we explain why machine learning models rely on spuriously correlated features in the dataset and fail at out-of-distribution generalization.</p></li><li><p>Rethinking the Role of Gradient-based Attribution Methods for Model Interpretability </p><p> Suraj Srinivas, Francois Fleuret</p><p> <strong>One-sentence Summary:</strong> Input-gradients in discriminative neural net models capture information regarding an implicit density model, rather than that of the underlying discriminative model which it is intended to explain.</p><p> <strong>Reviewers say:</strong> 本文从基于能量的生成模型的最新观察结果出发,从理论角度研究了可解释性文献中提出的基于梯度的归因方法。首先,作者指出了基于梯度的归因的普遍弱点,该弱点源于以下事实:输入梯度没有提供明确的解释,因为softmax输出的平移不变性使其具有任意性。然后作者提出,可以通过判别模型“包含隐式”类条件密度模型(提到的有关基于能量的生成模型的最新发现)这一事实来解释基于梯度的归因模型成功的原因。然后,他们继续阐述这个想法,该想法表明将隐式类条件生成模型与数据的“真实”生成模型对齐将如何帮助提供与基于梯度的归因相关的信息,以及如何通过一种新颖的实现方式有效地促进对齐得分匹配,以及如何将该机制实际实现为正则化成本。作者随后进行了实证研究,令人信服地证实了其理论观点的预测。首先,他们表明,通过“ GAN-test方法”提出的经过训练的判别模型,在噪声较少的意义上以及通过判别准确性而言,用分数匹配和建议的梯度范数正则化生成的样本更好。最后,他们表明,根据像素扰动测试的判别版本,基于梯度的解释的质量更好,这是一种通过扰动以相关性递增顺序排列的像素来评估梯度说明的方法。总之,本文在判别模型,基于能量的生成模型和基于梯度的解释之间建立了非常有趣的基本理论联系,使用此理论框架来解释基于梯度的解释如何克服softmax位移不变性问题(并指出在本文中),并介绍了实用的培训程序,以利用所获得的理论见解来产生更好的解释,并且在仿真中也进行了实证验证。一种通过扰乱以相关性递增顺序排列的像素来评估梯度说明的方法。</p></li></ul><ul><li><p>Getting a CLUE: A Method for Explaining Uncertainty Estimates </p><p> Javier Antoran, Umang Bhatt, Tameem Adel, Adrian Weller, José Miguel Hernández-Lobato</p><p> <strong>One-sentence Summary:</strong> We introduce a method to help explain uncertainties of any differentiable probabilistic model by perturbing input features.</p></li><li><p>Interpreting Graph Neural Networks for NLP With Differentiable Edge Masking </p><p> Michael Sejr Schlichtkrull, Nicola De Cao, Ivan Titov</p><p> <strong>One-sentence Summary:</strong> We present a novel post-hoc interpretation method for graph neural networks, and apply it to analyse two models from the NLP literature.</p></li><li><p>Interpretable Models for Granger Causality Using Self-explaining Neural Networks </p><p> Ričards Marcinkevičs, Julia E Vogt</p><p> <strong>One-sentence Summary:</strong> We propose an interpretable framework for inferring Granger causality based on self-explaining neural networks.</p><p> <strong>Reviewers say:</strong> 本文主要涉及在非线性动力学设置中学习多元时间序列中的格兰杰因果关系。核心方法使用带有稀疏诱导正则化器(基于弹性网和平滑度的融合套索)的矢量自回归建模,以及最近提出的具有自解释神经网络(用于解释性)。作者还通过学习对原始数据和时间反转数据稳定的格兰杰因果结构来扩充框架。详尽的经验分析是根据最近的GC基准进行的。我对本文的一些关注如下</p></li><li><p>Interpreting Knowledge Graph Relation Representation from Word Embeddings </p><p> Carl Allen, Ivana Balazevic, Timothy Hospedales</p><p> <strong>One-sentence Summary:</strong> Interpreting the structure of knowledge graph relation representation using insight from word embeddings.</p><p> <strong>Reviewers say:</strong> 作者通过对实体之间的关系进行分类来研究单词表示模型的潜在语义属性。目的是表明即使两种类型的模型都具有不同的学习目标,词嵌入和知识图表示也可以学习共同的潜在结构。主要的贡献是对象之间的关系到目标词嵌入的映射,这种关系的分类以及对最新知识图表示的评估。研究表明,知识表示模型遵循定义的关系条件。</p></li><li><p>Explaining by Imitating: Understanding Decisions by Interpretable Policy Learning </p><p> Alihan Hüyük, Daniel Jarrett, Cem Tekin, Mihaela van der Schaar</p><p> <strong>One-sentence Summary:</strong> We present a method for learning interpretable representations of behavior to enable auditing, quantifying, and understanding human decision-making processes.</p><p> <strong>Reviewers say:</strong> 这项工作提出了一种理解和解释决策行为的方法。作者旨在使方法1)透明,2)能够处理部分可观察性,以及3)处理脱机数据。为此,他们开发了INTERPOLE,它使用贝叶斯技术来估计决策动态以及决策边界。在模拟域和真实域上的结果表明,他们的方法在保持行为准确性的同时解释了行为数据中的决策,并着重于解释决策动态,而不是世界的“真实”动态。</p></li><li><p>BERTology Meets Biology: Interpreting Attention in Protein Language Models </p><p> Jesse Vig, Ali Madani, Lav R. Varshney, Caiming Xiong, richard socher, Nazneen Rajani</p><p> <strong>One-sentence Summary:</strong> We analyze the internal representations of protein language models, and show that attention targets structural and functional properties of protein sequences.</p></li><li><p>Representation learning for improved interpretability and classification accuracy of clinical factors from EEG </p><p> Garrett Honke, Irina Higgins, Nina Thigpen, Vladimir Miskovic, Katie Link, Sunny Duan, Pramod Gupta, Julia Klawohn, Greg Hajcak</p><p> <strong>One-sentence Summary:</strong> We use disentangled representations of EEG signals to improve performance on clinical classification tasks, provide interpretable recommendations for post-hoc analysis and allow for extraction of ERPs from novel single EEG trajectories.</p><p> <strong>Reviewers say:</strong> 作者关注EEG信号的分类,以便根据EEG信号预测年龄,性别,抑郁和1轴失调症。经过标准的预处理和可选的平均值以获得诱发的反应后,作者将样品喂入容器中。-VAE,然后使用标准分类算法或SCAN方法来预测标签。作者报告了比基于晚期阳性潜力的常规方法更好的结果。他们还表明,他们的方法可以使用非平均EEG数据进行训练,并且在ERP上进行测试时,反之亦然。最后,作者检查学习到的表示。</p></li><li><p>Interpretable Neural Architecture Search via Bayesian Optimisation with Weisfeiler-Lehman Kernels </p><p> Xingchen Wan, Binxin Ru, Xiaowen Dong, Michael Osborne</p><p> <strong>One-sentence Summary:</strong> We propose a NAS method that is sample-efficient, highly performant and interpretable.</p><p> <strong>Reviewers say:</strong> 作者提出了一种新的神经体系结构搜索算法,该算法将贝叶斯优化与富有表现力且广受欢迎的Weisfeiler-Lehman(WL)图核相结合。使用WL的一个优点是源于内核计算方式的本质的可解释结果,即通过图形的传播方案。合并方程式的导数 3.2可以提取直接负责提高性能的子图。在各种实验中,作者不仅显示出已检测架构的性能提高,而且还发现了其他算法也可以找到的子图。<br> 即使我的专业知识不属于NAS领域,我仍然认为这项工作很有吸引力。它是图形内核的创新应用程序,它具有可伸缩性,而在这种情况下,可伸缩性几乎没有问题。我发现新颖性,可解释性和定量结果方面令人信服,足以建议您接受。此外,作品的结构和书面结构都很好,数字清晰易读。与其他SOTA NAS算法的比较是否质量好,是否公平,我认为具有NAS背景的审阅者的意见很有价值。</p></li><li><p>Explainable Deep One-Class Classification </p><p> Philipp Liznerski, Lukas Ruff, Robert A. Vandermeulen, Billy Joe Franks, Marius Kloft, Klaus Robert Muller</p><p> <strong>One-sentence Summary:</strong> We introduce an approach to explainable deep anomaly detection based on fully convolutional neural networks. </p></li><li><p>A Learning Theoretic Perspective on Local Explainability </p><p> Jeffrey Li, Vaishnavh Nagarajan, Gregory Plumb, Ameet Talwalkar</p><p> 在本文中,我们通过局部逼近解释的角度探索了可解释机器学习与学习理论之间的联系。首先,我们解决了性能泛化的传统问题,并使用局部解释性的概念来限制模型的测试时间准确性。其次,我们探讨了解释泛化的新问题,这是越来越多的基于有限样本的局部逼近解释的重要问题。最后,我们通过经验验证了我们的理论结果,并表明它们反映了在实践中可以看到的结果。</p><p> <strong>Reviewers say:</strong> 本文试图在局部可解释性和学习理论之间建立一种新颖的联系,并针对与表现概括和解释概括相关的界线提出了两个定理。本文提供了两组实证结果,以说明边界的有用性。</p><p> 总体而言,通过从学习理论的角度探索黑匣子机器学习模型的本地解释,本文的想法很有趣。本文提出镜像邻居保真度(MNF),作为衡量本地可解释性的新方法以及论点和结论的核心组成部分。</p><p> 该论文声称,MNF自然补充了常用的邻域保真度(NF),并且在评估对“现实的”高维分布数据的本地解释时,NF具有相对于NF的独特优势,而这些解释通常表现出显着的特征依赖性。但是,除了附录中的一个玩具示例外,没有任何可靠或令人信服的证据和经验实验可支持上述权利要求。</p></li><li><p>Shapley explainability on the data manifold </p><p> Christopher Frye, Damien de Mijolla, Tom Begley, Laurence Cowton, Megan Stanley, Ilya Feige</p><p> <strong>One-sentence Summary:</strong> We present drawbacks of model explanations that do not respect the data manifold, and introduce two methods for on-manifold explainability.</p><p> <strong>Reviewers say:</strong> 本文重点讨论Shapley值的离数据流形问题,该问题是通过对分布不全的数据进行采样而创建的。目标是开发有效的方法。提出了两种主要算法:用于近似条件分布的生成模型和用于直接近似的训练监督模型。它们在实验中显示出优于原始非流形Shapley值的优势。印象实验特别有趣。使用生成模型来解决可解释性方法(不仅仅是SHAP)中的流形数据问题的总体思路通常是一个不错的方向。请注意,对于SHAP而言,该问题更为突出,因为该方法基于输入特征的所有子集的性能,并且本文很好地说明了解决基于Shapley的方法的流形外数据问题的必要性。实验结果也很全面,并为它们的有用性提供了足够的证据。似乎存在一些新奇的担忧:“到目前为止,尚缺乏一种在一般数据上估计流形Shapley值的高效方法,并且该工作的重点。” 歧管外数据在可解释性方面的问题已得到很好的研究,其SHAP方法的细节已在之前进行了讨论。<a href="https://proceedings.icml.cc/static/paper_files/icml/2020/334-Paper.pdf">this paprt</a>作者对本文进行了非常简短的引用,但实际上并未提及它们的不同之处。除非该作品对现有文献的主要贡献是显而易见的,否则我无法改变自己的分数。如果贡献是“……支持流形方法的实验证据”,则该贡献不足以用于该场所。</p></li><li><p>Information-theoretic Probing Explains Reliance on Spurious Features </p><p> Charles Lovering, Rohan Jha, Tal Linzen, Ellie Pavlick</p><p> <strong>One-sentence Summary:</strong> We find that feature extractability, measured by probing classifiers, can be viewed as an inductive bias: the more extractable a feature is after pre-training, the less statistical evidence needed during fine-tuning for the model to use the feature.</p><p> <strong>Reviewers say:</strong> 本文研究了来自预训练表示的特征的可抽取性与经过微调的模型使用该特征的程度之间的关系。特征的可提取性通过训练为从预训练表示中检测特征的探测分类器的最小描述长度来衡量(使用Voita和Titov的在线代码版本)。精细调整的模型使用特征的程度通过模型将虚假特征与非虚假特征(称为“目标”特征)分开所需的证据量来衡量。这里的证据是指出现虚假特征但未发生非虚假特征的示例。当存在许多此类示例时(仅高伪造率),模型更容易拒绝虚假特征并学会依赖目标特征。“</p><p> 本文针对合成数据和更自然的数据进行了两种实验。合成数据是符号序列,其中的任务是识别简单属性,例如符号的出现或重复。进行实验时,应在训练过程中提供不同比率的仅虚假示例,以提供越来越多的证据来证明虚假特征(符号2的存在)并支持目标特征。目标特征与标签相同,即示例对应于标签时为1,否则为0。该论文通过探测分类器的MDL报告了虚假特征和目标特征的可提取性。感兴趣的度量标准是相对MDL,其中越高表示功能越容易提取。当功能更易于提取时,模型拒绝虚假特征所需的证据较少。由于提取特征较少,因此需要更多证据。</p><p> 自然语言示例是通过对三种语言现象(主语-动词一致,负极性项目和填充项相关性)语法生成的示例的可接受性判断而制成的。此处的设置再次相似,只是对如何计算可提取性进行了调整。此处的主要结果是,可提取性与拒绝虚假特征所需的证据之间具有高度(负)相关性。</p></li></ul><ul><li><p>Exemplary Natural Images Explain CNN Activations Better than State-of-the-Art Feature Visualization </p><p> Judy Borowski, Roland Simon Zimmermann, Judith Schepers, Robert Geirhos, Thomas S. A. Wallis, Matthias Bethge, Wieland Brendel</p><p> <strong>One-sentence Summary:</strong> Using human psychophysical experiments, we show that natural images can be significantly more informative for interpreting neural network activations than synthetic feature visualizations.</p><p> <strong>Reviewers say:</strong> </p><ol><li>本文着重于特征可视化,该可视化为给定的隐藏节点生成最大程度的激活图像,以了解CNN的内部工作原理。他们将这些图像的信息量与自然图像相比较,后者也强烈激活了指定的隐藏节点,并发现自然图像可以帮助人类更好地回答哪些其他测试自然图像也被最大程度地激活了。</li><li>主要思想是研究极端激活图像如何帮助人类预测CNN激活。作者通过将极度激活的图像与示例性的自然图像进行比较来实现,这些自然图像也强烈地激活了特定的特征图(并使用心理物理学测试来查看哪种图像可以更好地帮助人类)。</li><li>作者指出,许多可视化方法都将响应最大化与人为定义的正则化方法相结合,这些方法本质上是艺术上的选择,旨在降低图像的噪点,这是正确且正确的。这些正则化归因于它们自身对结果图像的偏见,这可能会使它们的信息量更少。还可以很好地认识到,单元可能被一个以上的语义概念高度激活,或者与其他单元(可能传递更多的信息而不是选择性地最大化单个神经元的激活)相结合而活跃。</li></ol></li></ul><ul><li><p>Scaling Symbolic Methods using Gradients for Neural Model Explanation </p><p> Subham Sekhar Sahoo, Subhashini Venugopalan, Li Li, Rishabh Singh, Patrick Riley</p><p> <strong>Reviewers say:</strong> 本文提出了一种编码最小输入特征发现问题的方法-将一种最小的特征集输入到预测所需的特征中-将其编码为一种可满足满意度模理论(SMT)求解器的形式。特别是,他们首先使用集成梯度方法对第一层神经元的影响程度进行评分。然后,他们产生并解决一个SMT问题,该问题找到了改变这些有影响的神经元的最小面罩。他们展示了他们在一些问题上的方法。</p></li><li><p><strong>【重点阅读】</strong>Evaluation of Similarity-based Explanations </p><p> Kazuaki Hanawa, Sho Yokoi, Satoshi Hara, Kentaro Inui</p><p> <strong>One-sentence Summary:</strong> We investigated empirically which of the relevance metrics (e.g. similarity of hidden layer, influence function, etc.) are appropriate for similarity-based explanation.</p><p> <strong>Reviewers say:</strong> 这项工作提供了基于示例的解释方法中使用的相似性度量的经验评估,其目的是在训练集中为黑匣子模型的预测提供决策支持示例。本文评估了文献中流行的基于梯度的度量,例如影响函数,费舍尔核,以及基于l2,余弦距离和不同嵌入空间上的点积的简单方法。作者介绍了两个评估不同方法可靠性的新任务:相同的类测试和相同的子类测试。</p></li><li><p>Learning explanations that are hard to vary </p><p> Giambattista Parascandolo, Alexander Neitz, ANTONIO ORVIETO, Luigi Gresele, Bernhard Schölkopf</p><p> <strong>Reviewers say:</strong> 这项工作假定数据集中存在不变机制。使用梯度下降训练的机器学习算法通常会在示例中平均梯度。本文的观点是,通过平均梯度,信息会丢失。该方法假定,在梯度下降算法中,可以使用几何(或karcher)平均值代替算术平均值来保存有关不变机制的信息-而忽略混杂因素。在直接应用几何平均值时会遇到困难,因此开发了一种简单的启发式算法,其中包括根据梯度的符号是否在一批示例中一致(或是否达到某个一致阈值)来掩盖梯度。该算法已在合成数据集,CIFAR-10上的半合成任务以及RL算法coinbase上进行了测试。</p></li><li><p>Debiasing Concept-based Explanations with Causal Analysis </p><p> Mohammad Taha Bahadori, David Heckerman</p><p> <strong>One-sentence Summary:</strong> We use a technique from instrumental variables literature and remove the impact of noise and latent confounding from concept-based explanations.</p><p> <strong>Reviewers say:</strong> 这项工作的重点是使用基于概念的解释进行模型的可解释性。作者认为,概念问题与功能中的混淆信息相关。他们提出了表示系统的因果图,并使用工具变量方法来消除未观察到的混杂因素的影响。该方法在综合和真实数据上进行了评估。</p></li><li><p>Evaluations and Methods for Explanation through Robustness Analysis </p><p> Cheng-Yu Hsieh, Chih-Kuan Yeh, Xuanqing Liu, Pradeep Kumar Ravikumar, Seungyeon Kim, Sanjiv Kumar, Cho-Jui Hsieh</p><p> <strong>One-sentence Summary:</strong> We propose a suite of objective measurements for evaluating feature based explanations by the notion of robustness analysis; we further derive new explanation that captures different characteristics of explanation comparing to existing methods.</p><p> <strong>Reviewers say:</strong> 许多可解释性技术都集中在识别“最相关特征”的子集上。在这项工作中,作者建议将该集合定义为最容易遭到攻击的特征集(在<br>感)。首先,由于垂直空间LaTeX骇客攻击的数量令人沮丧,论文有点难以阅读,以至于节和段落之间的间距甚至小于句子之间的正常间距。这不是将所有内容压缩到8页的好方法。</p><p> 除此之外,该论文在实验评估方面非常全面,并提供了一系列适当的基准,健全性检查和人体研究。我认为这是对当前功能归因技术套件的有趣补充。从概念上讲,它非常相似。正如作者所指出的那样,有许多相关技术试图通过向它们添加噪声,将其设置为基准值或模糊它们来“去除特征”。在这里,作者改为考虑对抗性地干扰他们,他们提出了一种改进的贪婪策略,该方法似乎效果很好。对于我来说,还是有点不清楚,为什么考虑对抗性扰动比说考虑(例如考虑模糊或增加选定特征的噪声)更具说服力,但是它们的作用略有不同,而且从经验上讲,与现有技术相比,此方法可在插入和删除指标下获得收益。</p></li><li><p>Shapley Explanation Networks </p><p> Rui Wang, Xiaoqian Wang, David I. Inouye</p><p> <strong>One-sentence Summary:</strong> To enable new capabilities, we propose to use Shapley values as inter-layer representations in deep neural networks rather than as post-hoc explanations.</p></li></ul><ul><li><p>Shape or Texture: Understanding Discriminative Features in CNNs </p><p> Md Amirul Islam, Matthew Kowal, Patrick Esser, Sen Jia, Björn Ommer, Konstantinos G. Derpanis, Neil Bruce</p><p> <strong>One-sentence Summary:</strong> Exploring and quantifying shape information encoded in CNNs.</p><p> <strong>Reviewers say:</strong></p><ol><li>作者试图理解的问题在对象识别,纹理/形状偏差和深度神经网络中的学习表示等领域都很有趣且相关。</li><li>作者提供了一组不错的受控实验,这些实验表明了其中的某些效果,并且作者在其方法中提出了科学依据(尽管并不完美),但这与该领域非常相关。相反,这不是“另一篇论文,试图克服纹理偏差而没有任何直觉,试图获得更好的数字(不幸的是,这些天在计算机视觉中已经完成)”,相反,这篇论文是关于理解纹理/形状偏差的基本机制的知识,这些机制扩展了视觉层次结构中的计算的最终阶段,这就是为什么我倾向于接受。论文中几乎所有数字都很清楚,并有助于传达作者试图表达的内容(尽管有一些澄清点)</li></ol></li></ul><h2 id="Autoencoder"><a href="#Autoencoder" class="headerlink" title="Autoencoder"></a>Autoencoder</h2><ul><li><p>VAEBM: A Symbiosis between Variational Autoencoders and Energy-based Models<br> Zhisheng Xiao, Karsten Kreis, Jan Kautz, Arash Vahdat</p></li><li><p>Disentangled Recurrent Wasserstein Autoencoder<br> Jun Han, Martin Renqiang Min, Ligong Han, Xuan Zhang, Li Erran Li</p></li><li><p>Fully Unsupervised Diversity Denoising with Convolutional Variational Autoencoders<br> Mangal Prakash, Alexander Krull, Florian Jug</p></li><li><p>Tomographic Auto-Encoder: Unsupervised Bayesian Recovery of Corrupted Data<br> Francesco Tonolini, Andreas Damianou, Pablo Garcia Moreno, Roderick Murray-Smith</p></li><li><p>Unsupervised Audiovisual Synthesis via Exemplar Autoencoders<br> Kangle Deng, Aayush Bansal, Deva Ramanan</p></li><li><p>Property Controllable Variational Autoencoder via Invertible Mutual Dependence<br> Xiaojie Guo, Yuanqi Du, Liang Zhao</p></li><li><p>Improving relational regularized autoencoders with spherical sliced fused Gromov Wasserstein<br> Khai Nguyen, Son Nguyen, Nhat Ho, Tung Pham, Hung Bui</p></li><li><p>Learning a Latent Search Space for Routing Problems using Variational Autoencoders<br> André Hottung, Bhanu Bhandari, Kevin Tierney</p></li><li><p>Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Translation<br> Jungo Kasai, Nikolaos Pappas, Hao Peng, James Cross, Noah Smith</p></li><li><p>Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning </p><p> Xuebo Liu, Longyue Wang, Derek F. Wong, Liang Ding, Lidia S. Chao, Zhaopeng Tu</p><p> <strong>Reviewers say:</strong> 本文介绍了对细粒度层的关注,以评估各个编码器层的作用并研究编码器层融合的工作原理,其中解码器层可以访问各种编码器层的信息,而与标准Transformer中的最终编码器层不同。</p><p> 基于以下观点:编码器嵌入层对于编码器层融合的成功至关重要,而最上层的解码器层则更加关注编码器嵌入层,因此提出了SurfaceFusion,该方法仅将编码器嵌入层连接到解码器的softmax层。 ,导致BLEU等指标获得了可观的收益。</p></li></ul><h2 id="Missing-value-amp-irregularly-sampled-time-series"><a href="#Missing-value-amp-irregularly-sampled-time-series" class="headerlink" title="Missing value & irregularly sampled time series"></a>Missing value & irregularly sampled time series</h2><ul><li><p>Multi-Time Attention Networks for Irregularly Sampled Time Series </p><p> Satya Narayan Shukla, Benjamin Marlin</p><p> <strong>One-sentence Summary:</strong> This paper presents a new deep learning architecture for learning with sparse and irregularly sampled multivariate time series.</p><p> <strong>Reviewers say:</strong> 本文讨论不规则样本时间序列的分析。该方法主要基于插值。因此,作者可以研究有监督和无监督的问题。该架构由正弦波注意层,可在潜在空间中形成固定大小的界标的VAE层和RNN解码器组成。对于监督任务,作者添加了分类损失。</p><ul><li>他们在插值任务中获得了令人印象深刻的结果,而在分类任务中获得了有趣的结果。</li><li>在插值问题中,我们想将稳健的基线视为线性插值或类似AR的模型。即使我必须承认作者已经提出了与最近文献的模型进行的大量比较,这也将为我们提供有意义的MSE结果,以比较其他方法。</li><li>结果令人印象深刻,但我不知道架构的哪个部分会带来如此出色的性能</li></ul></li><li><p>not-MIWAE: Deep Generative Modelling with Missing not at Random Data </p><p> Niels Bruun Ipsen, Pierre-Alexandre Mattei, Jes Frellsen</p><p> <strong>One-sentence Summary:</strong> We present an approach for building and fitting deep latent variable models (DLVMs) in cases where the missing process is dependent on the missing data.</p><p> <strong>Reviewers say:</strong> 本文提出了一种对数据进行深度潜变量模型训练的方法,这些数据不是随机丢失的。为了学习深潜变量模型的参数,本文采用重要性加权变分推理技术。在各种数据集上进行的实验表明,该方法通过显式地建模随机数据中缺失的模型而有效。</p></li></ul><h2 id="Recurrent-Neural-Network"><a href="#Recurrent-Neural-Network" class="headerlink" title="Recurrent Neural Network"></a>Recurrent Neural Network</h2><ul><li><p>Coupled Oscillatory Recurrent Neural Network (coRNN): An accurate and (gradient) stable architecture for learning long time dependencies </p><p> T. Konstantin Rusch, Siddhartha Mishra</p><p> <strong>One-sentence Summary:</strong> A biologically motivated and discretized ODE based RNN for learning long-term dependencies, with rigorous bounds mitigating the exploding and vanishing gradient problem.</p></li></ul><ul><li><p><strong>【重点阅读】</strong>Recurrent Independent Mechanisms (Spotlight)</p><p> Anirudh Goyal, Alex Lamb, Jordan Hoffmann, Shagun Sodhani, Sergey Levine, Yoshua Bengio, Bernhard Schölkopf</p><p> <strong>One-sentence Summary:</strong> Learning recurrent mechanisms which operate independently, and sparingly interact can lead to better generalization to out of distribution samples.</p><p> <strong>Reviewers say:</strong> 本文提出了一种新颖的递归网络,称为RIM,以提高对局部变化的泛化性和鲁棒性。该网络由很大程度上独立的经常性模块组成,这些模块很少被激活,并通过柔和的注意力进行交互。在一系列不同任务上进行的实验表明,RIM的推广性优于LSTM。</p></li><li><p>Disentangled Recurrent Wasserstein Autoencoder (Spotlight)</p><p> Jun Han, Martin Renqiang Min, Ligong Han, Xuan Zhang, Li Erran Li</p><p> <strong>One-sentence Summary:</strong> We propose the first recurrent Wasserstein Autoencoder for learning disentangled representations of sequential data with theoretical analysis.</p><p> <strong>Reviewers say:</strong> 本文提出了一种学习序列数据的静态和动态潜在变量的解缠方法。在学习目标方面,本文将Wasserstein自动编码器扩展到顺序数据,这种方法新颖且动机良好。静态变量的聚合后验自然而然地出现,并且对正则化起着重要作用(这对于序列数据来说似乎是新的)。作者还研究了如何为真实情景中的弱监督学习建模其他分类变量。图形化的模型清楚地说明了主要步骤(生成和推断),并提供了严格的说明来支持它们。实验结果证明了该方法在纠缠性能和生成质量方面的优势。</p></li><li><p>The geometry of integration in text classification RNNs </p><p> Kyle Aitken, Vinay Venkatesh Ramasesh, Ankush Garg, Yuan Cao, David Sussillo, Niru Maheswaranathan</p><p> <strong>One-sentence Summary:</strong> We study text classification RNNs using tools from dynamical systems analysis, finding and explaining the geometry of low-dimensional attractor manifolds.</p><p> <strong>Reviewers say:</strong> 继Maheswaranathan等人(2019)和Maheswaranathan&Sussillo(2020)等最近的研究之后,本文加入了研究循环网络解决监督序列分类问题的机制的研究领域。在此过程中,本文假设并确认了递归网络(无论是GRU还是LSTM)的内部隐藏状态在读取输入时在平面(近似)吸引子上演化,相当于在处理输入序列时整合了证据。 ,并针对三种类型的问题(分类,有序分类和多标签分类)展示了这些吸引子的存在和集成动态。</p></li><li><p>Uncertainty Estimation and Calibration with Finite-State Probabilistic RNNs </p><p> Cheng Wang, Carolin Lawrence, Mathias Niepert</p><p> <strong>One-sentence Summary:</strong> A method to estimate and calibrate uncertainty in recurrent state transitions.</p><p> <strong>Reviewers say:</strong> 本文提出了一种量化RNN不确定性的方法,这是各种应用中的重要问题。它提供了各种领域的结果,表明所提出的方法优于基线。但是,除了考虑的基线(例如协方差传播,先验网络和正交证书)以外,通过针对特定任务与SOTA方法进行比较,这些实验将大大受益。还可以通过添加理论上的解释来解释Gumbel softmax函数如何捕获基本数据和模型不确定性,从而对本文进行改进。</p></li><li><p>RNNLogic: Learning Logic Rules for Reasoning on Knowledge Graphs </p><p> Meng Qu, Junkun Chen, Louis-Pascal Xhonneux, Yoshua Bengio, Jian Tang</p><p> <strong>One-sentence Summary:</strong> Learn Logic Rules for Reasoning on Knowledge Graphs.</p><p> <strong>Reviewers say:</strong> 在这项工作中,作者举例说明了一种从知识图开始学习逻辑规则的方法。学习逻辑规则比简单地执行链接预测更有趣,因为规则是人类可读的,因此具有可解释性。该方法似乎很有趣,并且所解决的问题可能会引起广泛的关注。它似乎不是很新颖,但是对我来说似乎是有效的。这篇论文写得很好并且自成体系。此外,实验结果表明,与其他系统相比(甚至与不学习规则而仅执行链接预测的系统相比),该方法具有竞争优势。</p></li><li><p>SkipW: Resource adaptable RNN with strict upper computational limit </p><p> Tsiry Mayet, Anne Lambert, Pascal Leguyadec, Francoise Le Bolzer, François Schnitzler</p><p> <strong>One-sentence Summary:</strong> Skip-Window is a method to allow recurrent neural networks (RNNs) to trade off accuracy for computational cost during the analysis of a sequence while keeping a strict upper computational limit.</p><p> <strong>Reviewers say:</strong> 此提交内容提供了SkipRNN(跳过窗口)的扩展,它将输入序列分割为长度为L的窗口,从中只能使用K个样本。这保证了不会超过计算预算。跳过窗口通过在每个窗口的开头并行预测L个更新概率来实现这种归纳偏差。需要在训练之前设置L,而可以在测试时修改K。该模型在两个任务中评估,即合成添加任务和人类活动识别。作者报告了小型平台中的延迟和能耗,显示了该研究方向对实际应用的影响。</p></li><li><p>Multi-timescale Representation Learning in LSTM Language Models </p><p> Shivangi Mahto, Vy Ai Vo, Javier S. Turek, Alexander Huth</p><p> <strong>One-sentence Summary:</strong> This work presents a theoretically-motivated analysis of memory and timescale in LSTM language models.</p><p> <strong>Reviewers say:</strong> 本文指出自然语言中单词之间的关系通常遵循幂律。门控递归神经网络(例如LSTM)在自然语言建模方面表现出色,但是LSTM的遗忘机制是由指数衰减决定的。这项工作展示了一种工程化LSTM的遗忘机制,以模仿自然语言中表现出的幂律关系的方法。通过应用他们的技术,修改后的LSTM模型可以更好地建模稀有标记,这些标记通常跨越较长的时间范围,因此,该模型可以在频率较低的单词上获得较低的困惑度。本文的主要贡献是推导,该推导表明,在给出第一个输入令牌后,LSTM的遗忘门在零输入状态下会经历指数衰减。</p><p> 实验表明,从反伽马分布中绘制T是自然语言的自然拟合。然后,作者提出了利用此特性的多尺度LSTM模型。每个时间标度T是从反伽玛分布中得出的,该反伽玛分布实际上成为一个遗忘偏差项,并且在训练过程中是固定的。绘制多个T来模拟幂律。多尺度LSTM可以捕获正确的归纳偏置,以便在对频率较低的单词进行建模时表现更好,这可能有助于在内存中保留更长的时间。该论文写得很好,并且该方法的动机和解释都清楚。实验经过适当设计,结果很好地支持了主要主张。</p></li></ul><h2 id="Clustering"><a href="#Clustering" class="headerlink" title="Clustering"></a>Clustering</h2><ul><li><p>Sparse Quantized Spectral Clustering (Spotlight)</p><p> Zhenyu Liao, Romain Couillet, Michael W. Mahoney</p><p> <strong>Reviewers say:</strong> </p><ol><li>本文对通过将非线性函数应用于随机矩阵而获得的矩阵光谱进行了很好的分析。</li><li>这是一篇很好的论文,表明人们可以扰动内核矩阵(或使其通过非线性变换),而不必显着修改基础本征谱,因此不会损害应用于矩阵的光谱聚类的性能。我立即可以看到的最重要的应用是稀疏化内核矩阵,以便可以有效地进行计算使用。或类似地,如作者所述,应用量化和二值化。</li></ol></li></ul><ul><li><p>A Critique of Self-Expressive Deep Subspace Clustering </p><p> Benjamin David Haeffele, Chong You, Rene Vidal</p><p> <strong>One-sentence Summary:</strong> Here we show theoretically and experimentally that there are a number of flaws with many existing self-expressive deep subspace clustering models.</p><p> <strong>Reviewers say:</strong> 摘要:本文对自表达深度子空间聚类(SEDSC)模型的先前结果的重要性提出了疑问,这些模型被吹捧为线性子空间聚类(使用自表达属性)成功扩展到非线性数据结构。作者提出了一组理论结果,这些结果表明SEDSC的标准配方通常是不适的。即使增加了正则化,也显示出这种公式可以很好地产生琐碎的几何形状,不利于成功的子空间聚类。</p></li></ul><ul><li><p>Intraclass clustering: an implicit learning ability that regularizes DNNs </p><p> Simon Carbonnelle, Christophe De Vleeschouwer</p><p> <strong>One-sentence Summary:</strong> This paper provides empirical evidence that deep neural networks are implicitly regularized through their ability to extract meaningful clusters among the samples of a class.</p><p> <strong>Reviewers say:</strong></p><ol><li>本文研究了在监督学习中训练的神经网络的类内聚类能力,发现尽管标签没有明确强制这样做,但网络仍显示出类内聚类能力。并且基于这些标准的准则与模型泛化性能很好地相关。</li><li>作者注意到,在分类任务中,通常存在未在粗糙类标签中明确编码的相似图像的类内组,他们称之为类内聚类。他们假设,DNN在没有明确告知的情况下识别这些类内集群的能力可能与泛化相关。然后,他们继续在一系列网络,体系结构和大量超参数配置上对此进行验证。他们会在可能的情况下建立因果关系。此外,他们表明,可以使用简单的基于方差的方法检测类内聚类,并且它在训练的早期就出现了。</li></ol></li><li><p>Clustering-friendly Representation Learning via Instance Discrimination and Feature Decorrelation </p><p> Yaling Tao, Kentaro Takagi, Kouta Nakata</p><p> <strong>One-sentence Summary:</strong> We present a clustering-friendly representation learning method using instance discrimination and feature decorrelation, which achieves accuracy of 81.5% and 95.4% on CIFAR-10 and ImageNet-10, respectively, far above state-of-the-art values.</p><p> <strong>Reviewers say:</strong> 作者提出了一种改进的基于深度学习的表示学习方法,该方法为聚类分析提供了更有效的功能。(1)根据在几个广泛使用的数据集上进行的比较实验,将softmax公式化的正交约束的集成能够提供更稳定的潜在特征表示。(2)据了解,广泛使用的深度聚类方法用于替代优化特征表示模型参数并更新由聚类方法(例如k-means)提供的锚点,我想知道本研究中提出的方法是否可以以真正的端到端方式集成这两个步骤。(3)我对这种拟议的表征学习方法的评估指标远远超过了最新的水平,印象深刻。尽管作者提供了CIFAR-10数据集上潜在特征的一些分布图,但是ImageNet-10上的可视化又如何呢?此外,添加一些存在于原始图像空间而非潜在空间中的“真实”可视化结果可以帮助说明所提出的方法是否可以从视觉内容的角度挖掘视觉上有意义的概念。</p></li><li><p>MiCE: Mixture of Contrastive Experts for Unsupervised Image Clustering </p><p> Tsung Wei Tsai, Chongxuan Li, Jun Zhu</p><p> <strong>One-sentence Summary:</strong> A principled probabilistic clustering method that exploits the discriminative representations learned by contrastive learning and the semantic structures captured by a latent mixture model in a unified framework.</p><p> <strong>Reviewers say:</strong> 简介:作者提出了“专家混合”类型的方法来解决无监督学习问题的聚类。该方法称为对比专家混合(MiCE),它使用对比学习作为基本模块,并将其与潜在的混合模型相结合。作者为MiCE开发了一种可扩展算法,并根据经验评估了提出的图像聚类方法。</p></li><li><p>Deep Learning meets Projective Clustering </p><p> Alaa Maalouf, Harry Lang, Daniela Rus, Dan Feldman</p><p> <strong>One-sentence Summary:</strong> We suggest a novel technique for compressing a fully connected layer (or an embedding layer).</p><p> <strong>Reviewers say:</strong> 这项工作提出了一种基于投影聚类的新方法,用于压缩DNN的嵌入层以实现自然语言建模任务。作者表明,通过考虑一组k个子空间而不是单个子空间,可以改善压缩和模型精度之间的折衷。压缩DNN的方法是研究的活跃领域,本文提出了一种有前途的方法以及有趣的结果。</p><p> 评级:本文提出了一些有趣的想法来压缩嵌入层。但是,由于这是一篇经验论文,因此我希望能得到一组更全面的经验结果,并能更好地与其他相关方法进行比较。</p></li><li><p>Isotropy in the Contextual Embedding Space: Clusters and Manifolds </p><p> Xingyu Cai, Jiaji Huang, Yuchen Bian, Kenneth Church</p><p> <strong>One-sentence Summary:</strong> This paper reveals isotropy in the clustered contextual embedding space, and found low-dimensional manifolds in there.</p><p> <strong>Reviewers say:</strong> 作者研究了自然语言的各种上下文嵌入模型的令牌嵌入空间。他们使用基于最近邻居,聚类和PCA的技术,在这些嵌入模型中报告了关于局部维数/各向异性/聚类/流形结构的各种结果,这是希望了解这些模型的科学家和从业人员普遍感兴趣的结果。这些包括在适当聚类和移动时发现嵌入中的(局部)各向同性,以及在GPT模型中出现明显的流形结构。</p></li><li><p>Deep Repulsive Clustering of Ordered Data Based on Order-Identity Decomposition </p><p> Seon-Ho Lee, Chang-Su Kim</p><p> <strong>One-sentence Summary:</strong> A deep clustering algorithm for ordered data is proposed based on the order-identity decomposition.</p><p> <strong>Reviewers say:</strong> 作者描述了一种对有序数据进行预测的直观有效的方法。该方法使用了一种基于聚类的直观方法,该方法将数据分组为子集,子集中的项易于订购。该文件写得很清楚,并清楚地说明了该方法。本文显示了该方法的预测输出的几个示例,并显示了两个任务的结果(估计年龄,美学评分回归)。该方法在估计年龄的任务上达到了最先进的结果,并且在其他任务上具有竞争力。作者显示了关于年龄转变的进一步结果。</p></li></ul><h2 id="data-augmentation"><a href="#data-augmentation" class="headerlink" title="data augmentation"></a>data augmentation</h2><ul><li><p>Removing Undesirable Feature Contributions Using Out-of-Distribution Data </p><p> Saehyung Lee, Changhwa Park, Hyungyu Lee, Jihun Yi, Jonghyun Lee, Sungroh Yoon</p><p> <strong>One-sentence Summary:</strong> We propose a simple method, Out-of-distribution data Augmented Training (OAT), to leverage OOD data for adversarial and standard learning.</p><p> <strong>Reviewers say:</strong> 本文研究了在训练过程中使用未标记的分布失调(OOD)数据来提高鲁棒性(和标准)准确性的效果。主要算法贡献是基于数据增强的鲁棒训练算法来训练损失,该算法经过精心设计以从其他OOD数据中受益。有趣的是,OOD数据带有随机标签,用于训练过程。如理论结果所示,这种输入OOD数据的方式有助于消除对非鲁棒特征的依赖性,从而提高鲁棒性。</p><p> 正如所有评论者(我都同意)所指出的那样,在训练中使用未标记的OOD数据的想法是新颖的/有趣的,并且该论文还展示了如何通过算法来做到这一点。数值结果也证实了所提出方法的有效性。</p></li><li><p><strong>【值得阅读】</strong> Negative Data Augmentation </p><p> Abhishek Sinha, Kumar Ayush, Jiaming Song, Burak Uzkent, Hongxia Jin, Stefano Ermon</p><p> <strong>One-sentence Summary:</strong> We propose a framework to do Negative Data Augmentation for generative models and self-supervised learning</p><p> **Reviewers say:**本文研究了扩大负面实例(不仅仅是正面实例)如何改善各种表征学习任务。本文研究了许多不同的增强,并将它们应用于GAN和带有图像和视频的对比学习。</p><ul><li>优点:本文的一个主要优点是它的简单性。该方法很容易实现为几种方法,并且在本文评估的每种方法上都可获得很强的结果。这些方法基于GAN和图像和视频的对比学习进行了评估。</li><li><pre><code>尽管该方法的新颖性有限,但在建立一些理论结果以直观说明该方法为何起作用方面,本文做得很好。与缺乏直觉了解其工作原理的机器学习进步相比,本文在为这种方法提供一些解释和动机方面做得很好。</code></pre></li><li>尽管本文着重于图像和视频,但相同的思想也可以扩展到其他形式,例如文本或音频。</li><li>实验令人信服,表明了这种想法的普遍性。实验是在几个不同的数据集上进行的。实验得到理论结果的支持,从而直观地说明了该方法为何有效。引言在确定与其他数据增强方法的差异方面做得很好,尤其是通过使用负面示例。</li></ul></li><li><p>CoDA: Contrast-enhanced and Diversity-promoting Data Augmentation for Natural Language Understanding </p><p> Yanru Qu, Dinghan Shen, Yelong Shen, Sandra Sajeev, Weizhu Chen, Jiawei Han</p><p> <strong>Reviewers say:</strong> NLP样本的增加是一项重要任务,没有明确的“适用于所有人”机制。这与计算机视觉形成鲜明对比,在计算机视觉中,存在诸如旋转,色调修改,饱和度以及多种其他技术的技术。这项工作试图通过提出一种将多种先前已知的方法仔细合并以生成各种标签保存示例的技术来解决该问题。RoBERTa上的实验结果强调了这种数据增强方法在文本分类(GLUE)下游任务中的适用性和重要性。</p></li><li><p>MODALS: Modality-agnostic Automated Data Augmentation in the Latent Space </p><p> Tsz-Him Cheung, Dit-Yan Yeung</p><p> <strong>Reviewers say:</strong> 本文提出了一种使用潜在嵌入空间的统一数据扩充方法-学习连续的潜在转换空间,并找到在该空间中遍历的有效方向以进行数据增强。所提出的方法结合了现有的数据增强方法,例如对抗训练,三重丢失和联合训练。本文还确定了模型性能低下的输入示例,并创建了更难的示例来帮助模型改进其性能。在与文本,表格,时间序列和图像模态相对应的多个值上进行评估,除图像数据外,其性能均优于SOTA。</p><p> 本文回应了审阅者的反馈,以提供更详细的实验和更强的基准,并进行了消融研究以显示该方法不同组成部分的有效性。通过与其他SOTA方法进行彻底的经验比较,以及使用其他损失函数(例如中心损失,大边际损失和其他对比损失)作为本文中提出的三重损失的替代方法,可以进一步改善结果。</p></li><li><p>Training GANs with Stronger Augmentations via Contrastive Discriminator </p><p> Jongheon Jeong, Jinwoo Shin</p><p> <strong>One-sentence Summary:</strong> We propose a novel discriminator of GAN showing that contrastive representation learning, e.g., SimCLR, and GAN can benefit each other when they are jointly trained. </p><p> <strong>Reviewers say:</strong> 本文旨在通过将对比性学习的原理纳入GAN鉴别器的训练中来改进生成对抗网络(GAN)的训练。与试图将GAN损失直接最小化的普通GAN不同,拟议的带有对比鉴别器(ContraD)的GAN变体使用鉴别器网络首先从给定的数据扩充和实际/生成的示例集中学习对比表示,然后训练基于学习到的对比表示的鉴别器。注意到这种混合的副作用是由于GAN训练而在对比学习中的改进。结果表明,带有对比鉴别器的GAN模型优于使用数据增强的其他技术。</p></li><li><p>Model Patching: Closing the Subgroup Performance Gap with Data Augmentation </p><p> Karan Goel, Albert Gu, Yixuan Li, Christopher Re</p><p> <strong>One-sentence Summary:</strong> We describe how to fix classifiers that fail on subgroups of a class using a combination of learned data augmentation & consistency training to achieve subgroup invariance.</p><p> <strong>Reviewers say:</strong> 本文提出了一种在分类器依赖于特定子组特征的情况下减轻图像中子组性能差距的方法。作者提出了一种数据增强方法,其中(由GANs生成的)合成示例充当所有可能子组中真实样本的实例。通过匹配原始示例和扩展示例的预测,预测模型被迫忽略鼓励不变性的亚组差异。所提出的“受控数据增强”方法(如R4所精确调用的)是相关且动机良好的,理论依据支持主要主张,并且实验结果多种多样,并证明了所提出方法的优点。正如R3正确指出的那样,“附录也非常详尽,代码组织得很好”。</p></li><li><p>SaliencyMix: A Saliency Guided Data Augmentation Strategy for Better Regularization </p><p> A F M Shahab Uddin, Mst. Sirazam Monira, Wheemyung Shin, TaeChoong Chung, Sung-Ho Bae</p><p> <strong>One-sentence Summary:</strong> The proposed method carefully selects a representative image patch with the help of a saliency map and mixes this indicative patch with the target image that leads the model to learn more appropriate feature representation</p><p> <strong>Reviewers say:</strong> </p><ol><li>本文提出了对数据混合的cutmix策略的一种改进,其中源补丁不是随机选择而是基于显着性选择。结果表明,在Imagenet,CIFAR 10/100上,混合和其他相关策略得到了改进,并且还可以转移到对象检测中</li><li>总结和贡献: 本文提出了一种新的数据增强策略来训练图像分类器和目标检测器。关键见解是使用图像显着性信号来指导混合图像时在何处裁剪和粘贴图像。本文包括对这种方法的设计空间的探索,以及多个实验结果,表明与现有的数据增强策略相比,该方法的经验优越性。</li></ol><p> 启示: 这篇论文很有趣,因为它提供了一个新的技巧,既易于理解,可以辩驳,又(现在)具有良好的经验支持(用于分类,检测和对抗攻击的鲁棒性)。</p><p> 创意:有限。尽管以前的工作都没有提供此处介绍的实验结果,但结果是可以预期的。这项工作是良好的A + B增量工作。</p></li><li><p>On Graph Neural Networks versus Graph-Augmented MLPs </p><p> Zhengdao Chen, Lei Chen, Joan Bruna</p><p> <strong>One-sentence Summary:</strong> We establish a separation in representation power between GNNs and Graph-Augmented MLPs.</p><p> <strong>Reviewers say:</strong> 本文研究了图神经网络(GNN)的一种变体,即图增强MLP(GA-MLP)。与在GNN中,节点将消息发送到邻居并通过非线性MLP聚合接收到的消息不同,GA-MLP依赖于一次计算的单个增强嵌入,然后将MLP应用于新的嵌入。可以通过对输入表示应用形式为A,A ^ 2,…,A ^ k的线性变换来获得增强嵌入,从而捕获更大的邻域。本文的主要目的是证明与GNN相比,使用GA-MLP解决图形问题时的根本缺陷。沿着这些思路,主要结果可以描述如下:1)本文确定了识别非同构图的特定实例,可以通过GNN而不是GA-MLP框架来解决。2)本文对GNN与图增强MLP的表示能力进行了实验和实验评估,并根据根图上的节点级函数显示了两者之间的表达能力分离。具体来说,他们表明,可以用一定深度的GNN表示的一组函数在k中呈指数增长,而在考虑类似的GA-MLP体系结构时,函数类仅呈指数增长。他们还根据经验评估了两种模型在社区检测和步行问题计数方面的性能差异。</p></li><li><p>Explaining the Efficacy of Counterfactually Augmented Data </p><p> Divyansh Kaushik, Amrith Setlur, Eduard H Hovy, Zachary Chase Lipton</p><p> <strong>One-sentence Summary:</strong> We present a framework for thinking about counterfactually augmented data and make strides towards understanding its benefits in out-of-domain generalization.</p><p> Reviewers says: 本文研究了反事实扩充的数据对域外泛化的影响。本文从具有高斯线性模型的结构因果模型的玩具示例开始,其中将噪声添加到因果或非因果特征上。休闲特征上的噪声增加会影响最小二乘估计,而非因果特征上的噪声却不会影响最小二乘估计。本文在这种情况和反事实文本编辑之间作了一个类比,其中假定跨度(合理值)被认为是因果关系。提出了一个假设,即在基本原理(因果关系特征)上添加噪声会导致模型依赖非理性因素(非因果关系特征)并导致较差的样本外性能,而在非理性因素中添加噪声则会导致更糟的结果。样本内性能,但更好的样本外性能。对情绪和自然语言推理(NLI)数据集的实验大多证实了这一假设,但有一些例外需要讨论。实验包括三种识别原理的方法(人工编辑,人工识别的跨度和通过自我注意识别的跨度)。在有理或无理的情况下对模型进行有无噪声训练(用随机令牌代替真实令牌)。</p></li><li><p>Reweighting Augmented Samples by Minimizing the Maximal Expected Loss </p><p> Mingyang Yi, Lu Hou, Lifeng Shang, Xin Jiang, Qun Liu, Zhi-Ming Ma</p><p> <strong>One-sentence Summary:</strong> a new reweighting strategy on augmented samples</p><p> <strong>Reviewers say:</strong> 本文提出了一种新颖的数据扩充方法。特别地,它提出了重新加权损失函数,该函数允许找到扩增样本的最佳加权。该方法在标准图像和语言任务上进行了测试,并与多种替代方法进行了比较。</p></li><li><p>Simple Augmentation Goes a Long Way: ADRL for DNN Quantization </p><p> Lin Ning, Guoyang Chen, Weifeng Zhang, Xipeng Shen</p><p> <strong>One-sentence Summary:</strong> Augments the neural networks in Deep Reinforcement Learning(DRL) with a complementary scheme to boost the performance of learning and solve the common low convergence problem in the early stage of DRL</p></li><li><p>On Data-Augmentation and Consistency-Based Semi-Supervised Learning </p><p> Atin Ghosh, Alexandre H. Thiery</p><p> <strong>One-sentence Summary:</strong> We propose a simple and natural framework leveraging the Hidden Manifold Model to study modern SSL methods.</p><p> <strong>Reviewers say:</strong> 本文提供了一些有关在基于一致性正则化的半监督学习中使用数据增强的理论观点。本文使用的框架认为,高质量的数据增强应该沿着数据流形移动。这种通用视图允许将论文的思想应用于数据集(与现有技术的半监督学习算法中使用的图像特定数据增强相反)。我不知道有任何其他工作可以提出这些观点,并且事实上,本文的意义在于,它为最有效的半监督学习方法提供了新的且可能有用的观点。审稿人认为该文件清晰实用。主要关注的是,该论文仅包括玩具环境中的实验。确实,</p></li></ul><ul><li><p>Tradeoffs in Data Augmentation: An Empirical Study </p><p> Raphael Gontijo-Lopes, Sylvia Smullin, Ekin Dogus Cubuk, Ethan Dyer</p><p> <strong>One-sentence Summary:</strong> We quantify mechanisms of how data augmentation works with two metrics we introduce: Affinity and Diversity.</p><p> <strong>Reviewers say:</strong> 本文研究了数据扩充问题,即通过修改现有的数据来获得新的训练示例。数据扩充在机器学习和人工智能中很流行,因为它增加了训练示例的数量。但是,其对模型性能的影响在实践中仍然未知。增强运算符(例如图像旋转)可能是有帮助的,也可能是有害的。本文介绍了两个新的指标,称为亲和力和多样性,以量化任何给定的扩增算子的效果。作者发现,具有高亲和力分数和高多样性分数的运算符可带来最佳性能改进。</p><p> 长处</p><pre><code> - 引入的措施同时考虑了数据和模型,这在现代深度学习中是不可分割的。 - 这些措施可以直观地解释为什么某些数据增强方法有效而另一些无效的原因。 - 亲和力很容易计算。为了获得亲和力,需要一个在干净数据上训练的模型,并使用具有给定扩充的验证集。可以重用训练后的模型来衡量其他增强。 - 实验是广泛的。</code></pre></li><li><p>Combining Ensembles and Data Augmentation Can Harm Your Calibration </p><p> Yeming Wen, Ghassen Jerfel, Rafael Muller, Michael W Dusenberry, Jasper Snoek, Balaji Lakshminarayanan, Dustin Tran</p><p> <strong>One-sentence Summary:</strong> We found that combining ensembles and data augmentation worsens calibration than applying them individually, and we proposed a simple fix to it.</p><p> <strong>Reviewers say:</strong> 这项工作分析了数据增强策略(例如MixUp)与模型集成之间在校准性能方面的交互作用。作者指出,将单个模型组合在一起时,诸如混合和标签平滑之类的策略如何减少单个模型的过分自信,从而导致校准性能下降。具体来说,所有技术都是单独采取的,通过减少过度自信来改善校准。但是,结合起来,它们会导致模型置信度不足,因此校准效果会更差。基于此分析,作者提供了一种简单的技术,可在CIFAR-10,CIFAR-10-C,CIFAR-100和CIFAR-100-C和ImageNet上产生SOTA校准性能。作者建议根据模型在特定类别上是否过度自信来动态启用和禁用MixUp,</p><p> 我认为这项工作提供了有用的见解以及简单有效的解决方案。另外,它写得很清楚,读起来非常容易和愉快。</p></li><li><p>GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing </p><p> Tao Yu, Chien-Sheng Wu, Xi Victoria Lin, bailin wang, Yi Chern Tan, Xinyi Yang, Dragomir Radev, richard socher, Caiming Xiong</p><p> <strong>One-sentence Summary:</strong> Language model pre-training for table semantic parsing.</p><p> <strong>Reviewers say:</strong> 本文提出了一种用于语义解析的预训练技术,重点是语义解析以及使其实际工作所需的技术细节。总体而言,所有审阅者都认为结果非常好,并且您看到跨多个文本到SQL数据集的良好改进。对于(a)创建用于生成综合数据的SCFG的难度提出了一些保留,但是作者似乎已经适当地解决了这一问题,并且需要付出合理的努力。(b)预训练任务是如何针对特定任务(文本到SQL)和数据集(蜘蛛)量身定制的。总体而言,我倾向于同意以下事实:由于语法是从蜘蛛衍生而来的,因此对蜘蛛的改进并不那么引人注目,但是作者正确地声称,即使收益略微减少,其他数据集的持续改进也是显而易见的。可以希望这个想法也可以推广到其他可以生成合成数据的设置,并且应该将合成数据与实际数据结合起来的详细信息很有用。</p></li></ul><h2 id="About-distribution"><a href="#About-distribution" class="headerlink" title="About distribution"></a>About distribution</h2><ul><li><p>Free Lunch for Few-shot Learning: Distribution Calibration<br> Shuo Yang, Lu Liu, Min Xu</p></li><li><p>Improved Autoregressive Modeling with Distribution Smoothing<br> Chenlin Meng, Jiaming Song, Yang Song, Shengjia Zhao, Stefano Ermon</p></li><li><p>Long-tailed Recognition by Routing Diverse Distribution-Aware Experts<br> Xudong Wang, Long Lian, Zhongqi Miao, Ziwei Liu, Stella Yu</p></li></ul><ul><li><p>Meta-Learning of Structured Task Distributions in Humans and Machines<br> Sreejan Kumar, Ishita Dasgupta, Jonathan Cohen, Nathaniel Daw, Thomas Griffiths</p></li><li><p>Convex Potential Flows: Universal Probability Distributions with Optimal Transport and Convex Optimization<br> Chin-Wei Huang, Ricky T. Q. Chen, Christos Tsirigotis, Aaron Courville</p></li></ul>]]></content>
<tags>
<tag> paper list </tag>
</tags>
</entry>
<entry>
<title>Related Papers in WWW 2021 (2021.04.19)</title>
<link href="/uncategorized/paperlistfile/WWW2021/"/>
<url>/uncategorized/paperlistfile/WWW2021/</url>
<content type="html"><![CDATA[<p><a href="https://www2021.thewebconf.org/program/papers/">paper list</a></p><span id="more"></span><h2 id="Anomaly-detection-Failure-detection"><a href="#Anomaly-detection-Failure-detection" class="headerlink" title="Anomaly detection / Failure detection"></a>Anomaly detection / Failure detection</h2><ul><li><p>Few-shot Network Anomaly Detection via Cross-network Meta-learning</p><p><strong>Authors:</strong> Kaize Ding (ASU), Qinghai Zhou (UIUC), Hanghang Tong (UIUC) and Huan Liu (ASU).</p><p><strong>Abstract:</strong> Network anomaly detection aims to find network elements (e.g., nodes, edges, subgraphs) with significantly different behaviors from the vast majority. It has a profound impact in a variety of applications ranging from finance, healthcare to social network analysis. Due to the unbearable labeling cost, existing methods are predominately developed in an unsupervised manner. Nonetheless, the anomalies they identify may turn out to be data noises or uninteresting data instances due to the lack of prior knowledge on the anomalies of interest. Hence, it is critical to investigate and develop few-shot learning for network anomaly detection. In real-world scenarios, few labeled anomalies are also easy to be accessed on similar networks from the same domain as the target network, while most of the existing works omit to leverage them and merely focus on a single network. Taking advantage of this potential, in this work, we tackle the problem of few-shot network anomaly detection by (1) proposing a new family of graph neural networks – Graph Deviation Networks (GDN) that can leverage a small number of labeled anomalies for enforcing statistically significant deviations between abnormal and normal nodes on a network; (2) equipping the proposed GDN with a new cross- network meta-learning algorithm to realize few-shot network anomaly detection by transferring meta-knowledge from multiple auxiliary networks. Extensive experimental evaluations demonstrate the<br>efficacy of the proposed approach on few-shot or even one-shot network anomaly detection.</p></li><li><p>MSTREAM: Fast Anomaly Detection in Multi-Aspect Streams</p><p><strong>Authors:</strong> Siddharth Bhatia (National University of Singapore), Arjit Jain (IIT Bombay), Pan Li (Purdue University), Ritesh Kumar (IIT Kanpur) and Bryan Hooi (National University of Singapore).</p><p><strong>Abstract:</strong> Given a stream of entries in a multi-aspect data setting i.e., entries having multiple dimensions, how can we detect anomalous activities in an unsupervised manner? For example, in the intrusion detection setting, existing work seeks to detect anomalous events or edges in dynamic graph streams, but this does not allow us to take into account additional attributes of each entry. Our work aims to define a streaming multi-aspect data anomaly detection framework, termed MSTREAM which can detect unusual group anomalies as they occur, in a dynamic manner. MSTREAM has the following properties: (a) it detects anomalies in multi-aspect data including both categorical and numeric attributes; (b) it is online, thus processing each record in constant time and constant memory; (c) it can capture the correlation between multiple aspects of the data. MSTREAM is evaluated over the KDDCUP99, CICIDS-DoS, UNSW-NB 15 and CICIDS-DDoS datasets, and outperforms state-of-the-art baselines.</p></li><li><p>SDFVAE: Static and Dynamic Factorized VAE for Anomaly Detection of Multivariate CDN KPIs</p><p><strong>Authors:</strong> Liang Dai (Institute of Information Engineering, Chinese Academy of Sciences), Tao Lin (Communication University of China),<br>Chang Liu (Institute of Information Engineering, Chinese Academy of Science & University of Chinese Academy of Sciences),<br>Bo Jiang (School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University),<br>Yanwei Liu (Institute of Information Engineering, Chinese Academy of Sciences), Zhen Xu (INSTITUTE OF INFORMATION ENGINEERING,CAS) and<br>Zhi-Li Zhang (University of Minnesota).</p><p><strong>Abstract:</strong> Content Delivery Networks (CDNs) are critical for providing good user experience of cloud services. CDN providers typically collect various multivariate Key Performance Indicators (KPIs) time series to monitor and diagnose system performance. State-of-the-art anomaly detection methods mostly use deep learning to extract the normal patterns of data, due to its superior performance. However, KPI data usually exhibit non-additive Gaussian noise, which makes it difficult for deep learning models to learn the normal patterns, resulting in degraded performance in anomaly detection. In this paper, we propose a robust and noise-resilient anomaly detection mechanism using multivariate KPIs. Our key insight is that different KPIs are constrained by certain time-invariant characteristics of the underlying system, and that explicitly modelling such invariance may help resist noise in the data. We thus propose a novel anomaly detection method called SDFVAE, short for Static and Dynamic Factorized VAE, that learns the representations of KPIs by explicitly factorizing the latent variables into dynamic and static parts. Extensive experiments using real-world data show that SDFVAE achieves a F1-score ranging from 0.92 to 0.99 on both regular and noisy dataset, outperforming state-of-the-art methods by a large margin.</p></li><li><p>NTAM: Neighborhood-Temporal Attention Model for Disk Failure Prediction in Cloud Platforms</p><p><strong>Authors:</strong> Chuan Luo (Microsoft Research), Pu Zhao (Microsoft Research), Bo Qiao (Microsoft Research), Youjiang Wu (Microsoft Azure),<br>Hongyu Zhang (The University of Newcastle), Wei Wu (Leibniz University Hannover), Weihai Lu (Microsoft Research), Yingnong Dang (Microsoft Azure),<br>Saravanakumar Rajmohan (Microsoft Office), Qingwei Lin (Microsoft Research) and Dongmei Zhang (Microsoft Research).</p><p><strong>Abstract:</strong> With the rapid deployment of cloud platforms, high service reliability is of critical importance. An industrial cloud platform contains a huge number of disks and disk failure is a common cause of service unreliability. In recent years, many machine learning based disk failure prediction approaches have been proposed, which can predict disk failures based on disk status data before the failures actually happen. In this way, proactive actions can be taken in advance to improve service reliability. However, existing approaches treat each disk individually and do not explore the influence of the neighboring disks. In this paper, we propose Neighborhood-Temporal Attention Model (NTAM), a novel deep learning based approach to disk failure prediction. When predicting whether or not a disk will fail in near future, NTAM is a novel approach that not only utilizes a disk’s own status data, but also considers its neighbors’ status data. Moreover, NTAM includes a novel attention-based temporal component to capture the temporal nature of the disk status data. Besides, we propose a data enhancement method, called Temporal Progressive Sampling (TPS), to handle the extreme data imbalance issue. We evaluate NTAM on a public dataset as well as two industrial datasets collected from around 9 million of disks in an industrial public cloud platform. Our experimental results show that NTAM significantly outperform state- of-the-art competitors. Also, our empirical evaluations indicate the effectiveness of the neighborhood-ware component and the temporal component underlying NTAM and the effectiveness of TPS. More encouragingly, we have successfully applied NTAM and TPS to Company M’s public cloud platform and obtained practical benefits in industrial practice.</p></li></ul><h2 id="Time-series"><a href="#Time-series" class="headerlink" title="Time series"></a>Time series</h2><ul><li><p>Network of Tensor Time Series</p><p><strong>Authors:</strong> Baoyu Jing (University of Illinois at Urbana-Champaign), Hanghang Tong (University of Illinois at Urbana-Champaign) and Yada Zhu (IBM Thomas J. Watson Research Center).</p><p><strong>Abstract:</strong> Co-evolving time series appears in a multitude of applications such as environmental monitoring, financial analysis, and smart transportation. This paper aims to address the following three challenges, including (C1) how to effectively model its multi-mode tensor structure at each time step; (C2) how to incorporate explicit relationship networks of the time series; (C3) how to model the implicit relationship of the temporal dynamics. We propose a novel model called Network of Tensor Time Series, which is comprised of two modules, including Tensor Graph Convolutional Network (TGCN) and Tensor Recurrent Neural Network (TRNN). TGCN tackles the first two challenges by generalizing Graph Convolutional Network (GCN) for flat graphs to tensor graphs, which captures the synergy between multiple graphs associated with the tensors. TRNN leverages tensor decomposition to balance the trade-off between the commonality and specificity of the co-evolving time series. The experimental results on five real-world datasets demonstrate the efficacy of the proposed method.</p></li><li><p>Radflow: A Recurrent, Aggregated, and Decomposable Model for Networks of Time Series</p><p><strong>Authors:</strong> Alasdair Tran (Australian National University), Alexander Mathews (Australian National University), Cheng Soon Ong (CSIRO) and Lexing Xie (Australian National University).</p><p><strong>Abstract:</strong> We propose a new model for networks of time series that influence each other. Graph structures among time series is found in diverse domains, such as web traffic influenced by hyperlinks, product sales influenced by recommendation, or urban transport volume influenced by the road network and weather. There has been recent progress in modeling graphs and time series, respectively, but an expressive and scalable approach for a network of series does not yet exist. We introduce Radflow, a novel model that embodies three main ideas: the recurrent structure of LSTM to obtain time- dependent node embeddings, aggregation of the flow of influence from other nodes with multi-head attention, and multi-layer decomposition of time series. Radflow naturally takes into account dynamic networks where nodes and edges appear over time, and it can be used for prediction and data imputation tasks. On four real-world datasets ranging from a few hundred to a few hundred thousand nodes, we observe Radflow variants being the best performing model across all tasks. We also report that the recurrent component in Radflow consistently outperforms N-BEATS, the state-of-the-art time series model. We show that Radflow can learn different trends and seasonal patterns, that it is robust to missing nodes and edges, and that correlated temporal patterns among network neighbors reflect influence strength. We curate WikiTraffic, the largest dynamic network of time series with 360K nodes and 22M time-dependent links spanning five years—this dataset provides an open benchmark for developing models in this area, and prototyping applications for problems such as estimating web resources and optimizing collaborative infrastructures. More broadly, Radflow can be used to improve the forecasts in correlated time series networks such as the stock market, or impute missing measurements of natural phenomena such as geographically dispersed networks of waterbodies.</p></li></ul><h2 id="Micro-service-cloud-native"><a href="#Micro-service-cloud-native" class="headerlink" title="Micro-service / cloud native"></a>Micro-service / cloud native</h2><ul><li><p>MicroRank: End-to-End Latency Issue Localization with Extended Spectrum Analysis in Microservice Environments</p><p><strong>Authors:</strong> Guangba Yu (Sun Yat-Sen University), Pengfei Chen (Sun Yat-sen University), Hongyang Chen (Sun Yat-sen University), Zijie Guan (Tencent),<br>Zicheng Huang (Sun Yat-sen University), Linxiao Jing (Sun Yat-sen University), Tianjun Weng (Sun Yat-Sen University), Xinmeng Sun (Sun Yat-Sen University)<br>and Xiaoyun Li (Sun Yat-sen University).</p><p><strong>Abstract:</strong> With the advantages of strong scalability and fast delivery, microser-vice has become a popular software architecture in the modern ITindustry. Most of microservice faults manifest themselves in terms of service latency increase and impact user experience. The explosion in the number of service instances and complex dependencies among them make the application diagnosis extremely challenging.To help understand and troubleshoot a microservice system,the end-to-end tracing technology has been widely applied to capturethe execution path of each request. However, the tracing data are not fully leveraged by cloud and application providers when con-ducting latency issue localization in the microservice environment.This paper proposes a novel system ,named MicroRank, which analyzes clues provided by normal and abnormal traces to locateroot causes of latency issues. Once a latency issue is detected by the Anomaly Detector in MicroRank, the cause localization procedure is triggered. MicroRank first distinguishs which traces are abnormal. Then, MicroRank’s PageRank Scorer module uses the abnormal and normal trace information as its input and differentials the importance of different traces to extended spectrum techniques . Finally, the spectrum techniques can calculate the ranking list based on the weighted spectrum information from PageRank Scorer to locate root causes more effectively. The experimental evaluationson a widely-used open-source system and a production system show that MicroRank achieves excellent results not only in one root cause situation but also in two issues that happen at the same time. Moreover,MicroRank makes 6% to 22% improvement in recall in localizing root causes compared to current state-of-the- art methods.</p></li></ul><h2 id="Augmentation"><a href="#Augmentation" class="headerlink" title="Augmentation"></a>Augmentation</h2><ul><li><p>Graph Contrastive Learning with Adaptive Augmentation</p><p><strong>Authors:</strong> Yanqiao Zhu (Institute of Automation, Chinese Academy of Sciences), Yichen Xu (Beijing University of Posts and Telecommunications),<br>Feng Yu (Alibaba), Qiang Liu (RealAI and Tsinghua University), Shu Wu (Institute of Automation, Chinese Academy of Sciences) and<br>Liang Wang (Institute of Automation, Chinese Academy of Sciences).</p><p><strong>Abstract:</strong> Recently, contrastive learning (CL) has emerged as a successful method for unsupervised graph representation learning. Most graph CL methods first perform stochastic augmentation on the input graph to obtain two graph views and maximize the agreement of representations in the two views. Despite the prosperous development of graph CL methods, the design of graph augmentation schemes—a crucial component in CL—remains rarely explored. We argue that the data augmentation schemes should preserve intrinsic structural and attribute information of graphs, which will force the model to learn representations that are insensitive to perturbation on unimportant nodes and edges. However, most existing methods adopt uniform data augmentation schemes, like uniformly dropping edges and uniformly shuffling features, leading to suboptimal performance. In this paper, we propose a novel graph contrastive representation learning method with adaptive augmentation that incorporates various priors for topological and semantic aspects of the graph. Specifically, on the topology level, we design augmentation schemes based on node centrality measures to highlight important connective structures. On the node attribute level, we corrupt node features by adding more noise to unimportant node features, to enforce the model to recognize underlying semantic information. We perform extensive experiments of node classification on a variety of real-world datasets. Experimental results demonstrate that our proposed method consistently outperforms existing state-of-the-art methods and even surpasses some supervised counterparts, which validates the effectiveness of the proposed contrastive framework with adaptive augmentation.</p></li></ul>]]></content>
<tags>
<tag> paper list </tag>
</tags>
</entry>
<entry>
<title>微服务框架下的多源异构数据异常检测 - 调研</title>
<link href="/uncategorized/surveys/aiops_micro-service/"/>
<url>/uncategorized/surveys/aiops_micro-service/</url>
<content type="html"><![CDATA[<p>2021-04-13: 整理了微服务框架下,做异常检测的大体思路,分别调研“多源异构数据融合”,以及分别基于“时间序列”“日志”“调用链”数据的异常检测方法。</p><p>2022-04-18: 补充了“异常根因分析”相关的工作,这些工作大多是基于图的,可以为4G LTE种的根因分析提供思路。</p><span id="more"></span><p><a href="https://netman.aiops.org/publications/">裴丹团队的主页</a></p><p><a href="https://aiopsworkshop.github.io/accepted_papers/index.html">2020 AIOPS workshop</a></p><p><a href="https://github.com/FudanSELab/train-ticket/">Train Ticket – Test Bed</a></p><p><a href="https://microservices-demo.github.io/">Sock shop – Test bed</a></p><h2 id="一些笔记"><a href="#一些笔记" class="headerlink" title="一些笔记"></a>一些笔记</h2><p>在相关工作<a href="#lundetecting2021">[1]</a>中:</p><ol><li><p>所要检测的异常如何定义,是一个关键的问题。本文提到”error messages”, “performance degradations”与”trace structure and response time anomalies”本身就是几种不同的异常类型,融合多种数据将有助于我们探测更多种类的异常。</p></li><li><p>本文所建立的调用树中包含了两种调用关系——本地调用(两个微服务位于同一个主机,彼此之间进行调用)、远程调用(两个微服务不在同一个主机,远程调用),但好像在分析的时候没什么区别。</p></li></ol><p>在相关工作<a href="#multimodalsasho2019">[2]</a>中:</p><ol><li><p>trace数据具有多模态,即meta action之间的结构(因果关系模态,或sequential nature模态)、服务响应时间(实值序列)。</p></li><li><p>trace数据的一个有趣的特性在于,它是一种能够良好的反映服务层状态的数据,同时还包含有大量的底层信息。</p></li><li><p>tree结构的调用在实际调用的时候,存在高并发的特性,即:两个被同一组件同时调用的服务,在响应的时候可能出现先后的差异,这在使用序列作为trace表征方式中是非常常见的,但这种差异不代表异常。</p></li></ol><h2 id="相关论文"><a href="#相关论文" class="headerlink" title="相关论文"></a>相关论文</h2><h4 id="根因分析"><a href="#根因分析" class="headerlink" title="根因分析"></a>根因分析</h4><h5 id="基于图的方法"><a href="#基于图的方法" class="headerlink" title="基于图的方法"></a>基于图的方法</h5><ol><li><p><strong>Graph-based root cause analysis for service-oriented and microservice architectures</strong></p><p> ÁlvaroBrandó, Marc Solé, Alberto Huélamo, David Solans, María S.Pérez, VictorMuntés-Mulero</p><p> Journal of Systems and Software, Volume 159, January 2020, 110432</p></li><li><p><strong>GRANO: interactive graph-based root cause analysis for cloud-native distributed data platform</strong></p><p> Hanzhang Wang, Phuong Nguyen, Jun Li, Selcuk Kopru, Gene Zhang, Sanjeev Katariya, Sami Ben-Romdhane</p><p> Proceedings of the VLDB EndowmentVolume 12Issue 12August 2019</p></li></ol><p>我们通过提供系统组件拓扑、警报和应用程序事件的整体视图,展示了 Grano,这是一个用于云原生分布式数据平台的端到端异常检测和根本原因分析(或简称 RCA)系统。Grano 提供: 一个<em>检测层</em>,用于处理大量时间序列监控数据,以检测逻辑和物理系统组件的异常情况;<em>Anomaly Graph Layer</em>具有新颖的图形建模和算法,用于利用系统拓扑数据和检测结果来识别系统组件级别的根本原因相关性;和<em>应用层</em>自动通知待命人员,并通过交互式图形界面提供实时和按需 RCA 支持。该系统使用 eBay 的生产数据进行部署和评估,以帮助值班人员将根本原因的识别时间从几小时缩短到几分钟。</p><ol start="3"><li><p><strong>A Causality Mining and Knowledge Graph Based Method of Root Cause Diagnosis for Performance Anomaly in Cloud Application</strong>s</p><p> Juan Qiu, Qingfeng Du, Kanglin Yin, Shuang-Li Zhang, and Chongshu Qian</p><p> Applied Sciences, Volume 10, Issue 6</p></li></ol><p>随着云计算技术的发展,微服务架构(MSA)已经成为云原生应用中流行的应用架构。很多面向用户的服务由很多微服务支持,服务之间的依赖比传统的单体架构应用更加复杂。在这种情况下,如果一个微服务的性能指标发生异常变化,就会导致其他相关服务降级甚至出现故障,这可能会给依赖的业务带来很大的损失。因此,在云应用的运维工作中,挖掘问题的因果关系,尽快找到根源至关重要。在本文中,我们提出了一种挖掘因果关系和诊断根本原因的方法,该方法使用知识图谱技术和因果搜索算法。我们在一个经典的云原生应用上验证了所提出的方法,发现该方法是有效的。将我们的方法应用于云原生应用程序的大部分服务后,准确率和召回率均超过 80%。</p><ol start="4"><li><p><strong>Groot: An Event-graph-based Approach for Root Cause Analysis in Industrial Settings</strong></p><p> Hanzhang Wang; Zhengkai Wu; Huai Jiang; Yichao Huang; Jiamu Wang; Selcuk Kopru; Tao Xie</p><p> 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE)</p></li></ol><p>对于大型分布式系统,有效诊断事件的根本原因以保持系统的高可用性至关重要。微服务架构的最新发展为工业环境中的根本原因分析 (RCA) 带来了三大挑战(即操作的复杂性、系统规模和监控)。为了应对这些挑战,在本文中,我们提出了 Groot,一种基于事件图的 RCA 方法。 Groot 基于事件构建实时因果关系图,这些事件汇总了被分析系统中的各种类型的指标、日志和活动。此外,为了整合来自站点可靠性工程 (SRE) 工程师的领域知识,Groot 可以使用用户定义的事件和特定领域的规则进行定制。目前,Groot 在 5,000 项实际生产服务中支持 RCA,并被 eBay 的 SRE 团队积极使用,eBay 是一个全球电子商务系统,每年为超过 1.59 亿活跃买家提供服务。在 15 个月内,我们收集了一个数据集,其中包含 952 个实际生产事件的标记根本原因以进行评估。评估结果表明,Groot 能够达到 95% 的 top-3 准确率和 78% 的 top-1 准确率。为了分享我们在工业环境中部署和采用 RCA 的经验,我们进行了一项调查以表明 Groot 的用户发现它有用且易于使用。我们还分享了部署和采用 Groot 解决生产环境中的 RCA 问题的经验教训。</p><ol start="5"><li><p><strong>Graph Based Root Cause Analysis in Cloud Data Center</strong></p><p> Divyaansh Dandona; Mevlut Demir; John J. Prevost</p><p> 2020 IEEE 15th International Conference of System of Systems Engineering (SoSE)</p></li></ol><p>低成本计算的吸引力和云技术的按需扩展已导致许多软件应用程序迁移到云。这种对云的依赖增加转化为对云数据中心的直接依赖,这些云数据中心形成了现代云。这些数据中心是由许多系统组成的复杂建筑物,这些系统相互交互以托管最终应用程序。在这个系统系统中检测异常事件,然后及时确定其根本原因是一项艰巨的任务。在本文中,我们提出了一个图形模型来封装系统的云数据中心系统,并分享一种减少根本原因分析的搜索空间的方法。</p><h3 id="关于多源数据融合的文章"><a href="#关于多源数据融合的文章" class="headerlink" title="关于多源数据融合的文章"></a>关于多源数据融合的文章</h3><ol><li><strong>Multi-Source Anomaly Detection in Distributed IT Systems</strong></li></ol><p>Jasmin Bogatinovski, Sasho Nedelkoski</p><p>知乎解析:<a href="https://zhuanlan.zhihu.com/p/347051870">link</a></p><p>2020 AIOPS workshop</p><ul><li>本工作联合考虑了日志数据与trace数据,开发了一种span2vec的方法,用于将trace数据像log数据一样表示为一系列的模板数据,进一步的便于与日志数据进行融合。</li><li>异常检测任务在本文中被转化为了一个“下一步模板预测”的<strong>有监督任务</strong>,可以分别对下一步可能出现的日志与模板进行预测,则偏离预测的trace/log模板即为异常。</li><li>在本工作中,log与trace数据提供了不同角度的系统状态信息——日志数据可以在服务级别上有更丰富的描述,可以被视为服务运行的的指纹。Trace数据中则没有太多上述系统级的信息,但包含执行一次用户清请求的总流程图。但是从结果上来看,日志数据的加入,提升了预测trace异常的recall,但反之,trace数据的的加入并没有显著提升log数据的异常检测性能。可能的原因在于log的数据粒度比较大。</li></ul><blockquote><p>One explanation of this behaviour is that the granularity of the information from the logs is truncated on the level of the data source with a lower frequency of generation – the trace is harder for the information in the trace to be transferred to the logs. The information that the multimodal method is receiving from the logs when it is aiming to predict the next relevant span complements the information as obtained just from the sequence of spans individually.</p></blockquote><p><br>2. <strong>Multi-source Distributed System Data for AI-Powered Analytics</strong></p><pre><code>Sasho Nedelkoski, Jasmin Bogatinovski, Ajay Kumar Mandapati, Soeren Becker, Jorge Cardoso, Odej KaoEuropean Conference on Service-Oriented and Cloud Computing, ESOCC 2020: Service-Oriented and Cloud Computing开发了一个用于捕捉多源(三种)数据的一个分析平台,对这个test bed做了描述</code></pre><ol start="3"><li><p><strong>An Intelligent Anomaly Detection Scheme for Micro-Services Architectures With Temporal and Spatial Data Analysis</strong></p><p> Yuan Zuo; Yulei Wu; Geyong Min; Chengqiang Huang; Ke Pei</p><p> IEEE Transactions on Cognitive Communications and Networking ( Volume: 6, Issue: 2, June 2020)</p><p> 联合使用日志(时间数据)和调用链数据query trace(空间数据)做特征提取与融合,并构建one-class classifier</p></li></ol><h3 id="微服务架构下的异常检测"><a href="#微服务架构下的异常检测" class="headerlink" title="微服务架构下的异常检测"></a>微服务架构下的异常检测</h3><h4 id="基于调用链数据"><a href="#基于调用链数据" class="headerlink" title="基于调用链数据"></a>基于调用链数据</h4><ol><li><p><strong>Observing and Controlling Performance in Microservices</strong></p><p> 学术毕业论文:André Pascoal Bento</p><p> 讲使用trace数据,建模系统中微服务之间的依赖关系,并建立为图模型,然后计算出每个依赖关系之间的应答响应时间</p></li><li><p><strong>Unsupervised Detection of Microservice Trace Anomalies through Service-Level Deep Bayesian Networks</strong></p><p> Ping Liu; Haowen Xu; Qianyu Ouyang; Rui Jiao; Zhekang Chen; Shenglin Zhang; …</p><p> 2020 IEEE 31st International Symposium on Software Reliability Engineering (ISSRE)</p><p> <strong>基于微服务trace数据,检测意外的调用关系或者意外的调用响应时间</strong></p></li></ol><ul><li><p>本文与其他工作有显著的不同,别的工作的异常检测对象一般都是针对单个的调用meta action,而本工作是针对一整条tree状的trace。</p></li><li><p>本文使用精心设计的手工特征来表征trace数据,向量化的trace数据用一个容量很大的概率模型来学习他们的正常模式。其中之所以要用到容量大的模型,是因为要学习的模式是一整个application的trace,而非一个trace。</p></li><li><p>手工设计的trace数据使得异常检测后的结果可以快速定位异常类型与异常根因,自带可解释性。</p></li></ul><ol start="3"><li> <a name="multimodalsasho2019"><sup>[2]</sup></a> <strong>Anomaly Detection from System Tracing Data Using Multimodal Deep Learning</strong></li></ol><p>Sasho Nedelkoski; Jorge Cardoso; Odej Kao</p><p>2019 IEEE 12th International Conference on Cloud Computing (CLOUD)</p><p>本文将trace数据视为了<strong>多模态数据</strong>,第一种模态为由事件序列组成的类似于NLP的序列数据(其实与日志模板序列是相似的),第二种模态是每个事件对应的响应时间。</p><ul><li><p>第一步将trace数据异常检测视为了单模态数据的异常检测。其本质是,将上述两种“文本序列”“实值序列”送进LSTM网络中,进行预测,当真实值不在TopK label(文本序列预测)或不在95%置信区间(实值序列预测)时,即为异常。</p></li><li><p>本文提到的多模态融合,其具体方案是,两种数据均通过单层的LSTM网络,然后在第二个隐层中进行融合(concat)。</p></li></ul><p>本文的最大缺陷在于:</p><ul><li>虽然理论上event间的调用因果关系被视为了文本序列中的语法,交由LSTM学习,但是这种拓扑关系本身是已知的,而没有被利用,LSTM学习到的关系是什么,也是不可解释的</li><li>本文实验中的异常全部都是人工生成的,而且有对照网络输出构造异常样本的嫌疑,实验结果存疑。</li><li>本文在还提到,本文通过所学习的预测模型,对调用链中的并发与依赖事件进行了重建与识别,但是识别这些事件对异常检测的作用在哪,并发与否在trace数据的JSON文件中不应该是可以直接被解析的吗 </li></ul><ol start="4"><li><strong>Self-Supervised Anomaly Detection from Distributed Traces</strong></li></ol><p>Jasmin Bogatinovski; Sasho Nedelkoski; Jorge Cardoso; Odej Kao</p><p>2020 IEEE/ACM 13th International Conference on Utility and Cloud Computing (UCC)</p><ul><li><p>这篇论文仍然将trace数据建模为<strong>模板序列</strong>,将异常检测问题视为下一模板预测问题。</p></li><li><p>不同之处在于引入了self-supervised learning,具体的,在训练的时候,遮蔽trace中的随机一个span,训练网络预测被遮蔽位置的span是什么。<br>具体的,网络学习到的是:trace中每一个位置的预测可能性列表</p></li><li><p>在异常检测的时候,统计待测样本中,span不在预测出饿top-k list中的比例,记为anomaly score</p></li></ul><ol start="5"><li><strong>Anomaly Detection and Classification using Distributed Tracing and Deep Learning</strong></li></ol><p>Sasho Nedelkoski; Jorge Cardoso; Odej Kao</p><p>2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)</p><ul><li><p>本文本质上是对于时间序列提出了一种异常检测的方法,只是根据本文的检测目标——响应时间,将特定一组微服务的响应时间序列作为了研究对象。然而所提出的方法其实可以在广义的时间序列上进行评估。</p></li><li><p>本文着重说明了在正常序列具有多种模式的时候,我们该如何去建模正常,这里提出使用VAE的多维高斯分布来建模这种复杂分布,再利用重建误差作为异常得分。</p></li><li><p>后续,他还通过训练了一个基于1D-CNN的网路,做了异常分类,首先由用户预定义几种异常,然后训练一个有监督的分类器,以帮助将那些检测到的异常归类到已知的异常类型中去。</p></li><li><p>本文比较有用的点在于:在处理运维时间序列上,其实存在很多难点,需要我们使用一些预处理方法,如去噪、平滑、去离群值等方法来处理,以保证模型性能,且具备可解释性。</p></li></ul><blockquote><p>However, the large amount of events in the time series and the fact that proper training of neural networks <strong>requires normalization</strong>, leads to obligation of having an outlier removal technique. <strong>The presence of a strong outlier, will lead to values clamped to zero after the normalization</strong>. Therefore, events having response time greater than three standard deviations from the mean are removed from the training batch.</p><p>Next, we normalize the values by using min-max scaling (0, 1) to ensure that we stay in the positive range of values for the response time. In contrast, <strong>using standardization might produce negative values that do not have natural meaning when we deal with response time (no negative time)</strong>. Normalization is required and makes the optimization function of the neural network well-conditioned, which is key for convergence.</p><p>we apply <strong>smoothing for noise removal and robustness to small deviations</strong>. The time series is convolved with Hamming smoothing filter defined with its optimal parameters [35] and size M as…</p></blockquote><ol start="6"><li><strong>MicroRAS: Automatic Recovery in the Absence of Historical Failure Data for Microservice Systems</strong></li></ol><p>Li Wu; Johan Tordsson; Alexander Acker; Odej Kao</p><p>2020 IEEE/ACM 13th International Conference on Utility and Cloud Computing (UCC)</p><ul><li><p>本文致力于提出一个故障自动恢复方法,指的是当系统已经异常后,我们如何评估某些操作所带来的正面/负面影响,并在恢复效果与恢复时间之间做权衡,选择有较高收益的动作。</p></li><li><p>本文将trace数据建模为一个属性图,属性图用于构建系统状态模型,分析动作的传播影响等。</p></li></ul><ol start="7"><li><strong>MicroRCA: Root Cause Localization of Performance Issues in Microservices</strong></li></ol><p>Li Wu; Johan Tordsson; Erik Elmroth; Odej Kao</p><p>NOMS 2020 - 2020 IEEE/IFIP Network Operations and Management Symposium</p><ul><li>我们提出了一种新的与应用无关的系统MicroRCA,用于定位基于容器的微服务中性能异常的根本原因。该方法构造了一个属性图模型,将服务异常性能症状与相应的资源利用率相关联,从而推断出异常微服务。</li></ul><ol start="8"><li><a name="lundetecting2021"><sup>[1]</sup></a> <strong>Detecting anomalies in microservices with execution trace comparison</strong></li></ol><p>Lun Meng, Feng Ji, Yao Sun, TaoWang</p><p>Future Generation Computer Systems, Volume 116, March 2021, Pages 291-301</p><p>本文提出了一种基于调用链数据的异常检测方法,拟检测的异常分为两种:1) 调用关系异常:调用关系本身并不是一成不变的,调用关系会因为调用时的参数而发生动态变化,但是某些异常会导致微服务调用不常见的异常的调用总是与已有的调用关系不同,因此可以被检测出来;2) 调用响应异常:这种异常不会破坏调用关系,但是会直接影响服务迟延,因此也是异常。</p><p>为解决调用关系异常,本文<strong>首先</strong>收集软件测试期间的调用链数据用于合成应用运行时的tarce tree,注意由于trace tree会因为调用携带的参数不同而不同,因此在软件测试阶段收集(几乎)所有情况的调用关系数据是有必要的。<strong>然后</strong>将实时的调用链数据与刚才的baseline之间计算最小编辑距离(baseline中有多种调用baseline,因此需要与每一个baseline计算他们之间的距离,取距离最小的baseline作为正常模板,并计算anomaly score),以作为anomaly score。</p><blockquote><p> To construct a baseline execution trace set S, for every arrival execution trace T_i, if we<br> cannot match T_i with an execution trace C_i in S, we add T_i in S.</p></blockquote><p>调用时间在物理资源充分的情况下,一般是不会有大的波动的,因此,调用时间的激增就可以被视为一个调用时间异常。为了识别_激增_,本文使用coefficient of variation(CV)来表示一次请求的响应时间异常程度。此时的trace数据被用一个$M \times N$的metrix表示,其中第m行第n列表示,在第i次用户请求时,第j个组件(微服务)的响应情况。然后借助PCA对矩阵进行分解,用来识别导致异常的微服务。这里没太看懂。</p><p><img src="https://user-images.githubusercontent.com/16149619/115550685-b9e17f00-a2dc-11eb-9094-980b220a86f3.png" alt="image"><br><img src="https://user-images.githubusercontent.com/16149619/115550693-bea63300-a2dc-11eb-8f08-d44732ca1b41.png" alt="image"></p><p><br>9. <strong>Midiag: A Sequential Trace-Based Fault Diagnosis Framework for Microservices</strong></p><pre><code>Lun Meng, Yao Sun, Shudong ZhangInternational Conference on Services Computing, SCC 2020: Services Computing – SCC 2020 pp 137-144</code></pre><ul><li>本文还是将异常检测问题转化为下一system call预测问题,这里的system call也是用模板来代替,与之前的不同的是,这里的模板是通过K-means+最长公共子序列搜索来得到的,这与日志相关的工作十分相似。</li></ul><hr><h4 id="基于日志数据"><a href="#基于日志数据" class="headerlink" title="基于日志数据"></a>基于日志数据</h4><ul><li><p>基于日志解析的大规模微服务架构软件系统异常检测</p><p> Anomaly Detection of Large Scale Microservice Architecture Software System Based on Log Parsing</p><p> 邰丽媛, 田春岐 :同济大学计算机科学与技术系,上海;<br> 王 伟 :华东师范大学数据科学学院,上海;</p></li><li><p>Root-Cause Metric Location for Microservice Systems via Log Anomaly Detection</p><p> Lingzhi Wang; Nengwen Zhao; Junjie Chen; Pinnong Li; Wenchi Zhang; Kaixin Sui</p><p> 2020 IEEE International Conference on Web Services (ICWS) </p></li><li><p>Anomaly Detection of Large Scale Microservice Architecture Software System Based on Log Parsing</p><p> Liyuan Tai, Chun-qi Tian, W. Wang</p></li></ul><hr><h4 id="基于时间序列数据"><a href="#基于时间序列数据" class="headerlink" title="基于时间序列数据"></a>基于时间序列数据</h4><ul><li>Localizing Failure Root Causes in a Microservice through Causality Inference Yuan Meng; Shenglin Zhang; Yongqian Sun; Ruru Zhang; Zhilong Hu; Yiyin Zhang, … 2020 IEEE/ACM 28th International Symposium on Quality of Service (IWQoS) <strong>基于微服务KPI数据的关联推断方法</strong> 我们设计了一种新的PCTS(路径条件时间序列)算法,在充分利用传播延迟的情况下学习监控指标的依赖图。在PCTS中,我们首先采用改进的PC[10]学习时间序列中每个点的因果图。然后生成两个时间序列之间的边,生成失效因果图。 我们提出了一种新的基于时间原因的随机漫步(TCORW)方法。在TCORW中,我们成功地整合了三种类型的信息:(1)监测指标的因果关系;(2)度量的异常信息,包括发生时间和异常程度;(3)基于领域知识获得的度量优先级 结合PCTS和TCORW,我们提出了一个新的框架——微原因,来推断微服务失败的前N个根本原因。据我们所知,这是在微服务中定位故障根源的第一个工作。</li></ul><ul><li>Performance Diagnosis in Cloud Microservices using Deep Learning Li Wu, Jasmin Bogatinovski, Sasho Nedelkoski, Johan Tordsson and Odej Kao 2020 AIOPS workshop <strong>多源时间序列的异常检测与根因定位</strong>——我们从多个数据源收集数据,包括应用程序、操作系统和网络,以提供由不同根源(如软件bug、硬件问题、资源争用等)引起的性能问题的罪魁祸首。我们的系统被设计成<strong>与应用程序无关的</strong>,不需要应用程序使用仪器来获取数据。相反,我们收集应用程序和运行时系统本身报告的指标。 本文提出了一种应用不可知系统,以细粒度定位微服务性能下降的罪魁祸首,不仅包括产生性能问题的异常服务,还包括与服务异常相关的罪魁祸首指标。我们的方法首先通过构建服务依赖图来发现潜在的罪魁祸首服务,然后应用自动编码器根据重构错误的排序列表来识别异常服务指标。 我们采用两阶段方法进行异常检测和根本原因分析(系统概述在第3节中描述)。在第一阶段,我们根据基于图的方法[16]对导致故障的服务进行建模。这使我们能够通过识别导致错误服务性能下降的根本原因(异常度量)来查明引发性能下降的潜在错误服务。第二阶段,对潜在故障的推断,是基于以下假设:故障行为的最重要症状与正常运行时的值存在显著偏差。在任何时间点测量每个症状的个体贡献,从而导致观察到的行为与正常行为之间的差异,从而可以定位最可能反映故障的症状。有了这个假设,我们的目标是在正常系统行为下模拟症状值</li></ul><ul><li><p>TELESTO: A Graph Neural Network Model for Anomaly Classification in Cloud Services</p><p> Dominik Scheinert and Alexander Acker</p><p> 2020 AIOPS workshop</p><p> <strong>多维时间序列的异常分类任务</strong>,即不仅只识别正异常,还是被异常的种类</p><p> 我们提出了一种通过训练分类模型来识别再次出现的异常的方法。利用系统度量数据,如CPU利用率、已分配内存或磁盘I/O统计数据,并将这些数据建模为多元时间序列,我们的模型能够识别异常类型特定的模式,并为它们分配各自的异常标签。</p><p> 我们提出的模型架构TELESTO利用一种新颖的图神经网络架构,在空间和时间维度上利用多变量时间序列建模为图。它不受维度变化的影响,优于其他两种常用的图神经网络方法。</p></li></ul><hr><h2 id="Dataset"><a href="#Dataset" class="headerlink" title="Dataset"></a>Dataset</h2><ol><li>ToN IoT-The role of heterogeneity and the need for standardization of features and attack types in IoT network intrusion datasets</li></ol><ol start="2"><li><p><a href="https://zenodo.org/record/3549604#.YEGfWI5LiUk">https://zenodo.org/record/3549604#.YEGfWI5LiUk</a></p><p> 一个公开数据集,里面包括AIOPS常见的三种数据</p></li><li><p>test bed搭建:</p><p> framework:OpenStack, Kolla-Ansible(dockerized environment)/k8s<br> For the <strong>metrics</strong> collection across the physical nodes in the infrastructure, we utilize <a href="20">Glances</a>,</p><p> OpenStack introduces a small but powerful library called <a href="21"><em>osprofiler</em></a> that is used by all OpenStack projects and their Python clients to generate <strong>traces</strong>.</p><p> The <strong>log</strong> files are distributed over the infrastructure and they are grouped in directories by the OpenStack projects (e.g., nova, neutron, glance, etc.) at the wally nodes.</p><p> anomaly injection: (ref) Multi-source Distributed System Data for AI-Powered Analytics</p><p> To generate workloads and inject faults into the infrastructure we used <a href="25">Rally</a> </p></li></ol><h3 id="Trace-data"><a href="#Trace-data" class="headerlink" title="Trace data"></a>Trace data</h3><ol><li><p>Azure Public dataset: composes of two datasets representing two representative traces of the virtual machine of Microsoft Azure</p><p> <a href="5">link</a></p></li><li><p>Alibaba’s cluster data is a collection of two datasets from real-world production</p><p> [link](2, 14, 28)</p></li><li><p>Google’s collection of two tracing datasets originates from parts of Google cluster management software and systems</p><p> <a href="10">link</a></p></li></ol><h3 id="metric-data"><a href="#metric-data" class="headerlink" title="metric data"></a>metric data</h3><ol><li><p>A plethora of available collections of datasets containing metric data can be found in Stonybrook</p><p> <a href="31">link</a></p></li><li><p><a href="1">Numenta</a> predominantly contains datasets<br>from streaming and real-time applications, while <a href="9">Harvard</a>, <a href="8">ELKI</a>, <a href="15">LMU</a> store network intrusion data.</p></li></ol><h3 id="log-data"><a href="#log-data" class="headerlink" title="log data"></a>log data</h3><ol><li><p>The <a href="3">CFDR resource</a> stores links or 19 log datasets grouped in 11 data collections.</p></li><li><p>The second resource is the <a href="35">loghub data resource</a>.</p></li></ol><hr><h2 id="没啥用的文章"><a href="#没啥用的文章" class="headerlink" title="没啥用的文章"></a>没啥用的文章</h2><ul><li><p><strong>MicroMon: A Monitoring Framework for Tackling Distributed Heterogeneity</strong></p><p> Babar Khalid, Nolan Rudolph, Ramakrishnan Durairajan, Sudarsun Kannan</p><p> 12th {USENIX} Workshop on Hot Topics in Storage and File Systems (HotStorage 20).</p><p> 没啥用,描述了一种在微服务上运行的监视系统,主要应对的是微服务系统的软件/硬件异质性问题,提高监视系统的吞吐量</p></li><li><p><strong>Advancing Monitoring in Microservices Systems</strong></p><p> Marcello Cinque; Raffaele Della Corte; Antonio Pecchia</p><p> 2019 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)</p><p> 没啥用,描述了一种在微服务上运行的监视系统</p></li><li><p><strong>学位论文:ANOMALY DETECTION IN CLOUD-NATIVE SYSTEMS</strong></p><p> Surace, Pino’ (2019)</p><p> 没啥用,讲了一些云原生应用中的组件综述,以及如何用简单的ML方法利用微服务中的时间序列做异常检测</p></li><li><p>Artificial Intelligence for IT Operations (AIOPS) Workshop White Paper</p><p> Jasmin Bogatinovski, Sasho Nedelkoski, Alexander Acker, Florian Schmidt, Thorsten Wittkopp, Soeren Becker, Jorge Cardoso, Odej Kao</p><p> white paper for the AIOPS 2020 workshop at ICSOC 2020</p></li></ul>]]></content>
<tags>
<tag> AIOps </tag>
<tag> survey </tag>
<tag> micro-service </tag>
<tag> cloud native </tag>
<tag> dataset </tag>
</tags>
</entry>
<entry>
<title>Some Notes</title>
<link href="/uncategorized/notes/notes/"/>
<url>/uncategorized/notes/notes/</url>
<content type="html"><![CDATA[<ol><li><strong>posterior collapse</strong>: where the latents are ignored when they are paired with a powerful autoregressive decoder — typically observed in the VAE framework, i.e., the latents are ignored as the decoder is powerful enough to model x perfectly.个人理解是某些VAE的decoder的重建能力过于好导致重建误差很小,最后模型不能很好的最小化prior相关的loss item。</li></ol><span id="more"></span><ol start="2"><li>关于self-supervised learning中的附加任务:<ol><li>一种常用且简单的任务:对原始数据$x$做一定的transformation($t$),即$t(x)$,然后训练一个附加任务,即区分这些transformation的种类。<br> <strong>目的</strong>:训练神经网络辨别图片是否为自然的。(To predict such transformations, a model should distinguish between what is semantically natural or not, and consequently, it<br> learns high-level semantic representations of inputs.)</li><li>当仅使用了transformation后的数据$t(x)$而没有将对应的self-supervised learning task加入到loss中,则该方法退化为<strong>数据增强</strong>。 <strong>目的</strong>:提高网络的通用性,即对多种变化后的样本,都能够提取出差不多的语义信息<blockquote><p>This conventional data augmentation aims to improve upon<br>the generalization ability of the target neural network f by<br>leveraging certain transformations that can preserve their semantics,<br>e.g., cropping, contrast enhancement, and flipping.</p></blockquote> <strong>缺点</strong>:另一方面,如果transformation修改了数据中的语义,则转换的不变属性可能会干扰语义表示学习(请参阅第3.2节中的表1)。<blockquote><p>On the other hands, if a transformation modifies the semantics,<br>the invariant property with respect to the transformation<br>could interfere with semantic representation learning</p></blockquote></li></ol></li><li><strong>mode collapse</strong>: meaning that GANs have the tendency to only generate a subset of the original dataset.</li></ol><ol start="4"><li><p>Tensorflow1中关于TensorArray的用法</p><pre class="line-numbers language-python" data-language="python"><code class="language-python"><span class="token comment"># 改动TensorFlow源码的时候,使用以下语句</span>i_list <span class="token operator">=</span> tensor_array_ops<span class="token punctuation">.</span>TensorArray<span class="token punctuation">(</span>size<span class="token operator">=</span><span class="token number">0</span><span class="token punctuation">,</span> dtype<span class="token operator">=</span>dtype<span class="token punctuation">,</span> dynamic_size<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span> name<span class="token operator">=</span><span class="token string">'i_list'</span><span class="token punctuation">,</span> clear_after_read<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">)</span><span class="token comment"># 当在其他环境中时,tensor_array_ops -> tf</span><span class="token comment"># 使用</span>self<span class="token punctuation">.</span>j_list<span class="token punctuation">.</span>write<span class="token punctuation">(</span>_step<span class="token punctuation">,</span> value<span class="token punctuation">)</span><span class="token punctuation">.</span>mark_used<span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token triple-quoted-string string">'''def write(self, index, value, name=None): """Write `value` into index `index` of the TensorArray. Args: index: 0-D. int32 scalar with the index to write to. value: N-D. Tensor of type `dtype`. The Tensor to write to this index. name: A name for the operation (optional). Returns: A new TensorArray object with flow that ensures the write occurs. Use this object all for subsequent operations.'''</span><span class="token comment"># 上述语句的返回需要被使用,当没有被使用的时候会在log中生成一个warning,解决方案是在write()后使用方法:mark_used()</span><span aria-hidden="true" class="line-numbers-rows"><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span></span></code></pre></li><li><p>在对LSTM/GRU的更新过程做数学上的推导的时候,遇到问题,$W * [h_{t-1}, x_t] + b$中的$concat$操作如何在数学上分析?其中:</p><ul><li>shape of W: [hidden_dim, hidden_dim + data_dim](10, 11)</li></ul></li></ol><ul><li>shape of h: [hidden_dim, 1](10, 1)<ul><li>shape of x: [data_dim, 1](1,1)</li><li>shape of b: [hidden_dim, 1](10, 1)</li></ul>在代码中,向量为行向量,因此上述公式在代码中的实现为:$[h_{t-1}, x_t] * W + b$, 其中:<ul><li>shape of W: [hidden_dim + data_dim, hidden_dim](11, 10)</li><li>shape of h: [1, hidden_dim](1, 10)</li><li>shape of x: [1, data_dim](1,1)</li><li>shape of b: [1, hidden_dim](10, 1)</li></ul>上述的式子$W * [h_{t-1}, x_t] + b$由于存在$concat$操作,而难以使用数学分析,此时可以使用$W’ * h_{t-1} + u * x_t + b$代替,其中W’: [hidden_dim, hidden_dim], u: [hidden_dim, 1], W = [W’, u]</li></ul> <pre class="line-numbers language-python" data-language="python"><code class="language-python">h <span class="token operator">=</span> np<span class="token punctuation">.</span>arange<span class="token punctuation">(</span><span class="token number">10</span><span class="token punctuation">)</span><span class="token punctuation">.</span>reshape<span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">,</span><span class="token number">10</span><span class="token punctuation">)</span> <span class="token comment"># [1, 10]</span><span class="token builtin">input</span> <span class="token operator">=</span> np<span class="token punctuation">.</span>arange<span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">)</span><span class="token punctuation">.</span>reshape<span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">,</span><span class="token number">1</span><span class="token punctuation">)</span> <span class="token comment"># [1, 1]</span><span class="token keyword">print</span><span class="token punctuation">(</span><span class="token string">'h shape {}, input shape: {}, concat shape: {}'</span><span class="token punctuation">.</span><span class="token builtin">format</span><span class="token punctuation">(</span>h<span class="token punctuation">.</span>shape<span class="token punctuation">,</span> <span class="token builtin">input</span><span class="token punctuation">.</span>shape<span class="token punctuation">,</span> np<span class="token punctuation">.</span>concatenate<span class="token punctuation">(</span><span class="token punctuation">[</span>h<span class="token punctuation">,</span> <span class="token builtin">input</span><span class="token punctuation">]</span><span class="token punctuation">,</span> axis<span class="token operator">=</span><span class="token number">1</span><span class="token punctuation">)</span><span class="token punctuation">.</span>shape<span class="token punctuation">)</span><span class="token punctuation">)</span>res <span class="token operator">=</span> np<span class="token punctuation">.</span>matmul<span class="token punctuation">(</span>np<span class="token punctuation">.</span>concatenate<span class="token punctuation">(</span><span class="token punctuation">[</span>h<span class="token punctuation">,</span> <span class="token builtin">input</span><span class="token punctuation">]</span><span class="token punctuation">,</span> axis<span class="token operator">=</span><span class="token number">1</span><span class="token punctuation">)</span><span class="token punctuation">,</span> w_i<span class="token punctuation">)</span> <span class="token operator">+</span> b_ires1 <span class="token operator">=</span> np<span class="token punctuation">.</span>matmul<span class="token punctuation">(</span>h<span class="token punctuation">,</span> w_i<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">:</span><span class="token number">10</span><span class="token punctuation">,</span> <span class="token punctuation">:</span><span class="token punctuation">]</span><span class="token punctuation">)</span> <span class="token operator">+</span> np<span class="token punctuation">.</span>matmul<span class="token punctuation">(</span><span class="token builtin">input</span><span class="token punctuation">,</span> w_i<span class="token punctuation">[</span><span class="token operator">-</span><span class="token number">1</span><span class="token punctuation">,</span> <span class="token punctuation">:</span><span class="token punctuation">]</span><span class="token punctuation">.</span>reshape<span class="token punctuation">(</span><span class="token number">1</span><span class="token punctuation">,</span> <span class="token number">10</span><span class="token punctuation">)</span><span class="token punctuation">)</span> <span class="token operator">+</span> b_ipprint<span class="token punctuation">(</span>res<span class="token punctuation">)</span>pprint<span class="token punctuation">(</span>res1<span class="token punctuation">)</span>pprint<span class="token punctuation">(</span>res <span class="token operator">==</span> res1<span class="token punctuation">)</span><span aria-hidden="true" class="line-numbers-rows"><span></span><span></span><span></span><span></span><span></span><span></span><span></span><span></span></span></code></pre><p> output<br> <pre class="line-numbers language-console" data-language="console"><code class="language-console">h shape (1, 10), input shape: (1, 1), concat shape: (1, 11)array([[ 1.61824865, -4.07500279, 1.56455836, 1.2925876 , -2.16892738, -2.38041244, -2.28631814, 2.84973208, -4.34229152, -1.44608608]])array([[ 1.61824865, -4.07500279, 1.56455836, 1.2925876 , -2.16892738, -2.38041244, -2.28631814, 2.84973208, -4.34229152, -1.44608608]])array([[ True, True, True, True, True, True, True, True, True, True]])<span aria-hidden="true" class="line-numbers-rows"><span></span><span></span><span></span><span></span><span></span><span></span><span></span></span></code></pre></p>]]></content>
<tags>
<tag> note </tag>
</tags>
</entry>
<entry>
<title>Related Papers in AAAI 2021 (Feb 02-09 2021)</title>
<link href="/uncategorized/paperlistfile/AAAI2021/"/>
<url>/uncategorized/paperlistfile/AAAI2021/</url>
<content type="html"><![CDATA[<p><a href="https://aaai.org/Conferences/AAAI-21/wp-content/uploads/2020/12/AAAI-21_Accepted-Paper-List.Main_.Technical.Track_.pdf">Accept paper list</a></p><span id="more"></span><h2 id="anomaly-detection-anomaly-outlier-out-of-distribution-one-class-Malware-detection-…"><a href="#anomaly-detection-anomaly-outlier-out-of-distribution-one-class-Malware-detection-…" class="headerlink" title="anomaly detection [anomaly, outlier, out-of-distribution, one-class, Malware detection, …]"></a>anomaly detection [anomaly, outlier, out-of-distribution, one-class, Malware detection, …]</h2><ul><li><p>LREN: Low-Rank Embedded Network for Sample-Free Hyperspectral Anomaly Detection</p><p> Kai Jiang, Weiying Xie, Jie Lei, Tao Jiang, Yunsong Li</p></li><li><p>GAN Ensemble for Anomaly Detection</p><p> Xiaohui Chen, Xu Han, Liping Liu</p></li><li><p>Anomaly Attribution with Likelihood Compensation</p><p> Tsuyoshi Ide, Amit Dhurandhar, Jiri Navratil, Moninder Singh, Naoki Abe</p></li><li><p>Regularizing Attention Networks for Anomaly Detection in Visual Question Answering</p><p> Doyup Lee, Yeongjae Cheon, Wook-Shin Han</p></li><li><p>Appearance-Motion Memory Consistency Network for Video Anomaly Detection</p><p> Ruichu Cai, Hao Zhang, Wen Liu, Shenghua Gao, Zhifeng Hao</p></li><li><p><strong>【看一下】</strong> Learning Semantic Context from Normal Samples for Unsupervised Anomaly Detection</p><p> Xudong Yan, Huaidong Zhang, Xuemiao Xu, Xiaowei Hu, Pheng-Ann Heng</p></li><li><p>Graph Neural Network-Based Anomaly Detection in Multivariate Time Series</p><p> Ailin Deng, Bryan Hooi</p></li><li><p><strong>【重点阅读】</strong> Time Series Anomaly Detection with Multiresolution Ensemble Decoding</p><p> Lifeng Shen, Zhongzhong Yu, Qianli Ma, James Tin-Yau Kwok</p></li><li><p><strong>【看一下】</strong> Outlier Impact Characterization for Time Series Data</p><p> Jianbo Li, Lecheng Zheng, Yada Zhu, Jingrui He</p></li><li><p>Graph Neural Network to Dilute Outliers for Refactoring Monolith Application</p><p> Utkarsh Desai, Sambaran Bandyopadhyay, Srikanth Tamilselvam</p></li><li><p>Accelerated Combinatorial Search for Outlier Detection with Provable Bound on Sub-<br>Optimality</p><p> Guihong Wan, Haim Schweitzer</p></li><li><p><strong>【看一下】</strong> Neighborhood Consensus Networks for Unsupervised Multi-View Outlier Detection</p><p> Li Cheng, Yijie Wang, Xinwang Liu</p></li><li><p>DecAug: Out-of-Distribution Generalization via Decomposed Feature Representation and<br>Semantic Augmentation</p><p> Haoyue Bai, Rui Sun, Lanqing Hong, Fengwei Zhou, Nanyang Ye, Han-Jia Ye, Gary Chan, Zhenguo Li</p></li><li><p>Few-Shot One-Class Classification via Meta-Learning</p><p> Ahmed Frikha, Denis Krompass, Hans-Georg Koepken, Volker Tresp</p></li><li><p>Classifying Sequences of Extreme Length with Constant Memory Applied to Malware<br>Detection</p><p> Edward Raff, William Fleshman, Richard J Zak, Hyrum Anderson, Bobby Filar, Mark McLean</p></li><li><p>Disentangled Representation Learning in Heterogeneous Information Network for Large-<br>Scale Android Malware Detection in the COVID-19 Era and Beyond</p><p> Shifu Hou, Yujie Fan, Mingxuan Ju, Yanfang Ye, Wenqiang Wan, Kui Wang, Yinming Mei, Qi Xiong,<br>Fudong Shao</p></li></ul><h2 id="heterogeneous"><a href="#heterogeneous" class="headerlink" title="heterogeneous"></a>heterogeneous</h2><ul><li><p>Embedding Heterogeneous Networks into Hyperbolic Space without Meta-‐Path</p><p> Lili Wang, Chongyang Gao, Chenghan Huang, Ruibo Liu, Weicheng Ma, Soroush Vosoughi</p></li><li><p>Synergetic Learning of Heterogeneous Temporal Sequences for Multi-‐Horizon Probabilistic Forecasting</p><p> Longyuan Li, Jihai Zhang, Junchi Yan, Yaohui Jin, Yunhao Zhang, Yanjie Duan, Guangjian Tian</p></li><li><p>Multi-‐Modal Multi-‐Label Emotion Recognition with Heterogeneous Hierarchical Message Passing</p><p> Dong Zhang, Xincheng Ju, Wei Zhang, Junhui Li, Shoushan Li, Zhu Qiaoming, Zhou Guodong</p></li><li><p>Heterogeneous Graph Structure Learning for Graph Neural Networks</p><p> Jianan Zhao, Xiao Wang, Chuan Shi, Binbin Hu, Guojie Song, Yanfang Ye</p></li><li><p>Disentangled Representation Learning in Heterogeneous Information Network for Large-‐<br>Scale Android Malware Detection in the COVID-‐19 Era and Beyond</p><p> Shifu Hou, Yujie Fan, Mingxuan Ju, Yanfang Ye, Wenqiang Wan, Kui Wang, Yinming Mei, Qi Xiong, Fudong Shao </p></li><li><p>MERL: Multimodal Event Representation Learning in Heterogeneous Embedding Spaces</p><p> Linhai Zhang, Deyu Zhou, Yulan He, Zeng Yang</p></li><li><p>Modeling Heterogeneous Relations across Multiple Modes for Potential Crowd Flow Prediction</p><p> Qiang Zhou, Jingjing Gu, Xinjiang Lu, Fuzhen Zhuang, Yanchao Zhao, Qiuhong Wang, Xiao Zhang</p></li><li><p><strong>【重要】</strong> Infusing Multi-‐Source Knowledge with Heterogeneous Graph Neural Network for Emotional Conversation Generation</p><p> Yunlong Liang, Fandong Meng, Ying Zhang, Yufeng Chen, Jinan Xu, Jie Zhou</p></li><li><p>HARGAN: Heterogeneous Argument Attention Network for Persuasiveness Prediction</p><p> Kuo-‐Yu Huang, Hen-‐Hsen Huang, Hsin-‐Hsi Chen</p></li><li><p>Deep Innovation Protection: Confronting the Credit Assignment Problem in Training Heterogeneous Neural Architectures</p><p> Sebastian Risi, Kenneth O Stanley</p></li><li><p>Real-‐Time Tropical Cyclone Intensity Estimation by Handling Temporally Heterogeneous Satellite Data</p><p> Boyo Chen, Buo-‐Fu Chen, Yun-‐Nung Chen</p></li></ul><h2 id="Time-series"><a href="#Time-series" class="headerlink" title="Time series"></a>Time series</h2><ul><li><p>Deep Switching Auto-Regressive Factorization: Application to Time Series Forecasting</p><p> Amirreza Farnoosh, Bahar Azari, Sarah Ostadabbas</p></li><li><p><strong>【重点阅读】</strong> Dynamic Gaussian Mixture Based Deep Generative Model for Robust Forecasting on Sparse<br>Multivariate Time Series</p><p> Yinjun Wu, Jingchao Ni, Wei Cheng, Bo Zong, Dongjin Song, Zhengzhang Chen, Yanchi Liu, Xuchao<br>Zhang, Haifeng Chen, Susan B Davidson</p></li><li><p>Second Order Techniques for Learning Time-Series with Structural Breaks</p><p> Takayuki Osogami</p></li><li><p>Correlative Channel-Aware Fusion for Multi-View Time Series Classification</p><p> Yue Bai, Lichen Wang, Zhiqiang Tao, Sheng Li, Yun Fu</p></li></ul><ul><li><p><strong>【看一下】</strong> Learnable Dynamic Temporal Pooling for Time Series Classification</p><p> Dongha Lee, Seonghyeon Lee, Hwanjo Yu</p></li><li><p>Time Series Domain Adaptation via Sparse Associative Structure Alignment</p><p> Ruichu Cai, Jiawei Chen, Zijian Li, Wei Chen, Keli Zhang, Junjian Ye, Zhuozhang Li, Xiaoyan Yang,<br>Zhenjie Zhang</p></li><li><p><strong>【看一下】</strong> Learning Representations for Incomplete Time Series Clustering</p><p> Qianli Ma, Chuxin Chen, Sen Li, Garrison Cottrell</p></li><li><p>Temporal Latent Autoencoder: A Method for Probabilistic Multivariate Time Series<br>Forecasting</p><p> Nam Nguyen, Brian Quanz</p></li></ul><ul><li><p>ShapeNet: A Shapelet-Neural Network Approach for Multivariate Time Series Classification</p><p> Guozhong Li, Byron Choi, Jianliang Xu, Sourav S Bhowmick, Kwok-Pan Chun, Grace Lai-Hung Wong</p></li><li><p>Joint-Label Learning by Dual Augmentation for Time Series Classification</p><p> Qianli Ma, Zhenjing Zheng, Jiawei Zheng, Sen Li, Wanqing Zhuang, Garrison Cottrell</p></li><li><p><strong>【Best paper award】</strong> Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting</p><p> Haoyi Zhou, Shanghang Zhang, Jieqi Peng, Shuai Zhang, Jianxin Li, Hui Xiong, Wancai Zhang</p></li><li><p>Meta-Learning Framework with Applications to Zero-Shot Time-Series Forecasting</p><p> Boris N. Oreshkin, Dmitri Carpov, Chapados Nicolas, Yoshua Bengio</p></li></ul><h2 id="about-deep-learning"><a href="#about-deep-learning" class="headerlink" title="about deep learning"></a>about deep learning</h2><ul><li><p>Deep Frequency Principle Towards Understanding Why Deeper Learning Is Faster</p><p> Zhiqin John Xu, Hanxu Zhou</p></li><li><p>Understanding Decoupled and Early Weight Decay</p><p> Johan Björck, Kilian Weinberger, Carla P Gomes</p></li></ul><h2 id="sequence"><a href="#sequence" class="headerlink" title="sequence"></a>sequence</h2><ul><li><p>Copy That! Editing Sequences by Copying Spans</p><p> Sheena L Panthaplackel, Miltiadis Allamanis, Marc Brockschmidt</p></li><li><p>Semi-Supervised Knowledge Amalgamation for Sequence Classification</p><p> Jidapa Thadajarassiri, Thomas Hartvigsen, Xiangnan Kong, Elke Rundensteiner</p></li><li><p>Neural Sequence-to-Grid Module for Learning Symbolic Rules</p><p> Segwang Kim, Hyoungwook Nam, Joonyoung Kim, Kyomin Jung</p></li></ul><ul><li><p>Synergetic Learning of Heterogeneous Temporal Sequences for Multi-Horizon Probabilistic<br>Forecasting</p><p> Longyuan Li, Jihai Zhang, Junchi Yan, Yaohui Jin, Yunhao Zhang, Yanjie Duan, Guangjian Tian</p></li></ul><ul><li><p>Semi-Supervised Sequence Classification through Change Point Detection</p><p> Nauman Ahad, Mark Davenport</p></li><li><p>Bridging Towers of Multi-Task Learning with a Gating Mechanism for Aspect-Based<br>Sentiment Analysis and Sequential Metaphor Identification</p><p> Rui Mao, Xiao Li</p></li><li><p>Deterministic Mini-Batch Sequencing for Training Deep Neural Networks</p><p> Subhankar Banerjee, Shayok Chakraborty</p></li><li><p><strong>【看一下】</strong> SeCo: Exploring Sequence Supervision for Unsupervised Representation Learning</p><p> Ting Yao, Yiheng Zhang, Zhaofan Qiu, Yingwei Pan, Tao Mei</p></li><li><p>Answering Complex Queries in Knowledge Graphs with Bidirectional Sequence Encoders</p><p> Bhushan Kotnis, Carolin Lawrence, Mathias Niepert</p></li><li><p>Residual Shuffle-Exchange Networks for Fast Processing of Long Sequences</p><p> Andis Draguns, Emīls Ozoliņš, Agris Šostaks, Matīss Apinis, Karlis Freivalds</p></li><li><p>Entity Guided Question Generation with Contextual Structure and Sequence Information<br>Capturing</p><p> Qingbao Huang, Mingyi Fu, Linzhang Mo, Yi Cai, Jingyun Xu, Pijian Li, Qing Li, Ho-fung Leung</p></li><li><p>Learning from History: Modeling Temporal Knowledge Graphs with Sequential Copy-<br>Generation Networks</p><p> Cunchao Zhu, Muhao Chen, Changjun Fan, Guangquan Cheng, Yan Zhang</p></li></ul><ul><li><p><strong>【看一下】</strong> Continuous-Time Attention for Sequential Learning</p><p> Yi-Hsiang Chen, Jen-Tzung Chien</p></li><li><p>Interpretable Sequence Classification via Discrete Optimization</p><p> Maayan Shvo, Andrew C Li, Rodrigo A Toro Icarte, Sheila A. McIlraith</p></li></ul><h2 id="interpretable-Understanding-explanation-Attribution-…"><a href="#interpretable-Understanding-explanation-Attribution-…" class="headerlink" title="interpretable [Understanding, explanation, Attribution …]"></a>interpretable [Understanding, explanation, Attribution …]</h2><ul><li><p>Building Interpretable Interaction Trees for Deep NLP Models</p><p> Die Zhang, HuiLin Zhou, Xiaoyi Bao, Da Huo, Ruizhao Chen, Hao Zhang, Xu Cheng, Mengyue Wu,<br>Quanshi Zhang</p></li><li><p>Interpretable Embedding Procedure Knowledge Transfer via Stacked Principal Component<br>Analysis and Graph Neural Network</p><p> Seunghyun Lee, Byung Cheol Song</p></li><li><p>Interpreting Neural Networks as Quantitative Argumentation Frameworks</p><p> Nico Potyka</p></li><li><p>Interpretable Clustering on Dynamic Graphs with Recurrent Graph Neural Networks</p><p> Yuhang Yao, Carlee Joe-Wong</p></li><li><p>Interpreting Deep Neural Networks with Relative Sectional Propagation by Analyzing<br>Comparative Gradients and Hostile Activations</p><p> Woo Jeoung Nam, Jaesik Choi, Seong-Whan Lee</p></li><li><p>Human-Level Interpretable Learning for Aspect-Based Sentiment Analysis</p><p> Rohan K Yadav, Lei Jiao, Ole-Christoffer Granmo, Morten Goodwin</p></li><li><p>Learning Accurate and Interpretable Decision Rule Sets from Neural Networks</p><p> Litao Qiao, Weijia Wang, Bill Lin</p></li><li><p>Learning Interpretable Models for Couple Networks under Domain Constraints</p><p> Hongyuan You, Sikun Lin, Ambuj Singh</p></li><li><p>Explanation Consistency Training: Facilitating Consistency-Based Semi-Supervised Learning<br>with Interpretability</p><p> Tao Han, Wei-Wei Tu, Yu-Feng Li</p></li><li><p>i-Algebra: Towards Interactive Interpretability of Deep Neural Networks</p><p> Xinyang Zhang, Pang Ren, Shouling Ji, Fenglong Ma, Ting Wang</p></li><li><p><strong>【看一下】</strong> Explainable Models with Consistent Interpretations</p><p> Vipin Pillai, Hamed Pirsiavash</p></li><li><p>Iterative Bounding MDPs: Learning Interpretable Policies via Non-Interpretable Methods</p><p> Nicholay Topin, Stephanie Milani, Fei Fang, Manuela Veloso</p></li><li><p>HyDRA: Hypergradient Data Relevance Analysis for Interpreting Deep Neural Networks</p><p> Yuanyuan Chen, Boyang Li, Han Yu, Pengcheng Wu, Chunyan Miao</p></li><li><p>Interpreting Multivariate Shapley Interactions in DNNs</p><p> Hao Zhang, Yichen Xie, Longjie Zheng, Die Zhang, Quanshi Zhang</p></li><li><p><strong>【看一下】</strong> Self-Attention Attribution: Interpreting Information Interactions Inside Transformer</p><p> Yaru Hao, Li Dong, Furu Wei, Ke Xu</p></li><li><p>Interpretable Sequence Classification via Discrete Optimization</p><p> Maayan Shvo, Andrew C Li, Rodrigo A Toro Icarte, Sheila A. McIlraith</p></li><li><p><strong>【看一下】</strong> The Heads Hypothesis: A Unifying Statistical Approach towards Understanding Multi-Headed<br>Attention in BERT</p><p> Madhura Pande, Aakriti Budhraja, Preksha Nema, Pratyush Kumar, Mitesh M. Khapra</p></li><li><p>Ordered Counterfactual Explanation by Mixed-Integer Linear Optimization</p><p> Kentaro Kanamori, Takuya Takagi, Ken Kobayashi, Yuichi Ike, Kento Uemura, Hiroki Arimura</p></li><li><p>Strong Explanations in Abstract Argumentation</p><p> Markus Ulbricht, Johannes Peter Wallner</p></li><li><p>On Generating Plausible Counterfactual and Semi-Factual Explanations for Deep Learning</p><p> Eoin Kenny, Mark Keane</p></li><li><p>The Tractability of SHAP-Score-Based Explanations for Classification over Deterministic and<br>Decomposable Boolean Circuits</p><p> Marcelo Arenas, Pablo Barceló, Leopoldo Bertossi, Mikaël Monet</p></li><li><p>On the Tractability of SHAP Explanations</p><p> Guy Van den Broeck, Anton Lykov, Maximilian Schleich, Dan Suciu</p></li><li><p>Responsibility Attribution in Parameterized Markovian Models</p><p> Christel Baier, Florian Funke, Rupak Majumdar</p></li><li><p>A Unified Taylor Framework for Revisiting Attribution Methods</p><p> Huiqi Deng, Na Zou, Mengnan Du, Weifu Chen, Guocan Feng, Xia Hu</p></li><li><p><strong>【看一下】</strong> Explaining Convolutional Neural Networks through Attribution-Based Input Sampling and<br>Block-Wise Feature Aggregation</p><p> Sam Sattarzadeh, Mahesh Sudhakar, Anthony Lem, Shervin Mehryar, Konstantinos N Plataniotis,<br>Jongseong Jang, Hyunwoo Kim, Yeonjeong Jeong, SangMin Lee, Kyunghoon Bae</p></li><li><p><strong>【看一下】</strong> Visualization of Supervised and Self-Supervised Neural Networks via Attribution Guided<br>Factorization</p><p> Shir Gur, Ameen Ali, Lior Wolf</p></li><li><p>Enhanced Regularizers for Attributional Robustness</p><p> Anindya Sarkar, Anirban Sarkar, Vineeth N Balasubramanian</p></li><li><p><strong>【看一下】</strong> Explaining a Black-Box by Using a Deep Variational information Bottleneck Approach</p><p> Seojin Bang, Pengtao Xie, Heewook Lee, Wei Wu, Eric Xing</p></li></ul><ul><li><p>Explaining Neural Matrix Factorization with Gradient Rollback</p><p> Carolin Lawrence, Timo Sztyler, Mathias Niepert</p></li></ul><h2 id="Autoencoder"><a href="#Autoencoder" class="headerlink" title="Autoencoder"></a>Autoencoder</h2><ul><li><p>Content Learning with Structure-Aware Writing: A Graph-Infused Dual Conditional<br>Variational Autoencoder for Automatic Storytelling</p><p> Meng Hsuan Yu, Juntao Li , Zhangming Chan, Dongyan Zhao, Rui Yan</p></li><li><p><strong>【看一下】</strong> HOT-VAE: Learning High-Order Label Correlation for Multi-LabelClassification via Attention-<br>Based Variational Autoencoders</p><p> Wenting Zhao, Shufeng Kong, Junwen Bai, Daniel Fink, Carla P Gomes</p></li><li><p>Fractal Autoencoders for Feature Selection</p><p> Xinxing Wu, Qiang Cheng</p></li><li><p>Temporal Latent Autoencoder: A Method for Probabilistic Multivariate Time Series<br>Forecasting</p><p> Nam Nguyen, Brian Quanz</p></li><li><p>Open-Set Recognition with Gaussian Mixture Variational Autoencoders</p><p> Alexander Cao, Yuan Luo, Diego Klabjan</p></li><li><p>Unsupervised Learning of Discourse Structures Using a Tree Autoencoder</p><p> Patrick Huber, Giuseppe Carenini</p></li></ul><h2 id="missing-value-amp-irregularly-sampled-time-series-Incomplete-imputation-…"><a href="#missing-value-amp-irregularly-sampled-time-series-Incomplete-imputation-…" class="headerlink" title="missing value & irregularly sampled time series [Incomplete, imputation, …]"></a>missing value & irregularly sampled time series [Incomplete, imputation, …]</h2><ul><li><p>Generative Semi-Supervised Learning for Multivariate Time Series Imputation</p><p> Xiaoye Miao, Yangyang Wu, Jun Wang, Yunjun Gao, Xudong Mao, Jianwei Yin</p></li><li><p>Tripartite Collaborative Filtering with Observability and Selection for Debiasing Rating<br>Estimation on Missing-Not-at-Random Data</p><p> Qi Zhang, Longbing Cao, Chongyang Shi, Liang Hu</p></li><li><p>Unified Tensor Framework for Incomplete Multi-View Clustering and Missing-View Inferring</p><p> Jie Wen, Zheng Zhang, Zhao Zhang, Lei Zhu, Lunke Fei, Bob Zhang, Yong Xu</p></li><li><p>Quantification of Resource Production Incompleteness</p><p> Yakoub Salhi</p></li><li><p><strong>【看一下】</strong> Learning Representations for Incomplete Time Series Clustering</p><p> Qianli Ma, Chuxin Chen, Sen Li, Garrison Cottrell</p></li><li><p>The Parameterized Complexity of Clustering Incomplete Data</p><p> Eduard Eiben, Robert Ganian, Iyad Kanj, Sebastian Ordyniak, Stefan Szeider</p></li><li><p>Restricted Domains of Dichotomous Preferences with Possibly Incomplete Information</p><p> Zoi Terzopoulou, Alexander Karpov, Svetlana Obraztsova</p></li><li><p>Estimating the Number of Induced Subgraphs from Incomplete Data and Neighborhood<br>Queries</p><p> Dimitris Fotakis, Thanasis Pittas, Stratis Skoulakis</p></li></ul><h2 id="Recurrent-Neural-Network"><a href="#Recurrent-Neural-Network" class="headerlink" title="Recurrent Neural Network"></a>Recurrent Neural Network</h2><p>这部分都可以看一下</p><ul><li><p>Shuffling Recurrent Neural Networks</p><p> Michael Rotman, Lior Wolf</p></li><li><p>Memory-Gated Recurrent Networks</p><p> Yaquan Zhang, Qi Wu, Nanbo Peng, Min Dai, Jing Zhang, Hu Wang</p></li><li><p>On the Softmax Bottleneck of Recurrent Language Models</p><p> Dwarak Govind Parthiban, Yongyi Mao, Diana Inkpen</p></li><li><p>Forecasting Reservoir Inflow via Recurrent Neural ODEs</p><p> Fan Zhou, Liang Li</p></li></ul><h2 id="clustering"><a href="#clustering" class="headerlink" title="clustering"></a>clustering</h2><ul><li><p>Hierarchical Multiple Kernel Clustering</p><p> Jiyuan Liu, Xinwang Liu, Siwei Wang, Sihang Zhou, Yuexiang Yang</p></li><li><p>Interpretable Clustering on Dynamic Graphs with Recurrent Graph Neural Networks</p><p> Yuhang Yao, Carlee Joe-Wong</p></li><li><p>Clustering Ensemble Meets Low-Rank Tensor Approximation</p><p> Yuheng Jia, Hui Liu, Junhui Hou, Qingfu Zhang</p></li><li><p>Contrastive Clustering</p><p> Yunfan Li, Peng Hu, Zitao Liu, Dezhong Peng, Joey Tianyi Zhou, Xi Peng</p></li><li><p>GoT: a Growing Tree Model for Clustering Ensemble</p><p> Feijiang Li, Yuhua Qian, Jieting Wang</p></li><li><p>Unified Tensor Framework for Incomplete Multi-View Clustering and Missing-View Inferring</p><p> Jie Wen, Zheng Zhang, Zhao Zhang, Lei Zhu, Lunke Fei, Bob Zhang, Yong Xu</p></li><li><p>LRSC: Learning Representations for Subspace Clustering</p><p> Changsheng Li, Chen Yang, Bo Liu, Ye Yuan, Guoren Wang</p></li><li><p>Automated Clustering of High-Dimensional Data with a Feature Weighted Mean-Shift<br>Algorithm</p><p> Saptarshi Chakraborty, Debolina Paul, Swagatam Das</p></li><li><p>Learning Representations for Incomplete Time Series Clustering</p><p> Qianli Ma, Chuxin Chen, Sen Li, Garrison Cottrell</p></li><li><p>Multiple Kernel Clustering with Kernel k-Means Coupled Graph Tensor Learning</p><p> Zhenwen Ren, Quansen Sun, Dong Wei</p></li><li><p>Tri-Level Robust Clustering Ensemble with Multiple Graph Learning</p><p> Peng Zhou, Liang Du, Yi-Dong Shen, Xuejun Li</p></li><li><p>Deep Mutual Information Maximin for Cross-Modal Clustering</p><p> Yiqiao Mao, Xiaoqiang Yan, Qiang Guo, Yangdong Ye</p></li><li><p>Fairness, Semi-Supervised Learning, and More: A General Framework for Clustering with<br>Stochastic Pairwise Constraints</p><p> Brian Brubach, Darshan Chakrabarti, John P Dickerson, Aravind Srinivasan, Leonidas Tsepenekas</p></li><li><p>Deep Fusion Clustering Network</p><p> Wenxuan Tu, Sihang Zhou, Xinwang Liu, Xifeng Guo, Zhiping Cai, En Zhu, Jieren Cheng</p></li><li><p>The Parameterized Complexity of Clustering Incomplete Data</p><p> Eduard Eiben, Robert Ganian, Iyad Kanj, Sebastian Ordyniak, Stefan Szeider</p></li><li><p>Objective-Based Hierarchical Clustering of Deep Embedding Vectors</p><p> Dmitrii Avdiukhin, Stanislav Naumov, Grigory Yaroslavtsev</p></li><li><p>Variational Fair Clustering</p><p> Imtiaz Masud Ziko, Jing Yuan, Eric Granger, Ismail Ben Ayed</p></li><li><p>Extreme k-Center Clustering</p><p> MohammadHossein Bateni, Hossein Esfandiari, Manuela Fischer, Vahab Mirrokni</p></li><li><p>Differentially Private Clustering via Maximum Coverage</p><p> Matthew Jones, Huy Nguyen, Thy D Nguyen</p></li></ul><h2 id="data-augmentation"><a href="#data-augmentation" class="headerlink" title="data augmentation"></a>data augmentation</h2><ul><li><p>AttaNet: Attention-Augmented Network for Fast and Accurate Scene Parsing</p><p> Qi Song, Kangfu Mei, Rui Huang</p></li><li><p>How Does Data Augmentation Affect Privacy in Machine Learning?</p><p> Da Yu, Huishuai Zhang, Wei Chen, Jian Yin, Tie-Yan Liu</p></li><li><p>SnapMix: Semantically Proportional Mixing for Augmenting Fine-Grained Data</p><p> Shaoli Huang, Xinchao Wang, Dacheng Tao</p></li><li><p>Inferring Emotion from Large-Scale Internet Voice Data: A Semi-Supervised Curriculum<br>Augmentation Based Deep Learning Approach</p><p> Suping Zhou, Jia Jia, Zhiyong Wu, Zhihan Yang, Yanfeng Wang, Wei Chen, Fanbo Meng, Shuo<br>Huang, Jialie Shen, Xiaochuan Wang</p></li><li><p>Kernel-Convoluted Deep Neural Networks with Data Augmentation</p><p> Minjin Kim, Young-geun Kim, Dongha Kim, Yongdai Kim, Myunghee Cho Paik</p></li><li><p>Improving Commonsense Causal Reasoning by Adversarial Training and Data Augmentation</p><p> Ignacio Iacobacci, Ieva Staliūnaitė, Philip John Gorinski</p></li><li><p>Self-Supervised Multi-View Stereo via Effective Co-Segmentation and Data-Augmentation</p><p> Hongbin Xu, Zhipeng Zhou, Yu Qiao, Wenxiong Kang, Qiuxia Wu</p></li><li><p>Joint-Label Learning by Dual Augmentation for Time Series Classification</p><p> Qianli Ma, Zhenjing Zheng, Jiawei Zheng, Sen Li, Wanqing Zhuang, Garrison Cottrell</p></li><li><p>Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-<br>Training</p><p> Peng Shi, Patrick Ng, Zhiguo Wang, Henghui Zhu, Alexander Hanbo Li, Jun Wang, Cicero Nogueira<br>dos Santos, Bing Xiang</p></li><li><p>Two-Stream Convolution Augmented Transformer for Human Activity Recognition</p><p> Bing Li, Wei Cui, Wei Wang, Le Zhang, Zhenghua Chen, Min Wu</p></li><li><p>Data Augmentation for Graph Neural Networks</p><p> Tong Zhao, Yozen Liu, Leonardo Neves, Oliver J Woodford, Meng Jiang, Neil Shah</p></li></ul><h2 id="About-distribution"><a href="#About-distribution" class="headerlink" title="About distribution"></a>About distribution</h2><ul><li><p>Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for Thoracic Disease<br>Identification</p><p> Yi Zhou, Lei Huang, Tianfei Zhou, Ling Shao</p></li><li><p>Robust Lightweight Facial Expression Recognition Network with Label Distribution Training</p><p> Zengqun Zhao, Qingshan Liu, Feng Zhou</p></li><li><p>Wasserstein Distributionally Robust Inverse Multiobjective Optimization</p><p> Chaosheng Dong, Bo Zeng</p></li><li><p>The Gap on Gap: Tackling the Problem of Differing Data Distributions in Bias-Measuring<br>Datasets</p><p> Vid Kocijan, Oana-Maria Camburu, Thomas Lukasiewicz</p></li></ul>]]></content>
<tags>
<tag> paper list </tag>
</tags>
</entry>
<entry>
<title>Related Papers in Other Top Conferences (2021)</title>
<link href="/uncategorized/paperlistfile/otherconf_2021/"/>
<url>/uncategorized/paperlistfile/otherconf_2021/</url>
<content type="html"><![CDATA[<h2 id="SIGMOD-2021"><a href="#SIGMOD-2021" class="headerlink" title="SIGMOD 2021"></a>SIGMOD 2021</h2><p><a href="http://www.2021.sigmod.org/sigmod_research_list.shtml">link</a></p><span id="more"></span><h3 id="anomaly-detection"><a href="#anomaly-detection" class="headerlink" title="anomaly detection"></a>anomaly detection</h3><ul><li><p>Multiple Dynamic Outlier-Detection from a Data Stream by Exploiting Duality of Data and Queries</p><p>Susik Yoon (KAIST); Yooju Shin (KAIST); Jae-Gil Lee (KAIST)$^{\star}$; Byung Suk Lee (University of Vermont)</p></li><li><p>GPU-Accelerated Graph Label Propagation for Real-Time Fraud Detection</p><p>chang ye (Singapore Management University)$^{\star}$; Yuchen Li (Singapore Management University); Bingsheng He (National University of Singapore); Zhao Li (Alibaba Group); Jianling Sun (Zhejiang University)</p></li><li><p>Fast and Exact Outlier Detection in Metric Spaces: A Proximity Graph-based Approach</p><p>Daichi Amagata (Osaka University)$^{\star}$; Makoto Onizuka (Osaka University); Takahiro Hara (Osaka University, Japan)</p></li><li><p>RobustPeriod: Robust Time-Frequency Mining for Multiple Periodicity Detection</p><p>Qingsong Wen (Alibaba DAMO Academy)$^{\star}$; Kai He (Alibaba DAMO Academy); Liang Sun (Alibaba Group); Yingying Zhang (Alibaba Group); Min Ke (Alibaba Group); Huan Xu (Alibaba Group)</p></li></ul><ul><li>On Saving Outliers for Better Clustering over Noisy DataShaoxu Song (Tsinghua University)$^{\star}$; Fei Gao (Tsinghua University); Ruihong Huang (Tsinghua University); Yihan Wang (Tsinghua University)</li></ul><h3 id="causal-analysis"><a href="#causal-analysis" class="headerlink" title="causal analysis"></a>causal analysis</h3><ul><li>Clonos: Consistent Causal Recovery for Highly-Available Streaming Dataflows<br>Pedro Silvestre (TU Delft); Marios Fragkoulis (TU Delft)$^{\star}$; Diomidis Spinellis (TU Delft); Asterios Katsifodimos (TU Delft)</li></ul><h3 id="heterogeneous"><a href="#heterogeneous" class="headerlink" title="heterogeneous"></a>heterogeneous</h3><ul><li><p>EquiTensors: Learning Fair Integrations of Heterogeneous Urban Data</p><p>An Yan (University of Washington)$^{\star}$; Bill G Howe (University of Washington)</p></li><li><p>Heterogeneity-Aware Distributed Machine Learning Training via Partial Reduce</p><p>Xupeng Miao (Peking University)$^{\star}$; Xiaonan Nie (Peking University); Yingxia Shao (BUPT); Zhi Yang (Peking University); Jiawei Jiang (ETH Zurich); Lingxiao Ma (Peking University); Bin Cui (Peking University)</p></li></ul><h2 id="NDSS-2021"><a href="#NDSS-2021" class="headerlink" title="NDSS 2021"></a>NDSS 2021</h2><p>accept papers: <a href="https://www.ndss-symposium.org/ndss2021/">link</a></p><ul><li><p>Evading Voltage-Based Intrusion Detection on Automotive CAN</p><p>Rohit Bhatia (Purdue University); Vireshwar Kumar (Indian Institute of Technology Delhi); Khaled Serag and Z. Berkay Celik (Purdue University); Mathias Payer (EPFL); Dongyan Xu (Purdue University)</p></li><li><p>Differential Training: A Generic Framework to Reduce Label Noises for Android Malware Detection</p><p>Jiayun Xu (School of Information Systems, Singapore Management University, Singapore); Yingjiu Li (University of Oregon); Robert H. Deng (School of Information Systems, Singapore Management University, Singapore)</p></li></ul><h2 id="ESEC-FSE-2021-CCF-A"><a href="#ESEC-FSE-2021-CCF-A" class="headerlink" title="ESEC/FSE 2021 (CCF-A)"></a>ESEC/FSE 2021 (CCF-A)</h2><ul><li><p>Detecting and Localizing Keyboard Accessibility Failures in Web Applications</p><p>Paul T. Chiou, Ali S. Alotaibi, William G.J. Halfond</p></li></ul><ul><li><p>Explaining Mispredictions of ML Models</p><p>Jürgen Cito, Isil Dillig, Vijayaraghavan Murali, Seohyun Kim, Satish Chandra</p></li><li><p>Feature Trace Recording</p><p>Paul Maximilian Bittner, Alexander Schultheiß, Thomas Thüm, Timo Kehrer, Jeffrey M. Young, Lukas Linsbauer</p></li><li><p>Identifying Bad Software Changes via Multimodal Anomaly Detection for Online Service Systems</p><p>Nengwen Zhao, Junjie Chen, Zhaoyang Yu, Honglin Wang, Jiesong Li, Bin Qiu, Hongyu Xu, Wenchi Zhang, Kaixin Sui, Dan Pei</p></li></ul><h2 id="ICSE-2021-CCF-A"><a href="#ICSE-2021-CCF-A" class="headerlink" title="ICSE 2021 (CCF-A)"></a>ICSE 2021 (CCF-A)</h2><ul><li><p>AUTOTRAINER: An Automatic DNN Training Problem Detection and Repair SystemTechnical Track</p><p>Xiaoyu Zhang, Juan Zhai, Shiqing Ma, Chao Shen</p></li><li><p>Interpretation-enabled Software Reuse Detection Based on a Multi-Level Birthmark ModelTechnical Track</p><p>Xi Xu, Qinghua Zheng, Zheng Yan, Ming Fan, Ang Jia, Ting Liu</p></li><li><p>Semi-supervised Log-based Anomaly Detection via Probabilistic Label EstimationArtifact ReusableTechnical TrackArtifact Available</p><p>Lin Yang, Junjie Chen, Zan Wang, Weijing Wang, Jiajun Jiang, Xuyuan Dong, Wenbin Zhang</p></li></ul><hr><h2 id="ASE-2021-CCF-A"><a href="#ASE-2021-CCF-A" class="headerlink" title="ASE 2021 (CCF-A)"></a>ASE 2021 (CCF-A)</h2><h2 id="ISSRE-2021"><a href="#ISSRE-2021" class="headerlink" title="ISSRE 2021"></a>ISSRE 2021</h2>]]></content>
<tags>
<tag> paper list </tag>
</tags>
</entry>
<entry>
<title>Related Papers in NDSS 2020(安全领域的顶会)</title>
<link href="/uncategorized/paperlistfile/NDSS2020/"/>
<url>/uncategorized/paperlistfile/NDSS2020/</url>
<content type="html"><![CDATA[<p><a href="https://www.ndss-symposium.org/ndss2020/accepted-papers/">Link</a></p><span id="more"></span><h2 id="Attack"><a href="#Attack" class="headerlink" title="Attack"></a>Attack</h2><ul><li><p>Practical Traffic Analysis Attacks on Secure Messaging Applications</p><p>Alireza Bahramali, Amir Houmansadr, Ramin Soltani, Dennis Goeckel, and Don Towsley (University of Massachusetts Amherst)</p></li></ul><h2 id="Traffic"><a href="#Traffic" class="headerlink" title="Traffic"></a>Traffic</h2><ul><li><p>Encrypted DNS – Privacy? A Traffic Analysis Perspective</p><p>Sandra Siby (EPFL); Marc Juarez (University of Southern California); Claudia Diaz (imec-COSIC KU Leuven); Narseo Vallina-Rodriguez (IMDEA Networks Institute); Carmela Troncoso (EPFL)</p></li><li><p>FlowPrint: Semi-Supervised Mobile-App Fingerprinting on Encrypted Network Traffic</p><p>Thijs van Ede (University of Twente); Riccardo Bortolameotti (Bitdefender); Andrea Continella (UC Santa Barbara); Jingjing Ren and Daniel J. Dubois (Northeastern University); Martina Lindorfer (TU Wien); David Choffnes (Northeastern University); Maarten van Steen and Andreas Peter (University of Twente)</p></li><li><p>Practical Traffic Analysis Attacks on Secure Messaging Applications</p><p>Alireza Bahramali, Amir Houmansadr, Ramin Soltani, Dennis Goeckel, and Don Towsley (University of Massachusetts Amherst)</p></li></ul>]]></content>
<tags>
<tag> paper list </tag>
</tags>
</entry>
<entry>
<title>Self-explaining DNN</title>
<link href="/uncategorized/surveys/interpretable_DNN/"/>
<url>/uncategorized/surveys/interpretable_DNN/</url>
<content type="html"><![CDATA[<p>key words: Self-explaining models</p><span id="more"></span><h2 id="DNN"><a href="#DNN" class="headerlink" title="DNN"></a>DNN</h2><ol><li><blockquote><p><a href="#refer-anchor-1"><sup>1</sup></a> the recently-proposed layer-wise relevance propagation (LRP) algorithm from Wojciech Samek’s group [28], [29] uses the fact that the individual neural network units are differentiable to decompose the network output in terms of its input variables. It is a principled method that has a close relationship to Taylor decomposition and is applicable to arbitrary deep neural network architectures [30]</p><blockquote><ul><li><p>[28] A. Binder, G. Montavon, S. Bach, K. Müller and W. Samek, “Layer-wise relevance propagation for neural networks with local renormalization layers”, CoRR, vol. abs/1604. 00825, 2016</p></li><li><p>[29] A. Binder, S. Bach, G. Montavon, K.-R. Müller and W. Samek, Layer-Wise Relevance Propagation for Deep Neural Network Architectures, Springer, 2016.</p></li><li><p>[30] G. Montavon, S. Lapuschkin, A. Binder, W. Samek and K.-R. Müller, “Explaining nonlinear classification decisions with deep taylor decomposition”, Pattern Recognition, vol. 65, pp. 211-222, 2017.</p></li></ul></blockquote></blockquote></li><li><p>Alvarez-Melis, D., & Jaakkola, T. S. (2018, December). Towards robust interpretability with self-explaining neural networks. In Proceedings of the 32nd International Conference on Neural Information Processing Systems (pp. 7786-7795).</p></li></ol><blockquote><p>We achieve this with a regularization scheme that ensures our model not only looks like a linear model, but (locally) behaves like one.<br>we start with a simple linear regression model and successively generalize<br>it towards the class of self-explaining models.<br>We progressively generalize the model in the following subsections and<br>discuss how this mechanism of interpretation is preserved.</p></blockquote><ol start="3"><li><blockquote><p><a href="#refer-anchor-2"><sup>2</sup></a> Examples of self-explainable models include simple linear<br>models, or specific nonlinear models, e.g. neural networks with<br>an explicit top-level sum-pooling structure [143], [111], [28],<br>[206], [25]. In all of these models, each summand is linked<br>only to one of a few input variables, which makes attribution<br>of their prediction on the input variables straightforward.</p><blockquote><ul><li>[25] W. Brendel and M. Bethge, “Approximating CNNs with bag-of-local-features models works surprisingly well on imagenet”, Proc. 7th Int. Conf. Learn. Represent., 2019.</li><li>[28] R. Caruana, Y. Lou, J. Gehrke, P. Koch, M. Sturm and N. Elhadad, “Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission”, Proc. 21th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, pp. 1721-1730, 2015.</li><li>[111] M. Lin, Q. Chen and S. Yan, “Network in network”, Proc. Int. Conf. Learn. Represent. (ICLR), 2014.</li><li>[143] B. Poulin et al., “Visual explanation of evidence with additive classifiers”, Proc. 21st Nat. Conf. Artif. Intell. 18th Innov. Appl. Artif. Intell. Conf., pp. 1822-1829, 2006.</li><li>[206] B. Zhou, A. Khosla, A. Lapedriza, A. Oliva and A. Torralba, “Learning deep features for discriminative localization”, Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 2921-2929, Jun. 2016.</li></ul></blockquote></blockquote></li><li><p>Agarwal, R., Frosst, N., Zhang, X., Caruana, R., & Hinton, G. E. (2020). Neural additive models: Interpretable machine learning with neural nets. arXiv preprint arXiv:2004.13912.</p></li></ol><blockquote><p>In this paper, we make restrictions on the structure of neural networks, which yields a family of models called Neural<br>Additive Models (NAMs), that are inherently interpretable while suffering little loss in prediction accuracy when applied to tabular data.</p><p>…</p><p>NAMs belong to a larger model family called Generalized Additive Models<br>(GAMs)</p></blockquote><h2 id="RNN"><a href="#RNN" class="headerlink" title="RNN"></a>RNN</h2><ol start="5"><li><blockquote><p><a href="#refer-anchor-2"><sup>2</sup></a> extensions of LRP have been proposed<br>to deal with the special LSTM blocks in recurrent neural<br>networks [11], [9]</p><blockquote><ul><li>[9] L. Arras et al., “Explaining and interpreting LSTMs” in Explainable AI: Interpreting Explaining and Visualizing Deep Learning, Cham, Switzerland:Springer, vol. 11700, pp. 211-238, 2019.</li><li>[11] L. Arras, G. Montavon, K.-R. Müller and W. Samek, “Explaining recurrent neural network predictions in sentiment analysis”, Proc. 8th Workshop Comput. Approaches to Subjectivity Sentiment Social Media Anal., pp. 159-168, 2017.</li></ul></blockquote></blockquote></li></ol><ol start="6"><li>Marcinkevičs, R., & Vogt, J. E. (2021). Interpretable Models for Granger Causality Using Self-explaining Neural Networks. ICLR 2021.</li></ol><blockquote><p>We extend self-explaining neural network models (Alvarez-Melis & Jaakkola, 2018) to<br>time series analysis. The resulting autoregressive model, named generalised vector autore-gression (GVAR), is interpretable and allows exploring GC relations between variables, signs of Granger-causal effects, and their variability through time.</p></blockquote><h1 id="Reference-List-of-Surveys"><a href="#Reference-List-of-Surveys" class="headerlink" title="Reference: List of Surveys"></a>Reference: List of Surveys</h1><div id="refer-anchor-1"></div>[1] Chakraborty, S., Tomsett, R., Raghavendra, R., Harborne, D., Alzantot, M., Cerutti, F., ... & Gurram, P. (2017, August). Interpretability of deep learning models: a survey of results. In 2017 IEEE smartworld, ubiquitous intelligence & computing, advanced & trusted computed, scalable computing & communications, cloud & big data computing, Internet of people and smart city innovation (smartworld/SCALCOM/UIC/ATC/CBDcom/IOP/SCI) (pp. 1-6). IEEE.<div id="refer-anchor-2"></div>[2] Samek, W., Montavon, G., Lapuschkin, S., Anders, C. J., & Müller, K. R. (2021). Explaining Deep Neural Networks and Beyond: A Review of Methods and Applications. Proceedings of the IEEE, 109(3), 247-278.]]></content>
<tags>
<tag> paper list </tag>
<tag> survey </tag>
<tag> interpretability </tag>
</tags>
</entry>
<entry>
<title>Post-hoc interpretability</title>
<link href="/uncategorized/surveys/interpretability/"/>
<url>/uncategorized/surveys/interpretability/</url>
<content type="html"><![CDATA[<h2 id="研究问题描述"><a href="#研究问题描述" class="headerlink" title="研究问题描述"></a>研究问题描述</h2><p>深度学习模型的事后(post-hoc)可解释方法:给定一个已训练的深度神经网络模型,如何对其输出进行解释。</p><span id="more"></span><h2 id="领域现状"><a href="#领域现状" class="headerlink" title="领域现状"></a>领域现状</h2><h3 id="基于叠加的方法-Superimposition-based-explanation"><a href="#基于叠加的方法-Superimposition-based-explanation" class="headerlink" title="基于叠加的方法 (Superimposition-based explanation)"></a>基于叠加的方法 (Superimposition-based explanation)</h3><p> 这类方法将网络的输出归因到网络的输入上,显式的指出网络输入中的每个维度对网络输出的贡献程度,</p><ul><li><p>优点:直观</p></li><li><p>缺点:有时会生成令人误解的解释,如当网络将已知的无关输入视为重要判断依据时</p></li></ul><p> [1]LIME:通过对输入施加轻微的扰动,以探测黑盒模型的输出变化,优化一个可解释模型(线性模型或tree-based model)局部近似黑盒模型的预测。</p><p> [2]SHAP:基于博弈理论(shapley value, a game-theory based method),计算网络输入的每一维对网络输出的贡献(也是将黑盒模型做局部近拟,使其具有可解释性)</p><p> [3]saliency map:训练一个遮罩模型(masking model)以识别最影响分类器决断的输入特征。</p><p> [4]Integrated Gradients:以输入特征在某个路径上的梯度积分( Integrated Gradients )作为该特征的重要性得分。</p><h3 id="基于例子的方法-example-based-explanation"><a href="#基于例子的方法-example-based-explanation" class="headerlink" title="基于例子的方法 (example-based explanation)"></a>基于例子的方法 (example-based explanation)</h3><p> 这类方法针对某个待解释的案例,依照某种策略生成一组案例(a set of example),这组案例一般是支持网络对待解释案例做出判断的主要依据。通过人类直观的对比这组案例与待解释的案例,总结出的差异与共同点将被视为一种直接的解释。</p><ul><li><p>优点:易于理解</p></li><li><p>缺点:依赖于训练集的质量与数量</p><p>[10]: 通过在训练样本上施加一个微小的扰动,观察该扰动对所训练出的模型权重的影响,以此来确定网络在对给定样本进行预测时,是哪些训练样本起到了决定性作用。</p><p>[11]: 搜索给定样本在深度学习模型的每一层特征空间中的最近邻,其邻居构成的合集即为用于解释给定样本的训练集(给定样本与该集合中的样本共性即为他们获得相同标签的原因)。</p></li></ul><hr><h2 id="代表性论文10篇"><a href="#代表性论文10篇" class="headerlink" title="代表性论文10篇"></a>代表性论文10篇</h2><h3 id="基于叠加的方法"><a href="#基于叠加的方法" class="headerlink" title="基于叠加的方法"></a>基于叠加的方法</h3><p> <strong>经典方法</strong></p><p> [1]: Marco Túlio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. In SIGKDD. ACM, 1135–1144. (LIME)</p><p> [2]: Scott M. Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. In NeurIPS. 4765–4774. (SHAP)</p><p> [3]: Piotr Dabkowski and Yarin Gal. 2017. Real Time Image Saliency for Black Box Classifiers. In NeurIPS. 6967–6976. (saliency map)</p><p> [4]: Sundararajan, M., Taly, A., & Yan, Q. (2017, August). Axiomatic attribution for deep networks. In Proceedings of the 34th International Conference on Machine Learning-Volume 70 (pp. 3319-3328).</p><p> <strong>其他工作</strong></p><p> [5]: Ismail, A., Gunady, M., Bravo, H., & Feizi, S. (2020). Benchmarking Deep Learning Interpretability in Time Series Predictions. Advances in Neural Information Processing Systems Foundation (NeurIPS).</p><p> [6]: Giurgiu, I., & Schumann, A. (2019). Explainable failure predictions with rnn classifiers based on time series data. arXiv preprint arXiv:1901.08554.</p><p> [7]: Mujkanovic, F., Doskoč, V., Schirneck, M., Schäfer, P., & Friedrich, T. (2020). timeXplain–A Framework for Explaining the Predictions of Time Series Classifiers. arXiv preprint arXiv:2007.07606.</p><p> [8]: Nguyen, T. T., Le Nguyen, T., & Ifrim, G. (2020, September). A Model-Agnostic Approach to Quantifying the Informativeness of Explanation Methods for Time Series Classification. In International Workshop on Advanced Analytics and Learning on Temporal Data (pp. 77-94). Springer, Cham.</p><p> [9]: Shankaranarayana, S. M., & Runje, D. (2019, November). ALIME: Autoencoder based approach for local interpretability. In International Conference on Intelligent Data Engineering and Automated Learning (pp. 454-463). Springer, Cham.</p><h3 id="基于例子的方法"><a href="#基于例子的方法" class="headerlink" title="基于例子的方法"></a>基于例子的方法</h3><p> <strong>经典方法</strong></p><p> [10]: Koh, P. W., & Liang, P. (2017, July). Understanding Black-box Predictions via Influence Functions. In International Conference on Machine Learning (pp. 1885-1894).</p><p> [11]: Nicolas Papernot and Patrick McDaniel. Deep k-nearest neighbors: Towards confident, interpretable and robust deep learning. arXiv preprint arXiv:1803.04765, 2018.</p><p> <strong>其他工作</strong></p><p> [12]: Kim, B., Rudin, C., & Shah, J. A. (2014). The bayesian case model: A generative approach for case-based reasoning and prototype classification. Advances in neural information processing systems, 27, 1952-1960.</p><p> [13]: Jeyakumar, J. V., Noor, J., Cheng, Y. H., Garcia, L., & Srivastava, M. (2020). How Can I Explain This to You? An Empirical Study of Deep Neural Network Explanation Methods. Advances in Neural Information Processing Systems, 33.</p><p> [14]: Papernot, N., & McDaniel, P. (2018). Deep k-nearest neighbors: Towards confident, interpretable and robust deep learning. arXiv preprint arXiv:1803.04765.</p><p> [15]: Ming, Y., Xu, P., Qu, H., & Ren, L. (2019, July). Interpretable and steerable sequence learning via prototypes. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 903-913).</p><p> [16]: Ma, D., Wang, Z., Xie, J., Guo, B., & Yu, Z. (2020, November). Interpretable Multivariate Time Series Classification Based on Prototype Learning. In International Conference on Green, Pervasive, and Cloud Computing (pp. 205-216). Springer, Cham.</p><p> [17]: Keane, M. T., & Kenny, E. M. (2019, September). How case-based reasoning explains neural networks: A theoretical analysis of XAI using post-hoc explanation-by-example from a survey of ANN-CBR twin-systems. In International Conference on Case-Based Reasoning (pp. 155-171). Springer, Cham.</p><h2 id="经典论文or强相关论文"><a href="#经典论文or强相关论文" class="headerlink" title="经典论文or强相关论文"></a>经典论文or强相关论文</h2><p> [11]: Nicolas Papernot and Patrick McDaniel. Deep k-nearest neighbors: Towards confident, interpretable and robust deep learning. arXiv preprint arXiv:1803.04765, 2018.</p><p> [13]: Jeyakumar, J. V., Noor, J., Cheng, Y. H., Garcia, L., & Srivastava, M. (2020). How Can I Explain This to You? An Empirical Study of Deep Neural Network Explanation Methods. Advances in Neural Information Processing Systems, 33.</p><p> <strong>异同点</strong></p><p> [11, 13]: 分析的是有监督分类模型中的基于案例的可解释问题,但仅能说明某样本为何被判定为某类,而不能做出反事实(counterfactual:)解释,即“某样本为何不是某类”。无法从中提取出可以用于分类的语义信息</p><p> [11, 13]: 和我们的工作均从隐空间入手,分析待测样本的邻居,而我们进一步的分析了待测样本与其邻居、聚类中心样本 之间的差异,以帮助我们总结有助于区分正异常的语义信息。</p>]]></content>
<tags>
<tag> paper list </tag>
<tag> survey </tag>
<tag> interpretability </tag>
</tags>
</entry>
<entry>
<title>Related Papers in NeuralIPS 2020 (2020.12.08)</title>
<link href="/uncategorized/paperlistfile/NIPS2020/"/>
<url>/uncategorized/paperlistfile/NIPS2020/</url>
<content type="html"><![CDATA[<p>Accept paper list: <a href="https://neurips.cc/Conferences/2020/AcceptedPapersInitial">link</a></p><span id="more"></span><h2 id="anomaly-detection"><a href="#anomaly-detection" class="headerlink" title="anomaly detection"></a>anomaly detection</h2><ul><li><p>Energy-based Out-of-distribution Detection</p><p> Weitang Liu (UC San Diego) · Xiaoyun Wang (University of California, Davis) · John Owens (University of California, Davis) · Sharon Yixuan Li (Stanford University)</p></li></ul><ul><li><p>Provable Worst Case Guarantees for the Detection of Out-of-distribution Data</p><p> Julian Bitterwolf (University of Tübingen) · Alexander Meinke (University of Tübingen) · Matthias Hein (University of Tübingen)</p></li></ul><ul><li><p>Likelihood Regret: An Out-of-Distribution Detection Score For Variational Auto-encoder</p><p> Zhisheng Xiao (The University of Chicago) · Qing Yan (University of Chicago) · Yali Amit (University of Chicago)</p></li></ul><ul><li><p>Why Normalizing Flows Fail to Detect Out-of-Distribution Data</p><p> Polina Kirichenko (New York University) · Pavel Izmailov (New York University) · Andrew Gordon Wilson (New York University)</p></li></ul><ul><li><p><strong>【看看5】</strong> Towards Maximizing the Representation Gap between In-Domain & Out-of-Distribution Examples</p><p> Jay Nandy (National University of Singapore) · Wynne Hsu (National University of Singapore) · Mong Li Lee (National University of Singapore)</p></li></ul><ul><li><p>On the Value of Out-of-Distribution Testing: An Example of Goodhart’s Law</p><p> Damien Teney (University of Adelaide) · Ehsan Abbasnejad (University of Adelaide) · Kushal Kafle (Adobe Research) · Robik Shrestha (Rochester Institute of Technology) · Christopher Kanan (PAIGE.AI / RIT / CornellTech) · Anton van den Hengel (University of Adelaide)</p></li></ul><ul><li><p>Understanding Anomaly Detection with Deep Invertible Networks through Hierarchies of Distributions and Features</p><p> Robin T Schirrmeister (University Medical Center Freiburg) · Yuxuan Zhou (Stuttgart University) · Tonio Ball (Albert-Ludwigs-University) · Dan Zhang (Bosch Center for Artificial Intelligence)</p></li></ul><ul><li><p><strong>【看看1】</strong> Timeseries Anomaly Detection using Temporal Hierarchical One-Class Network</p><p> Lifeng Shen (The Hong Kong University of Science and Technology) · Zhuocong Li (Tencent) · James Kwok (Hong Kong University of Science and Technology)</p></li></ul><ul><li><p>CSI: Novelty Detection via Contrastive Learning on Distributionally Shifted Instances</p><p> Jihoon Tack (KAIST) · Sangwoo Mo (KAIST) · Jongheon Jeong (KAIST) · Jinwoo Shin (KAIST)</p></li><li><p><strong>【看看4】</strong> One Ring to Rule Them All: Certifiably Robust Geometric Perception with Outliers</p><p> Heng Yang · Luca Carlone</p></li><li><p><strong>【看看3】</strong> Outlier Robust Mean Estimation with Subgaussian Rates via Stability</p><p> Ilias Diakonikolas · Daniel M. Kane · Ankit Pensia</p></li><li><p><strong>【看看2】</strong> Further Analysis of Outlier Detection with Deep Generative Models</p><p> Ziyu Wang · Bin Dai · David P Wipf · Jun Zhu</p></li></ul><h2 id="Time-series"><a href="#Time-series" class="headerlink" title="Time series"></a>Time series</h2><ul><li><p>Probabilistic Time Series Forecasting with Shape and Temporal Diversity</p><p> Vincent LE GUEN (CNAM, Paris, France) · Nicolas THOME (Cnam (Conservatoire national des arts et métiers))</p></li><li><p><strong>【看看ts-5】</strong> Deep reconstruction of strange attractors from time series</p><p> William Gilpin (Harvard University)</p></li></ul><ul><li><p>Deep Energy-based Modeling of Discrete-Time Physics</p><p> Takashi Matsubara (Osaka University) · Ai Ishikawa (Kobe University) · Takaharu Yaguchi (Kobe University)</p></li><li><p><strong>【看看ts-3】</strong> Neural Controlled Differential Equations for Irregular Time Series</p><p> Patrick Kidger (University of Oxford) · James Morrill (University of Oxford) · James Foster (University of Oxford) · Terry Lyons (University of Oxford)</p></li></ul><ul><li><p><strong>【看看ts-4】</strong> Adversarial Sparse Transformer for Time Series Forecasting</p><p> Sifan Wu (Tsinghua University) · Xi Xiao (Tsinghua University) · Qianggang Ding (Tsinghua University) · Peilin Zhao (Tencent AI Lab) · Ying Wei (Tencent AI Lab) · Junzhou Huang (University of Texas at Arlington / Tencent AI Lab)</p></li><li><p><strong>【看看ts-2】</strong> Learning Long-Term Dependencies in Irregularly-Sampled Time Series</p><p> Mathias Lechner (IST Austria) · Ramin Hasani (MIT)</p></li><li><p><strong>【看看ts-1】</strong> Benchmarking Deep Learning Interpretability in Time Series Predictions</p><p> Aya Abdelsalam Ismail (University of Maryland) · Mohamed Gunady (University of Maryland) · Hector Corrada Bravo (University of Maryland) · Soheil Feizi (University of Maryland)</p></li><li><p>High-recall causal discovery for autocorrelated time series with latent confounders</p><p> Andreas Gerhardus (German Aerospace Center (DLR)) · Jakob Runge (Institute of Data Science, German Aerospace Center (DLR))</p></li></ul><ul><li><p>Deep Rao-Blackwellised Particle Filters for Time Series Forecasting</p><p> Richard Kurle (Volkswagen Group) · Syama Sundar Rangapuram (Amazon Research) · Emmanuel de Bézenac (Sorbonne Université) · Stephan Günnemann (Technical University of Munich) · Jan Gasthaus (Amazon.com)</p></li></ul><ul><li><p>Normalizing Kalman Filters for Multivariate Time Series Analysis</p><p> Emmanuel de Bézenac (Sorbonne Université) · Syama Sundar Rangapuram (Amazon Research) · Konstantinos Benidis (Amazon Research) · Michael Bohlke-Schneider (Amazon) · Lorenzo Stella (Amazon Research) · Hilaf Hasson (Amazon Research) · Richard Kurle (Volkswagen Group) · Tim Januschowski (Amazon Research) · Patrick Gallinari (Sorbonne University & Criteo AI Lab, Paris)</p></li></ul><ul><li><p>User-Dependent Neural Sequence Models for Continuous-Time Event Data</p><p> Alex Boyd (UC Irvine) · Robert Bamler (University of California at Irvine) · Stephan Mandt (University of California, Irivine) · Padhraic Smyth (University of California, Irvine)</p></li></ul><h2 id="About-distribution"><a href="#About-distribution" class="headerlink" title="About distribution"></a>About distribution</h2><ul><li><p>Distribution-free binary classification: prediction sets, confidence intervals and calibration</p><p> Chirag Gupta (Carnegie Mellon University) · Aleksandr Podkopaev (Carnegie Mellon University) · Aaditya Ramdas (CMU)</p></li></ul><ul><li><p>Deep Diffusion-Invariant Wasserstein Distributional Classification</p><p> Sung Woo Park+ (Chung-Ang University) · Dong Wook Shu (Chung-Ang Univ., Korea) · Junseok Kwon (Chung-Ang Univ., Korea)</p></li></ul><ul><li><p>OOD-MAML: Meta-Learning for Few-Shot Out-of-Distribution Detection and Classification</p><p> Taewon Jeong (KAIST) · Heeyoung Kim (KAIST)</p></li></ul><ul><li><p>Understanding Anomaly Detection with Deep Invertible Networks through Hierarchies of Distributions and Features</p><p> Robin T Schirrmeister (University Medical Center Freiburg) · Yuxuan Zhou (Stuttgart University) · Tonio Ball (Albert-Ludwigs-University) · Dan Zhang (Bosch Center for Artificial Intelligence)</p></li></ul><ul><li><p>Measuring Robustness to Natural Distribution Shifts in Image Classification</p><p> Rohan Taori (University of California, Berkeley) · Achal Dave (Carnegie Mellon University) · Vaishaal Shankar (UC Berkeley) · Nicholas Carlini (Google) · Benjamin Recht (UC Berkeley) · Ludwig Schmidt (UC Berkeley)</p></li></ul><ul><li><p>Fast Epigraphical Projection-based Incremental Algorithms for Wasserstein Distributionally Robust Support Vector Machine</p><p> Jiajin Li (The Chinese University of Hong Kong) · Caihua Chen (Nanjing University) · Anthony Man-Cho So (CUHK)</p></li></ul><ul><li><p>Adversarial Distributional Training for Robust Deep Learning</p><p> Yinpeng Dong (Tsinghua University) · Zhijie Deng (Tsinghua University) · Tianyu Pang (Tsinghua University) · Hang Su (Tsinghua Univiersity) · Jun Zhu (Tsinghua University)</p></li></ul><ul><li><p>Mix and Match: An Optimistic Tree-Search Approach for Learning Models from Mixture Distributions</p><p> Matthew Faw (University of Texas at Austin) · Rajat Sen (Amazon) · Karthikeyan Shanmugam (IBM Research, NY) · Constantine Caramanis (UT Austin) · Sanjay Shakkottai (University of Texas at Austin)</p></li></ul><ul><li><p>Distributionally Robust Parametric Maximum Likelihood Estimation</p><p> Viet Anh Nguyen (Stanford University) · Xuhui Zhang (Stanford University) · Jose Blanchet (Stanford University) · Angelos Georghiou (University of Cyprus)</p></li></ul><ul><li><p>Distributionally Robust Local Non-parametric Conditional Estimation</p><p> Viet Anh Nguyen (Stanford University) · Fan Zhang (Stanford University) · Jose Blanchet (Stanford University) · Erick Delage (HEC Montréal) · Yinyu Ye (Standord)</p></li></ul><ul><li><p>Large-Scale Methods for Distributionally Robust Optimization</p><p> Daniel Levy (Stanford University) · Yair Carmon (Stanford University) · John Duchi (Stanford) · Aaron Sidford (Stanford)</p></li></ul><ul><li><p>Efficient Distance Approximation for Structured High-Dimensional Distributions via Learning</p><p> Arnab Bhattacharyya (National University of Singapore) · Sutanu Gayen (National University of SIngapore) · Kuldeep S Meel (National University of Singapore) · N. V. Vinodchandran (University of Nebraska)</p></li></ul><ul><li><p>Analytical Probability Distributions and EM-Learning for Deep Generative Networks</p><p> Randall Balestriero (Rice University) · Sebastien PARIS (University of Toulon) · Richard Baraniuk (Rice University)</p></li></ul><ul><li><p><strong>【看看dis-1】</strong> Learning Structured Distributions From Untrusted Batches: Faster and Simpler</p><p> Sitan Chen (MIT) · Jerry Li (Microsoft) · Ankur Moitra (MIT)</p></li></ul><ul><li><p>Linear-Sample Learning of Low-Rank Distributions</p><p> Ayush Jain (UC San Diego) · Alon Orlitsky (University of California, San Diego)</p></li></ul><ul><li><p>Profile Entropy: A Fundamental Measure for the Learnability and Compressibility of Distributions</p><p> Yi Hao (University of California, San Diego) · Alon Orlitsky (University of California, San Diego)</p></li></ul><ul><li><p><strong>【看看dis-2】</strong> SURF: A Simple, Universal, Robust, Fast Distribution Learning Algorithm</p><p> Yi Hao (University of California, San Diego) · Ayush Jain (UC San Diego) · Alon Orlitsky (University of California, San Diego) · Vaishakh Ravindrakumar (UC San Diego)</p></li></ul><ul><li><p>Learning discrete distributions with infinite support</p><p> Doron Cohen (Ben-Gurion University of the Negev) · Aryeh Kontorovich (Ben Gurion University) · Geoffrey Wolfer (Ben-Gurion University of the Negev)</p></li></ul><ul><li><p>Optimal Private Median Estimation under Minimal Distributional Assumptions</p><p> Christos Tzamos (UW-Madison) · Emmanouil-Vasileios Vlatakis-Gkaragkounis (Columbia University) · Ilias Zadik (NYU)</p></li></ul><h2 id="missing-value-amp-irregularly-sampled-time-series"><a href="#missing-value-amp-irregularly-sampled-time-series" class="headerlink" title="missing value & irregularly sampled time series"></a>missing value & irregularly sampled time series</h2><ul><li><p>Estimation and Imputation in Probabilistic Principal Component Analysis with Missing Not At Random Data</p><p> Aude Sportisse (Sorbonne University, Ecole Polytechnique) · Claire Boyer (LPSM, Sorbonne Université) · Julie Josses (CMAP / CNRS)</p></li></ul><ul><li><p><strong>【看看missing-2】</strong> Learning Disentangled Representations of Videos with Missing Data</p><p> Armand Comas (Northeastern University) · Chi Zhang (Northeastern University) · Zlatan Feric (Northeastern University) · Octavia Camps (Northeastern University) · Rose Yu (University of California, San Diego)</p></li></ul><ul><li><p><strong>【看看missing-3】</strong> Debiasing Averaged Stochastic Gradient Descent to handle missing values</p><p> Aude Sportisse (Sorbonne University, Ecole Polytechnique) · Claire Boyer (LPSM, Sorbonne Université) · Aymeric Dieuleveut (Ecole Polytechnique, IPParis) · Julie Josses (CMAP / CNRS)</p></li></ul><ul><li><p>Handling Missing Data with Graph Representation Learning</p><p> Jiaxuan You (Stanford University) · Xiaobai Ma (Stanford University) · Yi Ding (Stanford University) · Mykel J Kochenderfer (Stanford University) · Jure Leskovec (Stanford University and Pinterest)</p></li></ul><ul><li><p>A Functional EM Algorithm for Panel Count Data with Missing Counts</p><p> Alexander Moreno (Georgia Institute of Technology) · Zhenke Wu (University of Michigan) · Jamie Roslyn Yap (University of Michigan) · Cho Lam (University of Utah) · David Wetter (University of Utah) · Inbal Nahum-Shani (University of Michigan) · Walter Dempsey (University of Michigan) · James M Rehg (Georgia Tech)</p></li></ul><ul><li><p>NeuMiss networks: differentiable programming for supervised learning with missing values.</p><p> Marine Le Morvan (INRIA) · Julie Josses (CMAP / CNRS) · Thomas Moreau (Inria) · Erwan Scornet (Ecole Polytechnique) · Gael Varoquaux (Parietal Team, INRIA)</p></li></ul><ul><li><p><strong>【看看missing-1】</strong> Learning Continuous System Dynamics from Irregularly-Sampled Partial Observations</p><p> Zijie Huang (University of California, Los Angeles) · Yizhou Sun (UCLA) · Wei Wang (UCLA)</p></li></ul><h2 id="Recurrent-Neural-Network"><a href="#Recurrent-Neural-Network" class="headerlink" title="Recurrent Neural Network"></a>Recurrent Neural Network</h2><ul><li><p>Convolutional Tensor-Train LSTM for Spatio-Temporal Learning</p><p> Jiahao Su (University of Maryland) · Wonmin Byeon (NVIDIA Research) · Jean Kossaifi (NVIDIA) · Furong Huang (University of Maryland) · Jan Kautz (NVIDIA) · Anima Anandkumar (NVIDIA / Caltech)</p></li></ul><ul><li><p>RNNPool: Efficient Non-linear Pooling for RAM Constrained Inference</p><p> Oindrila Saha (Microsoft Research) · Aditya Kusupati (University of Washington) · Harsha Vardhan Simhadri (Microsoft Research) · Manik Varma (Microsoft Research India) · Prateek Jain (Microsoft Research)</p></li></ul><ul><li><p><strong>【看看other-3】</strong> The interplay between randomness and structure during learning in RNNs</p><p> Friedrich Schuessler (Technion) · Francesca Mastrogiuseppe (UCL) · Alexis Dubreuil (ENS) · Srdjan Ostojic (Ecole Normale Superieure) · Omri Barak (Technion - Israeli institute of technology)</p></li></ul><ul><li><p><strong>【看看other-4】</strong> HiPPO: Recurrent Memory with Optimal Polynomial Projections</p><p> Albert Gu (Stanford) · Tri Dao (Stanford University) · Stefano Ermon (Stanford) · Atri Rudra (University at Buffalo, SUNY) · Christopher Ré (Stanford)</p></li></ul><ul><li><p>RATT: Recurrent Attention to Transient Tasks for Continual Image Captioning</p><p> Riccardo Del Chiaro (University of Florence) · Bartłomiej Twardowski (Computer Vision Center, UAB) · Andrew D Bagdanov (University of Florence) · Joost van de Weijer (Computer Vision Center Barcelona)</p></li></ul><ul><li><p>MomentumRNN: Integrating Momentum into Recurrent Neural Networks</p><p> Tan Nguyen (Rice University/UCLA) · Richard Baraniuk (Rice University) · Andrea Bertozzi (UCLA) · Stanley Osher (UCLA) · Bao Wang (UCLA)</p></li></ul><ul><li><p>Recurrent Random Networks as Optimized Kernel Machines</p><p> Sandra Nestler (Juelich Research Centre) · Christian Keup (Juelich Research Centre) · David Dahmen (Jülich Research Centre) · Matthieu Gilson (Juelich Forschungszentrum) · Holger Rauhut (RWTH Aachen University) · Moritz Helias (Juelich Research Centre)</p></li></ul><ul><li><p>Adaptive Graph Convolutional Recurrent Network for Traffic Forecasting</p><p> LEI BAI (UNSW, Sydney) · Lina Yao (University of New South Wales) · Can Li (University of New South Wales) · Xianzhi Wang (University of Technology Sydney) · Can Wang (Griffith University)</p></li></ul><ul><li><p>Using noise to probe recurrent neural network structure and prune synapses</p><p> Eli Moore (University of California, Davis) · Rishidev Chaudhuri (University of California, Davis)</p></li></ul><ul><li><p>Regularizing Towards Permutation Invariance In Recurrent Models</p><p> Edo Cohen-Karlik (Tel Aviv University) · Avichai Ben David (Tel Aviv University) · Amir Globerson (Tel Aviv University, Google)</p></li></ul><ul><li><p>STLnet: Signal Temporal Logic Enforced Multivariate Recurrent Neural Networks</p><p> Meiyi Ma (University of Virginia) · Ji Gao (University of Virginia) · Lu Feng (University of Virginia) · John A Stankovic (University of Virginia)</p></li></ul><ul><li><p>Organizing recurrent network dynamics by task-computation to enable continual learning</p><p> Lea Duncker (Gatsby Unit, UCL) · Laura N Driscoll (Stanford) · Krishna V Shenoy (Stanford University) · Maneesh Sahani (Gatsby Unit, UCL) · David Sussillo (Stanford University)</p></li></ul><ul><li><p>Recurrent Quantum Neural Networks</p><p> Johannes Bausch (University of Cambridge)</p></li></ul><ul><li><p>Reverse-engineering recurrent neural network solutions to a hierarchical inference task for mice</p><p> Rylan Schaeffer (Harvard University) · Mikail C Khona (MIT) · Leenoy Meshulam (Massachusetts Institute of Technology MIT) · Brain Laboratory International (International Brain Laboratory) · Ila Fiete (Massachusetts Institute of Technology)</p></li></ul><ul><li><p>Recurrent Switching Dynamical Systems Models for Multiple Interacting Neural Populations</p><p> Joshua Glaser (Columbia) · Matthew Whiteway (Columbia University) · John Cunningham (University of Columbia) · Liam Paninski (Columbia University) · Scott Linderman (Stanford University)</p></li></ul><h2 id="sequence"><a href="#sequence" class="headerlink" title="sequence"></a>sequence</h2><ul><li><p><strong>【看看other-1】</strong> Big Bird: Bert for Longer Sequences</p><p> Manzil Zaheer (Google Research) · Guru Guruganesh (Google Research) · Kumar Avinava Dubey (Carnegie Mellon University) · Joshua Ainslie (Google) · Chris Alberti (Google) · Santiago Ontanon (Google LLC) · Philip Pham (Google) · Anirudh Ravula (Google) · Qifan Wang (Google Research) · Li Yang (Google) · Amr Ahmed (Google Research)</p></li></ul><h2 id="interpretable"><a href="#interpretable" class="headerlink" title="interpretable"></a>interpretable</h2><ul><li><p>Explaining Naive Bayes and Other Linear Classifiers with Polynomial Time and Delay</p><p> Joao Marques-Silva (ANITI, Federal University of Toulouse Midi-Pyrénées) · Thomas Gerspacher (ANITI) · Martin Cooper (University of Toulouse 3) · Alexey Ignatiev (Monash University) · Nina Narodytska (VMmare Research)</p></li></ul><ul><li><p><strong>【看看interpre-3】</strong> Interpretable Sequence Learning for Covid-19 Forecasting</p><p> Sercan Arik (Google) · Chun-Liang Li (Google) · Martin Nikoltchev (Google) · Rajarishi Sinha (Google) · Arkady Epshteyn (Google) · Jinsung Yoon (Google) · Long Le (Google) · Vikas Menon (Google) · Shashank Singh (Google) · Yash Sonthalia (Google) · Hootan Nakhost (Google) · Leyou Zhang (Google) · Elli Kanal (Google) · Tomas Pfister (Google)</p></li></ul><ul><li><p>ICAM: Interpretable Classification via Disentangled Representations and Feature Attribution Mapping</p><p> Cher Bass (King’s College London) · Mariana da Silva (King’s College London) · Carole Sudre (King’s College London) · Petru-Daniel Tudosiu (King’s College London) · Stephen Smith (FMRIB Centre - University of Oxford) · Emma Robinson (King’s College)</p></li></ul><ul><li><p>How does this interaction affect me? Interpretable attribution for feature interactions</p><p> Michael Tsang (University of Southern California) · Sirisha Rambhatla (University of Southern California) · Yan Liu (University of Southern California)</p></li></ul><ul><li><p><strong>【看看interpre-1】</strong> Learning outside the Black-Box: The pursuit of interpretable models</p><p> Jonathan Crabbe (University of Cambridge) · Yao Zhang (University of Cambridge) · William Zame (UCLA) · Mihaela van der Schaar (University of Cambridge)</p></li></ul><ul><li><p>GANSpace: Discovering Interpretable GAN Controls</p><p> Erik Härkönen (Aalto University) · Aaron Hertzmann (Adobe) · Jaakko Lehtinen (Aalto University & NVIDIA) · Sylvain Paris (Adobe)</p></li></ul><ul><li><p>Interpretable multi-timescale models for predicting fMRI responses to continuous natural speech</p><p> Shailee Jain (The University of Texas at Austin) · Vy Vo (Intel Corporation) · Shivangi Mahto (The University of Texas at Austin) · Amanda LeBel (The University of Texas at Austin) · Javier Turek (Intel Labs) · Alexander Huth (The University of Texas at Austin)</p></li></ul><ul><li><p>Learning identifiable and interpretable latent models of high-dimensional neural activity using pi-VAE</p><p> Ding Zhou (Columbia University) · Xue-Xin Wei (University of Pennsylvania)</p></li></ul><ul><li><p>Towards Interpretable Natural Language Understanding with Explanations as Latent Variables</p><p> Wangchunshu Zhou (Beihang University) · Jinyi Hu (Tsinghua University) · Hanlin Zhang (South China University of Technology) · Xiaodan Liang (Sun Yat-sen University) · Maosong Sun (Tsinghua University) · Chenyan Xiong (Microsoft Research AI) · Jian Tang (Mila)</p></li></ul><ul><li><p>Interpretable and Personalized Apprenticeship Scheduling: Learning Interpretable Scheduling Policies from Heterogeneous User Demonstrations</p><p> Rohan Paleja (Georgia Institute of Technology) · Andrew Silva (Georgia Institute of Technology) · Letian Chen (Georgia Institute of Technology) · Matthew Gombolay (Georgia Institute of Technology)</p></li></ul><ul><li><p>Incorporating Interpretable Output Constraints in Bayesian Neural Networks</p><p> Wanqian Yang (Harvard University) · Lars Lorch (Harvard) · Moritz Graule (Harvard University) · Himabindu Lakkaraju (Harvard) · Finale Doshi-Velez (Harvard)</p></li></ul><ul><li><p>Implicit Regularization in Deep Learning May Not Be Explainable by Norms</p><p> Noam Razin (Tel Aviv University) · Nadav Cohen (Tel Aviv University)</p></li></ul><ul><li><p>Parameterized Explainer for Graph Neural Network</p><p> Dongsheng Luo (The Pennsylvania State University) · Wei Cheng (NEC Labs America) · Dongkuan Xu (The Pennsylvania State University) · Wenchao Yu (UCLA) · Bo Zong (NEC Labs) · Haifeng Chen (NEC Labs America) · Xiang Zhang (The Pennsylvania State University)</p></li></ul><ul><li><p>PGM-Explainer: Probabilistic Graphical Model Explanations for Graph Neural Networks</p><p> Minh N Vu (University of Florida) · My T. Thai (University of Florida)</p></li></ul><ul><li><p>Can Implicit Bias Explain Generalization? Stochastic Convex Optimization as a Case Study</p><p> Assaf Dauber (Tel-Aviv University) · Meir Feder (Tel-Aviv University) · Tomer Koren (Tel Aviv University & Google) · Roi Livni (Tel Aviv University)</p></li></ul><ul><li><p>Asymmetric Shapley values: incorporating causal knowledge into model-agnostic explainability</p><p> Christopher Frye (Faculty) · Colin Rowat (University of Birmingham) · Ilya Feige (Faculty)</p></li></ul><ul><li><p><strong>【看看interpre-2】</strong> How Can I Explain This to You? An Empirical Study of Deep Neural Network Explanation Methods</p><p> Jeya Vikranth Jeyakumar (University of California, Los Angeles) · Joseph Noor (University of California, Los Angeles) · Yu-Hsi Cheng (UCLA) · Luis Garcia (University of California, Los Angeles) · Mani Srivastava (UCLA)</p></li></ul><ul><li><p>Explainable Voting</p><p> Dominik Peters (Carnegie Mellon University) · Ariel Procaccia (Harvard University) · Alexandros Psomas (Purdue University) · Zixin Zhou (Peking University)</p></li></ul><ul><li><p>What Did You Think Would Happen? Explaining Agent Behaviour through Intended Outcomes</p><p> Herman Ho-Man Yau (University of Surrey) · Chris Russell (The Alan Turing Institute/ The University of Surrey) · Simon Hadfield (University of Surrey)</p></li></ul><ul><li><p>Margins are Insufficient for Explaining Gradient Boosting</p><p> Allan Grønlund (Aarhus University, MADALGO) · Lior Kamma (Aarhus University) · Kasper Green Larsen (Aarhus University)</p></li></ul><ul><li><p>Causal Shapley Values: Exploiting Causal Knowledge to Explain Individual Predictions of Complex Models</p><p> Tom Heskes (Radboud University Nijmegen) · Evi Sijben (Radboud University) · Ioan Gabriel Bucur (Radboud University Nijmegen) · Tom Claassen (Radboud University Nijmegen)</p></li></ul><h2 id="Autoencoder"><a href="#Autoencoder" class="headerlink" title="Autoencoder"></a>Autoencoder</h2><ul><li><p>Implicit Rank-Minimizing Autoencoder</p><p> Li Jing (Facebook AI Research) · Jure Zbontar (Facebook) · yann lecun (Facebook)</p></li></ul><ul><li><p>Swapping Autoencoder for Deep Image Manipulation</p><p> Taesung Park (UC Berkeley) · Jun-Yan Zhu (Adobe, CMU) · Oliver Wang (Adobe Research) · Jingwan Lu (Adobe Research) · Eli Shechtman (Adobe Research, US) · Alexei Efros (UC Berkeley) · Richard Zhang (Adobe)</p></li></ul><ul><li><p><strong>【看看other-5】</strong> Hierarchical Quantized Autoencoders</p><p> Will Williams (Speechmatics) · Sam Ringer (Speechmatics) · Tom Ash (Speechmatics) · David MacLeod (Speechmatics) · Jamie Dougherty (Speechmatics) · John Hughes (Speechmatics)</p></li></ul><ul><li><p>Regularized linear autoencoders recover the principal components, eventually</p><p> Xuchan Bao (University of Toronto) · James Lucas (University of Toronto) · Sushant Sachdeva (University of Toronto) · Roger Grosse (University of Toronto)</p></li></ul><ul><li><p>Dirichlet Graph Variational Autoencoder</p><p> Jia Li (The Chinese University of Hong Kong) · Jianwei Yu (CUHK) · Jiajin Li (The Chinese University of Hong Kong) · Honglei Zhang (Georgia Institute of Technology) · Kangfei Zhao (The Chinese University of Hong Kong) · Yu Rong (Tencent AI Lab) · Hong Cheng (The Chinese University of Hong Kong) · Junzhou Huang (University of Texas at Arlington / Tencent AI Lab)</p></li></ul><ul><li><p>NVAE: A Deep Hierarchical Variational Autoencoder</p><p> Arash Vahdat (NVIDIA) · Jan Kautz (NVIDIA)</p></li></ul><ul><li><p>Evidential Sparsification of Multimodal Latent Spaces in Conditional Variational Autoencoders</p><p> Masha Itkina (Stanford University) · Boris Ivanovic (Stanford University) · Ransalu Senanayake (Stanford University) · Mykel J Kochenderfer (Stanford University) · Marco Pavone (Stanford University)</p></li></ul><ul><li><p>Fully Convolutional Mesh Autoencoder using Efficient Spatially Varying Kernels</p><p> Yi Zhou (University of Southern California) · Chenglei Wu (Facebook) · Zimo Li (University of Southern California) · Chen Cao (Snap Inc.) · Yuting Ye (Facebook Reality Labs) · Jason Saragih (Facebook) · Hao Li (Pinscreen/University of Southern California/USC ICT) · Yaser Sheikh (Facebook Reality Labs)</p></li></ul><ul><li><p>Recursive Inference for Variational Autoencoders</p><p> Minyoung Kim (Samsung AI Center Cambridge) · Vladimir Pavlovic (Rutgers University)</p></li></ul><ul><li><p><strong>【看看other-2】</strong> The Autoencoding Variational Autoencoder</p><p> Taylan Cemgil (DeepMind) · Sumedh Ghaisas (DeepMind) · Krishnamurthy Dvijotham (DeepMind) · Sven Gowal (DeepMind) · Pushmeet Kohli (DeepMind)</p></li></ul><ul><li><p>Autoencoders that don’t overfit towards the Identity</p><p> Harald Steck (Netflix)</p></li></ul><h2 id="clustering"><a href="#clustering" class="headerlink" title="clustering"></a>clustering</h2><ul><li><p>Deep Subspace Clustering with Data Augmentation</p><p> Mahdi Abavisani (Rutgers, The State University of New Jersey) · Alireza Naghizadeh (Rutgers University) · Dimitris Metaxas (Rutgers University) · Vishal Patel (Johns Hopkins University)</p></li></ul><ul><li><p>Bandit-PAM: Almost Linear Time k-Medoids Clustering via Multi-Armed Bandits</p><p> Mo Tiwari (Stanford University) · Martin Zhang (Harvard University) · James J Mayclin (Stanford University) · Sebastian Thrun (Stanford University) · Chris Piech (Stanford) · Ilan Shomorony (University of Illinois at Urbana Champaign)</p></li></ul><ul><li><p>Self-Supervised Learning by Cross-Modal Audio-Video Clustering</p><p> Humam Alwassel (KAUST) · Dhruv Mahajan (Facebook) · Bruno Korbar (Facebook) · Lorenzo Torresani (Facebook AI) · Bernard Ghanem (KAUST) · Du Tran (Facebook AI)</p></li></ul><ul><li><p>Near-Optimal Comparison Based Clustering</p><p> Michaël Perrot (Max Planck Institute for Intelligent Systems) · Pascal Esser (Technical University of Munich) · Debarghya Ghoshdastidar (Technical University Munich)</p></li></ul><ul><li><p>Graduated Assignment for Joint Multi-Graph Matching and Clustering with Application to Unsupervised Graph Matching Network Learning</p><p> Runzhong Wang (Shanghai Jiao Tong University) · Junchi Yan (Shanghai Jiao Tong University) · Xiaokang Yang (Shanghai Jiao Tong University)</p></li></ul><ul><li><p>Scalable Approximation Algorithm for Fair k−center Clustering</p><p> Elfarouk Harb (Hong Kong University of Science and Technology) · Ho Shan Lam (The Hong Kong University of Science and Technology)</p></li></ul><ul><li><p>Deep Transformation-Invariant Clustering</p><p> Tom Monnier (École des ponts Paristech) · Thibault Groueix (École des ponts ParisTech) · Mathieu Aubry (École des ponts ParisTech)</p></li></ul><ul><li><p>Efficient Clustering for Stretched Mixtures: Landscape and Optimality</p><p> Kaizheng Wang (Columbia University) · Yuling Yan (Princeton University) · Mateo Diaz (Cornell University)</p></li></ul><ul><li><p>Efficient Clustering Based On A Unified View Of K-means And Ratio-cut</p><p> Shenfei Pei (Northwestern Polytechnical University) · Feiping Nie (University of Texas Arlington) · Rong Wang (Northwestern Polytechnical University) · Xuelong Li (Northwestern Polytechnical University)</p></li></ul><ul><li><p>Adversarial Learning for Robust Deep Clustering</p><p> Xu Yang (Xidian University) · Cheng Deng (Xidian University) · Kun Wei (Xidian University) · Junchi Yan (Shanghai Jiao Tong University) · Wei Liu (Tencent AI Lab)</p></li></ul><ul><li><p>Sliding Window Algorithms for k-Clustering Problems</p><p> Michele Borassi (Google Switzerland GmbH) · Alessandro Epasto (Google) · Silvio Lattanzi (Google Research) · Sergei Vassilvitskii (Google) · Morteza Zadimoghaddam (Google Research)</p></li></ul><ul><li><p>From Trees to Continuous Embeddings and Back: Hyperbolic Hierarchical Clustering</p><p> Ines Chami (Stanford University) · Albert Gu (Stanford) · Vaggos Chatziafratis (Stanford University, California) · Christopher Ré (Stanford)</p></li></ul><ul><li><p>Probabilistic Fair Clustering</p><p> Seyed Esmaeili (University of Maryland, College Park) · Brian Brubach (University of Maryland) · Leonidas Tsepenekas (University of Maryland) · John Dickerson (University of Maryland)</p></li></ul><ul><li><p>Strongly local p-norm-cut algorithms for semi-supervised learning and local graph clustering</p><p> Meng Liu (Purdue University) · David Gleich (Purdue University)</p></li></ul><ul><li><p>Fair Hierarchical Clustering</p><p> Sara Ahmadian (Google Research) · Alessandro Epasto (Google) · Marina Knittel (University of Maryland, College Park) · Ravi Kumar (Google) · Mohammad Mahdian (Google Research) · Benjamin Moseley (Carnegie Mellon University) · Philip Pham (Google) · Sergei Vassilvitskii (Google) · Yuyan Wang (Carnegie Mellon University)</p></li></ul><ul><li><p>Partially View-aligned Clustering</p><p> Zhenyu Huang (Sichuan University) · Peng Hu (Institute for Infocomm Research, A<em>STAR) · Joey Tianyi Zhou (IHPC, A</em>STAR) · Jiancheng Lv (Machine Intelligence Laboratory College of Computer Science, Sichuan University) · Xi Peng (Institute for Infocomm, Research Agency for Science, Technology and Research (A*STAR) Singapore)</p></li></ul><ul><li><p>Differentially Private Clustering: Tight Approximation Ratios</p><p> Badih Ghazi (Google) · Ravi Kumar (Google) · Pasin Manurangsi (Google)</p></li></ul><ul><li><p>On the Power of Louvain for Graph Clustering</p><p> Vincent Cohen-Addad (CNRS & Sorbonne Université) · Adrian Kosowski (NavAlgo) · Frederik Mallmann-Trenn (King’s College London) · David Saulpic (Ecole normale supérieure)</p></li></ul><ul><li><p>SMYRF - Efficient attention using asymmetric clustering</p><p> Giannis Daras (National Technical University of Athens) · Nikita Kitaev (University of California, Berkeley) · Augustus Odena (Google Brain) · Alexandros Dimakis (University of Texas, Austin)</p></li></ul><ul><li><p>Higher-Order Spectral Clustering of Directed Graphs</p><p> Valdimar Steinar Ericsson Laenen (FiveAI) · He Sun (School of Informatics, The University of Edinburgh)</p></li></ul><h1 id="data-augmentation"><a href="#data-augmentation" class="headerlink" title="data augmentation"></a>data augmentation</h1><ul><li><p>Maximum-Entropy Adversarial Data Augmentation for Improved Generalization and Robustness</p><p> Long Zhao (Rutgers University) · Ting Liu (Google) · Xi Peng (University of Delaware) · Dimitris Metaxas (Rutgers University)</p></li></ul><ul><li><p>A Group-Theoretic Framework for Data Augmentation</p><p> Shuxiao Chen (University of Pennsylvania) · Edgar Dobriban (University of Pennsylvania) · Jane Lee (University of Pennsylvania)</p></li></ul><ul><li><p>Post-training Iterative Hierarchical Data Augmentation for Deep Networks</p><p> Adil Khan (Innopolis University) · Khadija Fraz (Hazara University)</p></li></ul><ul><li><p>Heavy-tailed Representations, Text Polarity Classification & Data Augmentation</p><p> Hamid JALALZAI (Télécom ParisTech) · Pierre Colombo (Telecom ParisTech) · Chloé Clavel (Telecom-ParisTech, Paris, France) · Eric Gaussier (Université Joseph Fourier, Grenoble) · Giovanna Varni (Telecom ParisTec) · Emmanuel Vignon (IBM) · Anne Sabourin (LTCI, Telecom ParisTech, Université Paris-Saclay)</p></li></ul><ul><li><p>Unsupervised Data Augmentation for Consistency Training</p><p> Qizhe Xie (CMU, Google Brain) · Zihang Dai (Carnegie Mellon University) · Eduard Hovy (CMU) · Thang Luong (Google Brain) · Quoc V Le (Google)</p></li></ul><ul><li><p>Exemplar VAEs for Exemplar based Generation and Data Augmentation</p><p> Sajad Norouzi (University of Toronto / Vector Institute) · David J Fleet (University of Toronto) · Mohammad Norouzi (Google Brain)</p></li></ul><ul><li><p>Practical automated data augmentation with a reduced search space</p><p> Ekin Dogus Cubuk (Google Brain) · Barret Zoph (Google Brain) · Jon Shlens (Google Research) · Quoc V Le (Google)</p></li></ul><ul><li><p>Counterfactual Data Augmentation using Locally Factored Dynamics</p><p> Silviu Pitis (University of Toronto) · Elliot Creager (University of Toronto) · Animesh Garg (Univ. of Toronto, Vector Institute, Nvidia)</p></li></ul><ul><li><p>Deep Subspace Clustering with Data Augmentation</p><p> Mahdi Abavisani (Rutgers, The State University of New Jersey) · Alireza Naghizadeh (Rutgers University) · Dimitris Metaxas (Rutgers University) · Vishal Patel (Johns Hopkins University)</p></li></ul><h2 id="有点意思"><a href="#有点意思" class="headerlink" title="有点意思"></a>有点意思</h2><ul><li><p>Learning Loss for Test-Time Augmentation</p><p> Ildoo Kim (Kakao Brain) · Younghoon Kim (Sungshin Women’s University) · Sungwoong Kim (Kakao Brain)</p></li><li><p>Predicting Training Time Without Training</p><p> Luca Zancato (University of Padova) · Alessandro Achille (Amazon Web Services) · Avinash Ravichandran (AWS) · Rahul Bhotika (Amazon) · Stefano Soatto (UCLA)</p></li></ul>]]></content>
<tags>
<tag> paper list </tag>
</tags>
</entry>
<entry>
<title>监视时间序列(特别是针对运维的监控时间序列数据)</title>
<link href="/uncategorized/surveys/monitoring_ts/"/>
<url>/uncategorized/surveys/monitoring_ts/</url>
<content type="html"><![CDATA[<p>总结几个点:</p><ol><li>数据集名称与链接</li><li>数据集背景:数据集的描述对象是什么</li><li>数据集任务:该数据集采集的时候的原始任务是什么</li><li>数据集格式(可选的)</li></ol><span id="more"></span><h2 id="运维时间序列"><a href="#运维时间序列" class="headerlink" title="运维时间序列"></a>运维时间序列</h2><ol><li><p><a href="https://catalog.data.gov/dataset/health-monitoring-and-prognostics-for-computer-servers">计算机服务器的运行状况监视和预测</a></p><ul><li><p>描述:关键任务系统的诊断解决方案需要一种全面的方法来主动检测和隔离故障,推荐和指导基于状况的维护措施,<br>并实时估算关键组件和相关子系统的剩余使用寿命。一个主要的挑战是将预测的优势扩展到包括计算机服务器和其他电子组件。<br>预测能力的关键推动因素是监视与执行组件和子系统的运行状况有关的时间序列信号。时间序列信号使用模式识别进行实时处理,<br>以进行主动异常检测和剩余使用寿命估算。将提供使用模式识别技术来早期检测已知会导致电子系统故障的多种机制的示例,<br>包括:环境问题;软件老化;传感器降级或故障;硬件组件的退化;机械,电子和光学互连的性能下降。<br>预后模式分类有助于大幅提高组件的可靠性裕度和系统可用性目标,同时减少昂贵的“无故障”事件的来源,<br>这些事件已成为重要的保修成本问题。</p></li><li><p>任务:anomaly detection, time series prediction</p></li><li><p><a href="https://datasetsearch.research.google.com/search?query=monitoring,%20time%20series&docid=cp3L+CfAjd47GckZAAAAAA==">google link</a></p></li></ul></li><li><p><a href="https://www.kaggle.com/boltzmannbrain/nab">Numenta异常基准(NAB)</a></p><ul><li>描述:58个时间序列数据文件的NAB语料库旨在为流异常检测中的研究提供数据。它由真实世界和人为的时间序列数据组成,<br>其中包含标记的异常行为时期。数据是有序的,带有时间戳的单值指标。除非另有说明,否则所有数据文件均包含异常。<br>大多数数据来自各种来源,例如AWS服务器指标,Twitter量,广告点击指标,流量数据等。</li><li>任务:anomaly detection, prediction</li><li><a href="https://datasetsearch.research.google.com/search?query=cpu,%20time%20series&docid=iXJXcx2EBRKHZSc8AAAAAA==">google link</a></li></ul></li><li><p><strong>挺好的数据集</strong>企业级应用程序运维 </p><p> <a href="https://www.kaggle.com/anomalydetectionml/rawdata">原始数据</a>,<br> <a href="https://www.kaggle.com/anomalydetectionml/features">feature data</a></p><ul><li>描述:一个企业级软件的运维时间序列,原始数据链接中是没有正常、异常标签的原始数据,数据更加详细。features中是经过处理和标签后的数据,<br>将原始数据中的同一时刻的数据全都拼接在了一起,形成了非常高维度的数据。</li><li>任务:异常检测</li></ul></li><li><p>Azure 服务器CPU时间序列 <a href="https://www.kaggle.com/amcs1729/azure-data">link</a></p><ul><li>描述:某个Azure服务器的CPU运转时间序列,单维时间序列。</li><li>任务:时间序列预测</li></ul></li></ol><ol start="5"><li><p>应用程序运维 <a href="https://www.kaggle.com/wolfgangb33r/usercount">link</a></p><ul><li>描述:某个应用程序在一段时间内的一些统计信息,</li></ul><p> <strong>数据包括:</strong> 时间戳、距离上次记录时间内的用户访问次数、距离上次记录时间内产生的会话次数、距离上次记录时间新增的用户数、<br> 距离上次记录时间内的应用崩溃次数。</p><ul><li>任务:回归、预测</li></ul></li></ol><ol start="6"><li>服务器运维日志,运维日志 <a href="https://www.kaggle.com/kartikjaspal/server-logs-suspicious">link</a><ul><li>描述:服务器攻击数据集,记录了1. 攻击时间和持续时间,2. 源IP和目的IP,3. 数据包,字节,流和标志,4.类型,ID和标签/类</li><li>任务:分类、异常检测</li></ul></li></ol><ol start="7"><li><p><a href="https://zenodo.org/record/3653909#.X0hlBMhLiUk">电梯设备的运维数据集</a></p><ul><li><p>描述:华为慕尼黑研究中心的公开(匿名)预测性维护数据集。来自各种IoT传感器的数据集,用于电梯行业的预测性维护。<br>该数据可用于电梯门的预测维护,以减少计划外的停机并最大程度地延长设备使用寿命。数据集包含操作数据,<br>该数据以时间序列的形式在建筑物的高峰和夜间电梯使用中以4Hz采样(16:30到23:30之间)。对于电梯轿厢门,<br>我们考虑的系统是:机电传感器(门球轴承传感器),环境(湿度)和物理(振动)。</p></li><li><p>任务:预测(无标签)</p></li><li><p>形式:time series,无标签,维度为3,有相关论文引用</p></li><li><p><a href="https://datasetsearch.research.google.com/save?query=coronavirus%20covid-19&docid=lvVz38vzKrmGqZ48AAAAAA==">google link</a></p></li></ul></li></ol><ol start="8"><li><p><strong>挺好的数据集</strong> 大数据平台运维数据集 <a href="https://zenodo.org/record/3549604#.X0hpFMhLiUk">link</a></p><ul><li>描述:<a href="https://link.springer.com/chapter/10.1007/978-3-030-44769-4_13">相关论文</a>, OpenStack是一个云操作系统,<br>它控制整个数据中心内的大型计算,存储和网络资源池,所有这些资源均通过具有通用身份验证机制的API进行管理和配置。<br>出于生成数据的目的,它包含一个名为wally-113的控制节点和四个计算节点:wally-122,wally-123,wally-124<br>和wally- 117。它已部署在群集的裸机节点上,每个节点具有16 GB的RAM,3个1TB磁盘和2个1Gbit以太网NIC。<br>将三个硬盘组合到软件RAID 5中以实现数据冗余。</li></ul><p> <strong>工作负载与故障类型:</strong> 1. Create and delete server; 2. Create and delete image; 3. Create and delete network;<br> <strong>数据:</strong> 1. Metrics: CPU, MEM and load of the machine (either controller or the compute nodes). 2. Logs: 每个物理主机与上面的项目所生成的日志 3. Traces: It generates one trace per request, that goes through all involved services, and builds a tree of calls which captures a workflow of service invocations, request types: HTTP, RPC, DB API, Driver.</p><ul><li>任务:异常检测</li><li><a href="https://datasetsearch.research.google.com/save?query=coronavirus%20covid-19&docid=Vac33QBm10kwUw0FAAAAAA==">google link</a></li></ul></li><li><p>智能手机上的恶意软件与恶意活动记录 <a href="http://bigdata.ise.bgu.ac.il/sherlock/">link</a></p><ul><li>描述:该数据集本质上是一个庞大的时间序列数据集,几乎涵盖了可以从Samsung Galaxy S5智能手机进行采样的每种软件<br>和硬件传感器的类型,而无需root特权。数据集包含超过100亿条数据记录中的6,000亿个数据点。<br>我们提供了明确的标签(时间戳和描述),可以准确捕获设备上的恶意软件何时执行其恶意活动。通过这些标签,您可以将数据集用作机器学习算法的基准。</li></ul><p> <strong>数据包括:</strong>呼叫/ SMS日志、位置、WiFi信号强度、网络统计信息、其他…(参见数据集描述页面)</p><ul><li>任务:恶意活动检测、恶意软件检测</li><li><a href="https://datasetsearch.research.google.com/save?query=coronavirus%20covid-19&docid=5CKm58KTxK6n3a22AAAAAA==">google link</a>, <a href="https://www.impactcybertrust.org/dataset_view?idDataset=1258">other link</a></li></ul></li><li><p>视频应用的服务质量检测数据集 <a href="https://zenodo.org/record/3459164#.X0h2bMhLiUk">link</a></p><ul><li>描述:场景为在用户设备上使用LTE网络访问一个基于GStreamer的MPEG-DASH视频软件时的应用服务质量。<br>用户首先要从分布式DASH服务器上下载MPEG-DASH视频文件,过程中收集下述的信息:</li></ul><ol><li>Date: date when the data is collected;</li><li>Player: type of the player (in this case it is always “GStreamer”);</li><li>Num: identifier of the player;</li><li>URLVid: URL of the MPD file;</li><li>Latency: latency experienced by the player;</li><li>BW: bandwidth experienced by the player;</li><li>Quality: chosen DASH video representation;<br>在实验过程中,其他播放器运行以在类似CDN的服务器上生成逼真的媒体流流量。 这些玩家通过遵循Poisson或Pareto分布开始游戏。</li></ol><ul><li><p>任务:时间序列预测,网络承载能力预测</p></li><li><p><a href="https://zenodo.org/record/3459164#.X0h2bMhLiUk">google link</a> </p></li></ul></li><li><p>加密劫持攻击时间序列数据集 <a href="https://www.kaggle.com/keshanijayasinghe/cryptojacking-attack-timeseries-dataset">link</a></p><ul><li>描述:加密劫持是未经授权使用他人的计算机来开采加密货币。加密挖矿代码在后台运行,<br>因为毫无戒心的受害者会正常使用他们的计算机。他们可能会注意到的唯一迹象是性能降低或执行滞后。<br>最近,由于服务器的强大计算能力和许多配置不当的服务器设置,攻击者已将目标从个人计算机转移到云服务器。<br>尽管已经设计了许多统计方法来进行密码劫持攻击检测,但是设计具有低计算开销的实时检测器仍然是主要问题之一。<br>另一方面,对新检测算法和技术的评估在很大程度上依赖于设计良好的数据集的存在。</li></ul><p><strong>数据包括:</strong>三个csv文件(异常、正常、完整),每个文件分别有关于服务器的多种指标(CPU, MEM, DISK)</p><ul><li><p>任务:异常检测(但是好像没有给出异常的形式、对应每个时间戳的标签)</p></li><li><p><a href="https://datasetsearch.research.google.com/save?query=coronavirus%20covid-19&docid=Ylcq22ZqJG16NqBSAAAAAA==">google link</a></p></li></ul></li><li><p><strong>不错的数据集</strong> Antarex HPC (High Performance computer) 故障数据集 <a href="https://zenodo.org/record/2553224#.X0iokshLiUk">link</a></p><ul><li>描述:Antarex数据集包含在进行故障注入时从位于苏黎世联邦理工学院的同名实验HPC系统收集的跟踪数据,目的是对HPC系统进行基于机器学习的故障检测研究。<br>为了获取数据,我们执行了基准测试应用程序,并同时通过专用程序在特定时间在系统中注入了错误,从而触发了应用程序行为的异常。我们的数据集中涵盖了广泛的故障,<br>从硬件故障到配置错误,最后是由其他过程的干扰导致的性能异常。这是通过作者开发的FINJ故障注入工具实现的。</li></ul><p><strong>数据集包含两种类型的数据:</strong> 一种类型的数据是指一系列CSV文件,每个文件都包含一组通过LDMS HPC监视框架采样的系统性能指标。<br>另一种类型是指日志文件,详细描述数据集中每个时间点的系统状态(即当前正在运行的基准测试应用程序或已注入的故障程序)。<br>这种结构使研究人员可以对数据集进行广泛的研究。而且,由于我们是通过流式传输连续数据收集数据集的,<br>因此基于该数据集的任何研究都可以轻松地以在线方式在真实的HPC系统上重现。<br><strong>数据集分为两部分:</strong>第一部分仅包括与CPU和内存相关的基准测试应用程序和故障程序,而第二部分则仅与硬盘驱动器相关。</p><ul><li><p>数据描述:注入异常用的脚本也在数据集中可以找到</p></li><li><p>任务:异常检测</p></li><li><p><a href="https://datasetsearch.research.google.com/save?query=coronavirus%20covid-19&docid=8mGvdORXdb72rgUVAAAAAA==">google link</a></p></li></ul></li></ol><h2 id="一般的监视数据"><a href="#一般的监视数据" class="headerlink" title="一般的监视数据"></a>一般的监视数据</h2><ol><li><p><a href="https://catalog.data.gov/dataset/scalable-time-series-change-detection-for-biomass-monitoring-using-gaussian-process">生物量监测数据集, TIME SERIES CHANGE DETECTION FOR BIOMASS MONITORING</a></p><ul><li>描述:生物量监测,特别是检测某个地理区域的生物量或植被变化,对于研究系统的碳循环至关重要,并且在理解气候变化及其影响方面具有重要意义。</li><li>任务:time series change point detection</li><li><a href="https://datasetsearch.research.google.com/search?query=monitoring,%20time%20series&docid=tv/EXqiscaRktAxXAAAAAA==">google link</a></li></ul></li><li><p>网络流量<br><a href="https://www.kaggle.com/crawford/computer-network-traffic">https://www.kaggle.com/crawford/computer-network-traffic</a></p></li><li><p>水泵故障<br><a href="https://www.kaggle.com/nphantawee/pump-sensor-data">https://www.kaggle.com/nphantawee/pump-sensor-data</a></p></li><li><p>SLA管理<br><a href="https://www.kaggle.com/imenbenyahia/clearwatervnf-virtual-ip-multimedia-ip-system?select=bono-io.read_kbytes_sec.csv">https://www.kaggle.com/imenbenyahia/clearwatervnf-virtual-ip-multimedia-ip-system?select=bono-io.read_kbytes_sec.csv</a></p></li></ol><ol start="5"><li><p>基于多传感器数据的液压试验台的状态评估 <a href="https://www.kaggle.com/jjacostupa/condition-monitoring-of-hydraulic-systems">link</a></p><ul><li><p>描述:该数据集是通过液压试验台实验获得的。该试验台由一个主要工作装置和一个辅助冷却过滤回路组成,它们通过油箱[1],[2]连接。该系统周期性地重复恒定的负载循环(持续时间为60秒),<br>并在定量改变四个液压组件(冷却器,阀门,泵和蓄能器)的状态的同时,测量压力,体积流量和温度等过程值。</p></li><li><p>任务:异常检测、分类、预测</p></li><li><p><a href="https://datasetsearch.research.google.com/save?query=coronavirus%20covid-19&docid=pCOC5zkeO5a8CSeGAAAAAA==">google link</a></p></li></ul></li><li><p>工业流水线刀片磨损情况数据集 <a href="https://www.kaggle.com/inIT-OWL/one-year-industrial-component-degradation?select=01-04T184424_001_mode1.csv">link</a></p><ul><li>好像没有明确label,需要联系作者</li></ul></li></ol><h2 id="其他有趣的数据集"><a href="#其他有趣的数据集" class="headerlink" title="其他有趣的数据集"></a>其他有趣的数据集</h2><ol><li>边缘计算数据集<br>边缘服务器的位置与用户的位置<br><a href="https://www.kaggle.com/salmaneunus/edge-computing-edge-servers">https://www.kaggle.com/salmaneunus/edge-computing-edge-servers</a>?</li></ol><ol start="2"><li><p><a href="https://www.kaggle.com/ntnu-testimon/paysim1">交易欺诈检测数据集</a></p><ul><li>描述:PaySim基于从一个非洲国家/地区实施的移动货币服务的一个月财务日志中提取的真实交易样本来模拟移动货币交易。原始日志是由一家跨国公司提供的,该公司是移动金融服务的提供商,该服务目前在全球14个以上的国家/地区中运行。</li><li>任务:交易欺诈检测</li><li>形式:raw data, 包含交易金额、交易客户、初始余额与新余额、标签(是否为欺诈交易)</li><li><a href="https://datasetsearch.research.google.com/save?query=coronavirus%20covid-19&docid=LUidW2DnXG+T/EA2AAAAAA==">google link</a></li></ul></li></ol><ol start="3"><li><p>Data for: A study on leading machine learning techniques for high order fuzzy time series forecasting <a href="https://data.mendeley.com/datasets/xc6c8xr564/1">link</a></p><ul><li>描述:14个子数据集的合集,具体参见 <a href="https://www.sciencedirect.com/science/article/pii/S095219761930226X#fig5">link</a></li><li>任务:时间序列预测</li><li><a href="https://datasetsearch.research.google.com/save?query=coronavirus%20covid-19&docid=6GYYktv0MMeQl5jGAAAAAA==">google link</a></li></ul></li><li><p>BPI挑战 <a href="https://www.narcis.nl/dataset/RecordID/uuid%3Ac3e5d162-0cfd-4bb0-bd82-af5268819c35/Language/EN">link</a></p><ul><li>描述: <strong>The challenge is to design a (draft) predictive model, which can be used to implement in a BI environment.</strong> </li></ul><p> <strong>The purpose of this predictive model will be to support Business Change Management in implementing software releases</strong><br> <strong>with less impact on the Service Desk and/or IT Operations.</strong> We have prepared several case-files with anonymous<br> information from Rabobank Netherlands Group ICT for this challenge. The files contain record details from an ITIL<br> Service Management tool called HP Service Manager. We provide you with extracts in CSV with the Interaction-,<br> Incident- or Change-number as case ID. Next to these case-files, we provide you with an Activity-log, related to<br> the Incident-cases. There is also a document detailing the data in the CSV file and providing background to the<br> Service Management tool.</p><ul><li><a href="https://datasetsearch.research.google.com/save?query=coronavirus%20covid-19&docid=ICMC800RzMnUdhDMAAAAAA==">google link</a></li></ul></li></ol><ol start="5"><li><pre><code>主机网络流量时间序列2019/01 [link](https://zenodo.org/record/2669079#.X0i2AMhLiUk)</code></pre></li></ol><pre><code>+ 描述:数据集于2019年 1 月在一个月的时间内收集。收集IP流的观察点位于大学校园网络的边界。校园大学网络可使用16 CIDR IPv4网络范围,并且包含从连接宿舍的部分(通过服务器部分)到包含大学管理人员的工作站的部分在内的各种网络部分。用于创建数据集的原始IP流的大小超过860GB。我们数据集中的主机由其源IPv4地址标识。 + 数据集包含以下变量: - 聚合 -使用均值/最大/最小聚合函数从一小时不相交的窗口中聚合的五分钟总体积中创建 - 流数(FL) -给定源IP的流数 - 数据包数(PKT) -给定源IP的数据包数 - 字节数(BYT) -给定源IP的数据包数 - 流量持续时间(DUR) -平均流量持续时间(以秒为单位) - 不重复计数 -使用均值/最大值/最小值聚合函数,在一小时不相交的窗口中聚合的五分钟窗口中每个变量的不重复值计数 - 对等体数(PEER) -给定源IP的不同通信对等体数 - 端口数(PORTS) -给定源IP的不同目标端口数 - 协议数(PROTO) -给定源IP的不同通信协议数 - AS号数量(AS) -给定源IP的不同目标AS号数量 - 国家数量(CTRY) -给定源IP的不同目标国家/地区的数量 - 标签 - 范围(RNG) -主机所属的网络范围(匿名) - 单位(UNT) -拥有网络范围的管理单位 - 子单位(SUB-UNT) -该单位的子单位 + 但是好像没有规定具体的任务,label都是主机所属的单位,难道是分类</code></pre><ol start="6"><li><p>用于应用程序级监视的不同多核处理器对运行时开销的影响的比较 <a href="https://zenodo.org/record/7619#.X0i4jMhLiUk">link</a></p><ul><li>描述:连续运行的软件系统需要应用程序级监视,以在运行时保持其性能和可用性。软件系统的性能监视需要将时间序列<br>数据存储在监视日志或流中。这样的监视可能会导致被监视系统的大量运行时开销。在本文中,我们评估了多核处理器对<br>Kieker应用程序级监视框架的开销的影响。我们将监控开销分为三部分,并使用微基准对受控实验室进行广泛的实验,<br>以量化在受控和可重复条件下监控开销的这些部分的结果。我们的实验表明,通过异步写入监视日志,可以在多核处理器上进<br>一步减少Kieker框架已经很低的开销。</li><li>提供了某个实验在多个CPU上的记录,包括AMD, Intel, T6360, 任务未知</li></ul></li><li><p>大型RAID磁盘系统的磁盘替换日志文件示例,用于进行预测性维护分析 <a href="https://zenodo.org/record/2580162#.X0i568hLiUk">link</a></p><ul><li>描述:在一个磁盘系统中更换其中的一个磁盘,并记录系统的日志文件。观察磁盘热插拔对系统运行的影响。</li><li>官方提供的一些观察<ol><li>随着时间的流逝,错误会在分组的群集中发生,从而可以进行预测性维护。</li><li>时间和空间上的粒度:热磁盘交换不会干扰RAID系统中的系统可用性。 全机架交换可以。 更糟糕的是,通过在某个特<br>定时间购买完整的,庞大的系统,一家公司将在系统生命周期的最后走向创伤性事件。 在较小的时间间隔内安装较小的系<br>统组件(阅读:机架)不会危及操作的连续性。</li></ol></li></ul></li><li><p>Application Detection through Rich Monitoring Data <a href="https://springernature.figshare.com/articles/Artifact_for_Taxonomist_Application_Detection_through_Rich_Monitoring_Data/6384248">link</a></p><ul><li>描述:The related study develops a technique named ‘Taxonomist’ to identify applications running on supercomputers,<br>using machine learning to classify known applications and detect unknown applications. </li></ul><p> <strong>The technique uses monitoring data such as CPU and memory usage metrics and hardware counters collected from</strong><br> <strong>supercomputers. The aims of this technique include providing an alternative to ‘naive’ application detection</strong><br> <strong>methods based on names of processes and scripts, and helping prevent fraud, waste and abuse in supercomputers.</strong></p><ul><li>任务:应该是分类,还有可能是one-class classification</li><li><a href="https://datasetsearch.research.google.com/save?query=coronavirus%20covid-19&docid=95bGlb7wMSmRtTfiAAAAAA==">google link</a></li></ul></li><li><p>局域网网络稳定性 - 测量无线与基于以太网的局域网的响应时间 <a href="https://www.kaggle.com/garystafford/ping-data">link</a></p><ul><li>描述:目的是在连接到Internet时捕获并可视化LAN网络的时序变化。为此,以10秒为间隔从网络上的一系列IoT收集设备收集ping<br>响应时间。定时是从2.4 GHz无线和100 Mbps以太网收集的。测量了从设备到本地Internet路由器以及Internet上第一跳服务器的时间。<br>时间序列数据中的间隙表示LAN中断或路由器访问Internet的能力中断。<br>ping实用程序使用ICMP协议的强制性ECHO_REQUEST数据报ECHO_RESPONSE从主机或网关获取ICMP 。</li><li>没有给定label,但是应该是一个分类问题</li><li><a href="https://datasetsearch.research.google.com/save?query=coronavirus%20covid-19&docid=J1jGnseIcAjUTiedAAAAAA==">google link</a></li></ul></li></ol><hr><p>SMAP (Soil Moisture Active Passive satellite) and MSL (Mars Science Laboratory rover) are two public datasets from NASA [6]. Kyle Hundman, Valentino Constantinou, Christopher Laporte, Ian Colwell, and Tom Soderstrom. 2018. Detecting Spacecraft Anomalies Using LSTMs and Nonparametric Dynamic Thresholding. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD ’18). ACM, New York, NY, USA, 387–395.</p>]]></content>
<tags>
<tag> survey </tag>
<tag> time series </tag>
<tag> dataset </tag>
</tags>
</entry>
<entry>
<title>Related Papers in SIGKDD 2020 (2020.08.23)</title>
<link href="/uncategorized/paperlistfile/KDD2020/"/>
<url>/uncategorized/paperlistfile/KDD2020/</url>
<content type="html"><![CDATA[<p>Accepted paper list: <a href="https://www.kdd.org/kdd2020/accepted-papers">Link</a></p><span id="more"></span><h2 id="Time-Series"><a href="#Time-Series" class="headerlink" title="Time Series"></a>Time Series</h2><ul><li><p>A Geometric Approach to Time Series Chains Improves Robustness</p><p>Authors: Makoto Imamura: Tokai University; Takaaki Nakamura: Mitsubishi Electric Corporation; Eamonn Keogh: University of California - Riverside</p></li><li><p>Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks</p><p>Authors: Zonghan Wu: University of Technology Sydney; Shirui Pan: Monash University; Guodong Long: University of Technology Sydney; Jing Jiang: University of Technology Sydney; Xiaojun Chang: Monash University; Chengqi Zhang: University of Technology Sydney</p></li><li><p>Fast R-STL: Efficient and Robust Seasonal-Trend Decompositionfor Time Series with Complex Patterns</p><p>Authors: Qingsong Wen: Alibaba Group U.S.; Zhe Zhang: Alibaba Group U.S.; Yan Li: Alibaba Group U.S.; Liang Sun: Alibaba Group U.S.</p></li><li><p>Fitbit for Chickens? Time Series Data Mining Can Increase the Productivity of Poultry Farms</p><p>Authors: Alireza Abdoli: University of California Riverside; Sara Alaee: University of California Riverside; Shima Imani: University of California Riverside; Amy Murillo: University of California Riverside ; Alec Gerry: UC Riverside; Leslie Hickle: FarmSense Inc; Eamonn Keogh: UC Riverside</p></li><li><p>USAD : UnSupervised Anomaly Detection on multivariate time series</p><p>Authors: Julien Audibert: Orange EURECOM; Pietro Michiardi: EURECOM; Frédéric Guyard: Orange Labs; Sébastien Marti: Orange; Maria A. Zuluaga: EURECOM</p></li><li><p>HiTANet: Hierarchical Time-Aware Attention Networks for Risk Prediction on Electronic Health Records</p><p>Authors: Junyu Luo: The Pennsylvania State University; Muchao Ye: The Pennsylvania State University; Cao Xiao: IQVIA; Fenglong Ma: The Pennsylvania State University</p></li><li><p>Identifying Sepsis Subphenotypes via Time-Aware Multi-Modal Auto-Encoder</p><p>Authors: Changchang Yin: The Ohio State University; Ruoqi Liu: The Ohio State University; Dongdong Zhang: The Ohio State University; Ping Zhang: The Ohio State University</p></li><li><p>Local Motif Clustering on Time-Evolving Graphs</p><p>Authors: Dongqi Fu: University of Illinois at Urbana-Champaign; Dawei Zhou: University of Illinois at Urbana-Champaign; Jingrui He: University of Illinois at Urbana-Champaign</p></li><li><p>Multi-Source Deep Domain Adaptation with Weak Supervision for Time-Series Sensor Data</p><p>Authors: Garrett Wilson: Washington State University; Janardhan Rao Doppa: Washington State University; Diane J. Cook: Washington State University</p></li><li><p>Sliding Sketches: A Framework using Time Zones for Data Stream Processing in Sliding Windows</p><p>Authors: Xiangyang Gou: Peking University; Long He: Peking University; Yinda Zhang: Peking University; Ke Wang: Peking University; Xilai Liu: Peking University; Tong Yang: Peking University; Yi Wang: Southern University of Science and Technology; Bin Cui: Peking University</p></li></ul><ul><li><p>Attention based multi-modal new product sales time-series forecasting</p><p>Authors: Vijay Ekambaram: IBM Research; Kushagra Manglik: IBM Research; Sumanta Mukherjee: IBM Research; Surya Shravan Kumar Sajja: IBM Research; Satyam Dwivedi: IBM Research; Vikas Raykar: IBM Research</p></li></ul><ul><li><p>BusTr: predicting bus travel times from real-time traffic</p><p>Authors: Richard Barnes: UC Berkeley; Senaka Buthpitiya: Google Research; James Cook: N ne; Alex Fabrikant: Google Research; Andrew Tomkins: Google Research; Fangzhou Xu: Google Research</p></li></ul><ul><li><p>Calendar Graph Neural Networks for Modeling Time Structures in Spatiotemporal User Behaviors</p><p>Authors: Daheng Wang: University of Notre Dame; Meng Jiang: University of Notre Dame; Munira Syed: University of Notre Dame; Oliver Conway: Conde Nast; Vishal Juneja: Conde Nast; Sriram Subramanian: Conde Nast; Nitesh V. Chawla: University of Notre Dame</p></li></ul><ul><li><p>HetETA: Heterogeneous Information Network Embedding for Estimating Time of Arrival</p><p>Authors: Huiting Hong: AI Labs Didi Chuxing Beijing China ; Yucheng Lin: AI Labs Didi Chuxing Beijing China ; Xiaoqing Yang: AI Labs Didi Chuxing Beijing China ; Zang Li: AI Labs Didi Chuxing Beijing China ; Jieping Ye: AI Labs Didi Chuxing Beijing China ; Kun Fu: AI Labs Didi Chuxing Beijing China ; Zheng Wang: AI Labs Didi Chuxing Beijing China ; Xiaohu Qie: Technology Ecosystem Development Didi Chuxing Beijing China</p></li></ul><ul><li><p>Heidegger: Interpretable Temporal Causal Discovery</p><p>Authors: Mehrdad Mansouri: Simon Fraser University; Ali Arab: Simon Fraser University; Zahra Zohrevand: Simon Fraser University; Martin Eser: Simon Fraser University</p></li></ul><ul><li><p>LogPar: Logistic PARAFAC2 Factorization for Temporal Binary Data with Missing Values</p><p>Authors: Kejing Yin: Hong Kong Baptist University; Ardavan Afshar: Georgia Institute of Technology; Joyce Ho: Emory University; William Cheung: Hong Kong Baptist University; Chao Zhang: Georgia Institute of Technology; Jimeng Sun: University of Illinois Urbana-Champaign</p></li></ul><ul><li><p>Predicting Temporal Sets with Deep Neural Networks</p><p>Authors: Le Yu: Beihang University; Leilei Sun: Beihang University; Bowen Du: Beihang University; Chuanren Liu: University of Tennessee; Hui Xiong: Rutgers University; Weifeng Lv: Beihang University</p></li></ul><h2 id="missing-value-or-irregularly-sampled-time-series"><a href="#missing-value-or-irregularly-sampled-time-series" class="headerlink" title="missing value or irregularly sampled time series"></a>missing value or irregularly sampled time series</h2><ul><li><p>Missing Value Imputation for Mixed Data via Gaussian Copula</p><p>Authors: Yuxuan Zhao: Cornell University; Madeleine Udell: Cornell University</p></li></ul><ul><li><p>LogPar: Logistic PARAFAC2 Factorization for Temporal Binary Data with Missing Values</p><p>Authors: Kejing Yin: Hong Kong Baptist University; Ardavan Afshar: Georgia Institute of Technology; Joyce Ho: Emory University; William Cheung: Hong Kong Baptist University; Chao Zhang: Georgia Institute of Technology; Jimeng Sun: University of Illinois Urbana-Champaign</p></li></ul><h2 id="Recurrent-Neural-Network"><a href="#Recurrent-Neural-Network" class="headerlink" title="Recurrent Neural Network"></a>Recurrent Neural Network</h2><ul><li><p>Recurrent Halting Chain for Early Multi-label Classification</p><p>Authors: Thomas Hartvigsen: Worcester Polytechnic Institute; Cansu Sen: Worcester Polytechnic Institute; Xiangnan Kong: Worcester Polytechnic Institute; Elke Rundensteiner: Worcester Polytechnic Institute</p></li><li><p>Recurrent Networks for Guided Multi-Attention Classification</p><p>Authors: Xin Dai: Worcester Polytechnic Institute; Xiangnan Kong: Worcester Polytechnic Institute; Tian Guo: Worcester Polytechnic Institute; John Lee: Worcester Polytechnic Institute; Xinyue Liu: Worcester Polytechnic Institute; Constance Moore: University of Massachusetts Medical School</p></li><li><p>A Self-Evolving Mutually-Operative Recurrent Network-based Model for Online Tool Condition Monitoring in Delay Scenario</p><p>Authors: Monidipa Das: Nayang Technological University NTU Singapore ; Mahardhika Pratama: Nanyang Technological University NTU ; Tegoeh Tjahjowidodo: KU Leuven</p></li><li><p>A Sleeping, Recovering Bandit Algorithm for Optimizing Recurring Notifications</p><p>Authors: Kevin Yancey: Duolingo; Burr Settles: Duolingo</p></li><li><p>Hypergraph Convolutional Recurrent Neural Network</p><p>Authors: Jaehyuk Yi: KAIST; Jinkyoo Park: KAIST</p></li></ul><h2 id="Anomaly-Detection"><a href="#Anomaly-Detection" class="headerlink" title="Anomaly Detection"></a>Anomaly Detection</h2><ul><li><p>Isolation Distributional Kernel: A new tool for kernel based anomaly detection</p><p>Authors: Kai Ming Ting: Nanjing University; Takashi Washio: Osaka University; Bi-Cun Xu: Nanjing University; Zhi-Hua Zhou: Nanjing University</p></li><li><p>USAD : UnSupervised Anomaly Detection on multivariate time series</p><p>Authors: Julien Audibert: Orange EURECOM; Pietro Michiardi: EURECOM; Frédéric Guyard: Orange Labs; Sébastien Marti: Orange; Maria A. Zuluaga: EURECOM</p></li><li><p>Generic Outlier Detection in Multi-Armed Bandit</p><p>Authors: Yikun Ban: University of Illinois at Urbana-Champaign; Jingrui He: University of Illinois at Urbana-Champaign</p></li><li><p>Ultrafast Local Outlier Detection from a Data Stream with Stationary Region Skipping</p><p>Authors: Susik Yoon: Korea Advanced Institute of Science and Technology; Jae-Gil Lee: Korea Advanced Institute of Science and Technology; Byung Suk Lee: University of Vermont</p></li><li><p>Interleaved Sequence RNNs for Fraud Detection <em>是一个关于银行卡欺诈检测的,数据是多维非均匀采样序列。这个检测问题被转化为一个监督学习问题。</em></p><p>Authors: Bernardo Branco: Feedzai; Pedro Abreu: QuantumBlack a McKinsey company ; Ana Sofia Gomes: Feedzai; Mariana Almeida: Cleverly; João Tiago Ascensão: Feedzai; Pedro Bizarro: Feedzai</p></li></ul><ul><li><p>Grounding Visual Concepts for Multimedia Event Detection and Multimedia Event Captioning in Zero-shot Setting</p><p>Authors: Zhihui Li: University of New South Wales; Xiaojun Chang: Monash University; Lina Yao: University of New South Wales; Shirui Pan: Monash University; Zongyuan Ge: Monash University; Huaxiang Zhang: Shandong Normal University</p></li><li><p>Multi-class Data Description for Out-of-distribution Detection</p><p>Authors: Dongha Lee: Pohang University of Science and Technology; Sehun Yu: Pohang University of Science and Technology; Hwanjo Yu: Pohang University of Science and Technology</p></li><li><p>CrowdQuake: A Networked System of Low-Cost Sensors for Earthquake Detection via Deep Learning</p><p>Authors: Xin Huang: Florida Institute of Technology; Jangsoo Lee: Kyungpook National University; Young-Woo Kwon: Kyungpook National University; Chul-Ho Lee: Florida Institute of Technology</p></li></ul><ul><li><p>DATE: Dual Attentive Tree-aware Embedding for Customs Fraud Detection <em>是一个关于交易异常检测的,数据集是交易记录,基于一个Tree-aware的结构进行特征提取</em></p><p>Authors: Sundong Kim: Institute for Basic Science; Yu-Che Tsai: National Cheng Kung University; Karandeep Singh: Institute for Basic Science; Yeonsoo Choi: World Customs Organization; Etim Ibok: Nigeria Customs Service; Cheng-Te Li: National Cheng Kung University; Meeyoung Cha: Institute for Basic Science</p></li></ul><h2 id="Change-point-detection"><a href="#Change-point-detection" class="headerlink" title="Change point detection"></a>Change point detection</h2><ul><li><p>A Non-Iterative Quantile Change Detection Method in Mixture Model with Heavy-Tailed Components</p><p>Authors: Yuantong Li: Purdue University; Qi Ma: North Carolina State University; Sujit Ghosh: North Carolina State University</p></li></ul><ul><li><p>Laplacian Change Point Detection for Dynamic Graphs</p><p>Authors: Shenyang Huang: McGill University, Quebec Institute for Artificial Intelligence (Mila); Yasmeen Hitti: McGill University, Quebec Institute for Artificial Intelligence (Mila); Guillaume Rabusseau: University of Montreal, Quebec Institute for Artificial Intelligence (Mila); Reihaneh Rabbany: McGill University, Quebec Institute for Artificial Intelligence (Mila)</p></li></ul><h2 id="Sequence"><a href="#Sequence" class="headerlink" title="Sequence"></a>Sequence</h2><ul><li><p>Interleaved Sequence RNNs for Fraud Detection</p><p>Authors: Bernardo Branco: Feedzai; Pedro Abreu: QuantumBlack a McKinsey company ; Ana Sofia Gomes: Feedzai; Mariana Almeida: Cleverly; João Tiago Ascensão: Feedzai; Pedro Bizarro: Feedzai</p></li></ul><h2 id="Interpretable"><a href="#Interpretable" class="headerlink" title="Interpretable"></a>Interpretable</h2><ul><li><p>Adversarial Infidelity Learning for Model Interpretation</p><p>Authors: Jian Liang: Cloud and Smart Industries Group, Tencent, China; Bing Bai: Cloud and Smart Industries Group, Tencent, China; Yuren Cao: Cloud and Smart Industries Group, Tencent, China; Kun Bai: Cloud and Smart Industries Group, Tencent, China; Fei Wang: Cornell University</p></li><li><p>Heidegger: Interpretable Temporal Causal Discovery</p><p>Authors: Mehrdad Mansouri: Simon Fraser University; Ali Arab: Simon Fraser University; Zahra Zohrevand: Simon Fraser University; Martin Eser: Simon Fraser University</p></li><li><p>INPREM: An Interpretable and Trustworthy Predictive Model for Healthcare</p><p>Authors: Xianli Zhang: Xi’an Jiaotong University; Buyue Qian: Xi’an Jiaotong University; Shilei Cao: Tencent Jarvis Lab; Yang Li: Xi’an Jiaotong University; Hang Chen: Xi’an Jiaotong University; Yefeng Zheng: Tencent Jarvis Lab; Ian Davidson: University of California - Davis</p></li><li><p>Interpretability is a Kind of Safety: An Interpreter-based Ensemble for Adversary Defense</p><p>Authors: Jingyuan Wang: Beihang University; Yufan Wu: Beihang University; Mingxuan Li: Beihang University; Xin Lin: Beihang University; Junjie Wu: Beihang University; Chao Li: Beihang University</p></li><li><p>Malicious Attacks against Deep Reinforcement Learning Interpretations</p><p>Authors: Mengdi Huai: University of Virginia; Jianhui Sun: University of Virginia; Renqin Cai: University of Virginia; Liuyi Yao: University of New York at Buffalo; Aidong Zhang: University of Virginia</p></li></ul><ul><li><p>GRACE: Generating Concise and Informative Contrastive Sample to Explain Neural Network Model</p><p>Authors: Thai Le: The Pennsylvania State University; Suhang Wang: The Pennsylvania State University; Dongwon Lee: The Pennsylvania State University</p></li><li><p>xGAIL: Explainable Generative Adversarial Imitation Learning for Explainable Human Decision Analysis</p><p>Authors: Menghai Pan: Worcester Polytechnic Institute; Weixiao Huang: Worcester Polytechnic Institute; Yanhua Li: Worcester Polytechnic Institute (WPI); Xun Zhou: University of Iowa; Jun Luo: Lenovo Group Limited</p></li></ul><ul><li><p>Explainable classification of brain networks via contrast subgraphs</p><p>Authors: Tommaso Lanciano: La Sapienza University of Rome; Francesco Bonchi: Fondazione ISI; Aristides Gionis: KTH Royal Institute of Technology</p></li></ul><h2 id="Autoencoder"><a href="#Autoencoder" class="headerlink" title="Autoencoder"></a>Autoencoder</h2><ul><li><p>High-Dimensional Similarity Search with Quantum-Assisted Variational Autoencoder</p><p>Authors: Nicholas Gao: NASA Ames Research Center; Max Wilson: NASA Ames Research Center; Thomas Vandal: NASA Ames Research Center; Walter Vinci: NASA Ames Research Center; Ramakrishna Nemani: NASA Ames Research Center; Eleanor Rieffel: NASA Ames Research Center</p></li></ul><h2 id="LSTM"><a href="#LSTM" class="headerlink" title="LSTM"></a>LSTM</h2><ul><li><p>Cascade-LSTM: A Tree-Structured Neural Classifier for Detecting Misinformation Cascades</p><p>Authors: Francesco Ducci: ETH Zurich; Mathias Kraus: ETH Zurich; Stefan Feuerriegel: ETH Zurich</p></li></ul><h2 id="Data-augmentation"><a href="#Data-augmentation" class="headerlink" title="Data augmentation"></a>Data augmentation</h2><ul><li><p>NodeAug: Semi-Supervised Node Classification with Data Augmentation</p><p>Authors: Yiwei Wang: National University of Singapore; Wei Wang: National University of Singapore; Yuxuan Liang: National University of Singapore; Yujun Cai: Nanyang Technological University; Juncheng Liu: National University of Singapore; Bryan Hooi: National University of Singapore</p></li></ul>]]></content>
<tags>
<tag> paper list </tag>
</tags>
</entry>
<entry>
<title>Self-supervised learning survey</title>
<link href="/uncategorized/notes/self_supervised_learning_survey/"/>
<url>/uncategorized/notes/self_supervised_learning_survey/</url>
<content type="html"><![CDATA[<h1 id="Survey-Self-supervised-learning-Generative-or-Contrastive"><a href="#Survey-Self-supervised-learning-Generative-or-Contrastive" class="headerlink" title="Survey: Self-supervised learning: Generative or Contrastive"></a>Survey: Self-supervised learning: Generative or Contrastive</h1><p>清华唐杰团队的工作,比较完整。</p><p><a href="https://www.aliyundrive.com/s/Hf99EegM1pd">PDF FILE (包含一些笔记)</a></p><span id="more"></span><hr><p>自监督任务的方法共分为3类,分别为生成式(generative)、判别式(contrastive)、对抗式(generative-contrastive (adversarial)),他们的区别如下。</p><img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img_profile/20200730194624.png" alt="compare" style="zoom:67%;" /><h2 id="GENERATIVE-SELF-SUPERVISED-LEARNING"><a href="#GENERATIVE-SELF-SUPERVISED-LEARNING" class="headerlink" title="GENERATIVE SELF-SUPERVISED LEARNING"></a>GENERATIVE SELF-SUPERVISED LEARNING</h2><h3 id="Auto-regressive-AR-Model"><a href="#Auto-regressive-AR-Model" class="headerlink" title="Auto-regressive (AR) Model"></a>Auto-regressive (AR) Model</h3><ul><li><p>pros: The advantage of auto-regressive models is that it can model the context dependency well.</p></li><li><p>cons: However, one shortcoming of the AR model is that the token at each position can only access its context from one direction.</p></li></ul><h3 id="Flow-based-Model"><a href="#Flow-based-Model" class="headerlink" title="Flow-based Model"></a>Flow-based Model</h3><h3 id="Auto-encoding-AE-Model"><a href="#Auto-encoding-AE-Model" class="headerlink" title="Auto-encoding (AE) Model"></a>Auto-encoding (AE) Model</h3><ol><li>Basic AE Model</li><li>Context Prediction Model (CPM): The idea of the Context Prediction Model (CPM) is predicting contextual information based on inputs.</li><li>The idea of denoising autoencoder models is that representation should be robust to the introduction of noise. The masked language model (MLM) can be regarded as a denoising because its input masks predicted tokens.</li><li>Variational AE Model</li></ol><h3 id="Hybrid-Generative-Models"><a href="#Hybrid-Generative-Models" class="headerlink" title="Hybrid Generative Models"></a>Hybrid Generative Models</h3><ol><li>Combining AR and AE Model.</li><li>Combining AE and Flow-based Models</li></ol><h2 id="CONTRASTIVE-SELF-SUPERVISED-LEARNING-Contrastive-learning-aims-at-“learn-to-compare”"><a href="#CONTRASTIVE-SELF-SUPERVISED-LEARNING-Contrastive-learning-aims-at-“learn-to-compare”" class="headerlink" title="CONTRASTIVE SELF-SUPERVISED LEARNING: Contrastive learning aims at “learn to compare”."></a>CONTRASTIVE SELF-SUPERVISED LEARNING: Contrastive learning aims at “learn to compare”.</h2><h3 id="context-instance-contrast-即对比某个样本及其它的语境"><a href="#context-instance-contrast-即对比某个样本及其它的语境" class="headerlink" title="context-instance contrast: 即对比某个样本及其它的语境"></a>context-instance contrast: 即对比某个样本及其它的语境</h3><ol><li>Predict Relative Position (PRP): 直观的理解,是将原始数据的好几个部分用某种方式打乱,然后使用一个辅助的任务尝试恢复原有的数据。</li></ol><p>例:在CV中,将某个样本,a) 分割成几个部分,然后打乱顺序,附加任务为恢复顺序;b) 做旋转,附加任务是将旋转后的样本恢复;c) 分割后完成一个拼图游戏。</p><img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img_profile/20200730193851.png" style="zoom:67%;" /><ol start="2"><li>Maximize Mutual Information:直观来讲,即使用互信息来表征某个instance是否属于一个context,给定一个context-instance pair,并给定其是否属于同一个样本的标签(0:不属于,1:属于),则最小化正样本的互信息,并最大化负样本的互信息。</li></ol><p>cons: The [132] provides empirical evidence that the success of the models mentioned above is only loosely connected to MI by showing that an upper bound MI estimator leads to ill-conditioned and lower performance representation. Instead, more should be attributed to encoder architecture and a negative sampling strategy related to metric learning.</p><blockquote><p>Some examples:</p><p><strong>(CV)</strong> maximize the MI between a local patch and its global context.</p><p><strong>(speech)</strong> CPC maximize the association between a segment of audio and its context audio.</p><p><strong>(NLP)</strong> maximize the mutual information between a global representation of a sentence and n-grams in it.</p><p><strong>(Graph)</strong> Deep Graph InfoMax (DGI) [139] considers a node’s representation as the local feature and the average of randomly samples 2-hop neighbors as context.</p></blockquote><h3 id="context-context-contrast-即对比两个独立的样本"><a href="#context-context-contrast-即对比两个独立的样本" class="headerlink" title="context-context contrast: 即对比两个独立的样本"></a>context-context contrast: 即对比两个独立的样本</h3><p>对于一个样本,首先生成他的多个副本(通过各种加噪/数据增强的方式),然后最小化这几个样本之间的相似度,并最大化这些样本与另一个独立样本的相似度。<br>例:在cv中,多个副本是通过裁剪、颜色转换、旋转等方式生成的。</p><img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img_profile/20200730194302.png" style="zoom: 67%;" /> <h2 id="GENERATIVE-CONTRASTIVE-ADVERSARIAL-SELF-SUPERVISED-LEARNING"><a href="#GENERATIVE-CONTRASTIVE-ADVERSARIAL-SELF-SUPERVISED-LEARNING" class="headerlink" title="GENERATIVE-CONTRASTIVE (ADVERSARIAL) SELF-SUPERVISED LEARNING"></a>GENERATIVE-CONTRASTIVE (ADVERSARIAL) SELF-SUPERVISED LEARNING</h2><p>pros: 1) A reason for the generative model’s success in self-supervised learning is its ability to fit the data distribution. 2) GANs are designed to serve for human-level understanding. 3) GANs focus on capturing the complete information of the sample.</p><ol><li><p>Generate with Complete Input: 将完整的样本送入网络并进行压缩与重建,由discriminator判别重建数据与原始数据的差异。</p></li><li><p>Recover with Partial Input: 将经过处理(加噪、转换)的样本送入网络进行重建,由discriminator判别重建数据与原始完整数据的差异。从这个角度上来讲,与判别式方法非常像,但是二者学习分布的方式不同,discriminator的复杂度也不同。</p></li></ol><img src="https://raw.githubusercontent.com/KMdsy/figurebed/master/img_profile/20200730195129.png" style="zoom: 50%;" />]]></content>
<tags>
<tag> note </tag>
<tag> self-supervised learning </tag>
</tags>
</entry>
<entry>
<title>Related Papers in SIRIR 2020 (2020.07.25)</title>
<link href="/uncategorized/paperlistfile/SIGIR2020/"/>
<url>/uncategorized/paperlistfile/SIGIR2020/</url>