-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathresults.tex
1622 lines (1570 loc) · 83.6 KB
/
results.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Master Thesis
% Ralf Krauth
% April 2021
%
% License:
% CC-BY-SA 4.0 -- Creative Commons Attribution-ShareAlike 4.0 International
% https://creativecommons.org/licenses/by-sa/4.0/legalcode
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Results}
In the following two sections,
the results of the \acrshort{dnn}- and \acrshort{cgan}-based approaches for predicting Hi-C contact matrices will be presented.
While the various modifications of the dense neural network described in \cref{sec:improve:DNNapproach}
did not really help improve the status quo, the novel conditional generative adversarial network laid out in \cref{sec:improve:Hi-cGAN}
showed interesting properties.
Finally, in \cref{sec:results:comparison}, the results from \acrshort{dnn} and \emph{Hi-cGAN} are compared to two known Hi-C matrix prediction approaches in the field,
namely \emph{HiC-Reg} by Zhang et al. \cite{Zhang2019} and the dense neural network approach by Farr\'e et al. \cite{Farre2018a},
on which the \acrshort{dnn}-approach of this thesis is based.
\subsection{Dense Neural Network approaches} \label{sec:results:DNN}
In the following subsections, the resulting predictions for modifications of the dense neural network originally conceived
by Farr\'e, Heurteau, Cuvier and Emberly \cite{Farre2018a} will be shown.
This includes variations of the convolutional layer(s), custom loss functions with the intent of reducing blurriness in the predictions
and tuning window- and bin size, cf. \cref{sec:improve:DNNapproach}.
As a start, however, the results of the initial network without any modifications will be shown for comparison.
Note that the thesis at hand is generally using data from human cell lines, cf.~\cref{sec:methods:input_data},
while the work by Farr\'e et al. is using data from Drosophila Melanogaster embryonic cells.
For a direct comparison, see \cref{sec:results:comparison}.
\subsubsection{Initial DNN results for comparison} \label{sec:initialDNNresults}
The basic dense neural network was setup and trained as explained in \cref{sec:methods:basicSetup}.
Here, the validation error (\acrshort{mse}) reached its minimum of about \SI{150000}{}
after approximately 500 epochs for bin size \SI{25}{\kilo\bp} and around \SI{24000}{} after 400 epochs for bin size \SI{10}{\kilo\bp}, \cref{fig:results:basicDNN_lossEpochs_25,fig:results:basicDNN_lossEpochs_10}.
Beyond that, the learning curve indicated overfitting, but the resulting test matrices often did not change much with increasing number of epochs,
compare e.\,g. the matrix plots after 500 and 1000 epochs in \cref{fig:results:basic500,fig:results:basic1000}.
\Cref{fig:results:basicDNN_pearson,fig:results:basicDNN_10k_pearson}
show the distance-stratified Pearson correlations (cf. \cref{sec:methods:metrics})
alongside \acrfull{auc} for the five test chromosomes 3, 5, 10, 19 and 21 at bin sizes 25 and \SI{10}{\kilo\bp}, respectively.
The red curves in each correlation plot show the correlation between the target Hi-C data from K562 (target chromosome)
and the corresponding training Hi-C data from GM12878 (training chromosome).
It is obvious that all predicted test matrices had a strictly positive Pearson correlation with respect to the target matrices,
but were worse than simply taking data from the training cell line as prediction for the target cell line.
The plots of the predicted matrices also looked modest.
While the \acrshort{dnn} generally produced high interaction counts in regions with many true interactions
and low interaction counts in regions with few true interactions, (\acrshort{tad}-)boundaries between different interacting domains
were mostly not discernible, \cref{fig:results:basic500,fig:results:basic1000}.
This finding is in line with the clearly positive, but medium-valued Pearson correlations.
Exceptions with more distinct boundaries existed in all of the five test chromosomes,
for example chr19:34-\SI{35}{\mega\bp} (\cref{fig:results:basic_r2}), but were rare.
Interestingly, medium-sized interacting structures, for example chr21:31-\SI{32.5}{\mega\bp}
or chr19:31.2-\SI{32.7}{\mega\bp} often seemed to be missing altogether --
while structures larger than the window size, for example chr3:34-\SI{36.7}{\mega\bp} and chr3:36.7-\SI{39.5}{\mega\bp}
sometimes were at least indicated, \cref{fig:results:basic500}.
Reducing the bin size to $b_\mathit{feat}=b_\mathit{mat}=\SI{10}{\kilo\bp}$ as in the paper by Farr\'e et al. \cite{Farre2018a}
led to somewhat different results.
Compared to \SI{25}{\kilo\bp}, the area under the correlation curves was approximately the same for test chromosomes 3 and 5,
slightly higher for chromosome 10, but lower for chromosomes 19 and 21, cf.~\cref{fig:results:basicDNN_pearson,fig:results:basicDNN_10k_pearson}.
However, the ability to predict larger structures was lost, and thus the matrix plots did not look better than before, \cref{fig:results:basic10k_matrices}.
The comparatively bad result for test chromosome 21 might result from the low chromatin feature coverage of this particular chromosome.
No obvious correlation between comparatively ``good'' and ``bad'' predictions with open and closed states of the chromatin was observed.
However, formally computing such a correlation is challenging, because no adequate objective measure for ``good'' and ``bad'' is known,
especially considering the rather blurry results obtained so far.
Furthermore, even if suchlike correlations existed, exploiting them for improving predictions would still be, at best, not straightforward.
\begin{figure}[p]
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_basic/}{pearson_chr03.pdf_tex}}
\caption{chr3}
\end{subfigure} \hfill
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_basic/}{pearson_chr05.pdf_tex}}
\caption{chr5}
\end{subfigure}\\[5mm]
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_basic/}{pearson_chr10.pdf_tex}}
\caption{chr10}
\end{subfigure}\hfill
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_basic/}{pearson_chr19.pdf_tex}}
\caption{chr19}
\end{subfigure}\\[3mm]
\centering
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_basic/}{pearson_chr21.pdf_tex}}
\caption{chr21}
\end{subfigure}\hfill
\begin{subfigure}{0.45\textwidth}
\resizebox{\textwidth}{!}{
\scriptsize
\import{figures/DNN_basic/}{lossOverEpochs.pdf_tex}}
\caption{Learning progress}\label{fig:results:basicDNN_lossEpochs_25}
\end{subfigure}
\caption{Results\,/\,metrics, basic \acrshort{dnn}, \SI{25}{\kilo\bp}, test chromosomes}
\label{fig:results:basicDNN_pearson}
\end{figure}
%10k Pearson and progress
\begin{figure}[p]
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_basic10k/}{pearson_chr03.pdf_tex}}
\caption{chr3}
\end{subfigure} \hfill
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_basic10k/}{pearson_chr05.pdf_tex}}
\caption{chr5}
\end{subfigure}\\[5mm]
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_basic10k/}{pearson_chr10.pdf_tex}}
\caption{chr10}
\end{subfigure}\hfill
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_basic10k/}{pearson_chr19.pdf_tex}}
\caption{chr19}
\end{subfigure}\\[3mm]
\centering
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_basic10k/}{pearson_chr21.pdf_tex}}
\caption{chr21}
\end{subfigure}\hfill
\begin{subfigure}{0.45\textwidth}
\resizebox{\textwidth}{!}{
\scriptsize
\import{figures/DNN_basic10k/}{lossOverEpochs.pdf_tex}}
\caption{Learning progress}\label{fig:results:basicDNN_lossEpochs_10}
\end{subfigure}
\caption{Results\,/\,metrics, basic \acrshort{dnn}, \SI{10}{\kilo\bp}, test chromosomes}
\label{fig:results:basicDNN_10k_pearson}
\end{figure}
%25k matrices, after 500 epochs
\begin{figure}[p]
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/DNN_basic/}{pred00500_chr21_030-040.pdf_tex}
\caption{example region 1, 500 epochs} \label{fig:results:basic_r1}
\end{subfigure}\\[2mm]
\centering{\scriptsize{See \cref{sec:methods:matrix_plots} for track explanations}}\\[3mm]
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/DNN_basic/}{pred00500_chr19_030-040.pdf_tex}
\caption{Example Region 2, 500 epochs} \label{fig:results:basic_r2}
\end{subfigure}\\[2mm]
\centering{\scriptsize{See \cref{sec:methods:matrix_plots} for track explanations}}\\[3mm]
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/DNN_basic/}{pred00500_chr3_030-040.pdf_tex}
\caption{Example Region 2, 500 epochs} \label{fig:results:basic_r3}
\end{subfigure}
\caption{Example predictions GM12878 $\rightarrow$ K562, basic \acrshort{dnn}, \SI{25}{\kilo\bp}, 500~epochs} \label{fig:results:basic500}
\end{figure}
%25k matrices, after 1000 epochs
\begin{figure}[p]
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/DNN_basic/}{pred01000_chr21_030-040.pdf_tex}
\caption{Example region 1, 1000 epochs} \label{fig:results:basic_r1_1000}
\end{subfigure}\\[2mm]
\centering{\scriptsize{See \cref{sec:methods:matrix_plots} for track explanations}}\\[3mm]
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/DNN_basic/}{pred01000_chr19_030-040.pdf_tex}
\caption{Example region 2, 1000 epochs} \label{fig:results:basic_r2_1000}
\end{subfigure}\\[2mm]
\centering{\scriptsize{See \cref{sec:methods:matrix_plots} for track explanations}}\\[3mm]
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/DNN_basic/}{pred01000_chr3_030-040.pdf_tex}
\caption{Example region 2, 1000 epochs} \label{fig:results:basic_r3_1000}
\end{subfigure}
\caption{Example predictions GM12878 $\rightarrow$ K562, basic \acrshort{dnn}, \SI{25}{\kilo\bp}, 1000~epochs} \label{fig:results:basic1000}
\end{figure}
%10k matrices
\begin{figure}[p]
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/DNN_basic10k/}{pred01000_chr21_030-040.pdf_tex}
\caption{Example region 1} \label{fig:results:basic10k_r1}
\end{subfigure}\\[2mm]
\centering{\scriptsize{See \cref{sec:methods:matrix_plots} for track explanations}}\\[3mm]
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/DNN_basic10k/}{pred01000_chr19_030-040.pdf_tex}
\caption{Example region 2} \label{fig:results:basic10k_r2}
\end{subfigure}\\[2mm]
\centering{\scriptsize{See \cref{sec:methods:matrix_plots} for track explanations}}\\[3mm]
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/DNN_basic10k/}{pred01000_chr3_030-040.pdf_tex}
\caption{Example region 3} \label{fig:results:basic10k_r3}
\end{subfigure}
\caption{Example predictions GM12878 $\rightarrow$ K562, basic \acrshort{dnn} \SI{10}{\kilo\bp}, 1000~epochs}
\label{fig:results:basic10k_matrices}
\end{figure}
\clearpage
\subsubsection{Results for variations of the convolutional part} \label{sec:results:wider-longer-etc}
The predictions from the ``longer'' variant with three convolutional filter layers instead of a single one,
cf. \cref{sec:improve:convolution_extensions,sec:methods:variants},
were better than the initial predictions in terms of Pearson correlations for test chromosomes 10, 19 and 21, but worse for test chromosomes 3 and 5,
\cref{fig:results:longerDNN_pearson}.
Interestingly, correlations for some of the larger distances could not be computed after 250 and 500 epochs,
which generally means that the same values were predicted for these distances, cf. \cref{sec:methods:metrics}.
The reason for this behavior is not fully understood yet, but comparatively few neurons in the outermost layer are responsible for predictions at
longer distances due to the chosen network setup, cf. \cref{sec:methods:sample_gen,fig:methods:prediction}.
Since the longer network variant has a considerably larger number of trainable parameters,
it is assumed that 500 epochs might not have been enough to activate some of the outer neurons.
Slow training can occur when \acrshort{relu} activations are used (as in the given case) and the gradients are close to zero \cite{Maas2013}.
Apart from that, the learning process for the ``longer'' variant in general looked more smooth and reached a lower validation error than before, \cref{fig:results:longerDNN_lossEpochs},
but the matrix plots did not show any obvious improvement over the initial ones, \cref{fig:results:longer_matrices}.
The results for the ``wider'' network, which featured a wider convolutional filter in the first network layer,
cf. \cref{sec:improve:convolution_extensions,sec:methods:variants},
were generally similar to the initial results, both in terms of Pearson correlations and in terms of matrix plots,
\cref{fig:results:widerDNN_pearson,fig:results:wider_matrices}.
Given the small increase in the number of trainable parameters and overall similar network topology, this is not surprising.
Overfitting was less obvious than with the initial setup and the training process looked more smooth,
but the remaining validation error was slightly higher than for the initial approach, \cref{fig:results:widerDNN_lossEpochs}.
Combining the ``longer'' and ``wider'' variants in the ``wider-longer'' setup with more convolutional layers and wider
filter kernels, cf. \cref{sec:improve:convolution_extensions,sec:methods:variants}, also did not perform as expected.
While improvements in the Pearson correlations could again be seen for 3 of 5 test chromosomes compared to the initial network, \cref{fig:results:wider-longerDNN_pearson},
the observed correlations were worse than the ones from the highly similar ``longer''-variant alone.
Like with the similar ``longer''-approach, predictions at longer distances were partially missing.
Compared to the other variants, the validation error was generally higher and stopped decreasing after very few epochs, \cref{fig:results:wider-longerDNN_lossEpochs},
while the training loss continued decreasing for at least \SI{1000} epochs.
This generally indicates lack of generalization and overfitting to the training data.
In terms of matrix plots, the predictions surprisingly were still quite similar to the initial ones, but seemed a bit more blurry, \cref{fig:results:wider-longer_matrices}.
Predictions and metrics from the generalized \acrshort{dnn}-approach with feature bin size \SI{5}{\kilo\bp} and matrix bin size \SI{25}{\kilo\bp}
according to \cref{sec:improve:convolution_extensions,sec:methods:inputBinning,sec:methods:variants}
are shown in \cref{fig:results:25k5DNN_pearson,fig:results:25k5_matrices}.
Unfortunately, the results did again not improve compared to the initial predictions.
While the learning curve was smooth and showed signs of slight overfitting beyond 300 epochs, \cref{fig:results:25k5DNN_lossEpochs},
the matrix plots seemed worse than the initial ones, \cref{fig:results:25k5_matrices}.
For example, the large structure at chr3:34-\SI{36.7}{\mega\bp}, which had been detected by the previous approaches, was completely missing.
\begin{figure}[p] %longer variant pearson
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_longer/}{pearson_chr03.pdf_tex}}
\caption{chr3}
\end{subfigure} \hfill
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_longer/}{pearson_chr05.pdf_tex}}
\caption{chr5}
\end{subfigure}\\[5mm]
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_longer/}{pearson_chr10.pdf_tex}}
\caption{chr10}
\end{subfigure}\hfill
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_longer/}{pearson_chr19.pdf_tex}}
\caption{chr19}
\end{subfigure}\\[3mm]
\centering
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_longer/}{pearson_chr21.pdf_tex}}
\caption{chr21}
\end{subfigure}\hfill
\begin{subfigure}{0.45\textwidth}
\resizebox{\textwidth}{!}{
\scriptsize
\import{figures/DNN_longer/}{lossOverEpochs.pdf_tex}}
\caption{Learning progress for longer \acrshort{dnn}} \label{fig:results:longerDNN_lossEpochs}
\end{subfigure}
\caption{Results\,/\,metrics, ``longer'' variant of \acrshort{dnn}, test chromosomes}
\label{fig:results:longerDNN_pearson}
\end{figure}
%longer variant matrices
\begin{figure}[p]
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/DNN_longer/}{pred01000_chr21_030-040.pdf_tex}
\caption{Example region 1} \label{fig:results:longer_r1}
\end{subfigure}\\[2mm]
\centering{\scriptsize{See \cref{sec:methods:matrix_plots} for track explanations}}\\[3mm]
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/DNN_longer/}{pred01000_chr19_030-040.pdf_tex}
\caption{Example region 2} \label{fig:results:longer_r2}
\end{subfigure}\\[2mm]
\centering{\scriptsize{See \cref{sec:methods:matrix_plots} for track explanations}}\\[3mm]
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/DNN_longer/}{pred01000_chr3_030-040.pdf_tex}
\caption{Example region 3} \label{fig:results:longer_r3}
\end{subfigure}
\caption{Example predictions GM12878 $\rightarrow$ K562, ``longer'' variant of \acrshort{dnn},\\bin size \SI{25}{\kilo\bp}, 1000~epochs} \label{fig:results:longer_matrices}
\end{figure}
\begin{figure}[p] %wider variant Pearson
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_wider/}{pearson_chr03.pdf_tex}}
\caption{chr3}
\end{subfigure} \hfill
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_wider/}{pearson_chr05.pdf_tex}}
\caption{chr5}
\end{subfigure}\\[5mm]
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_wider/}{pearson_chr10.pdf_tex}}
\caption{chr10}
\end{subfigure}\hfill
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_wider/}{pearson_chr19.pdf_tex}}
\caption{chr19}
\end{subfigure}\\[3mm]
\centering
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_wider/}{pearson_chr21.pdf_tex}}
\caption{chr21}
\end{subfigure}\hfill
\begin{subfigure}{0.45\textwidth}
\resizebox{\textwidth}{!}{
\scriptsize
\import{figures/DNN_wider/}{lossOverEpochs.pdf_tex}}
\caption{Learning progress for wider \acrshort{dnn}} \label{fig:results:widerDNN_lossEpochs}
\end{subfigure}
\caption{Pearson correlations, ``wider'' variant of \acrshort{dnn}, test chromosomes}
\label{fig:results:widerDNN_pearson}
\end{figure}
%wider variant matrices
\begin{figure}[p]
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/DNN_wider/}{pred01000_chr21_030-040.pdf_tex}
\caption{Example region 1} \label{fig:results:wider_r1}
\end{subfigure}\\[2mm]
\centering{\scriptsize{See \cref{sec:methods:matrix_plots} for track explanations}}\\[3mm]
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/DNN_wider/}{pred01000_chr19_030-040.pdf_tex}
\caption{Example region 2} \label{fig:results:wider_r2}
\end{subfigure}\\[2mm]
\centering{\scriptsize{See \cref{sec:methods:matrix_plots} for track explanations}}\\[3mm]
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/DNN_wider/}{pred01000_chr3_030-040.pdf_tex}
\caption{Example region 3} \label{fig:results:wider_r3}
\end{subfigure}
\caption{Example predictions GM12878 $\rightarrow$ K562, ``wider'' variant of \acrshort{dnn},\\bin size \SI{25}{\kilo\bp}, 1000~epochs}\label{fig:results:wider_matrices}
\end{figure}
\begin{figure}[p]%wider-longer Pearson
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_wider-longer/}{pearson_chr03.pdf_tex}}
\caption{chr3}
\end{subfigure} \hfill
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_wider-longer/}{pearson_chr05.pdf_tex}}
\caption{chr5}
\end{subfigure}\\[5mm]
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_wider-longer/}{pearson_chr10.pdf_tex}}
\caption{chr10}
\end{subfigure}\hfill
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_wider-longer/}{pearson_chr19.pdf_tex}}
\caption{chr19}
\end{subfigure}\\[3mm]
\centering
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_wider-longer/}{pearson_chr21.pdf_tex}}
\caption{chr21}
\end{subfigure}\hfill
\begin{subfigure}{0.45\textwidth}
\resizebox{\textwidth}{!}{
\scriptsize
\import{figures/DNN_wider-longer/}{lossOverEpochs.pdf_tex}}
\caption{Learning progress} \label{fig:results:wider-longerDNN_lossEpochs}
\end{subfigure}
\caption{Results\,/\,metrics, ``wider-longer'' variant of \acrshort{dnn}, test chromosomes}
\label{fig:results:wider-longerDNN_pearson}
\end{figure}
%wider-longer variant matrices
\begin{figure}[p]
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/DNN_wider-longer/}{pred01000_chr21_030-040.pdf_tex}
\caption{Example region 1} \label{fig:results:wider-longer_r1}
\end{subfigure}\\[2mm]
\centering{\scriptsize{See \cref{sec:methods:matrix_plots} for track explanations}}\\[3mm]
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/DNN_wider-longer/}{pred01000_chr19_030-040.pdf_tex}
\caption{Example region 2} \label{fig:results:wider-longer_r2}
\end{subfigure}\\[2mm]
\centering{\scriptsize{See \cref{sec:methods:matrix_plots} for track explanations}}\\[3mm]
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/DNN_wider-longer/}{pred01000_chr3_030-040.pdf_tex}
\caption{Example region 3} \label{fig:results:wider-longer_r3}
\end{subfigure}
\caption{Example predictions GM12878 $\rightarrow$ K562, ``wider-longer'' variant of \acrshort{dnn}, \SI{25}{\kilo\bp}, 1000~epochs} \label{fig:results:wider-longer_matrices}
\end{figure}
\begin{figure}[p]%25k5 Pearson
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_25k5/}{pearson_chr03.pdf_tex}}
\caption{chr3}
\end{subfigure} \hfill
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_25k5/}{pearson_chr05.pdf_tex}}
\caption{chr5}
\end{subfigure}\\[5mm]
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_25k5/}{pearson_chr10.pdf_tex}}
\caption{chr10}
\end{subfigure}\hfill
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_25k5/}{pearson_chr19.pdf_tex}}
\caption{chr19}
\end{subfigure}\\[3mm]
\centering
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_25k5/}{pearson_chr21.pdf_tex}}
\caption{chr21}
\end{subfigure}\hfill
\begin{subfigure}{0.45\textwidth}
\resizebox{\textwidth}{!}{
\scriptsize
\import{figures/DNN_25k5/}{lossOverEpochs.pdf_tex}}
\caption{Learning progress} \label{fig:results:25k5DNN_lossEpochs}
\end{subfigure}
\caption{Results\,/\,metrics, ``5k -- 25k'' variant of \acrshort{dnn} with $b_\mathit{feat}=\SI{5}{\kilo\bp}$ and $b_\mathit{mat}=\SI{25}{\kilo\bp}$, test chromosomes}
\label{fig:results:25k5DNN_pearson}
\end{figure}
%25k5 matrices
\begin{figure}[p]
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/DNN_25k5/}{pred01000_chr21_030-040.pdf_tex}
\caption{Example region 1} \label{fig:results:25k5_r1}
\end{subfigure}\\[2mm]
\centering{\scriptsize{See \cref{sec:methods:matrix_plots} for track explanations}}\\[3mm]
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/DNN_25k5/}{pred01000_chr19_030-040.pdf_tex}
\caption{Example region 2} \label{fig:results:25k5_r2}
\end{subfigure}\\[2mm]
\centering{\scriptsize{See \cref{sec:methods:matrix_plots} for track explanations}}\\[3mm]
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/DNN_25k5/}{pred01000_chr3_030-040.pdf_tex}
\caption{Example region 3} \label{fig:results:25k5_r3}
\end{subfigure}
\caption{Example predictions GM12878 $\rightarrow$ K562, ``5k -- 25k'' variant of \acrshort{dnn}, 1000~epochs} \label{fig:results:25k5_matrices}
\end{figure}
\clearpage
\subsubsection{Results for combined loss function} \label{sec:results:loss_functions}
Exchanging the mean squared error for a combined loss function consisting of \acrshort{mse}, \acrshort{tv} loss and perceptual loss did not improve the results in the chosen setting
according to \cref{sec:improve:combined_loss,sec:methods:combined_loss} (\cref{eq:methods:combined_loss}).
The results are shown in \cref{fig:results:combilossDNN_pearson,fig:results:combiloss_matrices}.
For all test chromosomes, the correlations were highly similar to the initial network's, \cref{fig:results:combilossDNN_pearson},
and the matrix plots also looked similar, chromosome 21 probably being the most different, \cref{fig:results:combiloss_matrices}.
The results shown here are the best ones obtained by manual tuning of the multiplicative parameters $\lambda$ in \cref{eq:methods:combined_loss}.
Guided parameter tuning was unfortunately infeasible within the thesis at hand due to the training times required for optimizing the combined loss function.
Other options which where not explored for the same reason include truncating the \emph{VGG-16} network at a different layer, using a loss function based on
more than one of the intermediate \emph{VGG-16} layers \cite{Johnson2016} or taking another loss network.
However, the results obtained thus far were also not encouraging towards such investigations.
\begin{figure}[p] %combiloss pearson and progress
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_combiloss/}{pearson_chr03.pdf_tex}}
\caption{chr3}
\end{subfigure} \hfill
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_combiloss/}{pearson_chr05.pdf_tex}}
\caption{chr5}
\end{subfigure}\\[5mm]
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_combiloss/}{pearson_chr10.pdf_tex}}
\caption{chr10}
\end{subfigure}\hfill
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_combiloss/}{pearson_chr19.pdf_tex}}
\caption{chr19}
\end{subfigure}\\[3mm]
\centering
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_combiloss/}{pearson_chr21.pdf_tex}}
\caption{chr21}
\end{subfigure}\hfill
\begin{subfigure}{0.45\textwidth}
\resizebox{\textwidth}{!}{
\scriptsize
\import{figures/DNN_combiloss/}{lossOverEpochs.pdf_tex}}
\caption{Learning progress} \label{fig:results:combilossDNN_lossEpochs}
\end{subfigure}
\caption{Results\,/\,metrics, \acrshort{dnn} with combined loss function (MSE, TV, VGG-16), test chromosomes}
\label{fig:results:combilossDNN_pearson}
\end{figure}
%combiloss matrices
\begin{figure}[p]
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/DNN_combiloss/}{pred00500_chr21_030-040.pdf_tex}
\caption{Example region 1} \label{fig:results:combiloss_r1}
\end{subfigure}\\[2mm]
\centering{\scriptsize{See \cref{sec:methods:matrix_plots} for track explanations}}\\[3mm]
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/DNN_combiloss/}{pred00500_chr19_030-040.pdf_tex}
\caption{Example region 2} \label{fig:results:combiloss_r2}
\end{subfigure}\\[2mm]
\centering{\scriptsize{See \cref{sec:methods:matrix_plots} for track explanations}}\\[3mm]
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/DNN_combiloss/}{pred00500_chr3_030-040.pdf_tex}
\caption{Example region 3} \label{fig:results:combiloss_r3}
\end{subfigure}
\caption{Example predictions GM12878 $\rightarrow$ K562, \acrshort{dnn} with combined loss function (MSE, TV, VGG-16), 500~epochs} \label{fig:results:combiloss_matrices}
\end{figure}
While manually searching for better parameters $\lambda$ was not successful,
it was found that the TV loss weight $\lambda_\mathit{TV}$ needed to be much smaller than the two other weights in \cref{eq:methods:combined_loss}.
Otherwise, many true interactions off the matrix diagonals were considered as noise and optimized away early in the training process, cf.
\cref{fig:appendix:failed_tv_loss} (p. \pageref{fig:appendix:failed_tv_loss}).
\subsubsection{Results for score-based loss function} \label{sec:results:scorebased}
Exchanging the \acrshort{mse} loss by a combination between score-based- and \acrshort{mse} loss
allowed for a smooth learning process and a slightly lower validation error compared to the initial approach.
However, at around \SI{7}{\min} per epoch on a GPU, the training process also was about seven times slower than the initial approach on CPU.
Unfortunately, the higher effort did not lead to obvious improvements.
The Pearson correlations for a score-based loss function according to \cref{eq:methods:score_loss} with parameters $\lambda_\mathit{MSE}=1.0,\; \lambda_\mathit{score}=100,\; ds=12$
are shown in \cref{fig:results:scoreLossDNN_pearson}.
While a slight improvement was achieved for test chromosome 21, the correlations of the others remained widely unchanged.
The matrix plots also looked fairly similar to the initial ones, \cref{fig:results:scoreloss_matrices}, chromosome 21 again being the
most different compared to the initial predictions.
In \cref{fig:results:scoreloss_matrices}, the true- and predicted scores are shown the second track, replacing the \acrshort{pca} track.
Indeed, the score curve computed from the true matrices showed local minima at putative \acrshort{tad} boundaries, as set forth in \cref{sec:improve:TAD_loss},
so score computation with the chosen diamond size seemed sound.
However, despite the optimization term in the loss function, the score curve of the predicted matrices compared to the true curve
somewhat like the predicted matrices compared to the true ones:
The predicted score was generally high, when the true score was high, and low when the true score was also low,
but high peaks (local maxima) and steep valleys (local minima) in the plots were usually smoothed out.
Long training times forbade a targeted parameter tuning by grid- or tree-search,
so the results presented in this section should not be interpreted as the optimal ones achievable by a score-based loss function.
\begin{figure}[p]%score loss pearson and progress
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_scoreLoss/}{pearson_chr03.pdf_tex}}
\caption{chr3}
\end{subfigure} \hfill
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_scoreLoss/}{pearson_chr05.pdf_tex}}
\caption{chr5}
\end{subfigure}\\[5mm]
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_scoreLoss/}{pearson_chr10.pdf_tex}}
\caption{chr10}
\end{subfigure}\hfill
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_scoreLoss/}{pearson_chr19.pdf_tex}}
\caption{chr19}
\end{subfigure}\\[3mm]
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_scoreLoss/}{pearson_chr21.pdf_tex}}
\caption{chr21}
\end{subfigure}\hfill
\begin{subfigure}{0.45\textwidth}
\resizebox{\textwidth}{!}{
\scriptsize
\import{figures/DNN_scoreLoss/}{lossOverEpochs.pdf_tex}}
\caption{Learning progress} \label{fig:results:scoreLossDNN_lossEpochs}
\end{subfigure}
\caption{Results\,/\,metrics, \acrshort{dnn} with score-based loss function, test chromosomes\\ ($\lambda_\mathit{MSE}=1.0,\; \lambda_\mathit{score}=100,\; ds=12$)} \label{fig:results:scoreLossDNN_pearson}
\end{figure}
\begin{figure}[p] %score loss matrices
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/DNN_scoreLoss/}{pred00500_chr21_030-040.pdf_tex}
\caption{Example region 1} \label{fig:results:scoreloss_r1}
\end{subfigure}\\[2mm]
\centering{\scriptsize{See \cref{sec:methods:matrix_plots} for track explanations}}\\[3mm]
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/DNN_scoreLoss/}{pred00500_chr19_030-040.pdf_tex}
\caption{Example region 2} \label{fig:results:scoreloss_r2}
\end{subfigure}\\[2mm]
\centering{\scriptsize{See \cref{sec:methods:matrix_plots} for track explanations}}\\[3mm]
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/DNN_scoreLoss/}{pred00500_chr3_030-040.pdf_tex}
\caption{Example region 3} \label{fig:results:scoreloss_r3}
\end{subfigure}
\caption{Example predictions GM12878 $\rightarrow$ K562, \acrshort{dnn} with score-based loss function, 500~epochs} \label{fig:results:scoreloss_matrices}
\end{figure}
\subsubsection{Results for different bin sizes and window sizes} \label{sec:results:binsize_winsize}
To assess predictions at larger bin sizes, five different approaches were compared, cf. \cref{sec:improve:binsize_winsize}:
\begin{enumerate}
\item ``50k direct'': \\directly training a network at bin size \SI{50}{\kilo\bp} and predicting at that same bin size
\item ``initial 25k coarsened'': \\coarsening the results of the initial network discussed above (\cref{sec:initialDNNresults})
by summarizing bins via \texttt{cooler coarsen}
\item ``initial 25k$\rightarrow$50k'': \\using the initial network trained at \SI{25}{\kilo\bp} (cf. \cref{sec:initialDNNresults}) to predict at \SI{50}{\kilo\bp}
\item ``25k+50k$\rightarrow$50k'': \\predicting at \SI{50}{\kilo\bp} from a network \emph{simultaneously} trained with bin sizes 25 and \SI{50}{\kilo\bp}
\item ``25k+50k$\rightarrow$25k'': \\predicting at \SI{25}{\kilo\bp} from a network \emph{simultaneously} trained with bin sizes 25 and \SI{50}{\kilo\bp}
\end{enumerate}
The best Pearson correlations at bin size \SI{50}{\kilo\bp} were generally obtained either by coarsening the initial results to \SI{50}{\kilo\bp}
(method b) or by taking the network trained at \SI{25}{\kilo\bp} for predicting at \SI{50}{\kilo\bp} (method c), \cref{fig:results:DNN50k_pearson}.
Compared to coarsening, the latter approach had the advantage of doubling the window size (in base pairs) and it worked better for test chromosome 21.
Looking at the corresponding matrix plots, the desired effect of making larger structures more prominent by increasing the bin size was only partially achieved,
\cref{fig:results:50k_from25k_matrices}.
While all larger structures in the example cutout of test chromosome 3 indeed looked more prominent, \cref{fig:results:50k_from25k_r3},
no obvious improvement was observed for the medium-sized structures in the example regions of chromosome 19 and 21,
\cref{fig:results:50k_from25k_r2,fig:results:50k_from25k_r1}.
Direct predictions at bin size \SI{50}{\kilo\bp} (method a) were worse than indirect methods derived from networks
trained at \SI{25}{\kilo\bp}.
Both the Pearson correlations and the matrix plots seemed better for method b) and c),
\cref{fig:results:DNN50k_pearson,fig:results:50k_matrices,fig:results:50k_from25k_matrices}, but on a generally low level.
It is not known why the direct predictions turned out worse.
Potential reasons include the reduced number of samples (cf. \cref{tab:methods:samples}, p. \pageref{tab:methods:samples})
and the binning process, or a combination of both.
To this end, first investigations showed that binning the proteins using the maximum instead of the mean across the \SI{50}{\kilo\bp}-bins, cf. \cref{sec:methods:sample_gen},
did not improve the results.
Notably, the training process for the direct prediction at bin size \SI{50}{\kilo\bp} (method a) diverged after about 420 epochs.
One possible reason for this could be too high a learning rate, which could have been avoided by decreasing the learning rate over time.
However, no further investigations were made into the case, because the divergence occurred only after overfitting, \cref{fig:results:50k_lossEpochs}, and was thus not seen as too problematic here.
The minimum validation error was reached after about 150 epochs, about 100 epochs earlier than in the initial setup at \SI{25}{\kilo\bp}.
This is not surprising, since there are only about half as many training samples at \SI{50}{\kilo\bp} compared to \SI{25}{\kilo\bp},
cf. \cref{tab:methods:samples} (p. \pageref{tab:methods:samples}).
Simultaneously training a network with matrix- and feature bin sizes of \SI{25}{\kilo\bp} and \SI{50}{\kilo\bp} (methods d, e)
turned out unproblematic with regard to convergence, \cref{fig:results:25plus50_lossEpochs},
but the Pearson correlations when predicting at both \SI{25}{\kilo\bp} and \SI{50}{\kilo\bp} were -- often significantly -- worse
than the initial predictions at the respective bin size, \cref{fig:results:DNN50k_pearson} (``25k+50k$\rightarrow$50k'') and \cref{fig:results:DNN25plus50_pearson} (``25k+50k$\rightarrow$25k'').
Looking into the matrix plots shown in \cref{fig:results:25plus50_matrices},
all predictions also seemed much worse than the results obtained by the other approaches investigated thus far.
It could not be clarified what caused the improvement in Pearson correlations for chromosome 21 compared to the initial predictions at \SI{25}{\kilo\bp},
\cref{fig:results:DNN_25_pearson_21}, but it is interesting that even predictions with such a high degree of blurriness as in \cref{fig:results:25plus50_r1}
can reach an \acrshort{auc} of around 0.65.
\begin{figure}[p]%50k direct AND from 25k, pearson and progress
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_50k/}{pearson_chr03.pdf_tex}}
\caption{chr3}
\end{subfigure} \hfill
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_50k/}{pearson_chr05.pdf_tex}}
\caption{chr5}
\end{subfigure}\\[5mm]
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_50k/}{pearson_chr10.pdf_tex}}
\caption{chr10}
\end{subfigure}\hfill
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_50k/}{pearson_chr19.pdf_tex}}
\caption{chr19}
\end{subfigure}\\[3mm]
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_50k/}{pearson_chr21.pdf_tex}}
\caption{chr21}
\end{subfigure}\hfill
\begin{subfigure}{0.45\textwidth}
\resizebox{\textwidth}{!}{
\scriptsize
\import{figures/DNN_50k/}{lossOverEpochs.pdf_tex}}
\caption{Learning progress 50k direct} \label{fig:results:50k_lossEpochs}
\end{subfigure}
\caption{Results\,/\,metrics, various \acrshort{dnn}s at \SI{50}{\kilo\bp}} \label{fig:results:DNN50k_pearson}
\end{figure}
\begin{figure}[p] %50k direct, matrices
\begin{subfigure}{\textwidth}
\centering
\resizebox{0.71\textwidth}{!}{
\scriptsize
\import{figures/DNN_50k/}{pred00250_chr21_030-040.pdf_tex}}
\caption{Example region 1} \label{fig:results:50k_r1}
\end{subfigure}\\[2mm]
\centering{\scriptsize{See \cref{sec:methods:matrix_plots} for track explanations}}\\[3mm]
\begin{subfigure}{\textwidth}
\centering
\resizebox{0.71\textwidth}{!}{
\scriptsize
\import{figures/DNN_50k/}{pred00250_chr19_030-040.pdf_tex}}
\caption{Example region 2} \label{fig:results:50k_r2}
\end{subfigure}\\[2mm]
\centering{\scriptsize{See \cref{sec:methods:matrix_plots} for track explanations}}\\[3mm]
\begin{subfigure}{\textwidth}
\centering
\resizebox{0.71\textwidth}{!}{
\scriptsize
\import{figures/DNN_50k/}{pred00250_chr3_030-040.pdf_tex}}
\caption{Example region 3} \label{fig:results:50k_r3}
\end{subfigure}
\caption{Example predictions GM12878 $\rightarrow$ K562, \acrshort{dnn} at \SI{50}{\kilo\bp} direct, 250 epochs} \label{fig:results:50k_matrices}
\end{figure}
\begin{figure}[p] %50k from 25k, matrices
\begin{subfigure}{\textwidth}
\centering
\resizebox{0.71\textwidth}{!}{
\scriptsize
\import{figures/DNN_50k/}{pred00500_50k_chr21_030-040.pdf_tex}}
\caption{Example region 1} \label{fig:results:50k_from25k_r1}
\end{subfigure}\\[2mm]
\centering{\scriptsize{See \cref{sec:methods:matrix_plots} for track explanations}}\\[3mm]
\begin{subfigure}{\textwidth}
\centering
\resizebox{0.71\textwidth}{!}{
\scriptsize
\import{figures/DNN_50k/}{pred00500_50k_chr19_030-040.pdf_tex}}
\caption{Example region 2} \label{fig:results:50k_from25k_r2}
\end{subfigure}\\[2mm]
\centering{\scriptsize{See \cref{sec:methods:matrix_plots} for track explanations}}\\[3mm]
\begin{subfigure}{\textwidth}
\centering
\resizebox{0.71\textwidth}{!}{
\scriptsize
\import{figures/DNN_50k/}{pred00500_50k_chr3_030-040.pdf_tex}}
\caption{Example region 3} \label{fig:results:50k_from25k_r3}
\end{subfigure}
\caption{Example predictions GM12878 $\rightarrow$ K562, \acrshort{dnn} trained at \SI{25}{\kilo\bp} predicting at \SI{50}{\kilo\bp}, 500~epochs} \label{fig:results:50k_from25k_matrices}
\end{figure}
\begin{figure}[p]%trained at 25k and 50k simultaneously, pearson and progress for 25k
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_25plus50/}{pearson_chr03.pdf_tex}}
\caption{chr3}
\end{subfigure} \hfill
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_25plus50/}{pearson_chr05.pdf_tex}}
\caption{chr5}
\end{subfigure}\\[5mm]
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_25plus50/}{pearson_chr10.pdf_tex}}
\caption{chr10}
\end{subfigure}\hfill
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_25plus50/}{pearson_chr19.pdf_tex}}
\caption{chr19}
\end{subfigure}\\[3mm]
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/DNN_25plus50/}{pearson_chr21.pdf_tex}}
\caption{chr21} \label{fig:results:DNN_25_pearson_21}
\end{subfigure}\hfill
\begin{subfigure}{0.45\textwidth}
\resizebox{\textwidth}{!}{
\scriptsize
\import{figures/DNN_25plus50/}{lossOverEpochs.pdf_tex}}
\caption{Learning progress} \label{fig:results:25plus50_lossEpochs}
\end{subfigure}
\caption{Results\,/\,metrics, \acrshort{dnn} trained at \SI{25}{\kilo\bp} and \SI{50}{\kilo\bp} simultaneously} \label{fig:results:DNN25plus50_pearson}
\end{figure}
\begin{figure}[p] %25plus50, matrices at 25k
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/DNN_25plus50/}{pred00500_chr21_030-040.pdf_tex}
\caption{Example region 1} \label{fig:results:25plus50_r1}
\end{subfigure}\\[2mm]
\centering{\scriptsize{See \cref{sec:methods:matrix_plots} for track explanations}}\\[3mm]
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/DNN_25plus50/}{pred00500_chr19_030-040.pdf_tex}
\caption{Example region 2} \label{fig:results:25plus50_r2}
\end{subfigure}\\[2mm]
\centering{\scriptsize{See \cref{sec:methods:matrix_plots} for track explanations}}\\[3mm]
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/DNN_25plus50/}{pred00500_chr3_030-040.pdf_tex}
\caption{Example region 3} \label{fig:results:25plus50_r3}
\end{subfigure}
\caption{Example predictions GM12878 $\rightarrow$ K562, \acrshort{dnn} trained at \SI{25}{\kilo\bp} and \SI{50}{\kilo\bp} simultaneously, \SI{25}{\kilo\bp}, 500 epochs}\label{fig:results:25plus50_matrices}
\end{figure}
\clearpage
\subsection{Hi-cGAN approaches} \label{sec:results:cgan}
In the following three subsections, the results from the conditional generative adversarial networks will be presented.
Here, at least one of the three \acrshort{cgan}-variants under investigation showed good performance.
\subsubsection{Hi-cGAN with DNN embedding} \label{sec:results:cgan_dnn}
\emph{Hi-cGAN} with \acrshort{dnn} embedding according to \cref{sec:improve:DNN_embedding,sec:methods:dnn-embedding} showed interesting results.
All in all, the training process was smooth and converged after around 60 epochs, mostly using the parameters suggested
by Isola et al. \cite{Isola2017}, cf. \cref{sec:methods:cGAN_initial,sec:methods:dnn-embedding}.
Although \emph{pix2pix} has shown fast convergence in other applications,
it was surprising that this still held after the changes made to the original network, especially adding the embedding network.
While the Pearson correlations were mostly worse than the ones of the \acrshort{dnn}, compare e.\,g. \cref{fig:results:GAN64-dnn_pearson,fig:results:DNN64_pearson} (p.~\pageref{fig:results:DNN64_pearson}),
the matrices mostly looked visually better, showing slightly more distinct boundaries between interacting and non-interacting regions,
compare \cref{fig:results:cGAN64-dnn_matrices,fig:results:DNN_matrices}.
Using weight transfer from a pre-trained \acrshort{dnn} for the embedding networks further stabilized the training process
and made the discriminator reach a stable value of around 0.693 ($\approx -\log0.5$) on epoch average after only 2 epochs,
while the generator validation loss reached a stable minimum value after about 15 epochs, \cref{fig:results:GAN64_pretrain-dnn_lossEpochs}.
However, the resulting predictions did not really improve.
While the Pearson correlations showed showed slightly better values for the pre-trained \acrshort{dnn} embedding compared to the non-pre-trained,
\cref{fig:results:GAN64_pretrain-dnn_pearson},
the matrices were visually clearly worse than without pre-training, and also worse than the results from the \acrshort{dnn} alone,
\cref{fig:results:cGAN64_pretrain-dnn_matrices}.
\begin{figure}[p] %cGAN with DNN, no pretraining, windowsize 64, pearson and progress
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/GAN_64_dnn/}{pearson_chr03.pdf_tex}}
\caption{chr3}
\end{subfigure} \hfill
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/GAN_64_dnn/}{pearson_chr05.pdf_tex}}
\caption{chr5}
\end{subfigure}\\[5mm]
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/GAN_64_dnn/}{pearson_chr10.pdf_tex}}
\caption{chr10}
\end{subfigure}\hfill
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/GAN_64_dnn/}{pearson_chr19.pdf_tex}}
\caption{chr19}
\end{subfigure}\\[3mm]
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/GAN_64_dnn/}{pearson_chr21.pdf_tex}}
\caption{chr21}
\end{subfigure} \hfill
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/GAN_64_dnn/}{lossOverEpochs.pdf_tex}}
\caption{Learning progress} \label{fig:results:GAN64-dnn_lossEpochs}
\end{subfigure}
\caption{Results\,/\,metrics, \acrshort{cgan}, \acrshort{dnn} embedding, no pre-training, $w=64$, test chromosomes} \label{fig:results:GAN64-dnn_pearson}
\end{figure}
\begin{figure}[p] %cgan with DNN, no pre-training, winsize 64, matrices
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/GAN_64_dnn/}{pred00060_chr21_030-040.pdf_tex}
\caption{Example region 1} \label{fig:results:cGAN64-dnn_r1}
\end{subfigure}\\[2mm]
\centering{\scriptsize{See \cref{sec:methods:matrix_plots} for track explanations}}\\[3mm]
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/GAN_64_dnn/}{pred00060_chr19_030-040.pdf_tex}
\caption{Example region 2} \label{fig:results:cGAN64-dnn_r2}
\end{subfigure}\\[2mm]
\centering{\scriptsize{See \cref{sec:methods:matrix_plots} for track explanations}}\\[3mm]
\begin{subfigure}{\textwidth}
\centering
\scriptsize
\import{figures/GAN_64_dnn/}{pred00060_chr3_030-040.pdf_tex}
\caption{Example region 3} \label{fig:results:cGAN64-dnn_r3}
\end{subfigure}
\caption{Example predictions GM12878 $\rightarrow$ K562, \acrshort{cgan}, \acrshort{dnn} embedding, no pre-training, $w=64$, 60 epochs} \label{fig:results:cGAN64-dnn_matrices}
\end{figure}
\begin{figure}[p] %cGAN with DNN, pretrained, windowsize 64, pearson and progress
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/GAN_64_pretrain-dnn/}{pearson_chr03.pdf_tex}}
\caption{chr3}
\end{subfigure} \hfill
\begin{subfigure}{0.45\textwidth}
\scriptsize
\resizebox{\textwidth}{!}{
\import{figures/GAN_64_pretrain-dnn/}{pearson_chr05.pdf_tex}}
\caption{chr5}
\end{subfigure}\\[5mm]
\begin{subfigure}{0.45\textwidth}