I noticed that, at least the AEe network based on hidden layer input features, AUC very early goes down as network starts to improve on reconstructing bkg jets. This indicates complexity bias. This can potentially be alleviated by reducing the latent space size. To be optimized.