1.3 Thesis Outline and Contributions 9 Q: How do LSBD models compare to traditional disentanglement models with respect to OOD generalisation? 1.3 Thesis Outline and Contributions In Chapter 2 we provide some relevant background and preliminaries for the work in this thesis. The subsequent chapters detail the contributions made in this thesis, addressing the research questions outlined above. These contributions can be summarised as follows. In Chapter 3 we present anomaly detection with probabilistic generative models, in particular Variational Autoencoders (VAEs). We train a VAE on normal data samples, such that it can detect anomalous samples if their assigned probability density is lower than for normal samples. We apply this method on applications for visual quality control and lung cancer detection. Results show that anomaly detection is possible in certain cases, which confirms the validity of this approach. However, in more complicated and realistic settings the models may fail to represent the data well enough for reliable anomaly detection. This suggests that improvements in the VAE framework could be beneficial for anomaly detection performance as well. In Chapter 4 we focus on quantifying and learning Linear Symmetry-Based Disentanglement (LSBD). We propose DLSBD, a well-formalised metric to quantify LSBD. We give a practical implementation of this metric for SO(2), a common group structure that models 2D rotations or other cyclic properties. From this metric, we derive LSBD-VAE, a semi-supervised method to learn LSBD representations. We use the DLSBD metric to compare LSBD with previous notions of disentanglement, as well as to evaluate models designed to learn LSBD representations, including our own LSBD-VAE. In Chapter 5 we explore how LSBD helps with out-of-distribution generalisation, and how LSBD models compare to traditional disentanglement models for this task. We train models on datasets with held-out factor combinations, and test their generalisation on these unseen factor combinations. We observe that models struggle with generalisation in more challenging settings, and that LSBD models show no obvious improvement over traditional disentanglement when measuring generalisation in terms of the likelihood of unseen data. However, we also observe that the encoder of LSBD models may still generalise well by learning a meaningful mapping that reflects the underlying real-world mechanisms.
RkJQdWJsaXNoZXIy MjY0ODMw