3.2 Related Work 27 Specifically, we train a Variational Autoencoder (VAE) (Kingma and Welling, 2013; Rezende et al., 2014) in an unsupervised fashion with only samples from the non-faulty class. The VAE is a generative latent variable model, that learns to represent a probability distribution over the data space implicitly by learning a latent space representation. It provides approximate inference of latent variables given some data point, as well as conditional distributions over the data, given some latent variable. This allows us to estimate the likelihood of any given data point. Modelling only the normal data this way prevents us from explicitly having to model the scarce defects, lowering the chance of overfitting that supervised models typically display. Once we have trained such a model, we can use it to retrieve likelihood scores of examples of both expected and anomalous data. In the optimal situation, these scores should be linearly separable by a single threshold; normal examples having a higher likelihood than anomalous examples. Anomaly classification then becomes a matter of expressing the likelihoods of data points, and determining an optimal threshold. In this chapter, we evaluate anomaly detection with VAEs for a dataset of photos of 3D-printed products. We also test this method on the simpler benchmark dataset MNIST. Results show that we can indeed train a model that separates normal from anomalous data up until a certain degree, based on their likelihood values. Furthermore, we evaluate this anomaly detection framework for lung cancer detection in 3D lung nodules, a task that poses similar challenges to the aforementioned visual inspection of 3D-printed products. Early cancer detection and diagnosis of anomalous anatomies, by means of Computer Tomography (CT), has been a recurrent research topic especially in the computer vision domain (Roth et al., 2016; Greenspan et al., 2016). For this task, we also compare to a previous anomaly detection approach based on Generative Adversarial Networks (GANs) (Goodfellow et al., 2014). Results show that neither GANs nor VAEs perform well on this task, implying that this application is too difficult to be solved to a satisfactory level by these methods. 3.2 Related Work In this chapter we consider probabilistic anomaly detection. Pimentel et al. (2014) provide an overview of current methods for such anomaly detection. For high-dimensional data, deep learning methods have been proposed for the likelihood estimation process in the anomaly detection framework. Similar to
RkJQdWJsaXNoZXIy MjY0ODMw