603556-Tonnaer

30 Anomaly Detection with Variational Autoencoders regularisation, ensuring a meaningful latent space according to the chosen prior p(z). The difference between the ELBO and the true marginal log likelihood is equal to the KL divergence from the approximate posterior to the (intractable) true posterior: logpθ(x) −ELBO(ϕ,θ; x)=KL(qϕ(z|x)||pθ(z|x)). (3.2) From this we can easily see that maximising the ELBO w.r.t. ϕ is the same as minimising the KL divergence of the approximate from the true posterior. Thus, during VAE training we are in fact performing variational inference to learn qϕ(z|x). Assuming our VAE has been trained well, it then seems reasonable to use the value of the ELBO as an approximation to the true marginal log likelihood, for use in our anomaly detection framework. For our experiments we make the common choice of using multivariate Gaussians with diagonal covariance, which can also be interpreted as independent univariate Gaussians: q(z|x)=N(z|µenc(x), σenc(x)), (3.3) p(x|z)=N(x|µdec(z),σdec · I), (3.4) p(z)=N(z|0, I), (3.5) where σdec is a fixed constant, which we set to 1√ 2 . The parameters µenc(x), σenc(x), andµdec(z) are modelled as neural networks. With these models, the KL divergence in the second term of (3.1) can be computed analytically, while the expectation in the first term can be approximated efficiently with Monte Carlo sampling. During training, a single sample suffices. To evaluate the ELBO for use in our anomaly detection framework however, we sample 128 instances from the approximate posterior, which allows us to find a more reliable estimate of the likelihood of each data point. The VAE is trained by using stochastic gradient descent (SGD) with the Adam optimiser, using the negative ELBO as the loss function. To be able to do this, we need to use the reparameterisation trick, besides the aforementioned Monte Carlo sampling for expectations, as detailed in Section 2.1. 3.3.2 Likelihood Estimation with GANs Another approach to perform anomaly detection with a deep generative model is AnoGAN (Schlegl et al., 2017), which uses a Generative Adversarial Network

RkJQdWJsaXNoZXIy MjY0ODMw