Chapter 2 Background In this chapter, we summarise relevant background for understanding this thesis. We assume the reader is somewhat familiar with machine learning, neural networks, probability theory, and set theory; as the topics in this section will build on concepts from these fields. 2.1 Variational Autoencoders In this section, we describe the variational autoencoder (VAE) (Kingma and Welling, 2013; Rezende et al., 2014), a model that is at the core of most other models discussed in this thesis. A VAE introduces neural networks in a very simple latent variable model, where latent variables z ∈Z predict data observations x ∈ X. There is a prior p(z) over the latent space Z, which is typically fixed, as well as a parametric conditional distributionpθ(x|z), where θ represents the parameters of this distribution. In a VAE, these parameters are modelled with a neural network, called the decoder, since it decodes latent variables into a distribution over the data space X. Together, these distributions constitute the generative model p(z)pθ(x|z) that, once trained, allows us to sample new data points, making the VAE a generative model. To train the model, we wish to learn the parameters θ through Maximum Likelihood Estimation, i.e. we wish to maximise the marginal log likelihood logpθ(x)= logRz pθ(x|z)p(z) with respect to the parameters θ. However, since pθ(x|z) is parameterised by a neural network, it is intractable to integrate over z to obtain gradients for gradient-based learning. Moreover, methods such as the
RkJQdWJsaXNoZXIy MjY0ODMw