16 Background disentangled latent variables. Building on this, cc-VAE (Burgess et al., 2018) gradually increases this bottleneck capacity over time, allowing the encoder to learn one generative factor at a time. Chen et al. (2018) show by rewriting the ELBO that it contains a Total Correlation (TC) term (Watanabe, 1960), which is a measure of dependence between variables. They claim that a heavier penalty on this specific term should induce a more disentangled representation by encouraging independence between latent variables, and thus propose β-TCVAE where a weight parameter β over-penalises this TC term, which they compute using a tractable but biased Monte Carlo estimator. Similarly, FactorVAE (Kim and Mnih, 2018) also overpenalises an additional TC term, using adversarial training instead. DIP-VAE-I and DIP-VAE-II (Kumar et al., 2017) both add an additional term that penalises some divergence between the aggregate posterior q(z) and a factorised prior. Since using the KL divergence would make this term intractable, they instead propose a moment matching solution. Disentanglement metrics Various metrics have been proposed to quantify disentanglement, aiming to capture various desirable properties that a disentangled representation should have. Often, new metrics are proposed alongside a new disentanglement method, aiming to address various issues of previous metrics. The following metrics all assume that a single generative factor should be modelled in a single latent dimension. The Beta metric (Higgins et al., 2017) measures the accuracy of a linear classifier that tries to predict the index of a generative factor that is kept fixed, aiming to measure both independence and interpretability of the learned latent variables. The Factor metric (Kim and Mnih, 2018) addresses several issues with this previous metric, by using a majority vote classifier that tries to predict the index of the fixed generative factor based on the index of the latent dimension with the lowest variance. Chen et al. (2018) argue that the Beta andFactor metrics are neither general nor unbiased, since they rely on certain hyperparameters. Instead the propose the Mutual Information Gap (MIG), which for each generative factor measures the normalised gap in mutual information between the two latent dimensions that have the highest and second highest mutual information with that factor. Conversely, Modularity (MOD) (Ridgeway and Mozer, 2018) measures if each latent dimension depends on at most one generative factor, by computing the average normalised squared difference between the mutual information of the
RkJQdWJsaXNoZXIy MjY0ODMw