603556-Tonnaer

4.7 Results: Evaluating LSBD withDLSBD 83 shows Arrow images and their reconstructions using Quessard et al. (2020)’s method. Since colour isn’t learned well, this example doesn’t get a goodDLSBD score, even though rotation is properly linearly disentangled. (a) Input. (b) Reconstructions. Figure 4.8: Results from Quessard et al. (2020)’s method on the Arrow dataset. Furthermore, we tested Forward-VAE (Caselles-Dupré et al., 2019), but we could not produce any reasonable results on our datasets. Therefore, we do not include scores for this method. We did manage to reproduce Forward-VAE’s results on the Flatland dataset used in the original paper, for which we computed ameanDLSBD score of 0.012 with standard deviation 0.001 over 10 runs. This confirms that Forward-VAE indeed learns LSBD representations for Flatland. 4.7.3 LSBD Representations Also Satisfy Previous Disentanglement Notions Our results also indicate that LSBD captures various desirable properties that are expressed by traditional disentanglement metrics. In Figure 4.9 we compare DLSBD scores with scores for previous disentanglement metrics. Note that for DLSBD lower is better, whereas for all other metrics higher is better. As we noted before, good scores on traditional disentanglement metrics don’t necessarily imply good DLSBD scores. Conversely however, methods that score well on DLSBD also score well on many traditional disentanglement metrics, often even outperforming the traditional methods. In particular, from the full results (see Section 4.7.4) we see that LSBD-VAE matches or outperforms the traditional

RkJQdWJsaXNoZXIy MjY0ODMw