603556-Tonnaer

72 Quantifying and Learning Linear Symmetry-Based Disentanglement (LSBD) 4.6 Experimental Setup 4.6.1 Datasets We evaluate the disentanglement of several models on three different image datasets (Square, Arrow, and Airplane) with a known group decomposition G=SO(2)×SO(2) describing the underlying transformations. For each subgroup a fixed number of |Gk| = 64 (with k ∈ {1, 2}) transformations is selected, resulting in datasets with|G1| · |G2| =4096 images. All datasets contain64×64 pixel images. The datasets exemplify different group actions of SO(2): periodic translations, in-plane rotations, out-of-plane rotations, and periodic hue-shifts. The top row of Figure 4.4 shows example images of each of these datasets, more details are provided below. In real settings, not all variability in the data can be modelled by the actions of a group. Therefore, we also evaluate the same models on two datasets (ModelNet40(Wu et al., 2014) andCOIL-100(Nene et al., 1996)) that consist of 64 ×64 pixel images (i.e. 2D observations) from various 3D objects under known out-of-plane rotations. In many settings it is easy to obtain labels for such rotations, e.g. when the camera or object angle is controlled by an agent. For these datasets the group G=SO(2) describes the underlying transformations that each object undergoes. The different objects can be seen as non-symmetric variability in the data. In this particular case, each object has its own base-point x0 from which data is generated. The metric DLSBD is then evaluated per object instance for the groupG=SO(2), the value of DLSBDis calculated and averaged across all available objects. The bottom row of Figure 4.4 shows example images of each of these datasets, more details are provided below. Note that we do not evaluate our LSBD-VAE method andDLSBD metric on traditional disentanglement datasets as evaluated by Locatello et al. (2018), since these datasets lack a clear underlying group structure. However, our results on the ModelNet40 and COIL-100 datasets show that our method can disentangle properties with a group structure from properties without such a structure. Square The Square dataset consists of a set of images of a black background with a square of 16×16 white pixels. The dataset is generated applying vertical and horizontal translations of the white square considering periodic boundaries.

RkJQdWJsaXNoZXIy MjY0ODMw