112 Out-of-Distribution Generalisation with LSBD Representations For the dSprites and 3D Shapes dataset, we follow the experimental setup of Montero et al. (2021), defining three experiments for each dataset: recombination-to-element (RTE), recombination-to-range (RTR) and extrapolation (EXTR). Table 5.1 summarises the factor combinations that are left out as OOD combinations for each of these settings on both datasets. Note that factors scale andx-positionfor dSprites andobject scale, orientationand all hues for 3D Shapes are given as values from 0 to 1. Hue values above 0.5 correspond to cyan, blue, and purple. Table 5.1: OOD splits for dSprites and 3D Shapes. RTE RTR EXTR dSprites shape = ellipsis, scale < 0.6, 120◦ < orientation <240◦, x-position≥0.6 shape=square, x-position≥0.5 x-position > 0.5 3DShapes floor hue ≥0.5, wall hue ≥0.5, object hue ≥0.5, object shape = cylinder, object scale = 1, orientation = 0 object hue ≥0.5, object shape = oblong floor hue ≥0.5 5.3.3 LSBD-VAE To investigate the generalisation of LSBD representations, we train an LSBD-VAE as proposed in Section 4.5. Details of this model are given there, but here we repeat some of the key components for completeness. We choose this model since it can be trained with supervision on the underlying transformations in a batch of data points, such that we can easily and reliably learn LSBD representations for the training data even with challenging OOD splits. Like a regular Variational Autoencoder (VAE), LSBD-VAE consists of an encoder (or approximate posterior) q(Z|X), a prior p(Z), and a decoder p(X|Z), where X and Z are the data and latent space, respectively. Given a group decomposition G = G1 ×. . . ×GK that represents the symmetry structure underlying the data, LSBD-VAE defines suitable topologies for the corresponding latent subspaces Z =Z1 ×. . . ×ZK and a matching linearly disentangled group representation ρ. The factors in our Square and Arrow datasets can be described with subgroups Gk = SO(2), i.e. the Special Orthogonal group of 2D rotations. As matching latent subspaces we use unit circles (or 1-spheres) Zk =S1 ={z ∈R2 : ||z|| =1}. Group representations ρk are then 2D rotation matrices.
RkJQdWJsaXNoZXIy MjY0ODMw