Latent Space Explorer (LSE) support analysis of image datasets via unsupervised machine learning methods. It allows to extract a compact representation from data by representation learning models (e.g. autoencoders). The information extracted can be then visualized using the projector. The latter allows visualizing the data in a 2D or 3D space in an interactive fashion. The system then allows performing clustering algorithms to detect potentially relevant ways to group data and to support the definition of novel classification schemes.
You could find an overview of the service in the intro video
In order to use the tool please follow the documentation
If you want to play with the projector on some demo experiments you will find those on your experiment page
Please consider acknowledging the NEANIAS project if you use the results of this service in any paper or communication.
NEANIAS was funded by European Union under Horizon 2020 research and innovation programme via grant agreement No. 863448.
Also, if you make use of the service or approach into a scientific research, please consider citing the paper:
Cecconello, T., Puerari, L., & Vizzari, G. (2021). Unsupervised Data Pattern Discovery on the Cloud. AIxIA 2021 Discussion Papers - co-located with the 20th International Conference of the Italian Association for Artificial Intelligence (AIxIA2021). CEUR Workshop Proceedings Vol. 3078, pages 108-120.
MNIST is a classic dataset for image classification. It consists of 28x28 grayscale images of handwritten digits. Analysing the dataset using the latent space explorer allows to have a structured overview of the content of the dataset. Clustering methods like dbscan could help to detect outliers and clean the dataset. The analysis could help to understand what the neural network learn from the data and correct hidden bias.
CelebA is a dataset of over 200,000 celebrity images. In this particular experiment we subsampled the dataset to a smaller size of 10000 images.
The dataset is particularily familiar to all users and so it is a good starting point for understanding the latent space explorer.
CelebA it's a challenging dataset to be represented, because there are a lot of visual features to understand: accessories, skin tone, hair type, eyes, and so on.
Furthermore the background and the style of the photos could be learned as a features. If the data are organized depending on the background, then probably the neural network doesn't learn properly what we're looking for.
EuroSAT is collection of images captured by European Space Agency satellites using multiple instruments.
Images has a shape of 64x64 pixels and those present more than 3 standard RGB channels. 13 spectral bands from 443 nm to 2190 form a datacube.
This dataset gets closer to the final use case of the latent space explorer, that was intended to explore astronomical images taken by non standard instruments.
It consists of 10 classes: Industrial Buildings, Residential Buildings, Annual Crop, Permanent Crop, River, Sea/Lake, Herbaceous Vegetation, Highway, Pasture, and Forest.
Analysing the dataset using the latent space explorer could suggest new classification schemes.