Hi Eunice,
It took me a while to research answers to your questions because Mathworks is a highly respectable platform. Many OPs who have posted their questions in the past had strong background in electrical, computer, biomedical, data science , RF etc and some of them asked intuitive and intellectual questions regarding their problems. My goal has always been staying humble and embrace humility when answering questions.
Based on your background in machine learning and interest in exploring clustering methods beyond the basics, it's great that you're looking to delve into more advanced techniques like clustering with a Variational Autoencoder (VAE). The paper you referenced indeed discusses applying a clustering method to the latent dimensions generated by a VAE, which can be a powerful approach for unsupervised learning tasks. Now, let me provide solutions to your questions.
Question#1: How can you access the latent dimensions?
To access latent dimensions in a VAE model, you typically need to modify the architecture or code to extract and analyze these representations directly. This may involve accessing the encoder part of the VAE model to obtain the mean and variance vectors that parameterize the latent space. I will try to demonstrate it with a simple example because the goal is to understand how to access the latent dimensions in a VAE and apply a clustering method to these dimensions. I will create generic data for visualization, define variables, and plot the data without using functions. So, in the example code snippet, I started by creating random data points with 2 dimensions using the randn function to simulate generic data for visualization. Then, set the number of clusters (num_clusters) to 3 and the maximum number of iterations for clustering (max_iterations) to 100, afterwards apply the K-Means clustering algorithm to the generated data (data) with the specified number of clusters and maximum iterations which returns the cluster indices (idx) and the centroid locations (C). Finally, plot the clustered data using the gscatter function to visualize the clusters based on the latent dimensions. The plot includes labels for the dimensions, clusters, and a legend for cluster identification.
%Snippet Code Example
% Generate generic data for visualization
data = randn(100, 2); % Generating 100 data points with 2 dimensions
% Define variables for clustering
num_clusters = 3; % Number of clusters
max_iterations = 100; % Maximum number of iterations for clustering
% Perform clustering on the latent dimensions
[idx, C] = kmeans(data, num_clusters, 'MaxIter', max_iterations);
% Plot the clustered data
figure;
gscatter(data(:,1), data(:,2), idx);
title('Clustering with VAE Latent Dimensions');
xlabel('Dimension 1');
ylabel('Dimension 2');
legend('Cluster 1', 'Cluster 2', 'Cluster 3');
Please see attached plot along with snippet code.
By following these steps, you can access the latent dimensions in a VAE, apply clustering techniques, and visualize the clustered data in MATLAB without using additional functions.
Question#2: Am I going in the wrong direction with this idea? Are there more suited neural architectures for clustering?
Well, it ultimately depends on your specific data and objectives. In my opinion, VAEs are well-suited for tasks involving generative modeling and dimensionality reduction. However, other neural network architectures like deep autoencoders, self-organizing maps, or deep belief networks may also be effective for clustering tasks.
By experimenting with different methods and staying curious about new developments in the field, you can continue to expand your knowledge and skills in machine learning. Please let me know if you have any further questions.
Good luck with your exploration!