pidst and sparse vectors/graphs

Roberto

08 Feb 2025

17 Views

0 Comments

Follow Post

This topic has been permanently closed and transferred to MATLAB Answers.

The pdist function allows the user to specify the function coding the similarity between rows of matrices (called DISTFUN in the documentation).

With the increasing diffusion of large datasets, techniques to sparsify graphs are increasingly being explored and used in practical applications. It is easy to code one own's DISTFUN such that it returns a sparse vector. Unfortunately, pdist (and pdist2) will return a dense vector in the output, which for very large graphs will cause an out of memory error. The offending code is

(lines 434 etc.)

% Make the return have whichever numeric type the distance function

% returns, or logical.

if islogical(Y)

Y = false(1,n*(n-1)./2);

else % isnumeric

Y = zeros(1,n*(n-1)./2, class(Y));

end

To have pdist return a sparse vector, the only modification that is required is

if islogical(Y)

Y = false(1,n*(n-1)./2);

elseif issparse(Y)

Y = sparse(1,n*(n-1)./2);

else % isnumeric

Y = zeros(1,n*(n-1)./2, class(Y));

end

It is a bit more work to modify squareform to produce a sparse matrix, given a sparse vector produced by the modified pdist. Squareform includes several checks on the inputs, but the core functionality for sparse vectors would be given by something like

% given a sparse vector d returned by pdist, compute a sparse squareform

[~,j,v] = find(d);

[m,n] = pdist_ind2sub(j, nobs);

W = sparse(m, n, v, nobs, nobs);

W = W + W';

Here, pdist_ind2sub is a function that given a set of indices into a vector produced by pdist, returns the corresponding subscripts in a (triangular) matrix. Computing this requires information about the number of observations given to pdist, i.e. what was n in the preceding code. I could not figure out a way to use the function adjacency to accomplish this.

Roberto

Post

Replies

View profile

Posts by this author

pidst and sparse vectors/graphs

Discussion

Discussions

Explore

Channels

Highlights

General

Books

Fun

Generative AI

Ideas

MATLAB EXPO

Meta Cody

Meta Contests

SimBiology

Team Cool Coders

Team Creative Coders

Team Relentless Coders

ThingSpeak

Tips & Tricks

日本語

pidst and sparse vectors/graphs

You are now following this topic

You are now following this channel

Tags

Roberto

Posts by this author

Discussions