Improve performance of my code

4 views (last 30 days)
Suraj Shetiya
Suraj Shetiya on 21 Apr 2017
Answered: Greg Dionne on 24 Apr 2017
I was working on time warping on sign language and it seems to take about an hour to classify around a 1000 objects. I have heard that it takes 5 mins to run for other friends of mine.
Note that this is a assignement, which I have solved but I need to know the source of my performance hit.
The ASL data set and their format is as below.
-------------------------------------------------
object ID: 1121
class label: 8
sign meaning: 4
dominant hand trajectory:
-0.488893 0.470786
-0.488893 0.470786
-0.488893 0.470786
-0.488893 0.470786
-0.488893 0.470786
-0.466259 0.470786
-0.466259 0.470786
-0.466259 0.470786
-0.466259 0.470786
-0.466259 0.470786
-0.466259 0.470786
-0.466259 0.470786
-0.466259 0.470786
-0.466259 0.470786
-0.466259 0.470786
-------------------------------------------------
There are multiple such objects and I have read each object into a struct. So, I have a vector of these structs which are created in the format below.
struct('ID', current_id, 'label', current_class_label, 'sign', current_sign, 'data', current_data)
The algorithm for time warping is a dynamic programming solution that calculates the overall cost for the match. The matlab code that corresponds to this algorithm is as below.
function cost = cost_calc(pattern1, pattern2)
new_mat = zeros(size(pattern1,1), size(pattern2,1));
new_mat(1,1) = distance_2d( pattern1(1, :), pattern2(1, :));
for i = 2:size(pattern2,1)
new_mat(1,i) = new_mat(1, i-1) + distance_2d( pattern1(1, :), pattern2(i, :));
end
for i = 2:size(pattern1,1)
new_mat(i,1) = new_mat(i-1, 1) + distance_2d( pattern1(i, :), pattern2(1, :));
for j = 2:size(pattern2,1)
new_mat(i,j) = min([new_mat(i-1,j), new_mat(i-1, j-1), new_mat(i, j-1)]) + distance_2d( pattern1(i, :), pattern2(j, :));
end
end
cost = new_mat(end,end);
end
Below is the code for Euclidean distance, that I am using.
function d = distance_2d(p1, p2)
d = norm(p1 - p2);
end
The call the cost_calc function is happening as below. obj being a test object.
distances = arrayfun(@(x) cost_calc(obj.data, x.data), train);
When I run the profiler on my code, I can find that distance_2d which finds the Euclidean distance, seems to take 2-2.5s for every test object, the min call seems to be taking a little above 2s while the overall DTW method seems to take around 4.5s for every test object.
Please note that I cant post the entire code here as it is against the rules of my university.
Any pointers ?

Answers (2)

Eamonn
Eamonn on 21 Apr 2017
Five minutes!!?
You can do this in under 0.5 seconds.
Use the UCR suite, google UCR Suite DTW
  1 Comment
Suraj Shetiya
Suraj Shetiya on 21 Apr 2017
Thanks for the link I will use that to implement my DTW.
But, can you point me to why my code runs slow so that I get to know what to use and what not for speed ?

Sign in to comment.


Greg Dionne
Greg Dionne on 24 Apr 2017
There is a version of DTW in the Signal Processing Toolbox that can accommodate multivariate data, it also has a version for searching via time-stretching with normalization (FINDSIGNAL).
Since it looks like you're implementing the "classic" unconstrained algorithm, you could first compute a point-wise distance matrix first (call it "D") where D(i,j) = norm(x(:,i)-y(:,j)). Note that I've changed the order in which your data appears to keep elements that you expect to access together in the first index -- MATLAB is column-major in its ordering (not row-major), so you'll see this convention frequently in routines that use MATLAB.
Something like:
function d = getDistanceMatrix(x, y)
d = zeros(size(x,2),size(y,2));
for c=1:size(x,1);
[u,v] = meshgrid(y(c,:),x(c,:));
d = d + conj(u-v).*(u-v);
end
d = sqrt(d);
end
Then you could feed "D" into a routine that computes the cumulative distance matrix, "C". The benefit of splitting out the distance computation from the cumulative computation is now (with a little bit of arithmetic) you can use purely vector operations to do this along each column.
Something like:
function C = cumulativeDistance(D)
m = size(D,1);
n = size(D,2);
C=NaN(size(D));
Dcum = cumsum(D);
C(:,1) = Dcum(:,1);
for j=2:n
prevmin = [C(1,j-1); min(C(1:end-1,j-1),C(2:end,j-1))];
delta = prevmin - [0; Dcum(1:end-1,j)];
C(:,j) = cummin(delta)+Dcum(:,j);
end
I think you can take it from there.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!