Fast way to perform multiple searches on a large array

Question

0 votes

I have a large time series array (10,000,000 elements) :

ts = [2; 1; 3; 4; 6; 7; .......]

I have a corresponding time array (same size as the above) :

times = [d1; d2; d3; d4; d5.......]

I have 2 arrays of start times and end times (also large ~ 30000 elements):

st = [dd1 dd2 dd3 ....]
en = [de1 de2 de3 ....]

I need to create a new matrix with many many finds. Logic is :

results = NaN(300, numel(st));
for i=1:numel(st);
  temp = ts(find(times > st(i) & times < en(i) , 300,'first');
  results(:,i) = temp;
end;

Is there any ay I do this faster (ideally without a loop) ?

I have a 64 bit version so I can try a large in-memory solution.

Many thanks in advance, Nigel

8 Comments
Show 6 older comments Hide 6 older comments

Daniel Shub on 4 Oct 2011

Just to confirm times, st and en are all sorted?

Nigel on 4 Oct 2011

Yes they are sorted by st and en(i)-st(i) = 300 seconds

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Daniel Shub on 4 Oct 2011

Open in MATLAB Online

0 votes

I think by dumping the past times you might be able to speed up the find. If st(i+1) > en(i), then you could dump even more elements, but I think the savings will be small. This code relies on times, st, and en being sorted.

results = NaN(300, numel(st));
offset = 0;
for i=1:numel(st);
  idx = find(times > st(i), 1,'first');
  offset = offset+idx-1;
  times = times(idx:end);
  results(:,i) = ts(0:299+idx+offset);
end

1 Comment
Show -1 older comments Hide -1 older comments

Nigel on 10 Oct 2011

Hi Daniel,

I used a modified version of your solution. Indeed it is a LOT quicker to search over smaller sized arrays.

Thank you all for your help.

N.

Sign in to comment.

Answer 2

Jan on 4 Oct 2011

Open in MATLAB Online

0 votes

Never let an array grow in each iteration! Pre-allocate the output:

results = NaN(300, numel(st));
for i = 1:numel(st)   % Not size(st), which is a vector!
  temp = ts(find(times > st(i) & times < en(i), 300, 'first');
  if length(temp) == 300
    results(:, i) = temp;
  else
    results(1:length(temp), i) = temp;
  end
end
results = results(~isnan(results));

If st and times are sorted, it wastes a lot of time to compare all values. But for vectorizing this, a very large matrix would be needed, such that I assume it will be slower than the loop.

Can you solve the problem by using HISTC?

6 Comments
Show 4 older comments Hide 4 older comments

Daniel Shub on 4 Oct 2011

and since times and st are sorted

0:299+find(times > st(i), 1, 'first')

Nigel on 4 Oct 2011

WOW by removing the < en(i)the processing time nearly halved !!

Sign in to comment.

Answer 3

Nigel on 4 Oct 2011

0 votes

Certainly taking away the < en(i) helped. I'm a little hesitant to implement the dumping the past times part because I need the data for something a little later on.

Just for my own learning I would really like to know how could I vectorise this operation such that I didn't need to do this in a loop.

Thank you all once again for taking the time to look at and respond to my question.

N.

2 Comments
Show None Hide None

Bjorn Gustavsson on 10 Oct 2011

Well then at least do the consequtive 'find's on shortened sections of times (with 'offset' as in Daniel's example):

idx = find(times(offset:end) > st(i), 1,'first');

Then you'd get the benefit from increasingly shorter arrays to search over but without loosing the data.

Daniel Shub on 10 Oct 2011

I wonder if this would be faster. I would hope MATLAB is smart enough not to have to reallocate memory for my method. Yours is probably a little safer. I was also thinking that working from the end backwards might ultimately be the fastest.

Sign in to comment.

Fast way to perform multiple searches on a large array

8 Comments
Show 6 older comments Hide 6 older comments

Accepted Answer

1 Comment
Show -1 older comments Hide -1 older comments

More Answers (2)

6 Comments
Show 4 older comments Hide 4 older comments

2 Comments
Show None Hide None

Categories

Tags

Community Treasure Hunt

Fast way to perform multiple searches on a large array

8 Comments Show 6 older comments Hide 6 older comments

Accepted Answer

1 Comment Show -1 older comments Hide -1 older comments

More Answers (2)

6 Comments Show 4 older comments Hide 4 older comments

2 Comments Show None Hide None

Categories

Tags

See Also

Community Treasure Hunt

8 Comments
Show 6 older comments Hide 6 older comments

1 Comment
Show -1 older comments Hide -1 older comments

6 Comments
Show 4 older comments Hide 4 older comments

2 Comments
Show None Hide None