A very fast way to sort datetime (in "ascend" mode)?
12 views (last 30 days)
Show older comments
A very fast way to sort datetime (in "ascend" mode)?
Here an example:
a = datetime([
'2022-10-27 00:22:50.000'
'2022-10-27 05:29:45.000'
'2022-10-27 05:32:19.400'
'2022-10-27 05:36:44.100'
'2022-10-27 05:39:26.600'
'2022-10-27 05:43:18.200'
'2022-10-27 05:49:17.400'
'2022-10-27 05:55:27.300'
'2022-10-27 05:58:07.000'
'2022-10-27 06:17:13.800'
'2022-10-27 06:41:28.700'
'2022-10-27 07:03:06.000'
'2022-10-27 07:09:49.800'
'2022-10-27 07:17:39.700'
'2022-10-27 07:35:09.000'
'2022-10-27 07:42:33.600'
'2022-10-27 07:46:50.500'
'2022-10-27 08:07:02.700'
'2022-10-27 08:26:29.600'
'2022-10-27 08:45:03.500'
'2022-10-27 08:48:09.700'
'2022-10-27 08:53:57.000'
'2022-10-27 08:59:43.400'
'2022-10-27 09:13:24.100'
'2022-10-27 09:19:25.000'
'2022-10-27 09:26:35.000'
'2022-10-27 09:29:54.000'
'2022-10-27 09:46:45.700'
'2022-10-27 10:10:59.900'
'2022-10-27 10:29:04.600'
'2022-10-27 10:31:50.100'
'2022-10-27 10:37:45.300'
'2022-10-27 10:42:35.800'
'2022-10-27 10:58:42.300'
'2022-10-27 11:03:48.900'
'2022-10-27 11:10:44.700'
'2022-10-27 11:13:11.100'
'2022-10-27 11:31:25.100'
'2022-10-27 11:56:01.200'
'2022-10-27 12:19:25.300'
'2022-10-27 12:25:12.500'
'2022-10-27 12:30:16.900'
'2022-10-27 12:49:19.100'
'2022-10-27 12:55:42.700'
'2022-10-27 12:58:18.100'
'2022-10-27 13:16:06.300'
'2022-10-27 14:00:44.700'
'2022-10-27 14:04:14.200'
'2022-10-27 14:11:54.600'
'2022-10-27 14:17:10.000'
'2022-10-27 14:42:03.300'
'2022-10-27 14:45:22.100'
'2022-10-27 15:02:07.900'
'2022-10-27 15:25:39.600'
'2022-10-27 15:44:58.300'
'2022-10-27 15:48:35.800'
'2022-10-27 15:54:23.100'
'2022-10-27 16:00:17.300'
'2022-10-27 16:19:23.700'
'2022-10-27 16:27:30.800'
'2022-10-27 16:31:30.600'
'2022-10-27 16:52:09.700'
'2022-10-27 17:16:08.800'
'2022-10-27 18:01:39.400'
'2022-10-27 18:08:24.800'
'2022-10-27 18:17:44.500'
'2022-10-27 18:25:02.500'
'2022-10-27 18:27:45.700'
'2022-10-27 18:48:32.000'
'2022-10-27 19:02:01.700'
'2022-10-27 19:24:08.300'
'2022-10-27 19:30:01.400'
'2022-10-27 19:43:02.200'
'2022-10-27 19:48:50.300'
'2022-10-27 19:55:41.900'
'2022-10-27 19:58:23.700'
'2022-10-27 20:17:15.800'
'2022-10-27 20:28:47.600'
'2022-10-27 20:49:36.900'
'2022-10-27 20:53:02.700'
'2022-10-27 21:07:12.100'
'2022-10-27 21:09:47.800'
'2022-10-27 21:50:42.300'
'2022-10-27 22:07:34.100'
'2022-10-27 22:09:18.800'])
% Any way faster than this one ?
tic
[~, idx2] = sort( datenum(a), 1, 'ascend');
toc
0 Comments
Accepted Answer
Stephen23
on 18 Nov 2022
"Any way faster than this one"
Interestingly, sorting seems to be slightly faster with the superfluous call to DATENUM:
a = datetime(['2022-10-27 00:22:50.000';'2022-10-27 05:29:45.000';'2022-10-27 05:32:19.400';'2022-10-27 05:36:44.100';'2022-10-27 05:39:26.600';'2022-10-27 05:43:18.200';'2022-10-27 05:49:17.400';'2022-10-27 05:55:27.300';'2022-10-27 05:58:07.000';'2022-10-27 06:17:13.800';'2022-10-27 06:41:28.700';'2022-10-27 07:03:06.000';'2022-10-27 07:09:49.800';'2022-10-27 07:17:39.700';'2022-10-27 07:35:09.000';'2022-10-27 07:42:33.600';'2022-10-27 07:46:50.500';'2022-10-27 08:07:02.700';'2022-10-27 08:26:29.600';'2022-10-27 08:45:03.500';'2022-10-27 08:48:09.700';'2022-10-27 08:53:57.000';'2022-10-27 08:59:43.400';'2022-10-27 09:13:24.100';'2022-10-27 09:19:25.000';'2022-10-27 09:26:35.000';'2022-10-27 09:29:54.000';'2022-10-27 09:46:45.700';'2022-10-27 10:10:59.900';'2022-10-27 10:29:04.600';'2022-10-27 10:31:50.100';'2022-10-27 10:37:45.300';'2022-10-27 10:42:35.800';'2022-10-27 10:58:42.300';'2022-10-27 11:03:48.900';'2022-10-27 11:10:44.700';'2022-10-27 11:13:11.100';'2022-10-27 11:31:25.100';'2022-10-27 11:56:01.200';'2022-10-27 12:19:25.300';'2022-10-27 12:25:12.500';'2022-10-27 12:30:16.900';'2022-10-27 12:49:19.100';'2022-10-27 12:55:42.700';'2022-10-27 12:58:18.100';'2022-10-27 13:16:06.300';'2022-10-27 14:00:44.700';'2022-10-27 14:04:14.200';'2022-10-27 14:11:54.600';'2022-10-27 14:17:10.000';'2022-10-27 14:42:03.300';'2022-10-27 14:45:22.100';'2022-10-27 15:02:07.900';'2022-10-27 15:25:39.600';'2022-10-27 15:44:58.300';'2022-10-27 15:48:35.800';'2022-10-27 15:54:23.100';'2022-10-27 16:00:17.300';'2022-10-27 16:19:23.700';'2022-10-27 16:27:30.800';'2022-10-27 16:31:30.600';'2022-10-27 16:52:09.700';'2022-10-27 17:16:08.800';'2022-10-27 18:01:39.400';'2022-10-27 18:08:24.800';'2022-10-27 18:17:44.500';'2022-10-27 18:25:02.500';'2022-10-27 18:27:45.700';'2022-10-27 18:48:32.000';'2022-10-27 19:02:01.700';'2022-10-27 19:24:08.300';'2022-10-27 19:30:01.400';'2022-10-27 19:43:02.200';'2022-10-27 19:48:50.300';'2022-10-27 19:55:41.900';'2022-10-27 19:58:23.700';'2022-10-27 20:17:15.800';'2022-10-27 20:28:47.600';'2022-10-27 20:49:36.900';'2022-10-27 20:53:02.700';'2022-10-27 21:07:12.100';'2022-10-27 21:09:47.800';'2022-10-27 21:50:42.300';'2022-10-27 22:07:34.100';'2022-10-27 22:09:18.800'])
timeit(@()sort(a))
timeit(@()sort(datenum(a)))
Lets try some other numeric formats, which might be more accurate over the dates in question:
timeit(@()sort(juliandate(a)))
timeit(@()sort(exceltime(a)))
timeit(@()sort(posixtime(a)))
3 Comments
Bruno Luong
on 18 Nov 2022
Edited: Bruno Luong
on 18 Nov 2022
idx2 should be the same, but I think timeit() results without second output could not be representative as the algorithm might skip some branching.
The relative result also might change with the size of your data.
More Answers (2)
Peter Perkins
on 19 Nov 2022
It seems like the obvious response to
Any way faster than this one ?
[~, idx2] = sort(datenum(a), 1, 'ascend');
is to not pass in parameters that are the defaults. That cuts the time by not quite a factor of two for me. But maybe in your real code you need to pass in non-defaults.
Three things:
1) On my machine, the difference between these two small sorts
[~, idx2] = sort(a, 1, 'ascend');
[~, idx2] = sort(datenum(a), 1, 'ascend');
is literally a few microseconds. The usual response is, "Sure, but I'm doing this in a loop, so a few microseconds adds up". It would help to hear the details of that context.
2) Sure, for small sorts like your example there is a difference. But by the time you have 10000 elements
a = datetime(2022,1,1,0,0,randi(10000,10000,1));
the difference between the two is gone. And the total time is less than .5ms.
3) ALL of those conversions through away precision. In your example, you have resolution of at most .1s, so precision is probably not an issue. But in general, converting datetime to those numeric formats is A BAD IDEA.
2 Comments
Bruno Luong
on 22 Nov 2022
"But in general, converting datetime to those numeric formats is A BAD IDEA."
That's true.
For decades a resolution of 53-bit datenum is the only option, whereas in C 128-bit resolution is used for time enumeration on other languages.
Bruno Luong
on 18 Nov 2022
Edited: Bruno Luong
on 18 Nov 2022
The first peak is due to JIT that is not kicks in.
Here is the result on my PC (code attached)
I have to chase what create the spikes every 33 iterations.
See Also
Categories
Find more on Dates and Time in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!