Get datenum to return missing instead of throwing an error

Hello,
I normally program in R, so apologies if this question doesn't make sense or I missed an easy answer. I am using datenum to read a vector of character dates from a text file, which occassionally contains strings such as "NA" or "NaN". When this happens, datenum quits and throws an error, but I want datenum to return <missing> and keep going. This happens when I try datenum(NaN) in the console, but not when applying datenum to a character vector.
I don't want to pre-process the data by stripping out the "NA" strings, as that seems inelegant. I want Matlab to (possibly with a warning) return a missing value if it can't parse the date, rather than signal an error (this is how most date functions in R behave). How can I do this? I think the answer lies in try-catch, but I am not sure of the specifics of Matlab error handling.

3 Comments

There may be a simple solution, but it will depend on what you are doing. Could you share some of your code, including how you load the data, and where you use datenum. Please also attach your dataset using the paperclip icon.
I cannot share the data as it is confidential, but I will try to elaborate upon what I'm doing below:
So far, there are two different places in my code where I am working with dates: I have a list of files with dates in the name which I extract using regexp, then convert to dates. The issue here was that I used char() at some point, which could in theory cause datenum to fail if regexp returned a pattern it could not parse, but that didn't happen.
The files themselves contain dates, but the dates are numbers (POSIX time), with some "NA" values. Upon further investigation, it seems that sometimes readtable() correctly converts the fields to numeric with NaN or NaT values instead of 'NA', and other times it does not. I'm looping over the files, but the debugger does not tell me which iteration cuased the loop to fail (so if you could point me to a way of figuring that out, that itself would be very helpful).
Here is the first part of my code, which extracts dates from the filenames (this works fine):
%Files and dates to loop over
files = struct2table(dir("dirname"));
filelist = char(files.name); filelist = filelist(3:length(filelist), :);
dates = regexp(cellstr(filelist), '\d{4}-\d{2}-\d{2}', 'match');
dates = datenum(vertcat(dates{:}));
Here is the relevant portion of second part, which fails on a particular iteration (which I am having trouble identifying):
header = "dirname/"
parfor t = 1:length(filelist)
file = filelist(t, :);
data = readtable(strcat("dirname", file));
data.date = dateum(data.date); %usually works, sometimes results in "cannot parse date na"
...
end

Sign in to comment.

 Accepted Answer

I would start by looking into the standardizemissing function.

5 Comments

Unfortunately, that still causes datenum to fail: "cannot parse date ." (it does not recognize the empty string as missing). Rather than trying to fix the input, I think it would be preferable to get datenum to return missing if it can't parse the date.
It is actually preferable to not use datenum at all. If you could share your data and code, we could probably be of more help.
I see. Is there something other function I should be using? The input data are in terms of the number of days since 1970-01-01, so I thought I could convert them by using datenum (and then adding the difference in days between 0000-01-01 and 1970-01-01).
You could use the caldays function to add the elapsed days to your base date.
% imported elapsed days
d = 1:5;
% Add to base date
d0 = datetime(1970,1,1);
D = d0 + caldays(d)
D = 1×5 datetime array
02-Jan-1970 03-Jan-1970 04-Jan-1970 05-Jan-1970 06-Jan-1970
Thank you! This works perfectly, and matches the format I need.

Sign in to comment.

More Answers (1)

As @Cris LaPierre said, we recommend using datetime instead of datenum. Let's create some example data.
t = string(datetime('today') + days([1; -2; 5; -17; 3]))
t = 5×1 string array
"12-May-2022" "09-May-2022" "16-May-2022" "24-Apr-2022" "14-May-2022"
Change one of the entries so it's no longer valid date data or its format doesn't match the format of the rest of the data.
t(3, :) = 'invalid date';
t(5, :) = "December 25, 2022"
t = 5×1 string array
"12-May-2022" "09-May-2022" "invalid date" "24-Apr-2022" "December 25, 2022"
When we convert that to a datetime array the invalid data becomes a NaT (for Not-a-Time.)
dt = datetime(t)
dt = 5×1 datetime array
12-May-2022 09-May-2022 NaT 24-Apr-2022 NaT
You can detect which entries were not converted using ismissing or isnat.
wasInvalid = ismissing(dt)
wasInvalid = 5×1 logical array
0 0 1 0 1
You could then potentially try to correct the problem (by converting those entries from the original array using a different InputFormat in a second datetime call, for example, or letting datetime try to deduce a different format.)
dt(5) = datetime(t(5))
dt = 5×1 datetime array
12-May-2022 09-May-2022 NaT 24-Apr-2022 25-Dec-2022

Categories

Products

Release

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!