Csv file: reading values with comma

43 views (last 30 days)
Fabio Retorta
Fabio Retorta on 18 Sep 2020
Edited: Jeremy Hughes on 21 Sep 2020
Hi dear friends,
I need to work with csv file from this website: "https://ds.50hertz.com/api/WindPowerActual/DownloadFile?fileName=2020.csv"
Matlab is not able to read data with comma, and so the result is NaN.
I did the following code to import the data from the web, and to fix the semicolon separator:
%% Get the file from the web and save the file in the path
url = 'https://ds.50hertz.com/api/WindPowerActual/DownloadFile?fileName=2020.csv';
filename = 'WP_50hz.csv';
options = weboptions('Timeout',60);
outfilename = websave(filename,url,options);
% Function to fix the file
[Wind_data] = adjustment_file(filename);
%% Adjustment_file
function [T] = adjustment_file(FILE)
% SECTION 1: Clean up and rewrite the file
% Read in the entire file as text
file = FILE;
txt = fileread(file);
% Remove all null characters
txtClean = regexprep(txt,char(0),'');
% Re-write the file with a new name
% * If "file" contains the full path, you'll need to change
% this line below to rename the file.
fid = fopen(['clean_',file],'w');
fprintf(fid, '%s',txtClean);
fclose(fid);
newfile = ['clean_',file]; % Use the full path whenever possible!
opts = delimitedTextImportOptions("NumVariables", 4);
% Specify range and delimiter
opts.DataLines = [6, Inf];
opts.Delimiter = ";";
% Specify column names and types
opts.VariableNames = ["Datum", "von", "bis", "MW"];
opts.VariableTypes = ["datetime", "datetime", "datetime", "double"];
% Specify file level properties
opts.ExtraColumnsRule = "ignore";
opts.EmptyLineRule = "read";
% Specify variable properties
opts = setvaropts(opts, "Datum", "InputFormat", "dd.MM.yyyy");
opts = setvaropts(opts, "von", "InputFormat", "HH:mm");
opts = setvaropts(opts, "bis", "InputFormat", "HH:mm");
% Import the data
T = readtable(newfile, opts);
end
How can I handle the comma in the data to be able to read the values?
Thank you very much ; )

Answers (1)

Jeremy Hughes
Jeremy Hughes on 18 Sep 2020
Edited: Jeremy Hughes on 21 Sep 2020
I think you need to set the decimal separator and thousands separator. Also, duration may be the better type for two of the variables.
Also, you can just supply UTF-16 I think instead of needing to do the file cleaning.
opts = delimitedTextImportOptions("NumVariables", 6);
% Specify range and delimiter
opts.DataLines = [6, Inf];
opts.Delimiter = ";";
opts.Encoding = "UTF-16";
% Specify column names and types
opts.VariableNames = ["Datum", "von", "bis", "MW", "OnshoreMW", "OffshoreMW"];
opts.VariableTypes = ["datetime", "duration", "duration", "double", "double", "double"];
% Specify file level properties
opts.ExtraColumnsRule = "ignore";
opts.EmptyLineRule = "read";
% Specify variable properties
opts = setvaropts(opts, "Datum", "InputFormat", "dd.MM.yyyy");
opts = setvaropts(opts, ["von", "bis"], "InputFormat", "hh:mm");
opts = setvaropts(opts, 4:6, "DecimalSeparator",",","ThousandsSeparator",".");
% Import the data
T = readtable(file, opts);
  4 Comments
Walter Roberson
Walter Roberson on 21 Sep 2020
That is a warning about UTF-16, not an error message. In the release you were using, officially UTF-16 was not supported, but in practice it worked fine.
As of R2020a, MATLAB will automatically detect the UTF-16LE encoding.
Jeremy Hughes
Jeremy Hughes on 21 Sep 2020
Also, the spaces in the variable names won't work in some earlier releases. Editing the steps to make sure those will work.

Sign in to comment.

Categories

Find more on Data Type Conversion in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!