readtable() on a csv is bringing in "missing" values when its a letter but reads fine when its a number

39 views (last 30 days)
So im a bit frustrated with tables. There seems to be many many bugs with this data type.
So im using read table to read in a csv.
The csv has cells that contain either letters or numbers or a mix
When i first open the csv and do the read table everything comes in fine.
numbers show up as numbers, letters show up as letters
however, if i edit the csv file and take a cell that once had numbers in it and replace it with lettes and do the read again i get "missing" data saved into the table
My code continues to break because matlab is not reliably reading the table correctly. I dont know what to do about this.
Does anyone have any ideas?
  6 Comments
Simon Chan
Simon Chan on 24 Jul 2021
I think it can be reproduced with at least two rows of data.
(1) Normal one (all numbers):
2×2 table
A B
___ _
0 2
0.1 4
(2) Change to charater on row #1 and column #2:
2×2 table
A B
___ ___
0 NaN
0.1 4
(3) Change to charater on row#1 and both columns:
1×2 table
A B
___ _
0.1 4
Yongjian Feng
Yongjian Feng on 24 Jul 2021
@Simon Chan: As shown in the answer from Image Analyst, the each column needs to have the same data type. In your example, the first column now contains both number and char.

Sign in to comment.

Answers (2)

Image Analyst
Image Analyst on 24 Jul 2021
csvread() reads an array of all numbers.
readtable() can read numbers and characters/strings but all items in a single column must be of the same type. Different columns can be different types though, like one column of numbers and another with strings.
readcell() is probably what you want. It can put anything into any cell. All rows and columns do not have to match like they do with tables or arrays. tables are not meant to have variables change type within a column on a row-by-row basis -- the whole column must have the same type.
For a good intuitive description of what cell arrays are and how they work, read the FAQ:
Finally, please give a specific example of a true bug in a table, as you have not done that yet.
  2 Comments
Peter Perkins
Peter Perkins on 26 Jul 2021
All due respect to IA, but readcell is probably NOT what you want. You will end up with a cell array that is likely not useful for almost anything. The root problem is a CSV with two different types of data in one column. Maybe with a mix of numbers and text, you want the numbers to come in as text. That's fine, use detectimportoptions and set the var type to do that. Otherwise, what is the text doing in that column? What do you expect it to be once read in? A mix of numbers and text in one variable is probably not useful. With no example to go on, hard to say exactly what you need to do.
This is not anything like a bug in readtable. It is potentially malformed data that readtable is handling as best as can be expected.
Image Analyst
Image Analyst on 26 Jul 2021
Edited: Image Analyst on 26 Jul 2021
I think we've all seen complicated text files with header stuff, a table of data, some more header stuff, and more data, etc. For example there is one in https://www.mathworks.com/matlabcentral/answers/312599-how-do-i-parse-this-complex-text-file-with-textscan#comment_406938
2015 1 22 0 8 58.537 45.97929 -129.98717 1.184 0.0 1.039 3.621 0.036 1
AXCC1 0.843 1.00 P
AXAS2 1.263 1.00 P
AXEC1 0.923 1.00 P
AXEC2 1.103 1.00 P
AXEC3 1.088 1.00 P
AXCC1 1.873 0.25 S
AXAS1 2.728 0.06 S
AXAS2 2.168 0.25 S
AXEC1 1.708 0.33 S
AXEC2 2.043 0.25 S
AXEC3 2.113 0.25 S
EOB
2015 1 22 0 11 49.113 45.93840 -130.01389 0.661 0.0 0.790 0.887 0.040 2
AXAS1 0.247 0.25 P
AXAS2 -0.053 0.75 P
AXEC2 0.547 0.25 P
AXEC3 0.512 1.00 P
AXCC1 0.437 0.03 S
AXAS1 0.942 0.08 S
AXAS2 0.162 0.33 S
AXEC1 1.167 0.17 S
AXEC2 1.122 0.12 S
AXEC3 1.247 0.12 S
EOB
2015 1 22 2 2 41.012 45.94110 -130.02023 1.265 0.0 0.464 0.755 0.037 3
AXCC1 0.549 1.00 P
AXAS1 0.614 1.00 P
AXAS2 0.389 1.00 P
And here is another example from a half hour ago today:
It's not that there is anything wrong or malformed with the data, it's just that, that's what was spit out by whatever other program created the data.
So in cases like that you can either write your own custom reader for it, or you can read it all into a cell array with readcell, and then parse out the stuff you want out of the cell array into more meaningfull, descriptive arrays.

Sign in to comment.


Robert
Robert on 24 Jul 2021
Thanks for all your suggestions

Tags

Products


Release

R2018b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!