How to replace multiple colums with NaN
Show older comments
Hi, i have a large dataset where each column is a subject. I have to remove some of them from the analisys but i have to keep the old subject numbers, so i can't simply remove the columns. The first idea i had was to fill all the bad columns with NaN, but doing this manually is very time consuming. Is there a fast way to do this? Like giving an array of column numbers and then let MATLAB do the job based on those numbers?
8 Comments
Dyuman Joshi
on 25 May 2022
Is there a criteria for deciding which columns are good ones and/or which are the bad ones?
Andrea Carobbi
on 25 May 2022
Edited: Andrea Carobbi
on 25 May 2022
Dyuman Joshi
on 25 May 2022
Visually as in?
dpb
on 25 May 2022
Well, you've got to have the columns defined somehow, whether it's done programmatically or by hand -- it would help to know how the data are stored -- are you using a table or just arrays or what? Given you mention keeping a subject number, I'm guessing maybe it's a table? There are addressing modes by either column index or by the table variable name, whichever is more convenient.
But, yes, given an index array of desired columns it's trivial to write
ixBad=[1, 23, 4]; % the list of columns, however obtained
X(:,ixBad)=missing; % set those columns to missing/bad indicator
Using missing will handle a case in a table where there may be different data types; it will match the inserted indicator to the type of the data it is replacing; isn't significant if it is just a double array...
See the doc section on addressing tables to see how to use column variable names instead of indices for a table if that's more convenient and using tables.
Benjamin Thompson
on 25 May 2022
Posting a small subset or example of your data will probably help the Community most in providing a good answer quickly.
Andrea Carobbi
on 26 May 2022
dpb
on 26 May 2022
" i dind't know MATLAB has this function."
Read through the "Getting Started" documentation/examples to get an idea about how MATLAB works. Vector operations are key to using it effectively and addressing arrays is key element in doing that.
"...made some plot of the mean progression of the response and by hand wrote down the subject numbers (that is, the column number)"
You could probably write code that does that screening pretty-much automagically as well -- simply testing the slopes are significant above zero would not be too difficult an exercise.
I would recomend again using the MATLAB table class and keeping the subject name as the column ID -- it comes along "for free" and isn't confused with the data being numeric.
Andrea Carobbi
on 27 May 2022
Answers (0)
Categories
Find more on Logical in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!