How to replace multiple colums with NaN

Hi, i have a large dataset where each column is a subject. I have to remove some of them from the analisys but i have to keep the old subject numbers, so i can't simply remove the columns. The first idea i had was to fill all the bad columns with NaN, but doing this manually is very time consuming. Is there a fast way to do this? Like giving an array of column numbers and then let MATLAB do the job based on those numbers?

8 Comments

Is there a criteria for deciding which columns are good ones and/or which are the bad ones?
Unfortunately not, the selection was done visually on some plots.
Well, you've got to have the columns defined somehow, whether it's done programmatically or by hand -- it would help to know how the data are stored -- are you using a table or just arrays or what? Given you mention keeping a subject number, I'm guessing maybe it's a table? There are addressing modes by either column index or by the table variable name, whichever is more convenient.
But, yes, given an index array of desired columns it's trivial to write
ixBad=[1, 23, 4]; % the list of columns, however obtained
X(:,ixBad)=missing; % set those columns to missing/bad indicator
Using missing will handle a case in a table where there may be different data types; it will match the inserted indicator to the type of the data it is replacing; isn't significant if it is just a double array...
See the doc section on addressing tables to see how to use column variable names instead of indices for a table if that's more convenient and using tables.
Posting a small subset or example of your data will probably help the Community most in providing a good answer quickly.
dpb, thanks you. Your code does exactly what i wanted, i dind't know MATLAB has this function.
To answer the question, the data i'm analysing are from some measurement on human eyes. The values in each of the matrix's colums should rise while the stimulus given to the subject rises. I made some plot of the mean progression of the response and by hand wrote down the subject numbers (that is, the column number) who did not show any progression. Now i have to remove this subject and repeat the analisys, keeping the same numeration (the subject 40 must remain 40). The easiest way i could think of was replacing all the bad data with NaN.
" i dind't know MATLAB has this function."
Read through the "Getting Started" documentation/examples to get an idea about how MATLAB works. Vector operations are key to using it effectively and addressing arrays is key element in doing that.
"...made some plot of the mean progression of the response and by hand wrote down the subject numbers (that is, the column number)"
You could probably write code that does that screening pretty-much automagically as well -- simply testing the slopes are significant above zero would not be too difficult an exercise.
I would recomend again using the MATLAB table class and keeping the subject name as the column ID -- it comes along "for free" and isn't confused with the data being numeric.
The main thing i missed is the fact i can use an array to pass multiple indices, even by reading the getting started guide. Thank you for remainding me.
I honestly didn't think about making the check automatic, 'cause for now the dataset isn't extremely big, so doing it by hand is still viable, but i will if the dimension of the dataset become too big.

Sign in to comment.

Answers (0)

Products

Release

R2021b

Tags

Asked:

on 25 May 2022

Commented:

on 27 May 2022

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!