How do I delete cells in a column based on information from another column?

2 views (last 30 days)
Hi,
I am working on data that I have extracted from an excel sheet. I have extracted the three columns that are necessary but need to do further work until I can use the data. Column three tells me if the data is valid hence I need to write the code that if any cells in column three equals 1 the correspoding cells in column 1 and 2 are equal to nan or 0. I then need that if column 1 equals to zero the corresponding cells in cloumn 2 equals to nan or 0. I then need to sum column 1 and average column 2. Below is the code I have tried. It runs but the output it gives is the sum and average of the columns before the if statements.
X = xlsread('filename.csv');
x = [X(:,2) X(:,3) X(:,9)];
if (x(:,3) == 1), x(:,1)= 0;
end
if (x(:,3) == 1), x(:,2)= 0;
end
if (x(:,1) == 0), x(:,2)= 0;
end
sum(x(:,1))
mean(x(:,2))
  3 Comments
Sonia Lenehan
Sonia Lenehan on 28 Feb 2020
Thank you for your reply.
Apologies for not stating the output
The output returned is ans = 5
ans = 9.1426e+05
But it should be ans = 3 and ans = 589333 as I have calculated it manually.
From the excel sheet I pulled cloumns 2, 3 and 9. 2 and 9 are coded 1 and 0 and 3 is time in mirco seconds.
When I added that line of code I got back
ans =
5
ans =
9.1426e+05
ans =
0 0 0
dpb
dpb on 28 Feb 2020
Edited: dpb on 28 Feb 2020
I guess will have to send the crystal ball back out for repair...it seems to still be murky! :)
That said, I'm pretty confident that you've not uncovered a bug in the ML sum() function so I'll assert the answer in ML is 5 and 9E5.
We'd have to see the input file to see what it is precisely that caused the differences.
I'll try the crystal ball one more time before declaring it broken (yet again)--
Does the input file have a header row, perchance? The difference between 3 and 5 is 2 and two headerlines would be a very common occurrence in a spreadsheet. There being some other value in one of those two lines that is also numeric in the second column is also certainly quite possible.

Sign in to comment.

Accepted Answer

Guillaume
Guillaume on 2 Mar 2020
Considering the format of your spreadsheet you would be better off importing the data as a table. If I understood correctly what you're trying to achieve, this would do it:
testresults = readtable('WM TEST.xlsx'); %import data as table. Matlab automatically detect the column headers and use that to name the table variables
isgoodtest = testresults.Correct & ~testresults.FailedTrial; %a good test is one with correct result and not failed
meangoodsaccade = mean(testresults.SaccadeTimeToChooseCurtain(isgoodtest); %mean of saccade time for these tests that are correct and not failed
countgoodtest = nnz(isgoodtest);
As you can see, there is no need to remove anything, so if you want to compute some other statistics the source data is still there, and it's much easier to understand what the code is doing and the code is simpler as well. Good variable names help with code clarity.

More Answers (1)

Pujitha Narra
Pujitha Narra on 2 Mar 2020
Edited: Pujitha Narra on 2 Mar 2020
Hi,
According to the code above, the condition for the ‘if’ would return a column vector, but the ‘if’ requires the entire vetor to contain logical true values. This is causing the unexpected results.
Instead use the following code instead of the ‘if’ statements:
x(find(x(:,3)==1),1) =0;
x(find(x(:,3)==1),2) =0;
x(find(x(:,3)==0),2) =0;
  5 Comments
Pujitha Narra
Pujitha Narra on 2 Mar 2020
I agree with Guillaume and Stephen. 'if' tries to check if the entire vector has logical true values in this case, not just the first element (as wrongly mentioned previously)
Sonia Lenehan
Sonia Lenehan on 2 Mar 2020
Edited: Sonia Lenehan on 2 Mar 2020
Thank you, I have updated the code to remove the find. I have attached a sample excel sheet of the data I am using. The participants have to pick a side based on information and I am looking at how many times each participant looks at the correct side and how long it takes them on average. Hence the columns I need are correct, saccade time to chosen curtain and failed trials which I have been able to extract. I then need to look at the failed trials and remove them. I then need to look at the correct trials and remove the incorrect ones. I then want to average the times of the correct trials.
I am using R2018a.

Sign in to comment.

Categories

Find more on Large Files and Big Data in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!