You are now following this question
- You will see updates in your followed content feed.
- You may receive emails, depending on your communication preferences.
Error in splitapply command
4 views (last 30 days)
Show older comments
I am using this command "splitapply" in order to find mean (average) of a group of data.
edges=1:0.5:10
[N, edges, bin] = histcounts(B, edges);
mean_B=splitapply(@mean, B, bin) %mean
%B is 1000x1 double
But command window shows me :
Error using splitapply (line 61)
Group numbers must be a vector of positive integers, and cannot be a sparse vector.
which is curiousness because for an another set of data code runs.
Could you please help me?
Answers (1)
Image Analyst
on 27 Feb 2021
This seems to work fine:
B = 1 + 9 * rand(1, 100000);
edges = 1 : 0.5 : 10
[counts, edges, bin] = histcounts(B, edges);
% bin says what bin the value was placed into.
% Compute the means of the values in each bin.
mean_B = splitapply(@mean, B, bin)
Attach your B so we can see why it's different than mine. If your B exceeds 10, it will say that bin is zero for those values exceeding 10, and that would be a problem since you're passing in bin as the "group number" and the group numbers have to be natural numbers (1,2,3,4,...) and not zero.
19 Comments
Ivan Mich
on 27 Feb 2021
Edited: Ivan Mich
on 27 Feb 2021
Ok , my full code is:
edges=1:0.5:10
[N, edges, bin] = histcounts(B, edges);
Result = accumarray(bin, A);
Y = discretize(B,edges)
% Y=rmmissing(Y)
mean_IMM=splitapply(@mean, A, bin) %mean
I am attaching the data.txt file (A id first column, B is second column).
I would appreciate if you could help me
Walter Roberson
on 27 Feb 2021
2 0.73834617
Your bins start at 1 but you have data less than 1 which would be getting a bin number of 0
Ivan Mich
on 27 Feb 2021
Also I have one problem. I have change edges=0:0.5:10, but command window shows me:
Error using splitapply (line 111)
For N groups, every integer between 1 and N must occur at least once in the vector of
group numbers.
How could I fix it?
I am attaching file
Image Analyst
on 27 Feb 2021
That's not the full code. For example there is no code to read in 'data.txt' and derive B from it. And you seem to have two input files: a text file and a workbook. Which is it? In general to get rid of values less than 1 you can do
badNumberMask = data < 1;
B = data(badNumberMask);
Image Analyst
on 27 Feb 2021
The last line got cut off. We're not seeing how B is computed from A. Come on, make it EASY for us to help you, not hard. And did you try my code to remove numbers less than 1?
Image Analyst
on 27 Feb 2021
Ivan:
Just give me the whole script. Attach it with the paper clip icon. Because the following code, with the data.txt you attached (and I'm attaching again here) doesn't work. It doesn't give me A or B.
fileName1 = 'data.txt';
[d1, tex] = importdata(fileName1)
A = d1.data(:,1);
% B is a result from an equation including A values.
% the equation is
B = 4 * A - 0.3
edges = 1 : 0.5 : 10
[counts, edges, bin] = histcounts(B, edges);
% bin says what bin the value was placed into.
% Compute the means of the values in each bin.
mean_B = splitapply(@mean, B, bin)
I'm very close to giving up on this, but giving you another chance to make it right. I'm sure over the past 3 hours you have new code by now so attach that. I don't have much time today to go back and forth on this just to get to a starting point.
Dot indexing is not supported for variables of this type.
Error in test7 (line 4)
A = d1.data(:,1);
Walter Roberson
on 27 Feb 2021
Your first column has values up to 10. 4*10-0.3 is 39.7 and 39.7 is outside the range 1:0.5:10 so values anywhere near that large would get assigned bin 0.
Your second column has values up to about 7.5. 4*7.5-0.3 is about 29.7 and 29.7 is outside the range 1:0.5:10 so values anywhere near that large would get assigned 0.
Your first column has values from 2 to 10, and that would fit in 1:0.5:10 if you used it directly.
Your second column has values from 0.73834617 to about 7.5, but 0.738 is before 1:0.5:10 so those small values would be assigned bin 0 if you were to bin your second column directly.
Walter Roberson
on 28 Feb 2021
mask = B < 1;
newA = A(mask) ;
newB = B(mask);
Image Analyst
on 28 Feb 2021
Keep in mind that you can't delete/remove elements from a array with more than 2 dimensions because it must remain rectangular. It can't have "holes" or "ragged edges" in it. You can only remove elements from a 1-D vector.
If you have a 2-D matrix and get a map of where it's above or below some threshold, then if you pass that into the 2-D matrix as a logical index, it can't give you a 2-D matrix back with "holes" in it where the non-selected elements are missing. So it returns the elements all concatenated in a 1-D vector pulled from the original matrix in a "column major" manner.
Ivan Mich
on 26 Jul 2021
I would like to make a joint splitapply command. I mean I would like to
1) Group my data into bins from edges : 1:0.5:10 (min=1, max=10 with step equal to 0.5)
2) Compute the means of the values in each bin (lets call it set1).
3) Group my data into bins from edges : 2:1:10 (min=2, max=10 with step equal to 1)
4) Compute the means of the values in each bin (set2).
5) Merge two sets of data
Could you please help me in order to make it?
Thanking you in advance
Walter Roberson
on 27 Jul 2021
discretize() to get bin numbers. grpstats() to get the mean (if you have the Statistics toolbox; otherwise you can use splitapply() or accumarray() )
You can do the above for each of the two sets of edges.
However... I do am not clear on how you would want to merge the two results ?
I guess I am also not clear as to whether the data mentioned in (1) is the same data as is mentioned in (3) or if it is different data.
Are you trying to divide data with (X, Y, value) up across a grid and take the mean for each (2D) grid location?
Ivan Mich
on 27 Jul 2021
the data are slightly different as the classes are different and therefore the averages in each bin. I would like to combine these two so that the results of the averages are smoothed out (tin order to make overlapping to my results). is it possible to do it?
Rik
on 27 Jul 2021
@Ivan Mich Why are you ignoring my answer to your question and reposted it here as a comment?
You are also consistently ignoring the main question: what do you mean by merging?
Ivan Mich
on 27 Jul 2021
Look for example from the first set I will have 5 bins with 5 mean numbers. From the second set I will have 4 bins with 4 mean numbers. every mean number corresponds to one bin . I am giving you an example of the output
set 1
Bin mean
[2-3] 0.5
[3-4] 1.25
[4-5] 1.6
[5-6] 1.9
[6-7] 3.2
set 2
Bin mean
[2.5-3.5] 0.75
[3.5-4.5] 1
[4.5-5.5] 1.7
[5.5-6.5]2.5
So I mean merge to have an output that will includes all the values.
Like
mean
0.5
0.75
1
1.6
1.7
1.9
2.5
3.2
That's what I mean.
Could you please help me?
Image Analyst
on 27 Jul 2021
The two sets are using different edges for some reason. That's probably not good and you should specify the edges to be the same for all sets. What do you want the edges to be for the combined set?
But my answer would be that what you asked to do is, in my opinion, not good. You should just histogram your combined original data set and not have two histograms (one from each data set) that have different edges. Just histogram the whole combined set with one set of edges.
See Also
Categories
Find more on Data Preprocessing in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!An Error Occurred
Unable to complete the action because of changes made to the page. Reload the page to see its updated state.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom(English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)