How to Generate Randomized Data Set with Given Mean/Median/Standard Deviation
Show older comments
Need to generate a data set thats randomIZED but not completely random with the following criteria:
Mean: 55.4
Median: 54.8
Standard Deviation: 10
I appreciate any help I can get!
5 Comments
Suppose you know your sample distribution, you can make use of different functions (such as random).
% Assuming a normal distribution
rng('default') % For reproducibility
r = normrnd(55.4, 10, 1e4, 1);
fprintf('Sample mean: %.2f, and sample SD: %.2f\n', mean(r), std(r))
Jeff Miller
on 3 Dec 2020
It is not completely clear from your question whether you want those three criteria to be met for the true underlying distribution as a whole (so each random sample deviates slightly from those criteria by chance), or if you want the criteria to be met exactly for each individual sample. If you want the former, Bruno's answer gives a solution. If you want the latter, the problem is much harder.
Julianna Sims
on 4 Dec 2020
Bruno Luong
on 4 Dec 2020
Edited: Bruno Luong
on 4 Dec 2020
It's not "much harder".
Sorry but such request just does look any pratical sense statistically, and usually asked by person who does not really know about random process and distribution.
Still this is a code, try to make use it if you can, good luck.
meantarget = 55.4;
mediantarget = 54.8;
starget = 10;
n = 100;
midp = 500/n;
xmid = midp*randn(1,n-2);
xmid = xmid-median(xmid);
m = meantarget-mediantarget;
s = sum(xmid);
s2 = sum(xmid.^2);
b = (m*n-s);
c = (b^2+s2-n*(m^2+starget^2))/2;
delta = b^2-4*c;
if delta < 0
error('reduce midp')
end
xhi = (b+sqrt(delta))/2;
xlo = b-xhi;
x = mediantarget+[xlo xmid xhi];
medianx = median(x)
meanx = mean(x)
stdx = std(x,1)
Jeff Miller
on 4 Dec 2020
Very clever, Bruno. I didn't think of generating n-2 scores and computing the last 2. I guess I should have said, "much harder for me".
Answers (1)
Bruno Luong
on 3 Dec 2020
Edited: Bruno Luong
on 3 Dec 2020
There is infinity of random variables with the same mean/median/standard deviation
Here is one that meet what you ask (three discrete values).
meantarget = 55.4;
mediantarget = 54.8;
starget = 10;
pm = 0.1;
pout = (1-pm)/2;
p = [pout pm pout];
dmx = meantarget-mediantarget;
v = starget^2;
a = dmx;
b = v+dmx^2;
delta = 2*b/pout-a^2/pout^2;
xhi = (a/pout+sqrt(delta))/2;
xlo = a/pout-xhi;
x3 = mediantarget + [xlo 0 xhi];
c = cumsum(p);
c = c/c(end);
n = 1e6;
x = rand(1,n);
[~,i] = histc(x, [0 c]);
x = x3(i);
meanx = mean(x)
medianx = median(x)
stdx = std(x)
Categories
Find more on Profile and Improve Performance in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!