![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/1480531/image.png)
Mean and standard deviation in six sigma
112 views (last 30 days)
Show older comments
I wrote code to calculate the mean and standard deviation (sigma) for values like 1 sigma, 2 sigma, and 3 sigma. However, I noticed that the 2 sigma and 3 sigma lines are plotted outside the curve. According to statistical theory, +- 1 sigma, +- 2 sigma, and +- 3 sigma should encompass 68.26%, 95.44%, and 99.73% of the area under the curve, respectively. Therefore, these sigma lines should be inside the curve. The data file is attached.
My code is:
Corrected_UV_D4_spectra_1 = xlsread('corrected_uvd4_spectra_1','Sheet1','A26:B341');
data = Corrected_UV_D4_spectra_1;
x = data(:,1);
y = data(:,2);
figure;scatter(x,y);
mu = mean(x)
xline(mu,'g--')
sd = std(x)
xline(mu + sd,'m--')
xline(mu - sd,'m--')
xline(mu + 2*sd,'b:')
xline(mu - 2*sd,'b:')
xline(mu + 3*sd,'k-.')
xline(mu - 3*sd,'k-.')
Result I am getting is also attached
0 Comments
Answers (1)
akshatsood
on 13 Sep 2023
Edited: akshatsood
on 13 Sep 2023
Hi SAKSHAM,
I understand that you want to visualize what we call the empirical rule in the statistical theory. I reviewed the attached code and noticed that that the plot was not intersecting the lines specified by μ ± 2σ and μ ± 3σ. I perceive it as an issue with the range of x data. To be more clear, consider the following observations
x = data(:,1);
min(x); % 425
max(x); % 740
Further, observe the range for μ ± 3σ
mu - 3*sd % 308.4033
mu + 3*sd % 856.5967
It can be easily observed that, min(x) >= mu - 3*sd and max(x) <= mu + 3*sd. This is not desired because, to illustrate the empirical rule effectively, the range of x should extend beyond the range of μ ± 3σ. To achieve this, the x data can be tweaked by incorporating the mean and standard deviation from the original data. Here is a code snippet that demonstrates this
Corrected_UV_D4_spectra_1 = xlsread('corrected_uvd4_spectra_1','sheet1','A26:B341');
data = Corrected_UV_D4_spectra_1;
x = data(:,1);
y = data(:,2);
mu = mean(x);
xline(mu,'g--')
sd = std(x);
x0 = (mu-4*sd):0.1:(mu+4*sd); % tweaking x data to adjust the range
% pdf of the normal distribution with mean mu and standard deviation sigma
pdf_values = normpdf(x0, mu, sd);
plot(x0, pdf_values); % plot normal distribution
xline(mu + sd,'m--')
xline(mu - sd,'m--')
xline(mu + 2*sd,'b:')
xline(mu - 2*sd,'b:')
xline(mu + 3*sd,'k-.')
xline(mu - 3*sd,'k-.')
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/1480531/image.png)
Have a look at the documentation page for better understanding
I hope this helps.
3 Comments
akshatsood
on 14 Sep 2023
Edited: akshatsood
on 14 Sep 2023
Hi SAKSHAM,
I understand that you want to include Y data. As you said that, you have the same X for the two datasets then It would be helpful if you could explain how the two datasets differ in terms of mean and standard deviation. Additionally, I would like to highlight a possible reason for the sigma lines not being inside the curve, which could be due to insufficient data points. The empirical rule states that
for normal distributions, 68.26% of observed data points will lie inside one standard deviation of the mean, 95.44% will fall within two standard deviations, and 99.73% will occur within three standard deviations.
However, it is important to note that the empirical rule assumes a truly normal distribution. If the datasets deviate significantly from normality or if there are outliers present, the empirical rule may not hold true. One possible workaround to replicate the behaviour is by using interpolation and extrapolation. However, it is important to note that this approach may not yield a good fit due to the limited number of data points available.
Have a look at the below references for interpolation and extrapolation
See Also
Categories
Find more on Interpolation in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!