How to sample from a dataset using a normal distribution as a sampling scheme?
Show older comments
Hi,
Suppose we have a dataset of images of size n, and I would like to sample from this dataset using a normal distribution with known mean and variance, How this could be achieved?
Thank you.
Answers (1)
Walter Roberson
on 25 Feb 2023
0 votes
You cannot. A normal distribution is inherently a continuous distribution with infinite tails in both directions. You are trying to sample from a finite population. Regardless if you are trying to sample from the pool of (all pixels in all images) or the pool of (all images) or something in-between such as "randomly selected 227 x 227 patches from randonly selected images", that is sampling from a finite discrete population not an infinite continuous population.
You could, of course, round() the randn() and min() and max() to truncate it to the integer bounds, but the result will not be a normal distribution.
You may wish to consider something like https://en.wikipedia.org/wiki/Beta-binomial_distribution which is designed for sampling from a finite population in ways influenced by Beta distribution.
1 Comment
Akram Awad
on 25 Feb 2023
Edited: Akram Awad
on 25 Feb 2023
Categories
Find more on Get Started with Statistics and Machine Learning Toolbox in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!