Also know as the "accept- reject" method is a way of generating samples of data from a complex probability function such as gamma functions. Just like we can generate points from a uniform distribution in R, using this method: we can generate points from any defined funtions whereby the points follow the probabilistic nature of said funtion. We can then use use this as synthetic data that followd the compled distribution in: Monte Carlo simulations, fine tuning machine learning models and much more.
Statistics is fun!
We will use R to generate data from the following gamma function and overlay the probability density function to show that the data follows the gamma function when x >= 5
This project will cover:
- How to define and graph a target probability density function.
- How to define proposal probability density function..
Languages used: R (version 4.5.1)
Environment: RStudio
For this rejection sampling we will strat of by defining and graphing our traget pdf.
Target probability density function
This is the gamma function we must sample from where x >= 5.
target_pdf(x)
Is the name of this function graphed as:

Proposal probability density function
This is a known probability distribution that we can easily sample from e.g in R. It must follow the following criteria:
- Cover the target pdf i.e proposal_pdf(x) >= target_pdf(x) for any x
- Be in the the general shape of the target
For this target proposal I seleced an exponential distribution with lambda = 1 translated by + 5
proposal_pdf(x)
Function we will use grpahed as:

Rejection Sampling Once we have defined the function as number generators we can then use the method of rejection sampling to accept or reject generated value.
sim_gamma()
Is the function we will use for our reject or accept method. We begin by defining:
- The seed for general reprodicibility.
- Number of samples we want to generate as
n. sample_yis the vector containing n many allowable values.
While loop
This loop allows for data generation so long as a condition is met. In this case the criteria would be:
While the count of values in sample_y is less than 5000:
- Sample a value from a uniform distribution from 0 to 1 called
u - Compare the value
uto the value of target_pdf(x)/proposal_pdf(x) - If u < target_pdf(x)/proposal_pdf(x) add 1 to the count and add that value to
sample_y
target_pdf(x)/proposal_pdf(x) - acts as a upper bound on the probability of acceptance, and if the value of u is greater than it, reject the value as a valid sample.
Once we have generated 5000 samples we will return sample_y.
Visualisations
We will visualise the output from sim_gamma() with a histogram and overlay the continuous probability curve over it. Keep in mind freq = FALSE to represent the density of each bin and therefore toal areas of all bins toegther is 1 which is the sum of the sum pf all probabilites of events/ outcomes in a probabilty denisty function.

As we can see the bar chart lines up roughly with our continuous curve of the target probability density function proving that we have kept the original distribution with our 5000 samples.
|Simulation- Rejection Sampling
|├── Rejection sampling R script
|└──README
The textbook "Probability with applications and R" by Dr. Wagaman and Dr. Dobrow was very helpful in many of my endevours.
Rejection Sampling: Sampling from ‘difficult’ distributions - was a website that lays down the basics of rejection sampling.