-
Notifications
You must be signed in to change notification settings - Fork 1
Bayesian Data Analysis
https://bookdown.org/marklhc/notes_bookdown/ by Mark Lai
- Incorporate prior knowledge
- Flexibility
- more fitting options that feq stats
- Handles missing data well
- ease of model comparisons
- people are sick of p-hacking
Classical probability is equal among independent cases Frequentist probability is the long term relative frequency of an outcome
The problem with the Fequ approach is that some events never repeat (presidential elections, sports competitions... etc.), at least in the way a count toss does.
Subjectivist probability Incorporate your belief into the probability. But the belief must follow the rules of probability and be rational.
- Probability has to be less than zero
- sum of all p() for possible events = 1
- p() that one of two mutually exclusive events occur is the sum of the probabilities
The probability of an event given another event
p(A|B) the p() of A given b
= P(AnB) the probability that both A and B will occur (pronounced A-cap-B)
/ P(B)
if the P(A|B) = P(A) then the vents are independent or P(AnB) = P(A)P(B)
If P(A) is a marginal probability (it won't happen without some B), then the P(A) is the sum of all of the P(AnB) (or probabilities of A and B)
P(B|A) = P(A|B)P(B) / P(A)
Posterior Probability ∝ Prior Probability × Likelihood
P(θ=t|y)∝P(θ=t)P(y|θ=t)
- Posterior Probability P(θ=t|y)
- likelihood P(y|θ=t)
- prior probability P(θ=t)
The probability that some parameter (θ) is equal to some value (t) given some data (y) is equal to the prior probability that the parameter equals the value P(θ=t) times the probability of the data (y) given the parameter equaling the value P(y|θ=t)
the frequentist approach is to leave off the prior
"turning the Bayesian crank"
- Assumption of Exchangeability subpopulations don't have different probabilities
- Probability Distribution must match the data - discrete/bimodal, normal etc.
- Likelihood calculate the likelihood of parameters(s) given data
- pick an informed prior of the parameter(s) distribution(s)
- Not just picking a value, selecting a probability distribution for that value
Use Bayes Rule to calculate the posterior
Grid Approximation
- Pick a number of values of parameters
- Evaluate the posterior for each parameter value
- This provides the poster distribution
Conjecgate Priors
- the probability of the parameter give y is the distribution of the sums of the prior and observed numbers of success and values
P(θ|y)∼ Beta(a+y,b+n−y)
a -> prior num of success
b -> is the prior number of faluies
y -> observed success
n-y -> observed falues
how does this work for a nonbinary distribution?
Laplace Approximation with Maximum A Posteriori Estimation
- find the maximum point in the posterior distribution, called the maximum a posteriori (MAP) estimate What about the other components of the distribution? variance?
Markov Chain Monte Carlo (MCMC)
- draws samples from the posterior distribution
- these samples are correlated, which requires drawing of more samples Is this the same as drawing samples from the data? is this similar to bootstrapping.
Mean, Median, & Mode
- Posterior Mean -> point generally used as the estimate
- Posterior meaning -> worth considering as it is more robust to outliers
- Postier Mode -> the maximum a posteriori (MAP) is the point with the highest posterior probability
Uncertainty Estimates
- standard deviation of the posterior distribution
- mean absolute deviation from the median (MAD) of the posterior distribution (more robust when the distribution is skewed)
Credible Intervals
- Similar to confidence intervals
90% credible interval is the interval that has a 90% probability of containing the true value of the parameter
(This is different from CIs, which show that 90% of the interval constructed with repeated sampling will contain the true parameter)
- Credible intervals can be defined however you want to (50%, 80%, 1st, and 51st %iles)
Does the model fit the distribution?
Posterior Predictive Distribution: Weighted prediction of the parameter by the corresponding posterior prediction.
The Posterior Predictive Distribution is used to check against the actual data or simulated samples of the data Like a residual?
Mentioned as step six but not discussed in this chapter.