Probability - Distributions

Nov. 19, 2020 pexels-pixabay-35888.jpg Vuong Huynh

Fundamentals of Probability Distributions

Distribution: a collection of all the possible values a variable can take and how frequently they occur in the sample space.

Notations:

Y → The actual outcome of an event

y → One of the possible outcomes

P(Y = y) or P(Y)

Examples:

Y → The number of red marbles we draw out of a bag

y → 5 red marbles

P(Y=5) or P(5)


Probability Frequency Distribution: measures the likelihood of an outcome.

Definitions:

Two characteristics: MEAN → μ and VARIANCE → σ2

Mean: average value

Variance: how spread out the data is


Population vs Sample

Population data: the whole data → σ

Sample data: a part of the whole data → s

Sample mean:

Sample variance: s2


Variance measured in squared units

Standard deviation → square root of variance sqrt(σ2 ) = σ


Mean and Variation relationship:

σ2 = E((Y - μ)2) = E(Y2) - μ2


Types of Probability Distributions

Notation:

X ~ N (μ, σ2)


Discrete Distributions
:

Uniform Distribution: pick a card or flip a coin → All outcomes are equally likely → Equiprobable

Bernoulli Distribution: events with only two possible outcomes → True or False

Binomial Distribution: two outcomes per iteration but many iterations (carrying out a similar experiment several times in a row). For example, we flip the coin 3 times and calculate the probability of P(HEAD*2)

Poisson Distribution: test out how unusual an event frequency is for a given interval.


Continuous Distributions:

Normal Distribution: often observe in nature

Chi-Squared: Asymmetric; Only consists of non-negative values. Often used in Hypothesis Testing

Exponential distribution: events that are rapidly changing early on

Logistic Distribution: useful in forecast analysis, or for determining a cut-off point for a successful outcome