# Module 3 Probability and Applications

by Patrick Boily, with contributions from Rafal Kulik

Data analysis is sometimes presented in a “point-and-click manner”, with tutorials often bypassing foundations in probability and statistics to focus on software use and specific datasets. While modern analysts do not always need to fully understand the theory underpinning the methods that they use, understanding some of the basic concepts can only lead to long-term benefits.

In this module, we introduce some of the crucial probabilistic notions that will help analysts get the most out of their data.

### Contents

3.1 Basic Notions
3.1.1 Sample Spaces and Events
3.1.2 Counting Techniques
3.1.3 Ordered Samples
3.1.4 Unordered Samples
3.1.5 Probability of an Event
3.1.6 Conditional Probability & Indep. Events
3.1.7 Bayes’ Theorem

3.2 Discrete Distributions
3.2.1 Random Variables and Distributions
3.2.2 Expectation of a Discrete R.V.
3.2.3 Binomial Distributions
3.2.4 Geometric Distributions
3.2.5 Negative Binomial Distributions
3.2.6 Poisson Distributions
3.2.7 Other Discrete Distributions

3.3 Continuous Distributions
3.3.1 Continuous Random Variables
3.3.2 Expectation of a Continuous R.V.
3.3.3 Normal Distributions
3.3.4 Exponential Distributions
3.3.5 Gamma Distributions
3.3.6 Approximation of the Binomial Distribution
3.3.7 Other Continuous Distributions

3.4 Joint Distributions

3.5 Central Limit Theorem and Sampling Distributions
3.5.1 Sampling Distributions
3.5.2 Central Limit Theorem
3.5.3 Sampling Distributions (Reprise)

3.6 Exercises