## 3.4 Joint Distributions

Let $$X$$, $$Y$$ be two continuous random variables. The joint probability distribution function (joint p.d.f.) of $$X,Y$$ is a function $$f(x,y)$$ satisfying:

1. $$f(x,y)\geq 0$$, for all $$x$$, $$y$$;

2. $$\int_{-\infty}^{\infty}\int_{-\infty}^{\infty}f(x,y)\, dxdy=1$$, and

3. $$P(A)=\iint_Af(x,y)\, dxdy$$, where $$A\subseteq \mathbb{R}^2$$.

For a discrete variable, the properties are the same, except that we replace integrals by sums, and we add a property to the effect that $$f(x,y)\leq 1$$ for all $$x,y$$.

Property 3 implies that $$P(A)$$ is the volume of the solid over the region $$A$$ in the $$xy$$ plane bounded by the surface $$z=f(x,y)$$.

Examples:

• Roll a pair of unbiased dice. For each of the $$36$$ possible outcomes, let $$X$$ denote the smaller roll, and $$Y$$ the larger roll (taken from [32]).

1. How many outcomes correspond to the event $$A=\{(X=2,Y=3)\}$$?

Answer: the rolls $$(3,2)$$ and $$(2,3)$$ both give rise to event $$A$$.

2. What is $$P(A)$$?

Answer: there are $$36$$ possible outcomes, so $$P(A)=\frac{2}{36}\approx 0.0556$$.

3. What is the joint p.m.f. of $$X,Y$$?

Answer: only one outcome, $$(X=a,Y=a)$$, gives rise to the event $$\{X=Y=a\}$$. For every other event $$\{X\neq Y\}$$, two outcomes do the trick: $$(X,Y)$$ and $$(Y,X)$$. The joint p.m.f. is thus $f(x,y)=\begin{cases}1/36 & \text{1\leq x=y\leq 6} \\ 2/36 & \text{1\leq x<y\leq 6}\end{cases}$ The first property is automatically satisfied, as is the third (by construction). There are only $$6$$ outcomes for which $$X=Y$$, all the remaining outcomes (of which there are $$15$$) have $$X< Y$$.

Thus, $\sum_{x=1}^6\sum_{y=x}^6 f(x,y)=6\cdot\frac{1}{36}+15\cdot \frac{2}{36}=1.$

4. Compute $$P(X=a)$$ and $$P(Y=b)$$, for $$a,b=1,\ldots, 6$$.

Answer: for every $$a=1,\ldots,6$$, $$\{X=a\}$$ corresponds to the following union of events: \begin{aligned} \{X=a,Y=a\}\cup &\{X=a,Y=a+1\}\cup \cdots \\ &\cdots \cup \{X=a,Y=6\}.\end{aligned} These events are mutually exclusive, so that \begin{aligned} P(X=a)&=\sum_{y=a}^6P(\{X=a,Y=y\})\\&=\frac{1}{36}+\sum_{y=a+1}^6\frac{2}{36} \\ &=\frac{1}{36}+\frac{2(6-a)}{36}, \quad a=1,\ldots, 6.\end{aligned} Similarly, we get $P(Y=b)=\frac{1}{36}+\frac{2(b-6)}{36},\ b=1,\ldots, 6.$ These marginal probabilities can be found in the margins of the p.m.f.

5. Compute $$P(X=3\mid Y>3)$$, $$P(Y\le 3 \mid X\geq 4)$$.

Answer: the notation suggests how to compute these conditional probabilities: \begin{aligned} P(X=3\mid Y>3)&=\frac{P(X=3 \cap Y>3)}{P(Y>3)} \\ P(Y=3|X\geq 4)&=\frac{P(Y=3 \cap X\geq 4)}{P(X \geq 4)}\end{aligned} The region corresponding to $$P(Y>3)=\frac{27}{36}$$ is shaded in red (see image at the top of the following column); the region corresponding to $$P(X=3)=\frac{7}{36}$$ is shaded in blue. The region corresponding to $P(X=3\cap Y>3)=\frac{6}{36}$ is the intersection of the regions: $P(X=3\mid Y>3)=\frac{6/36}{27/36}=\frac{6}{27}\approx 0.2222.$ As $$P(Y\le 3\cap X\ge 4)=0$$, $$P(Y\le 3|X\ge 4)=0$$.

6. Are $$X$$ and $$Y$$ independent?

Answer: why didn’t we simply use the multiplicative rule to compute $P(X=3 \cap Y>3)=P(X=3)P(Y>3)?$ It’s because $$X$$ and $$Y$$ are not independent, that is, it is not always the case that $P(X=x,Y=y)=P(X=x)P(Y=y)$ for all allowable $$x,y$$.

As it is, $$P(X=1,Y=1)=\frac{1}{36}$$, but $P(X=1)P(Y=1)=\frac{11}{36}\cdot \frac{1}{36}\neq \frac{1}{36},$ so $$X$$ and $$Y$$ are dependent (this is often the case when the domain of the joint p.d.f./p.m.f. is not rectangular).

• There are $$8$$ similar chips in a bowl: three marked $$(0,0)$$, two marked $$(1,0)$$, two marked $$(0,1)$$ and one marked $$(1,1)$$. A player selects a chip at random and is given the sum of the two coordinates, in dollars (taken from [32]).

1. What is the joint probability mass function of $$X_1$$, and $$X_2$$?

Answer: let $$X_1$$ and $$X_2$$ represent the coordinates; we have $f(x_1,x_2)=\frac{3-x_1-x_2}{8},\quad x_1,x_2=0,1.$

2. What is the expected pay-off for this game?

Answer: the pay-off is simply $$X_1+X_2$$. The expected pay-off is thus \begin{aligned} \text{E}[X_1+X_2]&=\sum_{x_1=0}^1\sum_{x_2=1}^0(x_1+x_2)f(x_1,x_2)\\&=0\cdot \frac{3}{8}+1\cdot\frac{2}{8}+1\cdot \frac{2}{8}+2\cdot \frac{1}{8}\\&=0.75. \end{aligned}

• Let $$X$$ and $$Y$$ have joint p.d.f. $f(x,y)=2,\quad 0\leq y\leq x\leq 1.$

1. What is the support of $$f(x,y)$$?

Answer: the support is the set $$S=\{(x,y):0\leq y\leq x\leq 1\}$$, a triangle in the $$xy$$ plane bounded by the $$x-$$axis, the line $$y=1$$, and the line $$y=x$$.

The support is the blue triangle shown below.

2. What is $$P(0\leq X\leq 0.5, 0\leq Y\leq 0.5)$$?

Answer: we need to evaluate the integral over the shaded area: \begin{aligned} P(0\leq &X\leq 0.5,0\leq Y\leq 0.5)\\&=P(0\leq X\leq 0.5,0\leq Y\leq X)\\& =\int_{0}^{0.5}\int_{0}^x2\, dydx\\&=\int_0^{0.5}\left[2y\right]_{y=0}^{y=x}\, dx \\ & =\int_{0}^{0.5}2x\, dx=1/4.\end{aligned}

3. What are the marginal probabilities $$P(X=x)$$ and $$P(Y=y)$$?

Answer: for $$0\leq x\leq 1$$, we get \begin{aligned} P(X=x)&=\int_{-\infty}^{\infty}f(x,y)\, dy\\ &=\int_{y=0}^{y=x}2\, dy=\left[2y\right]_{y=0}^{y=x}=2x, \end{aligned} and for $$0\leq y\leq 1$$, \begin{aligned} P(Y=y)&=\int_{-\infty}^{\infty}f(x,y)\, dx=\int_{x=y}^{x=1}2\, dx\\ &=\left[2x\right]_{x=y}^{x=1}=2-2y.\end{aligned}

4. Compute $$\text{E}[X]$$, $$\text{E}[Y]$$, $$\text{E}[X^2]$$, $$\text{E}[Y^2]$$, and $$\text{E}[XY]$$.

Answer: we have \begin{aligned} \text{E}[X]&=\iint_Sxf(x,y)\, dA =\int_{0}^{1}\int_{0}^x2x\, dydx\\&=\int_0^1\left[2xy\right]_{y=0}^{y=x}\, dx = \int_{0}^1 2x^2\, dx \\&=\left[\frac{2}{3}x^3\right]_{0}^{1}=\frac{2}{3};\\ \text{E}[Y]&=\iint_Syf(x,y)\, dA =\int_{0}^{1}\int_{y}^12y\, dxdy\\&=\int_0^1\left[2xy\right]_{x=y}^{x=1}\, dy = \int_{0}^1 (2y-2y^2)\, dy \\&=\left[y^2-\frac{2}{3}y^3\right]_{0}^{1}=\frac{1}{3}; \\ \text{E}[X^2]&=\iint_Sx^2f(x,y)\, dA =\int_{0}^{1}\int_{0}^x2x^2\, dydx\\&=\int_0^1\left[2x^2y\right]_{y=0}^{y=x}\, dx = \int_{0}^1 2x^3\, dx \\&=\left[\frac{1}{2}x^4\right]_{0}^{1}=\frac{1}{2};\\ \text{E}[Y^2] &= \iint_Sy^2f(x,y)\, dA =\int_{0}^{1}\int_{y}^12y^2\, dxdy\\&=\int_0^1\left[2xy^2\right]_{x=y}^{x=1}\, dy = \int_{0}^1 (2y-2y^3)\, dy \\&=\left[\frac{2}{3}y^3-\frac{1}{2}y^4\right]_{0}^{1}=\frac{1}{6};\\ \text{E}[XY]&=\iint_Sxyf(x,y)=\int_{0}^1\int_{0}^x2xy\, dydx\\&=\int_0^2\left[xy^2\right]_{y=0}^{y=x}=\int_{0}^1x^2\, dx \\ &=\left[\frac{x^4}{4}\right]_{0}^1=\frac{1}{4}.\end{aligned}

5. Are $$X$$ and $$Y$$ independent?

Answer: they are not independent as the support of the joint p.d.f. is not rectangular.

The covariance of two random variables $$X$$ and $$Y$$ can give some indication of how they depend on one another: \begin{aligned} \text{Cov}(X,Y)&=\text{E}[(X-\text{E}[X])(Y-\text{E}[Y])]\\&=\text{E}[XY]-\text{E}[X]\text{E}[Y].\end{aligned} When $$X=Y$$, the covariance reduces to the variance.33

Example: in the last example, $$\text{Var}[X]=\frac{1}{2}-(\frac{2}{3})^2=\frac{1}{18}$$, $$\text{Var}[X]=\frac{1}{6}-(\frac{1}{3})^2=\frac{1}{18}$$, and $$\text{Cov}(X,Y)=\frac{1}{4}-\frac{2}{3}\cdot\frac{1}{3}=\frac{1}{36}$$.

In R, we can generate a multivariate joint normal via MASS’s mvrnorm(), whose required paramters are n, a mean vector mu and a covariance matrix Sigma.

Let’s start with a standard bivariate joint normal of mean $$\mu=(0,2)$$ and covariance matrix $$\Sigma=\begin{pmatrix}1 & 0 \\ 0 & 1\end{pmatrix}$$.

mu = rep(0,2)
Sigma = matrix(c(1,0,0,1),2,2)

We sample 1000 observations from the joint normal $$N(\mu,\Sigma)$$.

library(MASS)
a<-mvrnorm(1000,mu,Sigma)
a<-data.frame(a)
str(a)
'data.frame':   1000 obs. of  2 variables:
$X1: num -0.5267 -0.3153 -0.4686 0.6782 0.0105 ...$ X2: num  0.1095 -0.721 -0.0943 0.1005 1.7259 ...

What would you expect to see when we plot the data?

library(ggplot2)
library(hexbin)
qplot(X1, X2, data=a, geom="hex")

The covariance matrix was the identity (diagonal), so we expect the blob to be circular. What happens if we use a non-diagonal covariance matrix?

mu = c(-3,12)
Sigma = matrix(c(110,15,15,3),2,2)
a<-mvrnorm(1000,mu,Sigma)
a<-data.frame(a)
qplot(X1, X2, data=a, geom="hex") + ylim(-40,40) + xlim(-40,40)

### References

[32]
R. V. Hogg and E. A. Tanis, Probability and Statistical Inference, 7th ed. Pearson/Prentice Hall, 2006.