Chapter 8 Joint Distributions

So far, we learned about joint probabilities in Bayesian context such as \(P(A|B) = P(A,B)/P(B)\). Now, we are going to expand this concept into discrete and continuous distributions. Define \(P(X = x, Y=y) = f(x,y)\) as the probability mass function (discrete) or probability density function (continuous).

Same probability laws apply to joint distributions as well.

  • \(f(x,y) \ge 0\) for all \((x,y)\).
  • \(\sum_x \sum_y f(x,y) = 1\) or \(\int_x \int_y f(x,y) dx dy = 1\)

Example (Discrete): Suppose there are 10 balls in a box; 3 white, 4 black and 3 red. Two balls are randomly selected. Let’s say random variable X is the number of white balls picked and r.v. Y is the number of black balls picked. (a) Find the joint probability function and (b) find the probabilities.

  1. Let’s first enumerate the alternatives. \((x,y)\) pair can be either of \((0,0),(0,1),(0,2),(1,1),(2,0),(1,0)\). Total number of alternatives are \(\binom{10}{2}\). To calculate, number of ways of getting 1 white and 1 black ball is \(\binom{3}{1}\binom{4}{1}\binom{3}{0}\). So, the probability will be \(\dfrac{\binom{3}{1}\binom{4}{1}\binom{3}{0}}{\binom{10}{2}}\). We can generalize it to a function.

\[f(x,y) = \dfrac{\binom{3}{x}\binom{4}{y}\binom{3}{2-x-y}}{\binom{10}{2}}\]

Let’s also make it into an R function

## [1] 0.2666667
  1. Using the above formula we can calculate all the probabilities within the specified region \(x+y \le 2\).
##      x_0  x_1  x_2
## y_0 0.07 0.20 0.07
## y_1 0.27 0.27 0.00
## y_2 0.13 0.00 0.00

Example (continuous): (This is from the textbook, Example 3.15) A privately owned business operates both a drive-in facility and a walk-in facility. On a randomly selected day, let X and Y, respectively, be the proportions of time that the drive-in and the walk-in facilities are in use and suppose that the joint density function of these random variables is

\[ f(x,y) = \dfrac{2}{5}(2x + 3y), 0 \le x \le 1, 0 \le y \le 1 \]

and 0 for other values of x and y.

  1. Verify \(\int_x \int_y f(x,y) dx dy = 1\)
  2. Find \(P[(X,Y) \in A]\), where \(A = \{(x,y)|0 < x < 1/2, 1/4 < y < 1/2\}\)

  3. (see the book for the full calculations)

\[ \int_x \int_y f(x,y) dx dy = \int_0^1 \int_0^1 \dfrac{2}{5}(2x+3y) dx dy = 1 \]

  1. (see the book for the full calculations)

\[ \int_x \int_y f(x,y) dx dy = \int_{1/4}^{1/2} \int_0^{1/2} \dfrac{2}{5}(2x+3y) dx dy = 13/160 \]

Example (with special distributions)

  1. Patients arrive at the doctor’s office according to Poisson distribution with \(\lambda = 2\)/hour.

    1. What is the probability of getting less than or equal to 2 patients within 2 hours?
    2. Suppose each arriving patient has 50% chance to bring a person to accompany. There are 10 seats in the waiting room. At least many hours should pass that there is at least 50% probability that the waiting room is filled with patients and their relatives?

Solution

  1. \(P(X\le 2|\lambda t = 2)= \sum_{i=0}^2 \dfrac{e^{-\lambda t}(\lambda t)^i}{i!}\)
## [1] 0.2381033
  1. First let's define the problem. Define \(n_p\) as the number of patients and \(n_c\) is the number of company. We want \(n_p + n_c \ge 10\) with probability 50% or higher for a given \(t^*\). Or to paraphrase, we want \(n_p + n_c \le 9\) w.p. 50% or lower.

    What is \(n_c\) affected by? \(n_p\). It is actually a binomial distribution problem. \(P(n_c = i|n_p) = \binom{n_p}{i} (0.5)^i*(0.5)^{n_p-i}\). It is even better if we use cdf \(P(n_c \le k|n_p) = \sum_{i=0}^{k} \binom{n_p}{i} (0.5)^i*(0.5)^{n_p-i}\).

    We know the arrival of the patients is distributed with poisson. So, \(P(n_p = j|\lambda t^*) = \dfrac{e^{-\lambda t}(\lambda t)^j}{j!}\). So \(P(j + k \le N) = \sum_{a=0}^j P(n_p = a|\lambda t^*)*P(n_c \le N-a | n_p = a)\). Remember it is always \(n_c \le n_p\).

## [1] 0.8631867
## [1] 0.5810261
## [1] 0.4905249

8.0.1 Marginal Distributions

You can get the marginal distributions by just summing up or integrating the other random variable such as \(P(Y=y) = \sum_x f(x,y)\) or \(f(y) = \int_x f(x,y) dx\). Let’s calculate the marginal distribution of black balls (rv Y) in the above example.

##      x_0  x_1  x_2
## y_0 0.07 0.20 0.07
## y_1 0.27 0.27 0.00
## y_2 0.13 0.00 0.00
##       y_0       y_1       y_2 
## 0.3333333 0.5333333 0.1333333

Marginal distribution of y in the second example is calculated as follows.

\[\int_x \dfrac{2}{5}(2x+3y) dx = \dfrac{2(1+3y)}{5}\]

8.0.2 Conditional Distribution

Similar to Bayes’ Rule, it is possible to calculate conditional probabilities of joint distributions. Let’s denote g(x) as the marginal distribution of x and h(y) as the marginal distribution of y. The formula of conditional distribution of x given y is as follows.

\[f(x|y) = f(x,y)/h(y)\]

Note that conditional distribution function is useless if x and y are independent. (\(f(x|y)=f(x)\))