Random Variables and Great Expectations

Some problems involving probability can seem really messy. For example, if we randomly put 100 letters into 100 addressed envelopes, on average how many letters will end up in their correct envelopes? Or, on average, in a class of size m, how many pairs of people will share the same birthday?

To solve such problems, we can turn to a formidable tool in probability called "linearity of expectation." To use this tool, we have to understand the concept of a random variable.

Random Variables

We begin with the notion of a finite probability distribution , which consists of a finite set of elements, or samples, where each has a weight, or probability, .

Example

Let the probability of heads coming up be . For HH, the probability is ; for TT, it is ; for TH, it is ; and for HT, it is .

Recall that an event is a subset of the sample space. So, if a sample space consists of seven samples with the following probabilities and the event is the subset of elements with the following probabilities , then .

Running example

We throw a white die and a black die. For this example, we have the following sample space of all 36 possible outcomes:

So, for a uniform distribution, we have , .

Suppose that the event in which we are interested is all outcomes that produce a sum of 3 or less:

= event that sum equals 3 or less. proportion of in .

A random variable is a (real-valued) function on .

So, a random variable can be used to describe the process of rolling a fair die and the possible outcomes . Another random variable might describe the possible outcomes of picking a random person and measuring his or her height.

Unlike the usual practice with other mathematical variables, a random variable cannot be assigned a value; a random variable does not describe the actual outcome of a particular experiment, but rather describes the possible, as-yet-undetermined outcomes in terms of real numbers.

There are many possible random variables. Consider the event of tossing a coin. Thus, , where is the event in which the coin falls head and is the event in which it falls tails. Let be the number of tails in the experiment. Then is a random variable.

Examples

We throw a white die and a black die. In this sample space, we have many possible choices of random variable.

Let the function be the value of the white die.

, and so on.

Let the function be the sum of the values of the two dice.

, and so on.

Let the function be the value of the white die taken to the power of the value of the black die.

, and so on.

Let be the function such that it equals 1 if the two dice are equal, 0 otherwise.

, and so on.

Example

Suppose that we toss a fair coin times. The sample space consists of all sequences of . We also have a uniform distribution on so .

Let the random variable be the number of heads.

For .

Let the random variable be 1 if the number of heads equals the number of tails, 0 otherwise.

For .

In terms of notation, we will use letters such as and for events and letters such as and for random variables.

We can think of a random variable as a function on the sample space to the reals . In other words, we can think of the input to the function as random. Or we can think of a random variable in terms of the induced distribution on : a variable with a probability distribution on its values. In effect, the randomness is "pushed" to the values of the function.

For two tossed coins, for example, the function counts the number of heads.

We can also consider it as a distribution on .

and . 2 is associated with a probability 1/4, 0 with a probability 1/4, and 1 with a probability 1/2.

Example

Again, we have two dice. The function is the sum of both dice.

We can also picture it as a probability distribution of , with the sums as the horizontal axis and the probabilities as the vertical axis.

Rplot001.png

From Random Variables to Events

For any random variable and value , we can define the event that .

Examples

For two tossed coins, the function counts the number of heads.

Suppose that we want the probability that one head and one tail will come up. In this case, .

For two dice, the function gives the sum. We want the probability of obtaining 6.

For any event , there is a simple random variable called the indicator random variable of , whose value tells us whether or not has occurred. We define the indicator random variable for as follows:

if and if

Expected Value

The expected value (or mathematical expectation) of a random variable is the sum of the probability of each possible outcome multiplied by its value. In effect, it represents the average amount one "expects" to win per bet if bets with identical odds are repeated many times

The expectation, or expected value, of a random variable is a weighted average of the possible values of by their corresponding probabilities. It is written as , and is given by the expression:

In this expression, is a function on the sample space and it also has a distribution on its values.

Example

For two tossed coins, we can write the following expression for the expected value.

What if we flip a coin three times? What is the expected number of heads?

Clearly, however, we cannot obtain 1.5 heads. So, . Note that the value itself may not be expected in the general sense; it may be unlikely or even impossible to obtain. Moral: Don't always expect the expected: may be 0!

For an indicator random variable for event (defined above), .

Linearity of Expectation

We can add random variables. If and are random variables on the same set , then is also a random variable.

For instance, we can roll one die and let be the random variable giving the die's value. We can roll a second die and let be the random variable giving this die's value. We can then define the random variable to be the sum of the two dice .

Example

Consider picking a random person in the world. Let length of the person's left arm in inches; length of the person's right arm in inches. Let . measures the combined arm lengths.

Formally, people in world , uniform distribution on .

Two random variables and are said to be independent if and only if the value of has no influence on the value of and vice versa. Hence. two random variables and are independent if for every the events and are independent.

Expected values obey a simple, very helpful rule called linearity of expectation. In its simplest form, it says that the expected value of a sum of random variables is the sum of the expected values of the variables.

If , then

The great thing about linearity of expectation is that no independence is required. This can be very useful. We often need to work with random variables that are not independent.

Proof

Also, we have for any constant .

Exercises

Consider the example of tossing a coin twice. Let be the first toss and be the second toss. Let total number of heads. What is ? ? ?

Or we can let be at least one coin heads and to be both coins heads. Are and independent? What is ? ? ?

Multiple Variables

Using induction, we can extend the idea of linearity of expectation to encompass random variables.

In general, the expectation of the sum is the sum of the expectations.

Example

If we randomly put 100 letters into 100 addressed envelopes, on average how many letters will end up in their correct envelopes?

Let be the event the letter ends up in its correct envelope. Let be the indicator random variable for .

if occurs. 0 otherwise.

Let . We are asking for .

What is ?

.

What is ?

.

So, in expectation, one card will be in the same position as it started. Note that it doesn't depend on how many letters we have! Were the independent? No. Consider what happens in the case .

Here is the general approach that we can use for solving such problems.

  • View the thing that we care about as the expected value of some random variable.
  • Write this random variable as the sum of simpler random variables (typically indicator random variables).
  • Solve for their expectations and add them up!

Example

We flip coins of bias . What is the expected number of heads?

We could do this by summing:

But now we know a better way.

Let number of heads when independent coins of bias are flipped.

Break into simpler random variables. if the coin is tails. if the coin is heads.

Product of Expectations

What about products of random variables? If , then ? No!

For example, if is the indicator random variable for "first flip is heads" and is the indicator random variable for "1st flip is tails" .

But it is true if the random variables are independent.

Proof

Suppose that we flip two fair coins, and we let indicator random variable for first coin being heads, indicator random variable for second coin being heads, and indicator random variable for "both are heads."

Note that and , so . In fact, is called the variance of .

Most of the time, however, power will come from using sums, mainly because linearity of expectations holds even if random variables are not independent.

Example

On average, in a class of size , how many pairs of people will have the same birthday?

Suppose that we have people, each with a uniformly chosen birthday from 1 to 366.

Let number of pairs of people with the same birthday.

We need to find .

We use indicator variables, one for each pair of people.

if person and person have the same birthday. otherwise.

Note that there are many dependencies among the indicator variables. For example, and and are dependent. But when using linearity of expectations, it doesn't matter.

Example

You pick a number . You roll three dice. If any match , you win 1. Otherwise, you pay me 1. Want to play?

Let event that die matches. Let indicator random variable for . Expected number of dice that match:

But this is not the same as Pr(at least one die matches). Pr(at least one die matches) = 1 - Pr(none match) = 1 - .

In general, be careful that you are modeling the problem correctly. And watch out for carnival games.

Topic revision: r7 - 2008-01-13 - 22:07:30 - SeverinHacker
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback