Rolling Dice with R
One of the foundational ways of exploring probability is calculating possible outcomes from rolling a dice. Normally it's a typical, six-sided die. We could roll a real die over and over, recording the outcome and use that explore probability. Of course, this is limited to the time and energy available. With R we can simulate rolling dice again and again as much as we want. We can simulate rolling six-sided, ten-sided, even fifty-sided dice.
To make it happen we'll use the sample() function. It takes a vector as input, which we already know how to create, and gives a sampled output of a specified size. We can sample with and without replacement. In the case of dice, it's not like drawing names from a hat - rolling a one doesn't remove it from the dice. Let's roll a six-sided dice once, using values one through six in a vector:
> sample(1:6, 1)
[1] 5
Looks like we rolled a five! Now let's roll the dice twice, using the replace = true option. Remember, rolling a dice doesnt' eliminate possible outcomes, which is why we need to specify that option. Here's us rolling the dice twice a few times in a row:
> sample(1:6, 2, replace = TRUE)
[1] 5 5
> sample(1:6, 2, replace = TRUE)
[1] 5 3
> sample(1:6, 2, replace = TRUE)
[1] 6 3
Without the replace = TRUE option we'd never been able to roll two fives. Let's roll the dice twenty times:
> sample(1:6, 20, replace = TRUE)
[1] 4 6 5 5 2 2 5 2 3 5 2 2 1 6 1 5 5 3 4 5
Now let's roll a thousand times and save the output vector to a variable that we can do something with:
> rolls <- sample(1:6, 1000, replace = TRUE)
The quickest way to see how many times we rolled each number is to use the table() function. Let's apply table() to the new rolls variable:
> table(rolls)
rolls
1 2 3 4 5 6
157 163 175 155 175 175
To see an overall percentage of rolls each number has, we can just divide table(rolls) by the total number of rolls (1000):
> table(rolls) / 1000
rolls
1 2 3 4 5 6
0.157 0.163 0.175 0.155 0.175 0.175
We rolled threes, fives, and sixes each 17.5% of the time. We rolled fours the least, at just 15.5% of total rolls. Visualizing all our rolls with a histogram is pretty straightforward with the built-in hist() function:
hist(rolls)
Rolling a dice 1000 times and recording the outcome manually would be a very time-intensive and error-prone process. R makes it easy to create sample data and visualize the results.