01 Jan

# More R Random

## Generation of normals

If you want to generate 200 standard normals, then do:

`> xn <- rnorm(200)`

You will get different numbers in `xn` if you do the command again.

There are additional arguments to control the mean and standard deviation.

## Two types of uniform

You can have a distribution that has all numbers in some range to be equally likely — a continuous uniform. Alternatively you can have a distribution that is equally likely for some finite set of objects, such as a range of integers — a discrete uniform.

### Continuous uniform

You can generate 100 numbers that are continuously uniform between 0 and 1 with:

`> xcontu <- runif(100)`

You will get different numbers in `xcontu` if you do the command again.

There are additional arguments to change the range.

### Discrete uniform

Use the `sample` function to generate uniformly from some set of integers (or other types of objects). For example:

`> xdiscu <- sample(1:100, 4, replace=TRUE)`

selects 4 numbers between 1 and 100, inclusive, with replacement.

You will get different numbers in `xdiscu` if you do the command again.

You can get a random color from among the named colors with the command:

`> sample(colors(), 1)`

The `prob` argument to `sample` allows you to give different probabilities to the elements of the vector that is being selected from. Thus `sample` will perform non-uniform sampling as well.

## Random permutations

The `sample` function also does random permutations. In fact, that is its default behavior:

```> xpermute <- sample(x)
> sample(1:9)
 1 9 3 8 5 2 6 4 7```

You will get a different order in `xpermute` if you do the command again. (We are assuming here that `x` is a vector with more than one element.)

## Seed setting

In all of the commands above, you get different answers as you repeat them. That is pretty much the point of them. However, it can be useful to know that you will get the same answers again even though you are generating random numbers. You can do that by setting the random seed.

In R there is an object called `.Random.seed` that controls random generation. Once you have generated something random, there will be a `.Random.seed` object in your global environment. (It doesn’t show up in `ls()` because the name starts with a dot — you can see such objects by saying: `ls(all=TRUE)`.)

Calls to random functions change the value of `.Random.seed`. That is, these calls not only return a value, they also have the side effect of changing `.Random.seed`.

But if the random seed is the same at the start of a call, then the results will be the same. There are two ways of setting the seed: you can save the seed and then assign it, or you can use `set.seed`

The preferred method is to use `set.seed`. You can just give a number as the first argument:

```> set.seed(123)
> rnorm(4)
 -0.56047565 -0.23017749  1.55870831  0.07050839
> rnorm(4)
  0.1292877  1.7150650  0.4609162 -1.2650612
> set.seed(123)
> rnorm(4)
 -0.56047565 -0.23017749  1.55870831  0.07050839```

## Probability distributions

R has functions for a number of probability distributions. In general, there are four functions for each distribution as shown in Table 1.

Table 1

Function name Description
rxxx random generation
dxxx density function
pxxx cumulative probability function
qxxx quantile function

For example `rnorm` is the random generation function for the normal distribution. `dnorm` is the density for the normal. `pnorm` is the cumulative probability function for the normal — that is, this gives the probability of being less than or equal to a given quantile. `qnorm` is the quantile function — the inverse of the probability function (that is, it returns a quantile given a probability).

Table 2 shows a few of the distributions that are available in R.

Table 2

Distribution Functions
Uniform ` runif dunif punif qunif `
Normal ` rnorm dnorm pnorm qnorm `
Student’s t ` rt dt pt qt `
F ` rf df pf qf `
Exponential ` rexp dexp pexp qexp `
Log normal ` rlnorm dlnorm plnorm qlnorm `
Beta ` rbeta dbeta pbeta qbeta `
Binomial ` rbinom dbinom pbinom qbinom `
Poisson ` rpois dpois ppois qpois `

You can see a more complete list with the command:

`> ??distribution`

The `ecdf` function takes a data vector as an argument and returns a function that is the cumulative probability function of the data.

Many contributed packages contain functions for additional distributions.

## Pseudorandomness

In a certain sense most of what is said on this page is a lie. When you use a function like `rnorm` or `sample`, you are not generating randomness at all. These are pseudorandom functions. Technically you are generating chaos when you use them, not randomness. There are two main reasons to use pseudorandomness rather than randomness.

The first is convenience. In the early days of computing there was no way to actually get true random values, so they had to invent pseudorandom methods. Now there is the possibility of using truly random values, but it is generally harder to do and seldom offers an advantage.

The second reason to prefer pseudorandomness is reproducibility. Random numbers (by definition) are not reproducible. A program without reproducible results is a program that can not be debugged.

It is largely accidental that we have pseudorandom functions and not truly random functions. It’s a happy accident.

## Resources

This includes a discussion of probability distributions.