# Generating random numbers in R and Racket

R makes it easy to generate random numbers from a wide variety of distributions with a consistent interface. For example, `runif`

, `rnorm`

, and `rpois`

generate random numbers from uniform, normal, and Poisson distributions, respectively.

```
> x = runif(n = 1000, min = 4.6, max = 9.3)
> min(x)
[1] 4.60374
> max(x)
[1] 9.288063
>
> y = rnorm(n = 1000, mean = 81, sd = 9)
> mean(y)
[1] 81.2548
> sd(y)
[1] 9.179427
>
> z = rpois(n = 1000, lambda = 7.11)
> mean(z)
[1] 7.187
> var(z)
[1] 6.954986
```

The Racket `math`

library provides much of the same functionality as the base R functions, but I first want to work through a couple of examples without using the `math`

library because I think it is useful for better understanding Racket.

Let's create a Racket function that is similar to `runif`

in R.

```
#lang racket/base
(define (runif n [min 0] [max 1])
(for/list ([i (in-range n)])
(+ (* (- max min) (random)) min)))
```

The `random`

function draws a random number from a uniform distribution between 0 and 1. A list of `n`

random samples is drawn by looping through a sequence with `for/list`

. To draw a random number within a specified interval, we multiply the random number by the specified range `(- max min)`

and shift by the `min`

.

```
> (define unif (runif 1000 -7 33))
> (apply min unif) ; check that min is about -7
-6.987705647350004
> (apply max unif) ; check that max is about 33
32.97250528872971
```

The need to use `apply`

to calculate the `min`

of a list was a surprise to me. The `min`

function takes any number of real numbers as arguments but not a list. `apply`

uses the contents of the list as the arguments for `min`

.

```
> (min 3 1 2)
1
> (min '(3 1 2))
min: contract violation
expected: real?
given: '(3 1 2)
```

Another wrinkle is introduced if you want to generate a vector (with `for/vector`

) of random numbers rather than a list.

```
(define (runif-vec n [min 0] [max 1])
(for/vector ([i (in-range n)])
(+ (* (- max min) (random)) min)))
```

We can't use `apply`

with a vector.

```
> (define unif-vec (runif-vec 1000 0.5 1.5))
> (apply min unif-vec)
apply: contract violation
expected: list?
given: '#(1.3465423772765357 1.0623864994795045 1.3437265664094888 1.1048441535810904 0.7861090566755002 0.6470981723620602 0.5546597487687198 0.552946824350613 0.5251621516034304 1.3568612624022975 1.1217164560959263 1.3334770245391927 0.890117942389224...
argument position: 2nd
other arguments...:
```

StackOverflow provides a solution based on `for/fold`

.

```
> (for/fold ([m +inf.0]) ([x (in-vector unif-vec)]) (min m x))
0.5014616191163698
> (for/fold ([m -inf.0]) ([x (in-vector unif-vec)]) (max m x))
1.4994848021056595
```

In `for/fold`

, `m`

is an accumulator that is initialized as positive (`+inf.0`

) or negative infinity (`-inf.0`

) for min and max, respectively. When iterating through `unif-vec`

, `m`

is compared to `x`

and the value of `m`

is updated with the output of `min`

or `max`

.

Random values can also be drawn from a normal distribution using draws from the standard uniform distribution with `random`

and a formula from Rosetta Code

```
> (define (rnorm n [mean 0] [sd 1])
(for/list ([i (in-range n)])
(+ mean (* (sqrt (* -2 (log (random)))) (cos (* 2 pi (random))) sd))))
> (define norm (rnorm 1000 2 0.5))
> (require math) ; need math library for mean and standard deviation
> (mean norm) ; check that mean is about 2
1.9797112583083123
> (stddev norm) ; check that sd is about 0.5
0.5022756727111091
```

Now, let's simplify our life and use the functions provided by the `math`

library.

```
(require math)
(define (rnorm-2 n [mean 0] [sd 1])
(if (and (real? mean) (real? sd))
(flnormal-sample (real->double-flonum mean) (real->double-flonum sd) n)
(error "mean and sd arguments must be real numbers")))
```

The function, `flnormal-sample`

, requires that the mean and standard deviation are passed as floating-point numbers (and returns `flvector`

). In `rnorm-2`

, we check if the `mean`

and `sd`

arguments are real numbers and, if so, convert them to floating-point numbers (`real->double-flonum`

) before calling `flnormal-sample`

, which is otherwise similar to R's `runif`

function.

```
> (define norm-2 (rnorm-2 1000 2 0.5))
> (mean norm-2)
2.024086073688603
> (stddev norm-2)
0.5039448247564015
```

The `math`

library contains similar sample functions for numerous types of distributions, including uniform, Poisson, beta, binomial, gamma, and more.

Finally, let's take a look at the `statistics`

object provided by the `math`

library.

```
> (define s (update-statistics* empty-statistics norm-2))
> (statistics-mean s)
1.9947322187693184
> (statistics-stddev s)
0.4974799838849746
```

`empty-statistics`

is an empty statistics object. `update-statistics*`

populates the statistics object by iterating through `norm-2`

to compute summary statistics, which are extracted with accessor functions, e.g., `statistics-min`

, `statistics-mean`

, `statistics-skewness`

.

The question is why would you use the more verbose statistics object. One reason is that `statistics-min`

and `statistics-max`

are actually less verbose than using `for/fold`

with `min`

and `max`

. Also, if you are computing several summary statistics on a large sequence, then `update-statistics*`

is faster than the other assorted functions.

```
> (define norm-3 (rnorm-2 10000000 1000 100))
> (time
(mean norm-3)
(stddev norm-3)
(variance norm-3)
(for/fold ([m +inf.0]) ([x (in-flvector norm-3)]) (min m x))
(for/fold ([m -inf.0]) ([x (in-flvector norm-3)]) (max m x)))
cpu time: 72618 real time: 108130 gc time: 34828
> (time
(define s2 (update-statistics* empty-statistics norm-3))
(statistics-mean s2)
(statistics-stddev s2)
(statistics-variance s2)
(statistics-min s2)
(statistics-max s2))
cpu time: 12478 real time: 22668 gc time: 5512
```