11  Random Numbers, If Statements, Loops, and Writing Custom Functions

11.1 Generate random numbers

In some studies, random numbers are very important. The following code can generate random values:

rnorm(n=20, mean=0, sd=1) # random normal numbers
rnorm(20) # same as above (standard normal is default)

11.2 Plot random numbers

x <- rnorm(1000)
hist(x) 

11.3 Generate samples

It is often useful to find a sample from a given dataset. We can do this with the sample() function. Try these:

sample(x = 1:10, size = 5, replace = FALSE)
sample(x = 1:10, size = 5, replace = TRUE)

Why is ‘replace’ not included here?

sample(x = c("Pearson", "Tukey", "Galton", "Fisher"), size = 3, replace = FALSE)

11.4 If statements

Using pipes:

x <- 5

if (x > 3 | x == 3) {
  print("at least one of the conditions is true")
}
[1] "at least one of the conditions is true"

Using &

a <- 200
b <- 33

if (b < a & a > 150) {
  print ("both conditions are true")
} else {
  print("both conditions are false")
}
[1] "both conditions are true"

with else:

a <- 200
b <- 33

if (b > a) {
  print ("b is greater than a")
} else {
  print("b is not greater than a")
}
[1] "b is not greater than a"

Nested if statements:

x <- 41

if (x > 10) {
  print("Above ten")
  if (x > 20) {
    print("and also above 20!")
  } else {
    print("but not above 20.")
  }
} else {
  print("below 10.")
}
[1] "Above ten"
[1] "and also above 20!"

11.5 While loops

i <- 1
while (i < 6) {
  print(i)
  i <- i + 1
}

With the Break statement, we can stop the loop even if the while condition is TRUE:

i <- 1
while (i < 6) {
  print(i)
  i <- i + 1
  if (i == 4) {
    break
  }
}
[1] 1
[1] 2
[1] 3

11.6 For loops

for (x in 1:10) {
  print(x)
}

Print the number of dice:

dice <- c(1, 2, 3, 4, 5, 6)

for (x in dice) {
  print(x)
}
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6

11.7 Writing custom functions

To create a function, use the function() keyword:

my_function <- function() { # create a function with the name my_function
  print("Hello World!")
}
my_function <- function() {
  print("Hello World!")
}

my_function() # call the function named my_function
[1] "Hello World!"
add_fun <- function(x, y) {
  a <- x + y
  return(a)
}

add_fun(3,4)
[1] 7
add_fun(add_fun(2,2), add_fun(3,3))
[1] 10

11.8 Reflection questions

  1. Why is it important to be able to generate random numbers in programming and data analysis?

  2. Why might you want to visualize randomly generated values with a histogram?

  3. What does the replace argument in the sample() function control? Can you think of a real-world scenario where sampling with replacement makes sense?

  4. What’s the difference between & and | in conditional statements? When would you use each?

  5. What is the purpose of the break statement? How can it prevent logic errors or infinite loops?

  6. Why might you want to create a custom function instead of repeating the same code multiple times?

  7. What are the benefits of naming your functions and arguments clearly?

  8. In your own words, describe what return() does in a function. Why might a function not need it?

11.9 Exercises

  1. What does rnorm() generate and why might a researcher use it?

  2. Use rnorm() to generate 100 random values from a normal distribution with a mean of 50 and a standard deviation of 10. What is the mean and standard deviation of the sample you generated?

  3. Create a histogram of a normal distribution with hist() and interpret the shape. Use rnorm(1000, 0, 1).

  4. What is the difference between sampling with and without replacement? Provide an example.

  5. Use sample() to draw 5 values from the vector c(1, 2, 3, 4, 5) with replacement. Then do it again without replacement. What is the difference?

  6. Create a sample of 10 observations from the categories “red”, “blue”, and “green” using sample(). Set the probability of red to be 20%, blue 50%, and green 30%.

  7. Use sample() to simulate rolling a six-sided die 20 times with replacement. Count how many times 6 appears. Used set.seed() to make results replicatable.

  8. Use a pipe %>% with an if statement inside dplyr::mutate() to assign “pass” if grade > 50, otherwise “fail”.

  9. Use the mutate() function in dplyr to add a new column that contains the square of another column in a data frame. Fix the code below:

library(dplyr)
df <- data.frame(id = 1:5, value = c(2, 4, 6, 8, 10))
df %>% mutate(new_column = value^2)
  1. Compare while and for loops. When would you use each?

  2. Use an if statement with & to check if a number is between 50 and 100, inclusive.

  3. Write an if-else block that checks if a number is even or odd.

  4. Write a nested if statement that checks if a value is positive, negative, or zero.

  5. Explain the purpose of curly brackets in R if statements and loops. Provide a short example.

  6. Write a while loop that prints numbers from 1 to 5.

  7. Write a for loop that prints the square of each number in c(1, 2, 3, 4).

  8. What does break do in a loop? Demonstrate by stopping a loop when a value reaches 3.

  9. What makes a custom function useful? Give a short real-life example.

  10. Write a custom function that takes a number and returns “Positive”, “Negative”, or “Zero”.

  11. Debug this code to calculate the sum of the first 10 even numbers:

x <- 2
sum <- 0
for (i in 1:10) {
  sum <- sum + i
}
print(sum)
  1. Debug the following code to correctly print the factorial of a number:
factorial_function <- function(n) {
  result <- 1
  for (i in 1:n) {
    result <- result * i
  }
  return(result)
}
factorial_function("5")
Solution:
factorial_function <- function(n) {
  if (!is.numeric(n)) {
    return("Input must be numeric")
  }
  result <- 1
  for (i in 1:n) {
    result <- result * i
  }
  return(result)
}
factorial_function(5)  # Test with a numeric value
  1. Create a function that returns the square of a number. Test it on 6.

  2. Use rnorm() to create 10 values with mean 10 and sd 5, then round them all to the nearest whole number.

  3. Fix the code to properly round a vector of numbers to two decimal places:

numbers <- c(3.14159, 2.71828, 1.61803)
round_numbers <- round(numbers, digits = 1)
print(round_numbers)
  1. Use sample() to pick 4 fruits from
c("apple", "banana", "cherry", "date", "fig")

without replacement.

  1. Write a loop that prints whether each number in 1:5 is even or odd.

  2. Generate a reproducible normal sample of 5 values using set.seed().

  3. Explain why set.seed() is useful when using random functions.

  4. Use seq() to create a sequence from 5 to 50 in steps of 5.

  5. Fix this code to generate a vector of even numbers from 2 to 20 using seq():

even_numbers <- seq(2, 20, by = 3)
print(even_numbers)
  1. The following code tries to generate a sequence of numbers in reverse order but doesn’t work. Fix it:
seq(10, 1, by = 1)
  1. Use ifelse() to label numbers 1 to 5 as “small” if less than 3, otherwise “big”.

  2. Write a custom function that adds two numbers together. Try 7 and 9.

  3. What is the purpose of using return() in a function?

  4. Create a 3x3 matrix with numbers 1 to 9. Replace all values greater than 5 with 0 using ifelse().

  5. Use a for loop to calculate and print the sum of numbers from 1 to 100.

  6. Explain the purpose of mutate() in dplyr. How does it work with ifelse()?

  7. Use mutate() and sample() to add a new column called “color” to a dataframe with random values from “red”, “blue”, “green”.

  8. Use a for loop with break to stop when the sum of elements reaches 10.

  9. Create a function that checks if a number is divisible by 3 and 5. Return “FizzBuzz” if true.