21 July 2016

Functions

Why write functions?

  • Repetitive processes

  • Nothing may exist for our purpose

  • We can customise existing function
    • We may need to tweak the output of a function
    • We may only need part of a larger function
    • We can customise for increased speed

The Basics of a Function

A function has three main parts

  1. The formals() - The arguments supplied to a function
  2. The body() - This is the code inside the function
  3. The environment() - Where the function was created

The Basics of a Function

formals()

formals(sd)
## $x
## 
## 
## $na.rm
## [1] FALSE

The Basics of a Function

body()

body(sd)
## sqrt(var(if (is.vector(x) || is.factor(x)) x else as.double(x), 
##     na.rm = na.rm))

The Basics of a Function

environment()

environment(sd)
## <environment: namespace:stats>

This is telling us that sd() comes from the packages stats

The Basics of a Function

f <- function(x) {
  x + 1
  }

What would the formals() be?

What would the body() be?

What would the environment() be?

The Basics of a Function

Let's use our function

f(1:5)
## [1] 2 3 4 5 6

Lexical Scoping

Where does the function look for it's values?

x <- 3
f <- function(x) {
  x + 1
  }

What would we expect to see if we don't provide x?

f()
## Error in f(): argument "x" is missing, with no default

Lexical Scoping

Where does the function look for it's values?

x <- 3
f <- function(y) {
  x + 1
  }

Now what would we expect to see if we don't provide y?

f()
## [1] 4

Lexical Scoping

  • If an object in the function is declared as an argument:
    \(\implies\) The function looks inside the function call
  • If an object in the function is NOT declared as an argument:
    \(\implies\) The function looks inside the function, then
    \(\implies\) looks in the Environment one level up.

Writing Our Own Functions

Our First Function

Place the following in a new script called myMean.R

myMean <- function(x){
  total <- sum(x)
  n <- length(x)
  mn <- total / n
  return(mn)
}

Select all, and hit Ctrl + Enter to send it to your Environment

What are the formals()?

Our First Function

Let's call our function

testVec <- 1:10
myMean(testVec)
## [1] 5.5
  • Note that we didn't have to enter x = testVec when we called the function
  • We could have just used mn as the last line

Our First Function

Writing For Idiots

What if an idiot called our function?

  • In 6 months, I'll have no idea why I wrote any function
  • What problems might arise?
  • How can we idiot-proof this function?

Our First Function

Writing For Idiots

What if an idiot called our function?

  1. We could comment our code
  2. We could think like an idiot
    • Introduce checking steps
    • Deal with any problems we can think of

Our First Function

Writing For Idiots

Add some comments

myMean <- function(x){
  total <- sum(x) # Get the total of all elements
  n <- length(x) # Count how many elements there are
  mn <- total / n # Calculate the mean
  return(mn)
}

NB: The comments are not reproduced by body(myMean)

Our First Function

Writing For Idiots

What might I do in 6 months which is not intelligent?

Our First Function

Writing For Idiots

What might I do in 6 months which is not intelligent?

The biggest issue will be missing values.

Our First Function

Debugging

Let's create a vector with missing values

testVec <- c(NA, 1:10)
myMean(testVec)
## [1] NA

How can we figure out where the problem is?

Our First Function

Debugging

Let's create a vector with missing values

testVec <- c(NA, 1:10)
myMean(testVec)
## [1] NA

We can use the command browser() to stop the function, and look inside it's Environment

Our First Function

Debugging

  1. Add the extra line to your code
  2. Make sure you send it to your Environment
myMean <- function(x){
  total <- sum(x) # Get the total of all elements
  n <- length(x) # Count how many elements there are
  browser()
  mn <- total / n # Calculate the mean
  return(mn)
}

Our First Function

Debugging

The next time we call the function:

  • It will stop when it hits browser()
  • A new Window will open in your Script Window

Our First Function

Debugging

We are now in the internal Environment for the function

  • The internal objects total and n are shown in the Environment Tab
  • Our supplied vector is now called x
  • We can step through the code using Ctrl + Enter (we can't edit this script)
  • We exit the browser by hitting Stop (the red button)

Our First Function

Debugging

Where have things gone wrong?

  1. The line total <- sum(x) has clearly failed
  2. Is n the correct value?

Our First Function

Debugging

Where have things gone wrong?

  1. The line total <- sum(x) has clearly failed
  2. Is n the correct value?

How can we remove the NA values?

CLICK THE STOP BUTTON!!!

Our First Function

Debugging

  • Let's add an argument: na.rm
  • We'll give it a default value: na.rm = TRUE

Our First Function

Debugging

myMean <- function(x, na.rm = TRUE){
  if (na.rm){
    x <- x[!is.na(x)]
  }
  total <- sum(x) # Get the total of all elements
  n <- length(x) # Count how many elements there are
  browser()
  mn <- total / n # Calculate the mean
  return(mn)
}

We only have to specify this if we want na.rm = FALSE

Our First Function

Debugging

Call the function again:

myMean(testVec)

What did x[!is.na(x)] do?

Our First Function

Debugging

Now we've protected our future selves, we can remove this line

myMean <- function(x, na.rm = TRUE){
  if (na.rm){
    x <- x[!is.na(x)]
  }
  total <- sum(x) # Get the total of all elements
  n <- length(x) # Count how many elements there are
  # browser()
  mn <- total / n # Calculate the mean
  return(mn)
}