<- function(x) {
square return(x^2)
}
6 Functions
A function is a piece of code that takes input arguments , performs a specific task, and returns its output.
We have seen quite a few built-in functions of R, such as c()
, length()
, runif()
, mean()
, or sum()
.
However, we are not limited to using built-in functions. We can define our own functions in order to complete one computing task.
A first example
Let us define a function that returns the square of a given number.
We call the function by providing the input value as an argument. The function returns a value, which we can store in a variable.
<- square(3)
foo foo
[1] 9
If we type a function’s name and press Enter, we get back the definition of the function.
square
function (x)
{
return(x^2)
}
General syntax of function definition
A function can take any number of input arguments. It performs some computations in its body, and generates a return value
. The returned value can be any R object (number, string, vector, list, dataframe, etc.)
<function_name> <- function([<argument_1>, <argument_2>, ...]) {
<statements>
return(<return_value>)
}
The return
statement is optional. The function returns the value of the last expression in its block. So the function square()
can also be defined as:
<- function(x) {
square print(x)
^2
x
}<- square(3) sq3
[1] 3
sq3
[1] 9
The job of the braces {}
is to combine several statements into one.
As we have only one statement here, braces can be omitted and the function can be stated in one line.
<- function(x) x^2
square square(3)
[1] 9
Function arguments
A function can be defined with any number of arguments.
<- function(x,y,z){
f return(x + y*z)
}f(1,2,3)
[1] 7
It is possible to change the order of arguments by using the argument names explicitly:
f(z=3,x=1,y=2)
[1] 7
You can even omit some names, and the unnamed arguments will be matched in order.
f(z=3,1,2)
[1] 7
Return values
The return value of the function can be any R object, such as a number, a vector, a matrix, a list, etc.
<- function(x,y){
sumdiff return( c(x+y, x-y) )
}
sumdiff(5,8)
[1] 13 -3
Functions returning functions
A function itself is an R object, therefore we can easily write functions that return functions.
Here is a function that returns a power function with any order we like:
<- function(p){
powerfun return(function(y){return(y^p)})
}# Alternatively:
# powerfun <- function(p) function(x) x^p
Now we can use this function to generate other functions:
<- powerfun(2)
sq <- powerfun(3) cube
sq
function (y)
{
return(y^p)
}
<environment: 0x5c68f1cc1cb8>
Evaluate functions with input value 5.
sq(5)
[1] 25
cube(5)
[1] 125
Functions with side effects
Sometimes we call a function not for its return value, but for its side effect, such as generating a plot.
<- function(n){
plot_random_walk <- cumsum(sample(c(-1,1), n, replace=TRUE))
x plot(x, type="o", xlab="step number", ylab="Distance from origin")
title("A random walk")
}
set.seed(7652)
plot_random_walk(100)
Vectorization of functions
The simple function square()
defined above happens to work with vector arguments without any modification, because the returned statement x^2
is valid for both numbers and vectors.
<- function(x) x^2
square square(c(1,2,3,4,5))
[1] 1 4 9 16 25
However, functions are not always applicable with vector arguments as they are. For example, a function that returns the sum of integers from 1 up to its argument value:
<- function(n) sum(1:n)
addupto addupto(10)
[1] 55
When we call this function with a vector argument, only the first element is taken, and a warning message is issued
addupto(c(10,20)) # Internally it tries sum(1:c(10,20))
Warning in 1:n: numerical expression has 2 elements: only the first used
[1] 55
If you want this function to work with vector input, the preferred way in R is to use the built-in sapply
function, which maps a function on each element of a vector.
sapply(c(10,20, 30, 40, 50), addupto)
[1] 55 210 465 820 1275
Default arguments
When you define a function, you can set some of the arguments to default values. Then you don’t have to specify them at each call.
<- function(capital, interest_rate=0.1) {
f * (1+interest_rate)
capital }
Without specifying the interest_rate
value, 0.1 is assumed.
f(1000)
[1] 1100
But if you want to change it, you can provide it as an extra argument.
f(1000, 0.2)
[1] 1200
Calling the function with argument names is usually clearer for the reader.
f(capital = 1000, interest_rate = 0.2)
[1] 1200
You can change the order of the arguments when you use argument names.
f(interest_rate=0.2, capital=1000)
[1] 1200
Scope of variables
- The value of a variable defined outside a function (a global variable) can be seen inside a function.
- However, a variable defined inside a function block is not recognized outside of it.
- We say that the scope of the variable
b
is limited to the functionf()
.
<- 5 # a global variable
a
<- function(){
f <- 10 # a local variable
b cat("inside f(): a =",a,"b =",b,"\n")
}
f()
inside f(): a = 5 b = 10
cat("outside f(): a =",a," ")
outside f(): a = 5
cat("b =",b) # raises an error
Error: object 'b' not found
A local variable temporarily overrides a global variable with the same name.
<- 5 # a global variable
a cat("before f(): a =",a,"\n")
before f(): a = 5
<- function(){
f <- 10 # a local variable
a cat("inside f(): a =",a,"\n")
}
f()
inside f(): a = 10
cat("after f(): a =",a)
after f(): a = 5
Assigning values to upper-level variables
Although the values of variables defined in upper levels are available in lower levels, they cannot be modified in a lower level, because an assignment will create only a local variable with the same name.
Using the superassignment operator <<-
it is possible to assign to a variable in the higher level.
<- 5
a cat("before f(): a =",a,"\n")
before f(): a = 5
<- function(){
f <<- 10
a cat("inside f(): a =",a,"\n")
}
f()
inside f(): a = 10
cat("after f(): a =",a)
after f(): a = 10
However, this is not recommended in general. It cause some subtle errors that are difficult to find. You almost never need this.
To modify a global variable, the most transparent way is to assign the function output to it explicitly.
<- 5
a cat("before f(): a =",a,"\n")
before f(): a = 5
<- function(x) {x+5}
f <- f(a)
a cat("after f(): a =",a)
after f(): a = 10
Unspecified arguments with ...
Some functions take an unlimited number of arguments, e.g. c()
.
c(1,2,3)
[1] 1 2 3
c(4,2,6,1,3,5,1)
[1] 4 2 6 1 3 5 1
The c()
function is defined with an ellipsis (three dots) as the argument list.
help(c)
Ellipsis has two use cases:
- Write a function that takes any number of arguments (like
c()
orsum()
). - Pass some arguments to another function, called inside the current function
Let’s modify our function for generating and plotting a random walk. It accepts some unspecified arguments represented with the ellipsis, and passes them to plot()
<- function(n, ...){
plot_random_walk <- cumsum(sample(c(-1,1), n, replace=TRUE))
x plot(x, type="o", ...)
}
We can then call the function by specifying only the number of points:
options(repr.plot.width=10, repr.plot.height=4)
plot_random_walk(100)
or by specifying plot parameters:
plot_random_walk(100,
pch=4,
col="red",
main="A random walk",
xlab="step number",
ylab="displacement")
The arguments passed with the ellipsis can be converted to a vector, so we can process them inside the function.
<- function(...) {
diff # returns the difference between the first and the last argument
<- c(...)
arguments cat("Number of arguments = ",length(arguments))
length(arguments)] - arguments[1] # last argument minus first argument
arguments[
}diff(1,4,2)
Number of arguments = 3
[1] 1
diff(1,4,2,6,3,1)
Number of arguments = 6
[1] 0
Ellipsis arguments can have arbitrary names, and can be converted to a list object (more on lists later).
<- function(...){
f <- list(...)
args print(args)
}f(a=1, b=3, foo=7654)
$a
[1] 1
$b
[1] 3
$foo
[1] 7654
Exercises
Write a function with the name FtoC
that takes a temperature measurement in degrees Fahrenheit, and returns the equivalent value in degrees Celsius. Make sure that your function works with vector input, too.
Write a function with the name bmi
that takes two arguments, height
and weight
, and returns the body-mass index calculated with these argument values. The function should work with vector input, too.
Write a function named range
that takes a vector of numbers, and returns the difference between its minimum and the maximum elements. Test your function with some randomly-generated vectors.