square <- function(x) {
    return(x^2)
}6 Functions
A function is a piece of code that takes input arguments , performs a specific task, and returns its output.
We have seen quite a few built-in functions of R, such as c(), length(), runif(), mean(), or sum().
However, we are not limited to using built-in functions. We can define our own functions in order to complete one computing task.
A first example
Let us define a function that returns the square of a given number.
We call the function by providing the input value as an argument. The function returns a value, which we can store in a variable.
foo <- square(3)
foo[1] 9
If we type a function’s name and press Enter, we get back the definition of the function.
squarefunction (x) 
{
    return(x^2)
}
General syntax of function definition
A function can take any number of input arguments. It performs some computations in its body, and generates a return value. The returned value can be any R object (number, string, vector, list, dataframe, etc.)
<function_name> <- function([<argument_1>, <argument_2>, ...]) {
    <statements>
    return(<return_value>)
}
The return statement is optional. The function returns the value of the last expression in its block. So the function square() can also be defined as:
square <- function(x) {
    print(x)
    x^2
    }
sq3 <- square(3)[1] 3
sq3[1] 9
The job of the braces {} is to combine several statements into one.
As we have only one statement here, braces can be omitted and the function can be stated in one line.
square <- function(x) x^2
square(3)[1] 9
Function arguments
A function can be defined with any number of arguments.
f <- function(x,y,z){
    return(x + y*z)
}
f(1,2,3)[1] 7
It is possible to change the order of arguments by using the argument names explicitly:
f(z=3,x=1,y=2)[1] 7
You can even omit some names, and the unnamed arguments will be matched in order.
f(z=3,1,2)[1] 7
Return values
The return value of the function can be any R object, such as a number, a vector, a matrix, a list, etc.
sumdiff <- function(x,y){
    return( c(x+y, x-y) )
}
sumdiff(5,8)[1] 13 -3
Functions returning functions
A function itself is an R object, therefore we can easily write functions that return functions.
Here is a function that returns a power function with any order we like:
powerfun <- function(p){
    return(function(y){return(y^p)})
}
# Alternatively:
# powerfun <- function(p) function(x) x^pNow we can use this function to generate other functions:
sq <- powerfun(2)
cube <- powerfun(3)sqfunction (y) 
{
    return(y^p)
}
<environment: 0x5c68f1cc1cb8>
Evaluate functions with input value 5.
sq(5)[1] 25
cube(5)[1] 125
Functions with side effects
Sometimes we call a function not for its return value, but for its side effect, such as generating a plot.
plot_random_walk <- function(n){
    x <- cumsum(sample(c(-1,1), n, replace=TRUE))
    plot(x, type="o", xlab="step number", ylab="Distance from origin")
    title("A random walk")
}set.seed(7652)
plot_random_walk(100)
Vectorization of functions
The simple function square() defined above happens to work with vector arguments without any modification, because the returned statement x^2 is valid for both numbers and vectors.
square <- function(x) x^2
square(c(1,2,3,4,5))[1]  1  4  9 16 25
However, functions are not always applicable with vector arguments as they are. For example, a function that returns the sum of integers from 1 up to its argument value:
addupto <- function(n) sum(1:n)
addupto(10)[1] 55
When we call this function with a vector argument, only the first element is taken, and a warning message is issued
addupto(c(10,20)) # Internally it tries sum(1:c(10,20))Warning in 1:n: numerical expression has 2 elements: only the first used
[1] 55
If you want this function to work with vector input, the preferred way in R is to use the built-in sapply function, which maps a function on each element of a vector.
sapply(c(10,20, 30, 40, 50), addupto)[1]   55  210  465  820 1275
Default arguments
When you define a function, you can set some of the arguments to default values. Then you don’t have to specify them at each call.
f <- function(capital, interest_rate=0.1) {
    capital * (1+interest_rate)
}Without specifying the interest_rate value, 0.1 is assumed.
f(1000)[1] 1100
But if you want to change it, you can provide it as an extra argument.
f(1000, 0.2)[1] 1200
Calling the function with argument names is usually clearer for the reader.
f(capital = 1000, interest_rate = 0.2)[1] 1200
You can change the order of the arguments when you use argument names.
f(interest_rate=0.2, capital=1000)[1] 1200
Scope of variables
- The value of a variable defined outside a function (a global variable) can be seen inside a function.
 - However, a variable defined inside a function block is not recognized outside of it.
 - We say that the scope of the variable 
bis limited to the functionf(). 
a <- 5 # a global variable
f <- function(){
    b <- 10 # a local variable
    cat("inside f(): a =",a,"b =",b,"\n")
}
f()inside f(): a = 5 b = 10 
cat("outside f(): a =",a," ")outside f(): a = 5  
cat("b =",b)  # raises an errorError: object 'b' not found
A local variable temporarily overrides a global variable with the same name.
a <- 5 # a global variable
cat("before f(): a =",a,"\n")before f(): a = 5 
f <- function(){
    a <- 10 # a local variable
    cat("inside f(): a =",a,"\n")
}
f()inside f(): a = 10 
cat("after f(): a =",a)after f(): a = 5
Assigning values to upper-level variables
Although the values of variables defined in upper levels are available in lower levels, they cannot be modified in a lower level, because an assignment will create only a local variable with the same name.
Using the superassignment operator <<- it is possible to assign to a variable in the higher level.
a <- 5
cat("before f(): a =",a,"\n")before f(): a = 5 
f <- function(){
    a <<- 10
    cat("inside f(): a =",a,"\n")
}
f()inside f(): a = 10 
cat("after f(): a =",a)after f(): a = 10
However, this is not recommended in general. It cause some subtle errors that are difficult to find. You almost never need this.
To modify a global variable, the most transparent way is to assign the function output to it explicitly.
a <- 5
cat("before f(): a =",a,"\n")before f(): a = 5 
f <- function(x) {x+5}
a <- f(a)
cat("after f(): a =",a)after f(): a = 10
Unspecified arguments with ...
Some functions take an unlimited number of arguments, e.g. c().
c(1,2,3)[1] 1 2 3
c(4,2,6,1,3,5,1)[1] 4 2 6 1 3 5 1
The c() function is defined with an ellipsis (three dots) as the argument list.
help(c)Ellipsis has two use cases:
- Write a function that takes any number of arguments (like 
c()orsum()). - Pass some arguments to another function, called inside the current function
 
Let’s modify our function for generating and plotting a random walk. It accepts some unspecified arguments represented with the ellipsis, and passes them to plot()
plot_random_walk <- function(n, ...){
    x <- cumsum(sample(c(-1,1), n, replace=TRUE))
    plot(x, type="o", ...)
} We can then call the function by specifying only the number of points:
options(repr.plot.width=10, repr.plot.height=4)
plot_random_walk(100)
or by specifying plot parameters:
plot_random_walk(100, 
                 pch=4, 
                 col="red", 
                 main="A random walk", 
                 xlab="step number", 
                 ylab="displacement")
The arguments passed with the ellipsis can be converted to a vector, so we can process them inside the function.
diff <- function(...) {
    # returns the difference between the first and the last argument
    arguments <- c(...)
    cat("Number of arguments = ",length(arguments))
    arguments[length(arguments)] - arguments[1] # last argument minus first argument
}
diff(1,4,2)Number of arguments =  3
[1] 1
diff(1,4,2,6,3,1)Number of arguments =  6
[1] 0
Ellipsis arguments can have arbitrary names, and can be converted to a list object (more on lists later).
f <- function(...){
    args <- list(...)
    print(args)
}
f(a=1, b=3, foo=7654)$a
[1] 1
$b
[1] 3
$foo
[1] 7654
Exercises
Write a function with the name FtoC that takes a temperature measurement in degrees Fahrenheit, and returns the equivalent value in degrees Celsius. Make sure that your function works with vector input, too.
Write a function with the name bmi that takes two arguments, height and weight, and returns the body-mass index calculated with these argument values. The function should work with vector input, too.
Write a function named range that takes a vector of numbers, and returns the difference between its minimum and the maximum elements. Test your function with some randomly-generated vectors.