The Best Way to Load Packages in R
1 Problems with the Usual Approach of Loading Packages
Almost every time we work with R
, we need to use the functionalities provided by one package or the other. These packages are easily installable from the package archive CRAN
or by a package manager such as pacman
on Linux, and it is most common to start an R
script by loading them.
What clutters up the code, though, is that you need to do two things in order for your script to work regardless of whose R
installation it is running on:
- You need to make sure that the packages are installed and accessible.
- You need to actually load these packages, i.e. evaluating their code.
This is why often times, you see code chunks like
install.packages("foo")
library("foo")
Although calling these functions in succession works in principle, there are good reasons against doing it. On the one hand, install.packages("foo")
, by default, installs foo
– regardless of whether or not it already exists. This is not time efficient at all, and especially on UNIX
-like systems, it eats up a considerable amount of time since the packages are byte-compiled during installation. On the other hand, the commands install.packages
and library
are called in combination in almost every case. Thus, the above way of doing things is both liable to consume extra time and impairs the readability of your code.
2 Ensuring a Single Package
Luckily, R
is a full-fledged programming language, so we can solve the problems we have with some very basic tools. Specifically, we want to write a function which
- only installs a package if it is not installed yet and
- automatically loads that package afterwards.
In the following, we will call this functionality ensuring a package. The first requirement is easily statisfied by a conditional statement:
if(!require("foo")){
install.packages("foo")
}
The statement can be read as follows: If it is not the case that the package called "foo"
can be required, install it. Note that we use the function require
here instead of library
. The reason is that library("foo")
returns an error if "foo"
is not found while require("foo")
simply returns FALSE
with a warning attached to it.
The second condition can be implemented by appending a call to the library
function. In a nutshell, then, we can define a function ensure_package
which takes one argument package
and installs package
if it is not installed yet and (in either case) loads package
.
ensure_package <- function(package){
if(!require(package,character.only=TRUE)){
install.packages(package)
}
library(package, character.only=TRUE)
}
Note that the additional parameter character.only
must be set to TRUE
for both functions in order to tell R
that the input is a string (as opposed to a symbol).
3 Ensuring Several Packages at once
Now it often happens that you need to load several packages in succession, so it would come in even more handy if we managed to write a function which takes a variable number of packages as arguments and ensures all of them. This is possible using the ...
syntax: We store the arguments provided in the list packages
and loop over it.
ensure_packages <- function(...){
packages <- list(...)
for (package in packages){
if(!require(package,character.only=TRUE)){
install.packages(package)
}
library(package, character.only=TRUE)
}
}
Note that this function also works when only a single argument is provided, simply because looping over a list whose only element is bar
using the function foo
produces the same result as executing foo
on bar
directly.
4 Using Our Function
Now that we have defined our function, we can actually use it. Simply add its definition to the file you are working on and call ensure_packages
with any number of package names you want. Note, though, that the order in which you provide them as arguments to ensure_packages
is the order in which they are loaded.
ensure_packages <- function(...){
packages <- list(...)
for (package in packages){
if(!require(package,character.only=TRUE)){
install.packages(package)
}
library(package, character.only=TRUE)
}
}
ensure_packages("dplyr") # single library
ensure_packages("knitr", "rlang", "ggplot2", "tidyverse") # multiple libraries