In this post, I will show you how to effectively manage your libraries in R. In particular, I will explain why the usual way of loading packages is quite messy and how to write a function which makes for a better package-loading experience. For those of you unintersted in the details, you can simply skip to the last section.

1 Problems with the Usual Approach of Loading Packages

Almost every time we work with R, we need to use the functionalities provided by one package or the other. These packages are easily installable from the package archive CRAN or by a package manager such as pacman on Linux, and it is most common to start an R script by loading them.

What clutters up the code, though, is that you need to do two things in order for your script to work regardless of whose R installation it is running on:

  1. You need to make sure that the packages are installed and accessible.
  2. You need to actually load these packages, i.e. evaluating their code.

This is why often times, you see code chunks like

install.packages("foo")
library("foo")

Although calling these functions in succession works in principle, there are good reasons against doing it. On the one hand, install.packages("foo"), by default, installs foo – regardless of whether or not it already exists. This is not time efficient at all, and especially on UNIX-like systems, it eats up a considerable amount of time since the packages are byte-compiled during installation. On the other hand, the commands install.packages and library are called in combination in almost every case. Thus, the above way of doing things is both liable to consume extra time and impairs the readability of your code.

2 Ensuring a Single Package

Luckily, R is a full-fledged programming language, so we can solve the problems we have with some very basic tools. Specifically, we want to write a function which

  1. only installs a package if it is not installed yet and
  2. automatically loads that package afterwards.

In the following, we will call this functionality ensuring a package. The first requirement is easily statisfied by a conditional statement:

if(!require("foo")){
  install.packages("foo")
}

The statement can be read as follows: If it is not the case that the package called "foo" can be required, install it. Note that we use the function require here instead of library. The reason is that library("foo") returns an error if "foo" is not found while require("foo") simply returns FALSE with a warning attached to it.

The second condition can be implemented by appending a call to the library function. In a nutshell, then, we can define a function ensure_package which takes one argument package and installs package if it is not installed yet and (in either case) loads package.

ensure_package <- function(package){
  if(!require(package,character.only=TRUE)){
    install.packages(package)
  }
  library(package, character.only=TRUE)
}

Note that the additional parameter character.only must be set to TRUE for both functions in order to tell R that the input is a string (as opposed to a symbol).

3 Ensuring Several Packages at once

Now it often happens that you need to load several packages in succession, so it would come in even more handy if we managed to write a function which takes a variable number of packages as arguments and ensures all of them. This is possible using the ... syntax: We store the arguments provided in the list packages and loop over it.

ensure_packages <- function(...){
  packages <- list(...)
  for (package in packages){
    if(!require(package,character.only=TRUE)){
      install.packages(package)
    }
    library(package, character.only=TRUE)
  }
}

Note that this function also works when only a single argument is provided, simply because looping over a list whose only element is bar using the function foo produces the same result as executing foo on bar directly.

4 Using Our Function

Now that we have defined our function, we can actually use it. Simply add its definition to the file you are working on and call ensure_packages with any number of package names you want. Note, though, that the order in which you provide them as arguments to ensure_packages is the order in which they are loaded.

ensure_packages <- function(...){
  packages <- list(...)
  for (package in packages){
    if(!require(package,character.only=TRUE)){
      install.packages(package)
    }
    library(package, character.only=TRUE)
  }
}

ensure_packages("dplyr") # single library
ensure_packages("knitr", "rlang", "ggplot2", "tidyverse") # multiple libraries