A package is a collection of R functions, data, and code bundled together. A library is the directory where packages are stored on your computer and the default is R_HOME/library.
Types of Packages
- Base packages: Comes with R installation & no installation is needed. For example, stats, utils, datasets, methods.
- Recommended packages: Part of R distribution but need to be loaded explicitly. For example, MASS, boot, class, cluster.
- Contributed packages: Created by R community available on CRAN, Bioconductor, GitHub etc. For example, ggplot2, dplyr.
Installing a package
install.packages(“xlsx”)
Getting info about installed packages
installed.packages()
Installing multiple packages at once
install.packages(c(“tidyverse”, “xlsx”, “dlookr”))
Installing from Bioconductor
if (!require(“BiocManager”, quietly = TRUE))
install.packages(“BiocManager”)
BiocManager::install(“GenomicFeatures”)
Installing from GitHub
install.packages(“devtools”)
devtools::install_github(“username/repository”)
Installing based on specific version
devtools::install_version(“dplyr”, version = “1.0.0”)
Loading package
library(xlsx)
Loading multiple packages
packages1 <- c(“tidyverse”, ”xlsx”, “dlookr”)
lapply(packages1, library, character.only = TRUE) or
lapply(c(“tidyverse”, ”xlsx”, “dlookr”), require, character.only = TRUE)
Checking if a particular package is installed
system.file(package=’ggplot2′)
Checking for a package version
packageVersion(“dplyr”)
Listing all loaded packages
sessionInfo()
Namespace management is an option when functions are same in different libraries. Use the library name to specify the function clearly.
dplyr::select(mtcars, mpg)
Dependency conflicts in R occur when multiple packages require different versions of the same dependency. This can lead to errors and dysfunctional code.
Checking for any conflicts
conflicts()
Detaching a package
detach(“package:dplyr”)
Unloading all packages
detach(“package:dplyr”, unload = TRUE)
Saving all packages list for reproducibility
writeLines(capture.output(sessionInfo()), “session_info.txt”)
Updating all packages
update.packages()
Removing packages
remove.packages(“dplyr”)
remove.packages(c(“dplyr”, “xlsx”))
Cleaning up unused dependencies
install.packages(“pacman”)
pacman::p_clean()
Repository management is the process of configuring and using servers to install packages. Set the
CRAN mirror to choose a location in install packages for speed (choose mirror closer to you), network (CRAN is a network of mirrors around the world, not just a website), cloud (use cloud.r-project.org
service to automatically select a server). RStudio’s package manager works like a CRAN mirror, and
includes packages from Bioconductor as well as CRAN.
Listing all available CRAN mirrors
getCRANmirrors()
To set a mirror
options(repos = c(CRAN = “https://cloud.r-project.org”))
to get local libraries
.libPaths()
To add new library location
.libPaths(c(.libPaths(), “path/to/new/library”))
Best practices
- Load packages at the start of scripts
- Use explicit namespaces for clarity
- Document dependencies