Data structures are fundamental components in R that allow you to store and organize data. Understanding these structures is crucial for effective data analysis and programming in R. The primary data structures in R include vectors, matrices, arrays, data frames, and lists.
1. Vectors
A vector is the most basic data structure in R. It is a sequence of elements of the same type. There are different types of vectors, including numeric, integer, character, logical, and complex.
Creating Vectors –
- Numeric Vector:
> numeric_vector <- c(1, 2, 3, 4)
- Character Vector:
> char_vector <- c(“a”, “b”, “c”)
- Logical Vector:
> logical_vector <- c(TRUE, FALSE, TRUE)
Accessing Elements –
- Single Element:
> numeric_vector[2] # Output: 2
- Multiple Elements:
> numeric_vector[c(1, 3)] # Output: 1 3
> first_element <- numeric_vector[1]
> subset_vector <- char_vector[c(1, 3)]
2. Matrices
A matrix is a two-dimensional array where each element has the same type. It is essentially a collection of vectors with the same length.
Creating Matrices –
- Using matrix() function:
> matrix_1 <- matrix(c(1:9), nrow = 3, ncol = 3)
> row1 <- c(1, 2, 3)
> row2 <- c(4, 5, 6)
> matrix_2 <- rbind(row1, row2)
or
> matrix_1 <- matrix(c(1:9), nrow = 3, ncol = 3)
> matrix_2 <- rbind(c(1, 2, 3), c(4, 5, 6))
Accessing Elements –
Single Element:
> matrix_1[2, 3] # Second row, third column
Entire Row or Column:
> matrix_1[2, ] # Second row
3. Arrays
An array is similar to a matrix but can have more than two dimensions.
Creating Arrays –
> array_1 <- array(1:24, dim = c(3, 4, 2))
Accessing Elements –
- Single Element: array_1[2, 3, 1] # Second row, third column, first matrix
> array_1[2, 3, 1]
4. Data Frames
A data frame is a table or 2D array-like structure where each column can contain different types of data (numeric, character, logical, etc.).
Creating Data Frames –
> data_frame <- data.frame(
Name = c(“John”, “Jane”, “Tom”),
Age = c(28, 34, 23),
Gender = c(“M”, “F”, “M”)
)
Accessing Elements –
Single Element:
> data_frame[1, 2] # First row, second column
> data_frame[1, ] # First row only
Column by Name:
> data_frame$Age
5. Lists
A list is a versatile data structure that can hold elements of different types, such as numbers, strings, vectors, and even other lists.
Creating Lists –
> list_1 <- list(
Name = “John”,
Age = 28,
Scores = c(85, 90, 92)
)
Accessing Elements –
- Single Element:
> list_1$Name
- Nested Elements:
> list_1$Scores[2]
6. Tibbles
A modern version of data frames with enhanced features and better printing methods.
> install.packages(“tibble”)
> library(tibble)
> tibble_example <- tibble(
Name = c(“John”, “Jane”, “Tom”),
Age = c(28, 34, 23)
)
7. Factor
Used to handle categorical data, which can have fixed and known set of possible values.
> factor_example <- factor(c(“High”, “Medium”, “Low”, “Medium”, “High”))