My Blog

My WordPress Blog

My Blog

My WordPress Blog

Data Manipulation with dplyr

In this example, we’ll use the dplyr package for data manipulation. We’ll filter, summarize, and arrange data.

Step 1: Install and Load dplyr

If you don’t have dplyr installed yet, you can install it with:

rCopy codeinstall.packages("dplyr")

Then, load the library:

rCopy codelibrary(dplyr)

Step 2: Create a Sample Dataset

We’ll continue using the previous dataset or create a new one:

rCopy code# Create a sample dataset
set.seed(456)
data <- data.frame(
  id = 1:100,
  age = sample(18:65, 100, replace = TRUE),
  height = rnorm(100, mean = 170, sd = 10),
  weight = rnorm(100, mean = 70, sd = 15)
)

Step 3: Data Manipulation

  1. Filtering Data: Let’s filter individuals who are above 30 years old.
rCopy code# Filter data for individuals older than 30
filtered_data <- data %>% filter(age > 30)
head(filtered_data)
  1. Summarizing Data: We can calculate the average height and weight for this filtered group.
rCopy code# Summarize to get mean height and weight for individuals older than 30
summary_stats <- filtered_data %>%
  summarize(
mean_height = mean(height),
mean_weight = mean(weight),
count = n()
) print(summary_stats)
  1. Arranging Data: Sort the dataset by height in descending order.
rCopy code# Arrange data by height in descending order
arranged_data <- data %>% arrange(desc(height))
head(arranged_data)
Data Manipulation with dplyr

Leave a Reply

Your email address will not be published. Required fields are marked *

Scroll to top