R Markdown is a free and open-source R package that provides an authoring framework for building data science projects. Using it, we can write a single .rmd file that combines narrative, code, and data plots, and then render this file in a selected output format. The main characteristics of R Markdown are:
What is RStudio?
RStudio is an open-source IDE (integrated development environment) that is widely used as a graphical front-end for working with the R programming language starting from version 3.0.1. It has many helpful features that make it very popular among R users: To learn more about what RStudio is and how to install it and begin using it, you can follow the RStudio Tutorial.
What is a factor in R?
A factor in R is a specific data type that accepts categories (aka levels) from a predefined set of possible values. These categories look like characters, but under the hood, they are stored as integers. Often, such categories have an intrinsic order. For example, a column in a data frame that contains the options of the Likert scale for assessing views (“strongly agree,” “agree,” “somewhat agree,” “neither agree nor disagree,” “somewhat disagree,” “disagree,” “strongly disagree”) should be of factor type to capture this intrinsic order and adequately reflect it on the categorical types of plots.
How to remove columns from a data frame in R?
1. By using the select() function of the dplyr package of the tidyverse collection. The name of each column to delete is passed in with a minus sign before it: If, instead, we have too many columns to delete, it makes more sense to keep the rest of the columns rather than delete the columns in interest. In this case, the syntax is similar, but the names of the columns to keep aren’t preceded with a minus sign: 2. By using the built-in subset() function of the base R. If we need to delete only one column, we assign to the select parameter of the function the column name preceded with a minus sign. To delete more than one column, we assign to this parameter a vector containing the necessary column names preceded with a minus sign: If, instead, we have too many columns to delete, it makes more sense to keep the rest of the columns rather than delete the columns in interest. In this case, the syntax is similar, but no minus sign is added:
Rich Ecosystem
R has a vast ecosystem of packages for various purposes, including data manipulation (dplyr), visualization (ggplot2), and machine learning (caret, tidymodels).
How do you add a new column to a data frame in R?
Output: Output: Output: In each of the three cases, we can assign a single value or a vector or calculate the new column based on the existing columns of that data frame or other data frames.
CRAN
The Comprehensive R Archive Network (CRAN) hosts thousands of R packages, making it one of the largest repositories for statistical software. It serves as a crucial resource for users to find and install additional functionality.
Open Source
R is free and open-source software, which means anyone can use, modify, and distribute it. This has led to a vibrant community contributing to its development.
Origin of the Name
The name “R” comes from the first letters of the names of its creators, Ross Ihaka and Robert Gentleman. It also plays on the earlier S programming language.
How to create a data frame in R?
1. From one or more vectors of the same length—by using the data.frame() function: 2. From a matrix—by using the data.frame() function: 3. From a list of vectors of the same length—by using the data.frame() function: 4. From other data frames: