9 R introduction
As your first introduction to R, we will make use of Data Carpentry’s Intro to R and RStudio for Genomics course. This self-paced workshop walks through the basics of R - and its most popular IDE, RStudio - in the context of genomics. The focus lies on the basic syntax of R, wrangling tabular data using the tidyverse - where we will encounter the VCF file format once again (Section 7.1.1.1) - and visualization using ggplot2.
The Data Carpentry workshop will sometimes refer to running RStudio in a cloud environment with pre-installed packages and files, but when going through these materials on your own, we recommend installing R and RStudio on your own machine. You can do so by following the instructions listed here:https://rstudio-education.github.io/hopr/starting.html. You will also need to download the following two files, which are used throughout the workshop: combined_tidy_vcf.csv and Ecoli_metadata.xlsx.
If you do run into trouble installing R and RStudio on your own computer, you could instead:
- make use of the free version of Posit Cloud (formerly known as RStudio Server): https://posit.cloud/plans/free. However, note that you only receive a limited number of hours of use as a free user.
- Alternatively, you can again make use of GitHub codespace, but you will need to select a different variant this time: RStudio Server.
- As a final option, you could make use of a binder RStudio instance, such as this one.