diff --git a/_site.yml b/_site.yml
index 6d8e9871..99203eaa 100644
--- a/_site.yml
+++ b/_site.yml
@@ -37,4 +37,6 @@ navbar:
href: home_precourse.html
- text: Info
href: home_info.html
+ - text: Projects
+ href: home_projects.html
diff --git a/data/slide_programming/Data_Information_Knowledge.png b/data/slide_programming/Data_Information_Knowledge.png
new file mode 100644
index 00000000..f9d4c696
Binary files /dev/null and b/data/slide_programming/Data_Information_Knowledge.png differ
diff --git a/data/slide_programming/Data_classification.png b/data/slide_programming/Data_classification.png
new file mode 100644
index 00000000..bdcab023
Binary files /dev/null and b/data/slide_programming/Data_classification.png differ
diff --git a/data/slide_r_environment/ggplot2_CRAN.png b/data/slide_r_environment/ggplot2_CRAN.png
new file mode 100644
index 00000000..0f0d0e20
Binary files /dev/null and b/data/slide_r_environment/ggplot2_CRAN.png differ
diff --git a/home_content.Rmd b/home_content.Rmd
index 43589f33..838c6e4a 100644
--- a/home_content.Rmd
+++ b/home_content.Rmd
@@ -37,6 +37,7 @@ This page contains links to different lectures (slides) and practical exercises
* [Working with Vectors (Lab)](lab_vectors.html)
* [Dataframes (Lab)](lab_dataframes.html)
* [Loops and functions (Slides)](slide_r_elements_4.html)
+* [Loops and functions (Lab)](lab_loops.html)
**Data wrangling**
diff --git a/home_precourse.Rmd b/home_precourse.Rmd
index c564c6ec..e03c7be1 100644
--- a/home_precourse.Rmd
+++ b/home_precourse.Rmd
@@ -72,7 +72,7 @@ Extra R packages used in the workshop exercises (if any) are listed below. It is
pkg<-unique(renv::dependencies()$Package)
-pkg_discard<-c("mkteachr")
+pkg_discard<-c("mkteachr", "manipulateWidget")
pkg_list<-pkg[!pkg %in% pkg_discard]
diff --git a/home_projects.Rmd b/home_projects.Rmd
new file mode 100644
index 00000000..ffd8e8e0
--- /dev/null
+++ b/home_projects.Rmd
@@ -0,0 +1,224 @@
+---
+title: "Projects"
+output:
+ bookdown::html_document2:
+ highlight: textmate
+ toc: false
+ toc_float:
+ collapsed: true
+ smooth_scroll: true
+ print: false
+ toc_depth: 4
+ number_sections: false
+ df_print: default
+ code_folding: none
+ self_contained: false
+ keep_md: false
+ encoding: 'UTF-8'
+ css: "assets/lab.css"
+ include:
+ after_body: assets/footer-lab.html
+---
+
+```{r,child="assets/header-lab.Rmd"}
+```
+
+Hands-on analysis of actual data is the best way to learn R programming. This page contains some data sets that you can use to explore what you have learned in this course. For each data set, a brief description as well as download instructions are provided.
+
+
+ Try to focus on using the tools from the course to explore the data, rather than worrying about producing a perfect report with a coherent analysis workflow.
+
+
+
+On the last day you will present your Rmd file (or rather, the resulting html report) and share with the class what your data was about.
+
+---
+
+## Palmer penguins 🐧
+
+- This is a data set containing a series of measurements for three species of penguins collected in the Palmer station in Antarctica.
+- Data description:
+
+
+ Download instructions
+```{r, warning=F, message=F}
+penguins <- read.table("https://vincentarelbundock.github.io/Rdatasets/csv/heplots/peng.csv", header = T, sep = ",")
+str(penguins)
+```
+
+
+---
+
+## Drinking habits 🍷
+
+- Data from a national survey on the drinking habits of american citizens in 2001 and 2002.
+- Data description:
+
+
+ Download instructions
+```{r}
+library(dplyr)
+# this will download the csv file directly from the web
+drinks <- read.table("https://vincentarelbundock.github.io/Rdatasets/csv/stevedata/nesarc_drinkspd.csv", header = T, sep = ",")
+# the lines below will take a sample from the full data set
+set.seed(seed = 2)
+drinks <- sample_n(drinks, size = 3000, replace = F)
+# and here we check the structure of the data
+str(drinks)
+```
+
+
+---
+
+## Car crashes 🚗
+
+- Data from car accidents in the US between 1997-2002.
+- Data description:
+
+
+ Download instructions
+```{r}
+library(dplyr)
+# this will download the csv file directly from the web
+crashes <- read.table("https://vincentarelbundock.github.io/Rdatasets/csv/DAAG/nassCDS.csv", header = T, sep = ",")
+# the lines below will take a sample from the full data set
+set.seed(seed = 2)
+crashes <- sample_n(crashes, size = 3000, replace = F)
+# and here we check the structure of the data
+str(crashes)
+```
+
+
+---
+
+## Gapminder health and wealth 📈
+
+- This is a collection of country indicators from the Gapminder dataset for the years 2000-2016.
+- Data description:
+
+
+ Download instructions
+```{r}
+library(dplyr)
+# this will download the csv file directly from the web
+gapminder <- read.table("https://vincentarelbundock.github.io/Rdatasets/csv/dslabs/gapminder.csv", header = T, sep = ",")
+# here we filter the data to remove anything before the year 2000
+gapminder <- gapminder |> filter(year >= 2000)
+# and here we check the structure of the data
+str(gapminder)
+```
+
+
+---
+
+## StackOverflow survey 🖥️
+
+- This is a downsampled and modified version of one of StackOverflow's annual surveys where users respond to a series of questions related to careers in technology and coding.
+- Data description:
+
+
+ Download instructions
+```{r}
+library(dplyr)
+# this will download the csv file directly from the web
+stackoverflow <- read.table("https://vincentarelbundock.github.io/Rdatasets/csv/modeldata/stackoverflow.csv", header = T, sep = ",")
+# the lines below will take a sample from the full data set
+set.seed(2)
+stackoverflow <- sample_n(stackoverflow, size = 3000)
+# and here we check the structure of the data
+str(stackoverflow)
+```
+
+
+---
+
+## Doctor visits 🤒
+
+- Data on the frequency of doctor visits in the past two weeks in Australia for the years 1977 and 1978.
+- Data description:
+
+
+ Download instructions
+```{r}
+library(dplyr)
+# this will download the csv file directly from the web
+doctor <- read.table("https://vincentarelbundock.github.io/Rdatasets/csv/AER/DoctorVisits.csv", header = T, sep = ",")
+# the lines below will take a sample from the full data set
+set.seed(2)
+doctor <- sample_n(doctor, size = 3000)
+# and here we check the structure of the data
+str(doctor)
+```
+
+
+---
+
+## Video Game Sales 🎮
+
+- This data set contains sales figures for video games titles released in 2001 and 2002.
+- Data description:
+ - Click on "Preview Data" and "VG Data Dictionary" to see the description for each column.
+
+
+ Download instructions
+```{r, warning=F, message=F}
+library(dplyr)
+library(lubridate)
+# this will download the file to your working directory
+download.file(url = "https://maven-datasets.s3.amazonaws.com/Video+Game+Sales/Video+Game+Sales.zip", destfile = "video_game_sales.zip")
+# this will unzip the file and read it into R
+videogames <- read.table(unz(filename = "vgchartz-2024.csv", "video_game_sales.zip"), header = T, sep = ",", quote = "\"", fill = T)
+# this will select rows corresponding to years 2001 and 2002
+videogames <- filter(videogames, year(as_date(release_date)) %in% c(2001,2002))
+# and here we check the structure of the data
+str(videogames)
+```
+
+
+---
+
+## LEGO Sets 🏗️
+
+- This data set contains the description of all LEGO sets released from 2000 to 2009.
+- Data description:
+ - Click on "Preview Data" and "VG Data Dictionary" to see the description for each column.
+
+
+ Download instructions
+```{r, warning=F, message=F}
+library(dplyr)
+# this will download the file to your working directory
+download.file(url = "https://maven-datasets.s3.amazonaws.com/LEGO+Sets/LEGO+Sets.zip", destfile = "lego.csv.zip")
+# this will unzip the file and read it into R
+lego <- read.table(unz(filename = "lego_sets.csv", "lego.csv.zip"), header = T, sep = ",", quote = "\"", fill = T)
+# this will select rows corresponding to years 2000-2009
+lego <- filter(lego, year %in% seq(2000,2009,1))
+# and here we check the structure of the data
+str(lego)
+```
+
+
+---
+
+## Shark attacks 🦈
+
+- This data set contains information on shark attack records from all over the world.
+- Data description:
+ - Click on "Preview Data" and "VG Data Dictionary" to see the description for each column.
+
+
+ Download instructions
+```{r, warning=F, message=F}
+library(dplyr)
+# this will download the file to your working directory
+download.file(url = "https://maven-datasets.s3.amazonaws.com/Shark+Attacks/attacks.csv.zip", destfile = "attacks.csv.zip")
+# this will unzip the file and read it into R
+sharks <- read.table(unz(filename = "attacks.csv", "attacks.csv.zip"), header = T, sep = ",", quote = "\"", fill = T)
+# the lines below will take a sample from the full data set
+set.seed(seed = 2)
+sharks <- sample_n(sharks, size = 3000, replace = F)
+str(sharks)
+```
+
+
+***
diff --git a/schedule.csv b/schedule.csv
index 0e0dd773..8f6fb454 100644
--- a/schedule.csv
+++ b/schedule.csv
@@ -1,29 +1,37 @@
date;room;start_time;end_time;topic;teacher;assistant;link_slide;link_lab;link_room
-23/10/2023;Tripplet room;09:00;09:15;Welcome;Nima;NR, PA;;;
-;;09:15;09:30;Intro to R;Nima;NR, PA;slide_r_intro.html;;
-;;09:30;10:00;Intro to R programming;Nima;NR, PA;slide_r_programming_1.html;;
-;;10:15;10:45;Intro to R environment;Nima;NR, PA;slide_r_environment.html;;
-;;11:00;12:00;Using Rstudio;Nima;NR, PA;;https://www.dropbox.com/s/3sy4ou2o8jh5syf/RCourseVideo.mov?dl=0;
+28/10/2024;Experimental room;09:00;09:15;Welcome;Nima;;;;
+;;09:15;10:00;Introduction;Nima;;slide_r_intro.html;;
+;;10:00;11:00;Using Rstudio;Nima;;;https://youtu.be/suX6nsSUXDw?si=Vs1e22GU6UJ4Ty7u;
+;;11:00;12:00;Essential: Variable & operators;Nima;;slide_r_elements_1.html;;
;;12:00;13:00;Lunch;;;;;
-;;13:00;15:00;Variables & Operators;Nima;NR, PA, GD;slide_r_elements_1.html;;
-;;15:00;17:00;Data types;Nima;NR, PA, GD;;lab_datatypes.html;
-24/10/2023;Tripplet room;09:00;10:00;Vectors & Strings;Sebastian DiLorenzo;NR, PA, SD, GD;slide_r_elements_2.html;;
-;;10:00;11:00;Matrices, Lists and Dataframes;Prasoon;NR, PA, SD, GD;slide_r_elements_3.html;;
-;;11:00;12:00;Working with Vectors;Sebastian DiLorenzo;NR, PA, SD, GD;;lab_vectors.html;
+;;13:00;13:15;Projects and group discussion;Guilherme;;;;
+;;13:15;14:00;Essential: data types;Guilherme;;;lab_datatypes.html;
+;;14:00;15:00;Essential: Vectors & Strings;Guilherme;;slide_r_elements_2.html;;
+;;15:00;16:00;Essential: Working with Vectors;Guilherme;;;lab_vectors.html;
+;;16:00;17:00;Group discussion on projects;;;;;
+29/10/2024;Experimental room;09:00;10:00;Essential: Matrices, Lists and Dataframes;Guilherme;;slide_r_elements_3.html;;
+;;10:00;11:00;Essential: Matrices, Lists and Dataframes;Guilherme;;;lab_dataframes.html;
+;;11:00;12:00;Loading data into R;Guilherme;;slide_loading_data.html;;
;;12:00;13:00;Lunch;;;;;
-;;13:00;17:00;Working with Matrices, Lists and Dataframes;Prasoon;NR, PA, SD;;lab_dataframes.html;
-25/10/2023;Tripplet room;09:00;10:00;Loading data into R;Sebastian DiLorenzo;NR, PA, GD, SD;slide_loading_data.html;;
-;;10:00;12:00;Loading data into R;Sebastian DiLorenzo;NR, PA, GD, SD;;lab_loadingdata.html;
+;;13:00;15:00;Loading data into R;Guilherme;;;lab_loadingdata.html;
+;;15:00;15:30;Essential: Basic statistics;Nima;;slide_r_basic_statistic.html;;
+;;15:30;16:00;Essential: Basic statistics;Nima;;;;
+;;16:00;17:00;Group discussion on projects;;;;;
+30/10/2024;Experimental room;09:00;10:00;Essential: Loops, Conditionals, Functions;Miguel;;slide_r_elements_4.html;;
+;;10:00;12:00;Essential: Loops, Conditionals, Functions;Miguel;;;lab_loops.html;
;;12:00;13:00;Lunch;;;;;
-;;13:00;14:00;Control Structures, Iteration;Nima;NR, PA, GD;slide_r_elements_4.html;;
-;;14:00;17:00;Loops, Conditionals, Functions;Nima;NR, PA, GD;;lab_loops.html;
-26/10/2023;Tripplet room;09:00;10:00;Base graphics;Prasoon;PA, NR, GD;slide_base_graphics.html;;
-;;10:00;12:00;Base graphics;Prasoon;PA, NR, GD;;lab_graphics.html;
+;;13:00;14:00;Intro to Tidyverse;Marcin;;slide_tidyverse.html;;
+;;14:00;16:00;Intro to Tidyverse;Marcin;;;lab_tidyverse.html;
+;;16:00;17:00;Group discussion on projects;;;;;
+31/10/2024;Experimental room;09:00;10:00;Base graphics;Nima;;slide_base_graphics.html;;
+;;10:00;12:00;Base graphics;Nima;;;lab_graphics.html;
;;12:00;13:00;Lunch;;;;;
-;;13:00;14:00;Intro to Tidyverse;Marcin Kierczak;MK, PA, GD, NR;slide_tidyverse.html;;
-;;14:00;17:00;Intro to Tidyverse;Marcin Kierczak;MK, PA, GD, NR;;lab_tidyverse.html;
-27/10/2023;Tripplet room;09:00;10:00;Graphics using ggplot2;Prasoon;PA, NR;slide_ggplot2.html;;
-;;10:00;11:00;Topic of your interest;Nima/Prasoon;NR, PA;;;
-;;11:00;12:00;Q&A;Nima/Prasoon;NR, PA, MR;;;
+;;13:00;14:00;Graphics using ggplot2;Lokesh;;slide_ggplot2.html;;
+;;14:00;16:00;Working with ggplot2;Lokesh;;;lab_ggplot2.html;
+;;16:00;17:00;Group discussion on projects;;;;;
+1/11/2024;Experimental room;09:00;10:00;Group discussion on projects;;;;;
+;;10:00;12:00;Group discussion on projects;;;;;
+;;11:00;12:00;Group discussion on projects;;;;;
;;12:00;13:00;Lunch;;;;;
-;;13:00;16:00;Working with ggplot2;Prasoon;PA, NR, MR;;lab_ggplot2.html;
\ No newline at end of file
+;;13:00;14:30;Group presentation;;;;;
+;;14:30;15:00;Q & A;;;;;
\ No newline at end of file
diff --git a/slide_loading_data.Rmd b/slide_loading_data.Rmd
index c9c7bdd0..c492ae14 100644
--- a/slide_loading_data.Rmd
+++ b/slide_loading_data.Rmd
@@ -1,6 +1,6 @@
---
title: "Reading (and writing) data in R"
-subtitle: "R Foundations for Life Scientists"
+subtitle: "R Foundations for Data Analysis"
author: "Marcin Kierczak"
keywords: "bioinformatics, course, scilifelab, nbis, R"
output:
@@ -46,15 +46,9 @@ name: reading_data
# Reading data
---
-
-* Reading data is one of the most consuming and most cumbersome aspects of bioinformatics...
-
---
-
-* R provides a number of ways to read and write data stored on different media (file, database, url, twitter, Facebook, etc.) and in different formats.
+* Can be one of the most consuming and cumbersome aspects of data analysis.
---
+* R provides ways to read and write data stored on different media (e.g.: file, database, url) and in different formats.
* Package `foreign` contains a number of functions to import less common data formats.
@@ -63,11 +57,11 @@ name: reading_tables
# Reading tables
-Most often, we will use the `read.table()` function. It is really, really flexible and nice way to read your data into a data.frame structure with rows corresponding to observations and columns to particular variables.
+We can use the `read.table()` function. It is a nice way to read your data into a data frame.
The function is declared in the following way:
-```
+```{r, echo=T, eval=F}
read.table(file, header = FALSE, sep = "", quote = "\"'",
dec = ".", numerals = c("allow.loss", "warn.loss", "no.loss"),
row.names, col.names, as.is = !stringsAsFactors,
@@ -77,28 +71,33 @@ read.table(file, header = FALSE, sep = "", quote = "\"'",
comment.char = "#",
allowEscapes = FALSE, flush = FALSE,
stringsAsFactors = default.stringsAsFactors(),
- fileEncoding = "", encoding = "unknown", text, skipNul = FALSE)*
+ fileEncoding = "", encoding = "unknown", text, skipNul = FALSE)
+
+# or just
+read.table(file)
```
---
name: read_table_params
-# `read.table` parameters
+# `read.table()` parameters
+
+You can read all about the `read.table()` function using `?read.table`
-You can read more about the *read.table* function on its man page, but the most important arguments are:
+The most important arguments are:
-* file – the path to the file that contains data,
-* header – a logical indicating whether the first line of the file contains variable names,
-* sep – a character determining variable delimiter, e.g. comma for csv files,
-* quote – a character telling R which character surrounds strings,
-* dec – acharacter determining the decimal separator,
-* row/col.names – vectors containing row and column names,
-* na.strings – a character used for missing data,
-* nrows – how many rows should be read,
-* skip – how many rows to skip,
-* as.is – a vector of logicals or numbers indicating which columns shall not be converted to factors,
-* fill – add NA to the end of shorter rows,
-* stringsAsFactors – a logical. Rather self explanatory.
+* **file** – the path to the file that contains data, e.g. `/path/to/my/file.csv`
+* **header** – a logical indicating whether the first line of the file contains variable names,
+* **sep** – a character determining variable delimiter, e.g. `","` for csv files,
+* **quote** – a character telling R which character surrounds strings,
+* **dec** – character determining the decimal separator,
+* **row/col.names** – vectors containing row and column names,
+* **na.strings** – a character used for missing data,
+* **nrows** – how many rows should be read,
+* **skip** – how many rows to skip,
+* **as.is** – a vector of logicals or numbers indicating which columns shall not be converted to factors,
+* **fill** – add NA to the end of shorter rows,
+* **stringsAsFactors** – a logical. Rather self explanatory.
---
name: read_table_sibs
@@ -130,37 +129,25 @@ name: handling_errors
# What if you encounter errors?
-* StackOverflow,
+* R documentation `?` and `??`
* Google – just type R and copy the error you got without your variable names,
-* open the file – has the header line the same number of columns as the first line?
-* in Terminal (on Linux/OsX) you can type some useful commands.
+* Open the file using a text editor and see if you can spot anything unusual –
+ * e.g. has the header line the same number of columns as the first line?
--
-# Useful commands for debugging
-
---
+# Useful terminal commands for debugging (Linux/OsX)
* `cat phenos.txt | awk -F';' '{print NF}'` prints the number of words in each row. `-F';'` says that semicolon is the delimiter,
---
-
* `head -n 5 phenos.txt` prints the 5 first lines of the file,
---
-
* `tail -n 5 phenos.txt` prints the 5 last lines of the file,
---
-
* `head -n 5 phenos.txt | tail -n 2` will print lines 4 and 5...
---
-
* `wc -l phenos.txt` will print the number of lines in the file
---
-
* `head -n 2 phenos.txt > test.txt` will write the first 2 lines to a new file
--
@@ -176,19 +163,21 @@ name: writing
# Writing with `write.table()`
-`read.table()` has its counterpart, the `write.table()` function (as well ass its siblings, like write.csv()). You can read more about it in the documentation, let us show some examples:
+`read.table()` has its counterpart, the `write.table()` function (as well ass its siblings, like `write.csv()`). You can read more about it in the documentation, let us show some examples:
```{r write.table, echo=T, eval=F}
vec <- rnorm(10)
write.table(vec, '') # write to screen
write.table(vec, file = 'vector.txt')
+
# write to the system clipboard, handy!
-write.table(vec, 'clipboard', col.names=F,
- row.names=F)
+write.table(vec, 'clipboard', col.names=F, row.names=F)
+
# or on OsX
clip <- pipe("pbcopy", "w")
write.table(vec, file=clip)
close(clip)
+
# To use in a spreadsheet
write.csv(vec, file = 'spreadsheet.csv')
```
@@ -213,12 +202,12 @@ name: read_xls_matlab
```{r xls, eval=F, echo=T}
library(readxl)
-data <- readxl::read_xlsx('myfile.xlsx')
+data <- read_xlsx('myfile.xlsx')
```
```{r matlab, eval=F, echo=T}
library(R.matlab)
-data <- R.matlab::readMat("mydata.mat")
+data <- readMat("mydata.mat")
```
---
diff --git a/slide_r_basic_statistic.Rmd b/slide_r_basic_statistic.Rmd
new file mode 100644
index 00000000..3459694e
--- /dev/null
+++ b/slide_r_basic_statistic.Rmd
@@ -0,0 +1,428 @@
+---
+title: "Brief introduction to statistics"
+subtitle: "Statistics"
+author: "Nima Rafati"
+keywords: bioinformatics, course, scilifelab, nbis, R
+output:
+ xaringan::moon_reader:
+ encoding: 'UTF-8'
+ self_contained: false
+ chakra: 'assets/remark-latest.min.js'
+ css: 'assets/slide.css'
+ lib_dir: libs
+ include: NULL
+ nature:
+ ratio: '4:3'
+ highlightLanguage: r
+ highlightStyle: github
+ highlightLines: true
+ countIncrementalSlides: false
+ slideNumberFormat: "%current%/%total%"
+---
+
+exclude: true
+count: false
+
+```{r,echo=FALSE,child="assets/header-slide.Rmd"}
+```
+
+
+
+```{r setup, include=FALSE}
+knitr::opts_chunk$set(echo=TRUE, width=60)
+```
+
+```{r,include=FALSE}
+# load the packages you need
+#library(dplyr)
+#library(tidyr)
+#library(stringr)
+#library(ggplot2)
+#library(mkteachr)
+```
+
+---
+name: intro
+
+# Introduction
+
+**Why do we need statistics in our analysis?**
+
+- Make data understandable and insightful.
+
+- Evaluate patterns and trends.
+
+- Identify and quantify differences/similarities between groups.
+
+
+--
+
+
+**Types of statistics:**
+
+- Descriptive statistics: To summarize and describe main features of a dataset (Mean, median,...).
+
+- Inferential statistics: To make prediction or inferences about a population using a sample of data (Hypothesis testing, regression analysis,...).
+
+- Predictive statistics: To make predictions about future outcomes based on collected data (Regression models, time series forecasting, machine learning,...).
+
+- ......
+
+
+---
+name: Descriptive
+# Types of Descriptive Statistics
+
+Descriptive statistics helps to:
+
+- Summarize and describe the data.
+
+- Visualize the data.
+
+- Identify patterns (trends) and outliers in the data.
+
+- Provide insights for downstream-analysis.
+
+---
+name: SomeStats
+# Some of the basic descriptive statistics
+
+1. **Measures of Central Tendency**
+ - Mean, Median, Mode.
+2. **Measures of Spread**
+ - Range, Interquartile Range, Standard Deviation, Variance.
+3. **Correlation**
+ - Relation between two variables (e.g. Pearson's correlation).
+
+---
+name: Mean
+# Central Tendency: Mean
+- Mean: The average value of data.
+$$
+\bar{x} = \frac{1}{n} \sum_{i=1}^{n} x_i
+$$
+
+```{r Mean, eval = T, echo = F, fig.width = 10, fig.height=4}
+set.seed(123)
+par(mfrow = c(1, 2), mar = c(5, 4, 4, 2) + 0.1)
+data <- data.frame( var1 = rgamma(10000, shape = 2, scale = 2) * 12,
+ var2 = rnorm(10000, mean = 100, sd = 20))
+hist(data$var1,breaks = 50, main = 'var1 distribution', xlab = 'var1', col = 'skyblue', freq = TRUE)
+var1_mean = mean(data$var1)
+# Mean
+abline(v = var1_mean, col = 'red', lwd = 2)
+text(x = var1_mean + 10 , y = 700, labels = paste("Mean =", round(var1_mean, 2)), pos = 4, col = 'red', cex = 0.8)
+
+hist(data$var2,breaks = 50, main = 'var2 distribution', xlab = 'var2', col = 'skyblue', freq = TRUE)
+var2_mean = mean(data$var2)
+var2_median = median(data$var2)
+# Mean
+abline(v = var2_mean, col = 'red', lwd = 2)
+text(x = var2_mean + 10 , y = 700, labels = paste("Mean =", round(var2_mean, 2)), pos = 4, col = 'red', cex = 0.8)
+```
+
+```{r mean, eval = T, echo = T}
+mean(data$var1)
+mean(data$var2)
+```
+
+
+---
+name: Median
+
+# Central Tendency: Median
+
+- Median: The middle value with the data is sorted.
+```{r Median, eval = T, echo = F, fig.width = 10, fig.height=5}
+par(mfrow=c(1,2))
+hist(data$var1,breaks = 50, main = 'var1 distribution', xlab = 'var1', col = 'skyblue', freq = TRUE)
+var1_mean = mean(data$var1)
+var1_median = median(data$var1)
+# Mean
+abline(v = var1_mean, col = 'red', lwd = 2)
+text(x = var1_mean + 10 , y = 400, labels = paste("Mean =", round(var1_mean, 2)), pos = 4, col = 'red')
+# Median
+abline(v = var1_median, col = 'green', lwd = 2)
+text(x = var1_median + 10 , y = 500, labels = paste("Median =", round(var1_median, 2)), pos = 4, col = 'green')
+
+hist(data$var2,breaks = 50, main = 'var2 distribution', xlab = 'var2', col = 'skyblue', freq = TRUE)
+var2_mean = mean(data$var2)
+var2_median = median(data$var2)
+# Mean
+abline(v = var2_mean, col = 'red', lwd = 2)
+text(x = var2_mean + 30 , y = 400, labels = paste("Mean =", round(var2_mean, 2)), pos = 4, col = 'red')
+# Median
+abline(v = var2_median, col = 'green', lwd = 2)
+text(x = var2_median + 30 , y = 600, labels = paste("Median =", round(var2_median, 2)), pos = 4, col = 'green')
+```
+
+```{r}
+median(data$var1)
+median(data$var2)
+```
+
+---
+name: Mode
+# Central Tendency: Mode
+
+- Mode: The most frequently occurring value.
+```{r Mode-plot, eval = T, echo = F, fig.width = 10, fig.height=5}
+par(mfrow=c(1,2))
+hist(data$var1,breaks = 50, main = 'var1 distribution', xlab = 'var1', col = 'skyblue', freq = TRUE)
+var1_mean = mean(data$var1)
+var1_median = median(data$var1)
+# Mean
+abline(v = var1_mean, col = 'red', lwd = 2)
+text(x = var1_mean + 10 , y = 400, labels = paste("Mean =", round(var1_mean, 2)), pos = 4, col = 'red')
+# Median
+abline(v = var1_median, col = 'green', lwd = 2)
+text(x = var1_median + 10 , y = 500, labels = paste("Median =", round(var1_median, 2)), pos = 4, col = 'green')
+# Mode
+density_data <- density(data$var1)
+var1_mode <- density_data$x[which.max(density_data$y)]
+abline(v = var1_mode, col = 'purple', lwd = 2)
+text(x = var1_mode + 10 , y = 600, labels = paste("Mode =", round(var1_mode, 2)), pos = 4, col = 'purple')
+
+
+hist(data$var2,breaks = 50, main = 'var2 distribution', xlab = 'var2', col = 'skyblue', freq = TRUE)
+var2_mean = mean(data$var2)
+var2_median = median(data$var2)
+# Mean
+abline(v = var2_mean, col = 'red', lwd = 2)
+text(x = var2_mean + 30 , y = 400, labels = paste("Mean =", round(var2_mean, 2)), pos = 4, col = 'red')
+# Median
+abline(v = var2_median, col = 'green', lwd = 2)
+text(x = var2_median + 30 , y = 600, labels = paste("Median =", round(var2_median, 2)), pos = 4, col = 'green')
+# Mode
+density_data <- density(data$var2)
+var2_mode <- density_data$x[which.max(density_data$y)]
+abline(v = var2_mode, col = 'purple', lwd = 2)
+text(x = var2_mode - 90 , y = 600, labels = paste("Mode =", round(var2_mode, 2)), pos = 4, col = 'purple')
+```
+
+```{r mode, echoo = T, eval = T}
+mode(data$var1)
+mode(data$var2)
+```
+---
+name: Spread
+# Measures of spread: Range and Interquartile Range.
+- Range: Difference between maximum `max(data$var2)` and minimum `min(data$var2)`.
+- Interquartile Range: Data is represented in four equally sized groups (bins) known as **Quartile** and the distance between quartile is called **Interquartile Range** (IQR).
+
+```{r range, echo = F, eval = T}
+# Sample data
+set.seed(123)
+data_quartile <- c(24, 30, 33, 45, 47, 58, 60, 66, 70)
+
+# Calculate min, Q1, Q2 (median), Q3, max, IQR, and range
+min_val <- min(data_quartile)
+q1 <- quantile(data_quartile, 0.25)
+median_val <- median(data_quartile)
+q3 <- quantile(data_quartile, 0.75)
+max_val <- max(data_quartile)
+iqr_val <- IQR(data_quartile)
+range_val <- max_val - min_val
+
+# Plot the main line and quartiles
+plot(c(1, 9), c(0, 1), type = "n", xlab = "", ylab = "", xaxt = "n", yaxt = "n", bty = "n")
+
+# Main line (the range of the data)
+segments(1, 0.5, 9, 0.5, lwd = 2)
+
+# Draw vertical lines at min, Q1, median (Q2), Q3, max
+segments(1, 0.45, 1, 0.55, lwd = 2) # Min
+segments(3, 0.45, 3, 0.55, lwd = 2, col = "orange") # Q1
+segments(5, 0.45, 5, 0.55, lwd = 2, col = "red") # Q2 (Median)
+segments(7, 0.45, 7, 0.55, lwd = 2, col = "orange") # Q3
+segments(9, 0.45, 9, 0.55, lwd = 2) # Max
+
+# Add the values on top
+text(1, 0.6, min_val, cex = 1)
+text(3, 0.6, q1, cex = 1)
+text(5, 0.6, median_val, cex = 1, col = "red")
+text(7, 0.6, q3, cex = 1)
+text(9, 0.6, max_val, cex = 1)
+
+# Add labels for Min, Q1, Q2, Q3, Max
+text(1, 0.4, "Min", cex = 1, col = "blue")
+text(3, 0.4, "Q1", cex = 1, col = "blue")
+text(5, 0.4, "Q2", cex = 1, col = "blue")
+text(7, 0.4, "Q3", cex = 1, col = "blue")
+text(9, 0.4, "Max", cex = 1, col = "blue")
+
+# Add the IQR and Range arrows and labels
+arrows(3, 0.3, 7, 0.3, length = 0.1)
+text(5, 0.25, paste("IQR = Q3 - Q1 =", round(iqr_val, 2)), cex = 1)
+
+arrows(1, 0.2, 9, 0.2, length = 0.1)
+text(5, 0.15, paste("Range = Max - Min =", range_val), cex = 1)
+
+```
+
+---
+name: Variance
+# Measures of spread: Variance
+
+- Variance: How far the data points are spread out from the mean. Unit is the square of the data's unit (e.g. $cm^2$ ).
+
+$$
+\sigma^2 = \frac{1}{n} \sum_{i=1}^{n} (x_i - \bar{x})^2
+$$
+```{r var, echo=TRUE, eval=TRUE}
+var(data$var2)
+```
+---
+name: Stdev
+# Measures of spread: Standard deviation
+
+- Standard deviation (sd): is the square root of the variance and provides a more intuitive measure of spread. Despite of variance, sd has the same unit as the data (e.g. cm).
+
+$$
+\sigma =\sqrt{\sigma2}
+$$
+
+```{r sd-plot,echo = F, eval = T}
+var2_sd <- sd(data$var2)
+hist(data$var2,breaks = 50, main = 'var2 distribution', xlab = 'var2', col = 'skyblue', freq = TRUE, ylim = c(0,1200))
+abline(v = var2_mean, col = 'red', lwd = 2)
+rect(var2_mean - var2_sd, 0, var2_mean + var2_sd, 1100, col = rgb(0.9, 0.9, 0.9, 0.5), border = NA)
+rect(var2_mean - 2*var2_sd, 0, var2_mean - var2_sd, 1100, col = rgb(0.7, 0.7, 0.7, 0.5), border = NA)
+rect(var2_mean + 2*var2_sd, 0, var2_mean + var2_sd, 1100, col = rgb(0.7, 0.7, 0.7, 0.5), border = NA)
+
+rect(var2_mean - 3*var2_sd, 0, var2_mean - 2*var2_sd, 1100, col = rgb(0.5, 0.5, 0.5, 0.5), border = NA)
+rect(var2_mean + 3*var2_sd, 0, var2_mean + 2*var2_sd, 1100, col = rgb(0.5, 0.5, 0.5, 0.5), border = NA)
+
+text(x = var2_mean - 1 , y = 1200, labels = expression(bar(x)), pos = 4, col = 'red', cex = 0.8)
+text(x = var2_mean + 5, y = 1100, labels = expression(bar(x) + sd), pos = 4, col = 'black', cex = 0.8)
+text(x = var2_mean - var2_sd , y = 1100, labels = expression(bar(x) - sd), pos = 4, col = 'black', cex = 0.8)
+
+text(x = var2_mean + 2*var2_sd - 15 , y = 1100, labels = expression(bar(x) + 2*sd), pos = 4, col = 'black', cex = 0.7)
+text(x = var2_mean - 2*var2_sd , y = 1100, labels = expression(bar(x) - 2*sd), pos = 4, col = 'black', cex = 0.7)
+
+text(x = var2_mean + 3*var2_sd - 15 , y = 1100, labels = expression(bar(x) + 3*sd), pos = 4, col = 'black', cex = 0.7)
+text(x = var2_mean - 3*var2_sd , y = 1100, labels = expression(bar(x) - 3*sd), pos = 4, col = 'black', cex = 0.7)
+
+```
+---
+name: correlation
+# Correlation
+
+- Measuring the strength and direction of the **linear** relationship between two variables.
+
+ - Positive Correlation: As one variable increases, the other also increases.
+
+ - Negative Correlation: As one variable increases, the other decreases.
+
+ - No Correlation: No directional relationship between the variables.
+
+---
+name: Pearson
+# Types of correlation
+- Pearson's correlation coefficient: Correlation of two **continuous** variables.
+- Assumptions:
+ - Linear relationship.
+ - Normally distributed variables.
+
+```{r pearson,echo = F, eval = T, fig.width=10, fig.height=5}
+set.seed(123)
+
+# Generate data for perfect positive correlation
+x_pos <- seq(1, 100, length.out = 100)
+y_pos <- x_pos + rnorm(100, mean = 0, sd = 1) # adding a tiny bit of noise for realism
+
+# Generate data for perfect negative correlation
+x_neg <- seq(1, 100, length.out = 100)
+y_neg <- -x_neg + rnorm(100, mean = 0, sd = 1)
+
+# Generate data for no correlation
+x_none <- seq(1, 100, length.out = 100)
+y_none <- rnorm(100, mean = 50, sd = 20)
+
+# Combine all datasets into a data frame
+data <- data.frame(
+ x_pos = x_pos,
+ y_pos = y_pos,
+ x_neg = x_neg,
+ y_neg = y_neg,
+ x_none = x_none,
+ y_none = y_none
+)
+
+# Plot the data to visualize the correlations
+par(mfrow = c(1, 3), mar = c(5, 4, 4, 5) + 0.1)
+
+# Positive correlation
+plot(data$x_pos, data$y_pos, main = paste0("Positive (r =", round(cor(data$x_pos, data$y_pos), digits = 4), ")"), xlab = "X", ylab = "Y", col = "blue", pch = 19)
+abline(lm(data$y_pos ~ data$x_pos), col = "red", lwd = 2)
+
+# Negative correlation
+plot(data$x_neg, data$y_neg, main = paste0("Negative (r =", round(cor(data$x_neg, data$y_neg), digits = 4), ")"), xlab = "X", ylab = "Y", col = "blue", pch = 19)
+abline(lm(data$y_neg ~ data$x_neg), col = "red", lwd = 2)
+
+# No correlation
+plot(data$x_none, data$y_none, main = paste0("No Correlation (r =", round(cor(data$x_none, data$y_none), digits = 2), ")"), xlab = "X", ylab = "Y", col = "blue", pch = 19)
+abline(lm(data$y_none ~ data$x_none), col = "red", lwd = 2)
+
+```
+
+$$
+r = \frac{\sum (x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum (x_i - \bar{x})^2 \sum (y_i - \bar{y})^2}}
+$$
+
+
+---
+name: Spearman
+# Types of correlation
+- Spearman's rank correlation coefficient: Measures the monotonic relationship between two **ranked** variables.
+- Assumptions:
+ - It is a non-parametric approach and does not require the data to be linearly correlated.
+ - The data is not normally distributed.
+ - For both conrinuous and ordinal (categorical) variables.
+```{r spearman,echo = F, eval = T, fig.width=8, fig.height=4}
+# Create the ordinal dataset
+data_ordinal <- data.frame(
+ Satisfaction = c(5, 4, 3, 2, 1, 4, 5, 2, 3, 1),
+ Performance = c(9, 8, 7, 3, 2, 6, 10, 1, 5, 4)
+)
+
+# Calculate Spearman's rank correlation
+spearman_corr <- cor(data_ordinal$Satisfaction, data_ordinal$Performance, method = "spearman")
+
+# Plot to visualize the relationship
+plot(data_ordinal$Satisfaction, data_ordinal$Performance,
+ xlab = "Satisfaction (Ordinal)",
+ ylab = "Performance (Rank)",
+ main = paste("Spearman's Correlation =", round(spearman_corr, 2)),
+ pch = 19, col = "blue")
+
+# Add a line to show the trend
+abline(lm(data_ordinal$Performance ~ data_ordinal$Satisfaction), col = "red", lwd = 2)
+
+```
+
+$$
+\rho = 1 - \frac{6 \sum d_i^2}{n(n^2 - 1)}
+$$
+---
+name: closing
+# More on statistics?
+- We discussed about very basic descriptive statistical measures.
+- You can read more [here](https://nbisweden.github.io/workshop-mlbiostatistics/session-descriptive/docs/index.html).
+
+
+---
+name: end_slide
+class: end-slide, middle
+count: false
+
+# See you at the next lecture!
+```{r, echo=FALSE,child="assets/footer-slide.Rmd"}
+```
+
+```{r,include=FALSE,eval=FALSE}
+# manually run this to render this document to HTML
+#rmarkdown::render("presentation_demo.Rmd")
+# manually run this to convert HTML to PDF
+#pagedown::chrome_print("presentation_demo.html",output="presentation_demo.pdf")
+```
diff --git a/slide_r_elements_1.Rmd b/slide_r_elements_1.Rmd
index c3a36ee4..0c8696c8 100644
--- a/slide_r_elements_1.Rmd
+++ b/slide_r_elements_1.Rmd
@@ -1,7 +1,7 @@
---
title: "Variables, Data types & Operators"
subtitle: "Elements of the R language"
-author: "Marcin Kierczak"
+author: "Marcin Kierczak & Nima Rafati"
keywords: bioinformatics, course, scilifelab, nbis, R
output:
xaringan::moon_reader:
@@ -210,7 +210,7 @@ class(x)
is.integer(x)
```
-> We need **casting** because sometimes a function requires data of some type!
+> We need **casting** because sometimes a function requires data of certain type!
---
name: casting2
diff --git a/slide_r_intro.Rmd b/slide_r_intro.Rmd
index e9ba2404..baa159d8 100644
--- a/slide_r_intro.Rmd
+++ b/slide_r_intro.Rmd
@@ -1,7 +1,7 @@
---
title: "Introduction to R"
-subtitle: "R Foundations for Life Scientists"
-author: "Marcin Kierczak"
+subtitle: "R Foundations for Data Analysis"
+author: "Marcin Kierczak and Nima Rafati"
keywords: bioinformatics, course, scilifelab, nbis, R
output:
xaringan::moon_reader:
@@ -34,7 +34,7 @@ count: false
#library(tidyr)
#library(stringr)
#library(ggplot2)
-library(mkteachr)
+#library(mkteachr)
```
---
@@ -44,10 +44,11 @@ class: spaced
# Contents
* [About R](#about)
-* [Timeline](#timeline)
-* [Ideas behind R](#ideas)
* [Pros and cons of R](#pros_and_cons)
* [Ecosystem of packages](#num_packages)
+* [Programming language](#programming_language)
+* [Packages](#packages)
+* [Package installation](#pkg_cran_inst)
---
name: about
@@ -98,123 +99,177 @@ name: about
---
name: timeline
-# Timeline
+---
+name: pros_and_cons
+class: spaced
+
+# Pros and cons
+
+ steep learning curve
--
+ uniform, clear and clean system of documentation and help
-.pull-left-50[
+--
+ difficulties due to a limited object-oriented programming capabilities,
+e.g. an agent-based simulation is a challenge
-![](data/slide_intro/Ihaka_and_Gentleman.jpg)
+--
+ good interconnectivity with compiled languages like Java or C
-* ca. 1992 — conceived by [Robert Gentleman](https://bit.ly/35kn99L) and [Ross Ihaka](https://en.wikipedia.org/wiki/Ross_Ihaka) (R&R) at the University of Auckland, NZ as a tool for **teaching statistics**
+--
+ cannot order a pizza for you (?)
-* 1994 — initial version
-* 2000 — stable version
+--
+ a very powerful ecosystem of packages
-]
+--
+ free and open source, GNU GPL and GNU GPL 2.0
--
+ easy to generate high quality graphics
-.pull-right-50[
+---
+name: programming_language
-![](data/slide_intro/jjallaire_siliconangle_com.jpg)
+# Programing Language
-* 2011 — [RStudio](https://en.wikipedia.org/wiki/RStudio), first release by J.J. Allaire
+--
+> Programming is a process of instructing a computer to perform a specific task. We write these instructions by **programming language**. It can be as simple as calculation (like a calculator) or complex applications.
-![](data/slide_intro/hadley-wickham.jpg)
+--
-* ca. 2017 — Tidyverse by [Hadley Wickham](https://en.wikipedia.org/wiki/Hadley_Wickham)
-]
+ * flow of _data_
+--
----
-name: ideas
+ * Data is collected information which qualitatively and/or quantitatively describe an entity.
+--
-# Ideas behind R
+ * Data is collected from quite diverse sources (data types).
+--
+
+ * Data processing.
+--
-* open-source solution — fast development
+ * Data cleaning.
--
-* based on the [S language](https://en.wikipedia.org/wiki/S_%28programming_language%29) created at the Bell Labs by [John Mc Kinley Chambers](https://bit.ly/2RhDqUx) to
+```{r,out.width="75%",fig.align='center',echo=FALSE}
+knitr::include_graphics("data/slide_programming/Data_Information_Knowledge.png")
+```
+---
+# Programing Language cted.
-> *turn ideas into software, quickly and faithfully*
+--
+ * from one _function_ to another
--
-* [lexical scope](https://en.wikipedia.org/wiki/Scope_%28computer_science%29%23Lexical_scoping) inspired by [Lisp](https://en.wikipedia.org/wiki/Lisp) syntax
+ * Function is a **reusable** chunk of code that performs a task. It takes **inputs** as well as **arguments** to process.
+--
+ * each function does something to the data and return output(s)
+--
+
+ * For example `mean()`, `min()`
--
+---
+# Three things to think about
-* since 1997 developed by the R Development Core Team (ca. 20 experts, with Chambers onboard; 6 are active)
+ * what *types* of data can I process?
--
-* overviewed by [The R Foundation for Statistical Computing](https://www.r-project.org/foundation/)
+ * how do I *write* what I want?
----
-name: packages
+--
-# Packages
+ * when does it *mean* anything?
-.pull-right-50[
-```{r, out.width="250pt", fig.align='center', echo=FALSE}
-knitr::include_graphics("data/slide_intro/packages.jpg")
+
+---
+# Data type
+
+```{r,out.width="75%",fig.align='center',echo=FALSE}
+knitr::include_graphics("data/slide_programming/Data_classification.png")
```
-]
+---
+
+# Three components of a language
--
-* developed by the community
+ * what *types* of data can I process — *type system*
--
-* cover several very diverse areas of science/life
+ * int — 1 2 5 9
+ * double — 1.23 -5.74
+ * char — a b test 7 9
+ * logical — TRUE/FALSE (T/F)
+
--
-* uniformely structured and documented
+ * how do I *write* what I want — *syntax* defined by a language *grammar*
+
+ `2 * 1 + 1` vs. `(+ (* 2 1) 1)`
--
-* organised in repositiries:
- + [CRAN](https://cran.r-project.org)
- + [R-Forge](https://r-forge.r-project.org)
- + [Bioconductor](http://www.bioconductor.org)
- + [GitHub](https://github.com)
+ * when does it *mean* anything — *semantics*
+
+--
+
+ * *Colorful yellow train sleeps on a crazy wave.* — has no generally accepted meaning
+ * *There is $500 on his empty bank acount.* — internal contradiction
---
-name: pros_and_cons
-class: spaced
+name: topic2
-# Pros and cons
+# Where to start?
- steep learning curve
---
- uniform, clear and clean system of documentation and help
+*Divide et impera* — divide and rule.
---
- difficulties due to a limited object-oriented programming capabilities,
-e.g. an agent-based simulation is a challenge
+**Top-down approach:** define the big problem and split it into smaller ones. Assume you have solution to the small problems and continue — push the responsibility down.
+Wishful thinking!
---
- good interconnectivity with compiled languages like Java or C
+---
+
+name: packages
+
+# Packages
+
+.pull-right-50[
+```{r, out.width="250pt", fig.align='center', echo=FALSE}
+knitr::include_graphics("data/slide_intro/packages.jpg")
+```
+]
--
- cannot order a pizza for you (?)
+
+* developed by the community
--
- a very powerful ecosystem of packages
+
+* cover several very diverse areas of science/life
--
- free and open source, GNU GPL and GNU GPL 2.0
+
+* uniformly structured and documented
--
- easy to generate high quality graphics
+
+* organised in repositiries:
+ + [CRAN](https://cran.r-project.org)
+ + [R-Forge](https://r-forge.r-project.org)
+ + [Bioconductor](http://www.bioconductor.org)
+ + [GitHub](https://github.com)
---
name: num_packages
-
# Ecosystem of R packages
@@ -232,6 +287,148 @@ gg
+
+---
+name: work_with_packages
+
+# Working with packages
+
+Packages are organised in repositories. The three main repositories are:
+
+* [CRAN](https://cran.r-project.org)
+* [R-Forge](http://r-forge.r-project.org)
+* [Bioconductor](http://www.bioconductor.org)
+
+We also have [GitHub](https://github.com).
+
+--
+# Working with packages -- CRAN example.
+
+```{r,out.width="80%",fig.align='center',echo=FALSE}
+knitr::include_graphics("data/slide_r_environment/ggplot2_CRAN.png")
+```
+
+---
+name: pkg_cran_inst
+
+# Working with packages -- installation
+
+Only a few packages are pre-installed:
+
+```{r pkg.err.ex,eval=TRUE,error=TRUE}
+library(XLConnect)
+```
+
+In order to install a package from command line, use:
+
+```{r pkg.inst,eval=FALSE}
+install.packages("ggplot2",dependencies=TRUE)
+```
+
+---
+name: work_pkg_details
+
+# Working with packages -- details
+
+It may happen that you want to also specify the repository, e.g. because it is geographically closer to you or because your default mirror is down:
+
+```{r pkg.inst.repo,eval=FALSE}
+install.packages('ggplot2',dependencies=TRUE,repos="http://cran.se.r-project.org")
+```
+
+But, sometimes, this does not work either because the package is not available for your platform. In such case, you need to *compile* it from its *source code*.
+
+---
+name: work_pkg_details2
+
+# Working with packages -- details cted.
+```{r,out.width="150%",fig.align='center',echo=FALSE}
+knitr::include_graphics("data/slide_r_environment/ggplot2_CRAN.png")
+```
+
+---
+name: source_pkg_inst
+
+# Working with packages -- installing from source.
+
+- Download the source file, in our example *ggplot2_3.4.3.tar.gz*.
+- Install it:
+
+```{r pkg.inst.src,eval=FALSE}
+install.packages("path/to/ggplot2_3.4.3.tar.gz",
+ repos=NULL,
+ type='source',
+ dependencies=TRUE)
+```
+
+- Load it:
+
+```{r pkg.load,eval=FALSE}
+library('ggplot2') # always forces reloading
+require('ggplot2') # load only if not already loaded
+```
+
+- Enjoy!
+
+---
+name: pkg_github
+
+# Packages -- GitHub
+
+Nowadays, more and more developers distribute their packages via GitHub. The easiest way to install packages from the GitHub is via the *devtools* package:
+
+- Install the *devtools* package.
+- Load it.
+- Install.
+- Enjoy!
+
+```{r pkg.inst.devtools.github,eval=FALSE}
+install.packages('devtools',dependencies=TRUE)
+library('devtools')
+install_github('talgalili/installr')
+```
+
+---
+name: pkg_bioconductor
+
+# Packages -- Bioconductor
+
+```{r,out.width="200pt",fig.align='center',echo=FALSE}
+knitr::include_graphics("data/slide_r_environment/logo_bioconductor.png")
+```
+
+First install Bioconductor Manager:
+
+```{r inst.biocond,eval=FALSE}
+if (!requireNamespace("BiocManager",quietly = TRUE))
+ install.packages("BiocManager")
+```
+
+---
+name: pkg_bioconductor2
+
+# Packages -- Bioconductor cted.
+
+Now, you can install particular packages from Bioconductor:
+
+```{r biocond.inst.pkg,eval=FALSE}
+BiocManager::install("GenomicRanges")
+```
+
+For more info, visit [Bioconductor website](http://www.bioconductor.org/install/).
+
+---
+# One package to rule them all -- the magic of `renv`
+
+- first time do `renv::activate()` and `renv::init()`
+- while working: `renv::hydrate()` and `renv::snapshot()`
+
+Now, send `renv.lock` to your friend to share the environment and she can:
+
+- restore the environment `renv::restore()`
+
+**Pure magic!**
+
---
@@ -240,8 +437,7 @@ class: end-slide, middle
count: false
# Thank you! Questions?
-
-```{r,echo=FALSE,child="assets/footer-slide.Rmd"}
+```{r, echo=FALSE,child="assets/footer-slide.Rmd"}
```
```{r,include=FALSE,eval=FALSE}