diff --git a/02_getting_started_with_r.qmd b/02_getting_started_with_r.qmd index 0ace3a0..9870ed1 100644 --- a/02_getting_started_with_r.qmd +++ b/02_getting_started_with_r.qmd @@ -272,16 +272,68 @@ A shorter way of writing this is to use `?` before the name of the function. After you run the code, the help page is displayed in the "Help" tab in the "Files-Plots-Packages" pane (usually in the bottom right of **[R]{.sans-serif}**Studio). -Help pages may seem arcane to the novice user, probably because they aim for shortness and use a lot of technical jargon. But this short jargon makes the explanations precise (most of them), so we can get the information we need without having to read the entire document. Also, all help pages are organized similarly, so we don't have to relearn how to navigate them. So, with a bit of practice, you will be able to find exactly what you need in mere seconds. +Help pages may seem arcane to the novice user, probably because they aim for shortness and use a lot of technical jargon. But this short jargon makes (most of) the explanations precise, so we can use the information we need without having to read the entire document. Also, all help pages are organized similarly, so we don't have to relearn how to navigate them. So, with a bit of practice, you will be able to find exactly what you need in mere seconds. + +The first line of the help document displays the name of the function and the package that contains the function. Other sections are: + ++ **Description**: a short description of the function. + ++ **Usage**: names the arguments associated with the function and possible default values. + ++ **Arguments**: expounds each argument and what they do. + ++ **Details**: a more detailed description of the function. + ++ **Value**: if applicable, gives the type and structure of the object returned by the function or the operator. + ++ **See Also**: leads to other help pages with similar or related content. + ++ **Examples**: code examples on how to use the function. To see how they work, we just need to copy and paste them into the console. We can also access examples at any time by using the `example()` function (i.e. `example("round")`). + +The `help()` function is useful if we know the name of the function. But if all we remember is a key word in the name, we can search through **[R]{.sans-serif}**'s help system using `help.search()` + +```{r} +#| eval: false +help.search("round") +``` + +Or we can use the shortcut `??` + +```{r} +#| eval: false +??round +``` + +As before, the 'Help' tab in RStudio will display the results of the search. `help.search()` searches through the help documentation, code demonstrations, and package vignettes and displays the results as clickable links that we can follow. + +Another useful function is `apropos()`. This function can be used to list all functions containing a specified character string. For example, to find all functions with `sum` in their name. + +```{r} +apropos("round") +``` + +If we find the function we need, we can look for its documentation. + +```{r} +#| eval: false +help("round.Date") +``` + +Another useful function is `RSiteSearch()`, which allows us to search for keywords and phrases in function help pages and vignettes for *all* CRAN packages, and in CRAN task views. So, we can access the [online search engine](https://www.r-project.org/search.html) directly from the Console and display the results in our web browser. + +```{r} +#| eval: false +RSiteSearch("regression") +``` :::callout-tip -## How to learn to read help pages? -Start by reading the help pages of functions that you can already understand. This will teach you how to understand the structure of the pages and will familiarize you with many technical terms. As you use **[R]{.sans-serif}** you will likely need other, more complicated functions, so reading more help pages will happen almost naturally. +## How to get started with help pages? +Start by reading the help pages of functions that you already understand. This will teach you how to understand the structure of the pages and will familiarize you with the jargon. As you use **[R]{.sans-serif}** you will likely need other, more complicated functions, so reading more help pages will happen almost naturally. Just keep in mind that help pages are about code, not about the underlying concepts. If you don't know what it means to round a number, reading the documentation for `round()` will not help you. ::: -So far we have been working with only a single number at the time. But **[R]{.sans-serif}** can also work with groups of numbers by using something called "vectors". +Now that we know how to get help, we can move on to more advanced stuff. Previously we worked with one number at the time. But we can also work with groups of numbers by using something called "vectors". ## Working with vectors @@ -306,7 +358,7 @@ Note that we must enclose words in quotation marks to let **[R]{.sans-serif}** k vector_of_words <- c(monday, lemon) ``` -Later on we will work more with vectors of words, but for now let's focus on numberical vectors. +Later on we will work more with vectors of words, but for now let's focus on numerical vectors. ### Operations with numerical vectors diff --git a/docs/02_getting_started_with_r.html b/docs/02_getting_started_with_r.html index 37c2c42..234460d 100644 --- a/docs/02_getting_started_with_r.html +++ b/docs/02_getting_started_with_r.html @@ -601,82 +601,116 @@

?round

After you run the code, the help page is displayed in the “Help” tab in the “Files-Plots-Packages” pane (usually in the bottom right of RStudio).

-

Help pages may seem arcane to the novice user, probably because they aim for shortness and use a lot of technical jargon. But this short jargon makes the explanations precise (most of them), so we can get the information we need without having to read the entire document. Also, all help pages are organized similarly, so we don’t have to relearn how to navigate them. So, with a bit of practice, you will be able to find exactly what you need in mere seconds.

+

Help pages may seem arcane to the novice user, probably because they aim for shortness and use a lot of technical jargon. But this short jargon makes (most of) the explanations precise, so we can use the information we need without having to read the entire document. Also, all help pages are organized similarly, so we don’t have to relearn how to navigate them. So, with a bit of practice, you will be able to find exactly what you need in mere seconds.

+

The first line of the help document displays the name of the function and the package that contains the function. Other sections are:

+ +

The help() function is useful if we know the name of the function. But if all we remember is a key word in the name, we can search through R’s help system using help.search()

+
+
help.search("round")
+
+

Or we can use the shortcut ??

+
+
??round
+
+

As before, the ‘Help’ tab in RStudio will display the results of the search. help.search() searches through the help documentation, code demonstrations, and package vignettes and displays the results as clickable links that we can follow.

+

Another useful function is apropos(). This function can be used to list all functions containing a specified character string. For example, to find all functions with sum in their name.

+
+
apropos("round")
+
+
[1] "round"        "round.Date"   "round.POSIXt"
+
+
+

If we find the function we need, we can look for its documentation.

+
+
help("round.Date")
+
+

Another useful function is RSiteSearch(), which allows us to search for keywords and phrases in function help pages and vignettes for all CRAN packages, and in CRAN task views. So, we can access the online search engine directly from the Console and display the results in our web browser.

+
+
RSiteSearch("regression")
+
-How to learn to read help pages? +How to get started with help pages?
-

Start by reading the help pages of functions that you can already understand. This will teach you how to understand the structure of the pages and will familiarize you with many technical terms. As you use R you will likely need other, more complicated functions, so reading more help pages will happen almost naturally.

+

Start by reading the help pages of functions that you already understand. This will teach you how to understand the structure of the pages and will familiarize you with the jargon. As you use R you will likely need other, more complicated functions, so reading more help pages will happen almost naturally.

Just keep in mind that help pages are about code, not about the underlying concepts. If you don’t know what it means to round a number, reading the documentation for round() will not help you.

-

So far we have been working with only a single number at the time. But R can also work with groups of numbers by using something called “vectors”.

+

Now that we know how to get help, we can move on to more advanced stuff. Previously we worked with one number at the time. But we can also work with groups of numbers by using something called “vectors”.

2.6 Working with vectors

In R, an ordered group of numbers is called a vector. To create a vector of numbers, we need to use the function c() (short for “combine”). The arguments of c() are the numbers you want to use in the vector, in the order you want to use them.

-
my_vec <- c(5, 3, 7, 1, 1, 8)
-my_vec
+
my_vec <- c(5, 3, 7, 1, 1, 8)
+my_vec
[1] 5 3 7 1 1 8

Vectors can also contain other types of data, like words

-
vector_of_words <- c("monday", "lemon")
-vector_of_words
+
vector_of_words <- c("monday", "lemon")
+vector_of_words
[1] "monday" "lemon" 

Note that we must enclose words in quotation marks to let R know that we want to use “monday” and “lemon” as values instead of the names of objects. Look at what happens if we forget the quotation marks:

-
vector_of_words <- c(monday, lemon)
+
vector_of_words <- c(monday, lemon)
Error in eval(expr, envir, enclos): object 'monday' not found
-

Later on we will work more with vectors of words, but for now let’s focus on numberical vectors.

+

Later on we will work more with vectors of words, but for now let’s focus on numerical vectors.

2.6.1 Operations with numerical vectors

R is a “vectorized” language, which means that it can often operate on an entire vector of numbers as easily as on a single number. All the logical and mathematical functions we used before work with vectors:

-
my_vec > 2
+
my_vec > 2
[1]  TRUE  TRUE  TRUE FALSE FALSE  TRUE
-
my_vec <= 7
+
my_vec <= 7
[1]  TRUE  TRUE  TRUE  TRUE  TRUE FALSE
-
my_vec + 7
+
my_vec + 7
[1] 12 10 14  8  8 15
-
my_vec / 11
+
my_vec / 11
[1] 0.45454545 0.27272727 0.63636364 0.09090909 0.09090909 0.72727273
-
sqrt(my_vec)
+
sqrt(my_vec)
[1] 2.236068 1.732051 2.645751 1.000000 1.000000 2.828427
-
factorial(my_vec)
+
factorial(my_vec)
[1]   120     6  5040     1     1 40320
-
log(my_vec)
+
log(my_vec)
[1] 1.609438 1.098612 1.945910 0.000000 0.000000 2.079442
-
my_vec*my_vec
+
my_vec*my_vec
[1] 25  9 49  1  1 64
@@ -685,14 +719,14 @@

R will line up the vectors and perform a sequence of individual operations. For instance, in my_vec*my_vec, R multiplies the first element of vector 1 by the first element of vector 2, then the second element of vector 1 by the second element of vector 2, and so on, until all elements are multiplied. The result will be a new vector the same length as the first two.

If you give R two vectors of unequal lengths, R will repeat the shorter vector until it is as long as the longer vector, and then do the math.

-
my_vec * c(1, 2)
+
my_vec * c(1, 2)
[1]  5  6  7  2  1 16

If the length of the short vector does not divide evenly into the length of the long vector, R will do an “incomplete repeat” of the shorter vector and return a warning.

-
my_vec * c(1, 2, 3, 4)
+
my_vec * c(1, 2, 3, 4)
Warning in my_vec * c(1, 2, 3, 4): longer object length is not a multiple of
 shorter object length
@@ -705,7 +739,7 @@

R can also do vector and matrix multiplications, but we have to explicitly ask for them. For example, to get the inner product, we need the operator %*%:

-
my_vec %*% my_vec
+
my_vec %*% my_vec
     [,1]
 [1,]  149
@@ -713,7 +747,7 @@

-
my_vec %o% my_vec
+
my_vec %o% my_vec
     [,1] [,2] [,3] [,4] [,5] [,6]
 [1,]   25   15   35    5    5   40
@@ -731,39 +765,39 @@ 

We can access specific elements of vectors using the square bracket [ ] notation. Write the name of the vector you want to extract from, followed by a the square brackets with an index of the element you wish to extract. This index can be a position or the result of a logical test.

To extract elements based on their position we simply write the position inside the [ ]. Let’s first recall the values of my_vec.

-
my_vec
+
my_vec
[1] 5 3 7 1 1 8

To extract the 3rd value of my_vec, we use

-
my_vec[3]
+
my_vec[3]
[1] 7

We can store this value in another object

-
value_3 = my_vec[3]
+
value_3 = my_vec[3]

And we can extract multiple elements at the time by using a vector of indices inside the square brackets:

-
my_vec[c(1, 3, 5)]
+
my_vec[c(1, 3, 5)]
[1] 5 7 1

Or we can use : notation. Remember that : helps us create a sequence of values. For example,

-
3:6
+
3:6
[1] 3 4 5 6

So,

-
my_vec[3:6]
+
my_vec[3:6]
[1] 7 1 1 8
@@ -783,28 +817,28 @@

Another convenient way to extract elements from a vector is to use a logical expression as an index. For example, to extract all elements greater than 4 from my_vec, we do

-
my_vec[my_vec > 4]
+
my_vec[my_vec > 4]
[1] 5 7 8

This works because R uses element-wise operations even for logical statements. So, my_vec > 4 asks if each item of my_vec meets the condition “greater than four” and returns the corresponding vector of TRUE and FALSE. Then, when we add this result to the square brackets, R examines each element of my_vec asking “should I extract this element?”. If the answer is TRUE, the value is extracted; if it’s FALSE, the value is ignored. Under the hood, my_vec > 4 is equivalent to

-
my_vec[c(TRUE, FALSE, TRUE, FALSE, FALSE, TRUE)]
+
my_vec[c(TRUE, FALSE, TRUE, FALSE, FALSE, TRUE)]
[1] 5 7 8

There are two useful functions to extract values of a vector. head() helps us extract the first few values of a vector.

-
head(my_vec, n = 3)
+
head(my_vec, n = 3)
[1] 5 3 7

tail() gives us the last few values of a vector

-
tail(my_vec, n = 3)
+
tail(my_vec, n = 3)
[1] 1 1 8
@@ -814,44 +848,44 @@

2.6.3 Replacing elements

We can replace the elements of a vector by combining the square bracket notation with the assignment operator. For example, to replace the second element of my_vec, we do

-
my_vec[2] <- 99
-my_vec
+
my_vec[2] <- 99
+my_vec
[1]  5 99  7  1  1  8

To replace multiple elements with the same value, say, elements 5 and 6, we do

-
my_vec[c(5, 6)] <- 55
-my_vec
+
my_vec[c(5, 6)] <- 55
+my_vec
[1]  5 99  7  1 55 55

R can also replace elements element wise:

-
my_vec[c(5, 6)] <- c(100, 200)
-my_vec
+
my_vec[c(5, 6)] <- c(100, 200)
+my_vec
[1]   5  99   7   1 100 200

What happens if you try to replace two values with a vector that has three (or more) values?

-
my_vec[c(5, 6)] <- c(100, 200, 500)
+
my_vec[c(5, 6)] <- c(100, 200, 500)
Warning in my_vec[c(5, 6)] <- c(100, 200, 500): number of items to replace is
 not a multiple of replacement length
-
my_vec
+
my_vec
[1]   5  99   7   1 100 200

Logical expressions help us replace values that meet specific conditions without having to find them ourselves.

-
my_vec[my_vec > 44] <- -1
-my_vec
+
my_vec[my_vec > 44] <- -1
+my_vec
[1]  5 -1  7  1 -1 -1
@@ -861,33 +895,33 @@

2.6.4 Reordering elements of vectors

To sort the elements of a vector from lowest to highest, we can use sort()

-
my_vec <- sort(my_vec)
-my_vec
+
my_vec <- sort(my_vec)
+my_vec
[1] -1 -1 -1  1  5  7

If we want to sort from highest to lowest, we need to set the optional argument decreasing to TRUE

-
my_vec <- sort(my_vec, decreasing = TRUE)
-my_vec
+
my_vec <- sort(my_vec, decreasing = TRUE)
+my_vec
[1]  7  5  1 -1 -1 -1

Another option is to first use sort() and then reverse the sorted vector using rev().

-
my_vec <- rev(sort(my_vec))
+
my_vec <- rev(sort(my_vec))

A more useful feature of vectors is that we can reorder their elements based on the values of other vectors. To show this, let’s first create a vector of countries and another vector with (my guess of) their typical daily temperatures.

-
countries <- c("Japan", "Egypt", "Mexico", "Finland")
-temperatures_fahrenheit <- c(50, 90, 65, -10)
+
countries <- c("Japan", "Egypt", "Mexico", "Finland")
+temperatures_fahrenheit <- c(50, 90, 65, -10)

Imagine we want to order the vector of countries, going from coldest to hottest. The first step to reorder the countries is to use order() to create a new variable called “temperatures_ordered”.

-
temperatures_ordered <- order(temperatures_fahrenheit)
-temperatures_ordered
+
temperatures_ordered <- order(temperatures_fahrenheit)
+temperatures_ordered
[1] 4 1 3 2
@@ -895,8 +929,8 @@

-
countries_ordered <- countries[temperatures_ordered]
-countries_ordered
+
countries_ordered <- countries[temperatures_ordered]
+countries_ordered
[1] "Finland" "Japan"   "Mexico"  "Egypt"  
@@ -928,14 +962,14 @@

2.8 Writing our own functions

Functions, as you may remember, are objects that store commands, which is helpful when we want to the same thing to different inputs. The three basic parts of a function are name, code to implement, and arguments. To assemble these parts, we can use the function() function (yes, really) followed by a pair of curly brackets {}:

-
my_function <- function() {}
+
my_function <- function() {}

function() will run the code that we write inside the curly brackets. This code is called the body of the function.

Let’s try something simple, like adding 1 + 1:

-
simple_function <- function() {
-    1 + 1
-}
+
simple_function <- function() {
+    1 + 1
+}
@@ -952,7 +986,7 @@

-
simple_function()
+
simple_function()
[1] 2
@@ -960,50 +994,50 @@

R which temperature to convert each time.

-
fahrenheit_to_celsius <- function(temperature) {
-    (temperature - 32) / 1.8
-}
-fahrenheit_to_celsius(27)
+
fahrenheit_to_celsius <- function(temperature) {
+    (temperature - 32) / 1.8
+}
+fahrenheit_to_celsius(27)
[1] -2.777778

Now let’s try something a bit more complicated: solving a quadratic equation. If we have an equation of the form \(ax^2 + bx + c = 0\), then the solutions are given by \(x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}\). We can write a function that will apply this formula for us.

-
solve_quadratic = function(a, b, c) {
-    # Quadratic equations can have two solutions
-    solution_1 = (-b + sqrt(b**2 - 4*a*c)) / 2*a # First solution to equation
-    solution_2 = (-b - sqrt(b**2 - 4*a*c)) / 2*a # Second solution to equation
-}
+
solve_quadratic = function(a, b, c) {
+    # Quadratic equations can have two solutions
+    solution_1 = (-b + sqrt(b**2 - 4*a*c)) / 2*a # First solution to equation
+    solution_2 = (-b - sqrt(b**2 - 4*a*c)) / 2*a # Second solution to equation
+}

Did you see the comments I added? R will ignore everything that comes after a hashtag #. This # is known in R as commenting symbol because it allows us to comment our code. This way we can explain confusing chunks of code, or warn about potential problems.

Now all we need is to identify values of \(a, b,\) and \(c\) to pass as arguments.

-
solve_quadratic(a = 1, b = -1, c = -3)
+
solve_quadratic(a = 1, b = -1, c = -3)

Why didn’t our function show a result? When we run a function, R runs the code in the body and returns the result of the last line of code. If this line doesn’t return a value, neither will our function. So, we have to write something to ensure that solve_quadratic() displays the solutions. Since we are using a script (right?) it is easy to go back to our function and just add one more line:

-
solve_quadratic = function(a, b, c) {
-    # Quadratic equations can have two solutions
-    solution_1 = (-b + sqrt(b**2 - 4*a*c)) / 2*a # First solution to equation
-    solution_2 = (-b - sqrt(b**2 - 4*a*c)) / 2*a # Second solution to equation
-    c(solution_1, solution_2) # Show vector of solutions
-}
-solve_quadratic(a = 1, b = -1, c = -3)
+
solve_quadratic = function(a, b, c) {
+    # Quadratic equations can have two solutions
+    solution_1 = (-b + sqrt(b**2 - 4*a*c)) / 2*a # First solution to equation
+    solution_2 = (-b - sqrt(b**2 - 4*a*c)) / 2*a # Second solution to equation
+    c(solution_1, solution_2) # Show vector of solutions
+}
+solve_quadratic(a = 1, b = -1, c = -3)
[1]  2.302776 -1.302776

Now imagine that we want to multiply the solutions to our equation by an arbitrary value (maybe because we want to convert the units of \(x\) to something else). And let’s pretend that, by default, we want to double the solutions. We can write another function that first calls solve_quadratic() and then multiplies the result by our arbitrary number.

-
multiply_solutions = function(a, b, c, multiplier = 2) {
-    solve_quadratic(a, b, c) * multiplier
-}
-multiply_solutions(a = 1, b = -1, c = -3)
+
multiply_solutions = function(a, b, c, multiplier = 2) {
+    solve_quadratic(a, b, c) * multiplier
+}
+multiply_solutions(a = 1, b = -1, c = -3)
[1]  4.605551 -2.605551
-
multiply_solutions(a = 1, b = -1, c = -3, multiplier = 10)
+
multiply_solutions(a = 1, b = -1, c = -3, multiplier = 10)
[1]  23.02776 -13.02776
diff --git a/docs/search.json b/docs/search.json index 1128e2a..300c023 100644 --- a/docs/search.json +++ b/docs/search.json @@ -565,7 +565,7 @@ "href": "02_getting_started_with_r.html#working-with-vectors", "title": "2  Getting Started with R", "section": "2.6 Working with vectors", - "text": "2.6 Working with vectors\nIn R, an ordered group of numbers is called a vector. To create a vector of numbers, we need to use the function c() (short for “combine”). The arguments of c() are the numbers you want to use in the vector, in the order you want to use them.\n\nmy_vec <- c(5, 3, 7, 1, 1, 8)\nmy_vec\n\n[1] 5 3 7 1 1 8\n\n\nVectors can also contain other types of data, like words\n\nvector_of_words <- c(\"monday\", \"lemon\")\nvector_of_words\n\n[1] \"monday\" \"lemon\" \n\n\nNote that we must enclose words in quotation marks to let R know that we want to use “monday” and “lemon” as values instead of the names of objects. Look at what happens if we forget the quotation marks:\n\nvector_of_words <- c(monday, lemon)\n\nError in eval(expr, envir, enclos): object 'monday' not found\n\n\nLater on we will work more with vectors of words, but for now let’s focus on numberical vectors.\n\n2.6.1 Operations with numerical vectors\nR is a “vectorized” language, which means that it can often operate on an entire vector of numbers as easily as on a single number. All the logical and mathematical functions we used before work with vectors:\n\nmy_vec > 2\n\n[1] TRUE TRUE TRUE FALSE FALSE TRUE\n\nmy_vec <= 7\n\n[1] TRUE TRUE TRUE TRUE TRUE FALSE\n\nmy_vec + 7\n\n[1] 12 10 14 8 8 15\n\nmy_vec / 11\n\n[1] 0.45454545 0.27272727 0.63636364 0.09090909 0.09090909 0.72727273\n\nsqrt(my_vec)\n\n[1] 2.236068 1.732051 2.645751 1.000000 1.000000 2.828427\n\nfactorial(my_vec)\n\n[1] 120 6 5040 1 1 40320\n\nlog(my_vec)\n\n[1] 1.609438 1.098612 1.945910 0.000000 0.000000 2.079442\n\nmy_vec*my_vec\n\n[1] 25 9 49 1 1 64\n\n\nIn the last example, R did not follow the rules of linear algebra to multiply two vectors. Instead, R uses “element-wise execution”, which means that R applies the same operation to each member of the vector. For example, my_vec + 7 adds 7 to each number inside my_vec.\nWhen you use two vectors with the same number of elements for an operation, R will line up the vectors and perform a sequence of individual operations. For instance, in my_vec*my_vec, R multiplies the first element of vector 1 by the first element of vector 2, then the second element of vector 1 by the second element of vector 2, and so on, until all elements are multiplied. The result will be a new vector the same length as the first two.\nIf you give R two vectors of unequal lengths, R will repeat the shorter vector until it is as long as the longer vector, and then do the math.\n\nmy_vec * c(1, 2)\n\n[1] 5 6 7 2 1 16\n\n\nIf the length of the short vector does not divide evenly into the length of the long vector, R will do an “incomplete repeat” of the shorter vector and return a warning.\n\nmy_vec * c(1, 2, 3, 4)\n\nWarning in my_vec * c(1, 2, 3, 4): longer object length is not a multiple of\nshorter object length\n\n\n[1] 5 6 21 4 1 16\n\n\nRepeating the numbers of the vector is known as “vector recycling”, and it helps R do element-wise operations.\nElement-wise operations allow us to manipulate entire data variables rather than one element at a time. When you start working with data sets, element-wise operations will ensure that values from one observation or case are only paired with values from the same observation or case. Element-wise operations also make it easier to write your own programs and functions in R.\nR can also do vector and matrix multiplications, but we have to explicitly ask for them. For example, to get the inner product, we need the operator %*%:\n\nmy_vec %*% my_vec\n\n [,1]\n[1,] 149\n\n\nAnd to get the outer product, we need %o%:\n\nmy_vec %o% my_vec\n\n [,1] [,2] [,3] [,4] [,5] [,6]\n[1,] 25 15 35 5 5 40\n[2,] 15 9 21 3 3 24\n[3,] 35 21 49 7 7 56\n[4,] 5 3 7 1 1 8\n[5,] 5 3 7 1 1 8\n[6,] 40 24 56 8 8 64\n\n\nIf you are not familiar with matrix operations, don’t worry. You won’t need them in these notes.\n\n\n2.6.2 Extracting elements\nWe can access specific elements of vectors using the square bracket [ ] notation. Write the name of the vector you want to extract from, followed by a the square brackets with an index of the element you wish to extract. This index can be a position or the result of a logical test.\nTo extract elements based on their position we simply write the position inside the [ ]. Let’s first recall the values of my_vec.\n\nmy_vec\n\n[1] 5 3 7 1 1 8\n\n\nTo extract the 3rd value of my_vec, we use\n\nmy_vec[3]\n\n[1] 7\n\n\nWe can store this value in another object\n\nvalue_3 = my_vec[3]\n\nAnd we can extract multiple elements at the time by using a vector of indices inside the square brackets:\n\nmy_vec[c(1, 3, 5)]\n\n[1] 5 7 1\n\n\nOr we can use : notation. Remember that : helps us create a sequence of values. For example,\n\n3:6\n\n[1] 3 4 5 6\n\n\nSo,\n\nmy_vec[3:6]\n\n[1] 7 1 1 8\n\n\n\n\n\n\n\n\nNote\n\n\n\nIn R, the positional index starts at 1, so to call the first element of a vector we need to use [1]. In most other programming languages (like Python and C++), the positional index starts at 0.\n\n\nAnother convenient way to extract elements from a vector is to use a logical expression as an index. For example, to extract all elements greater than 4 from my_vec, we do\n\nmy_vec[my_vec > 4]\n\n[1] 5 7 8\n\n\nThis works because R uses element-wise operations even for logical statements. So, my_vec > 4 asks if each item of my_vec meets the condition “greater than four” and returns the corresponding vector of TRUE and FALSE. Then, when we add this result to the square brackets, R examines each element of my_vec asking “should I extract this element?”. If the answer is TRUE, the value is extracted; if it’s FALSE, the value is ignored. Under the hood, my_vec > 4 is equivalent to\n\nmy_vec[c(TRUE, FALSE, TRUE, FALSE, FALSE, TRUE)]\n\n[1] 5 7 8\n\n\nThere are two useful functions to extract values of a vector. head() helps us extract the first few values of a vector.\n\nhead(my_vec, n = 3)\n\n[1] 5 3 7\n\n\ntail() gives us the last few values of a vector\n\ntail(my_vec, n = 3)\n\n[1] 1 1 8\n\n\n\n\n2.6.3 Replacing elements\nWe can replace the elements of a vector by combining the square bracket notation with the assignment operator. For example, to replace the second element of my_vec, we do\n\nmy_vec[2] <- 99\nmy_vec\n\n[1] 5 99 7 1 1 8\n\n\nTo replace multiple elements with the same value, say, elements 5 and 6, we do\n\nmy_vec[c(5, 6)] <- 55\nmy_vec\n\n[1] 5 99 7 1 55 55\n\n\nR can also replace elements element wise:\n\nmy_vec[c(5, 6)] <- c(100, 200)\nmy_vec\n\n[1] 5 99 7 1 100 200\n\n\nWhat happens if you try to replace two values with a vector that has three (or more) values?\n\nmy_vec[c(5, 6)] <- c(100, 200, 500)\n\nWarning in my_vec[c(5, 6)] <- c(100, 200, 500): number of items to replace is\nnot a multiple of replacement length\n\nmy_vec\n\n[1] 5 99 7 1 100 200\n\n\nLogical expressions help us replace values that meet specific conditions without having to find them ourselves.\n\nmy_vec[my_vec > 44] <- -1\nmy_vec\n\n[1] 5 -1 7 1 -1 -1\n\n\n\n\n2.6.4 Reordering elements of vectors\nTo sort the elements of a vector from lowest to highest, we can use sort()\n\nmy_vec <- sort(my_vec)\nmy_vec\n\n[1] -1 -1 -1 1 5 7\n\n\nIf we want to sort from highest to lowest, we need to set the optional argument decreasing to TRUE\n\nmy_vec <- sort(my_vec, decreasing = TRUE)\nmy_vec\n\n[1] 7 5 1 -1 -1 -1\n\n\nAnother option is to first use sort() and then reverse the sorted vector using rev().\n\nmy_vec <- rev(sort(my_vec))\n\nA more useful feature of vectors is that we can reorder their elements based on the values of other vectors. To show this, let’s first create a vector of countries and another vector with (my guess of) their typical daily temperatures.\n\ncountries <- c(\"Japan\", \"Egypt\", \"Mexico\", \"Finland\")\ntemperatures_fahrenheit <- c(50, 90, 65, -10)\n\nImagine we want to order the vector of countries, going from coldest to hottest. The first step to reorder the countries is to use order() to create a new variable called “temperatures_ordered”.\n\ntemperatures_ordered <- order(temperatures_fahrenheit)\ntemperatures_ordered\n\n[1] 4 1 3 2\n\n\nThis output says that the lowest value in temperatures_fahrenheit is in the fourth position, the second lowest value is on the first position, and so on. So, we can think of temperatures_ordered as a vector of positional indices of temperatures in ascending order.\nNow we can use these indices to reorder the vector of countries.\n\ncountries_ordered <- countries[temperatures_ordered]\ncountries_ordered\n\n[1] \"Finland\" \"Japan\" \"Mexico\" \"Egypt\" \n\n\nTa-da!\nThese vector manipulations can do more than dazzle your friends. Imagine you have a dataset with two columns of data and you want to sort each column. If you just use sort() on each column separately, the values of each column will become uncoupled from each other. By using order() on one column, a vector of positional indices is created of the values of the column in ascending order. Then we can use this vector as the index of elements on the second column, which will return a vector of values based on the first column." + "text": "2.6 Working with vectors\nIn R, an ordered group of numbers is called a vector. To create a vector of numbers, we need to use the function c() (short for “combine”). The arguments of c() are the numbers you want to use in the vector, in the order you want to use them.\n\nmy_vec <- c(5, 3, 7, 1, 1, 8)\nmy_vec\n\n[1] 5 3 7 1 1 8\n\n\nVectors can also contain other types of data, like words\n\nvector_of_words <- c(\"monday\", \"lemon\")\nvector_of_words\n\n[1] \"monday\" \"lemon\" \n\n\nNote that we must enclose words in quotation marks to let R know that we want to use “monday” and “lemon” as values instead of the names of objects. Look at what happens if we forget the quotation marks:\n\nvector_of_words <- c(monday, lemon)\n\nError in eval(expr, envir, enclos): object 'monday' not found\n\n\nLater on we will work more with vectors of words, but for now let’s focus on numerical vectors.\n\n2.6.1 Operations with numerical vectors\nR is a “vectorized” language, which means that it can often operate on an entire vector of numbers as easily as on a single number. All the logical and mathematical functions we used before work with vectors:\n\nmy_vec > 2\n\n[1] TRUE TRUE TRUE FALSE FALSE TRUE\n\nmy_vec <= 7\n\n[1] TRUE TRUE TRUE TRUE TRUE FALSE\n\nmy_vec + 7\n\n[1] 12 10 14 8 8 15\n\nmy_vec / 11\n\n[1] 0.45454545 0.27272727 0.63636364 0.09090909 0.09090909 0.72727273\n\nsqrt(my_vec)\n\n[1] 2.236068 1.732051 2.645751 1.000000 1.000000 2.828427\n\nfactorial(my_vec)\n\n[1] 120 6 5040 1 1 40320\n\nlog(my_vec)\n\n[1] 1.609438 1.098612 1.945910 0.000000 0.000000 2.079442\n\nmy_vec*my_vec\n\n[1] 25 9 49 1 1 64\n\n\nIn the last example, R did not follow the rules of linear algebra to multiply two vectors. Instead, R uses “element-wise execution”, which means that R applies the same operation to each member of the vector. For example, my_vec + 7 adds 7 to each number inside my_vec.\nWhen you use two vectors with the same number of elements for an operation, R will line up the vectors and perform a sequence of individual operations. For instance, in my_vec*my_vec, R multiplies the first element of vector 1 by the first element of vector 2, then the second element of vector 1 by the second element of vector 2, and so on, until all elements are multiplied. The result will be a new vector the same length as the first two.\nIf you give R two vectors of unequal lengths, R will repeat the shorter vector until it is as long as the longer vector, and then do the math.\n\nmy_vec * c(1, 2)\n\n[1] 5 6 7 2 1 16\n\n\nIf the length of the short vector does not divide evenly into the length of the long vector, R will do an “incomplete repeat” of the shorter vector and return a warning.\n\nmy_vec * c(1, 2, 3, 4)\n\nWarning in my_vec * c(1, 2, 3, 4): longer object length is not a multiple of\nshorter object length\n\n\n[1] 5 6 21 4 1 16\n\n\nRepeating the numbers of the vector is known as “vector recycling”, and it helps R do element-wise operations.\nElement-wise operations allow us to manipulate entire data variables rather than one element at a time. When you start working with data sets, element-wise operations will ensure that values from one observation or case are only paired with values from the same observation or case. Element-wise operations also make it easier to write your own programs and functions in R.\nR can also do vector and matrix multiplications, but we have to explicitly ask for them. For example, to get the inner product, we need the operator %*%:\n\nmy_vec %*% my_vec\n\n [,1]\n[1,] 149\n\n\nAnd to get the outer product, we need %o%:\n\nmy_vec %o% my_vec\n\n [,1] [,2] [,3] [,4] [,5] [,6]\n[1,] 25 15 35 5 5 40\n[2,] 15 9 21 3 3 24\n[3,] 35 21 49 7 7 56\n[4,] 5 3 7 1 1 8\n[5,] 5 3 7 1 1 8\n[6,] 40 24 56 8 8 64\n\n\nIf you are not familiar with matrix operations, don’t worry. You won’t need them in these notes.\n\n\n2.6.2 Extracting elements\nWe can access specific elements of vectors using the square bracket [ ] notation. Write the name of the vector you want to extract from, followed by a the square brackets with an index of the element you wish to extract. This index can be a position or the result of a logical test.\nTo extract elements based on their position we simply write the position inside the [ ]. Let’s first recall the values of my_vec.\n\nmy_vec\n\n[1] 5 3 7 1 1 8\n\n\nTo extract the 3rd value of my_vec, we use\n\nmy_vec[3]\n\n[1] 7\n\n\nWe can store this value in another object\n\nvalue_3 = my_vec[3]\n\nAnd we can extract multiple elements at the time by using a vector of indices inside the square brackets:\n\nmy_vec[c(1, 3, 5)]\n\n[1] 5 7 1\n\n\nOr we can use : notation. Remember that : helps us create a sequence of values. For example,\n\n3:6\n\n[1] 3 4 5 6\n\n\nSo,\n\nmy_vec[3:6]\n\n[1] 7 1 1 8\n\n\n\n\n\n\n\n\nNote\n\n\n\nIn R, the positional index starts at 1, so to call the first element of a vector we need to use [1]. In most other programming languages (like Python and C++), the positional index starts at 0.\n\n\nAnother convenient way to extract elements from a vector is to use a logical expression as an index. For example, to extract all elements greater than 4 from my_vec, we do\n\nmy_vec[my_vec > 4]\n\n[1] 5 7 8\n\n\nThis works because R uses element-wise operations even for logical statements. So, my_vec > 4 asks if each item of my_vec meets the condition “greater than four” and returns the corresponding vector of TRUE and FALSE. Then, when we add this result to the square brackets, R examines each element of my_vec asking “should I extract this element?”. If the answer is TRUE, the value is extracted; if it’s FALSE, the value is ignored. Under the hood, my_vec > 4 is equivalent to\n\nmy_vec[c(TRUE, FALSE, TRUE, FALSE, FALSE, TRUE)]\n\n[1] 5 7 8\n\n\nThere are two useful functions to extract values of a vector. head() helps us extract the first few values of a vector.\n\nhead(my_vec, n = 3)\n\n[1] 5 3 7\n\n\ntail() gives us the last few values of a vector\n\ntail(my_vec, n = 3)\n\n[1] 1 1 8\n\n\n\n\n2.6.3 Replacing elements\nWe can replace the elements of a vector by combining the square bracket notation with the assignment operator. For example, to replace the second element of my_vec, we do\n\nmy_vec[2] <- 99\nmy_vec\n\n[1] 5 99 7 1 1 8\n\n\nTo replace multiple elements with the same value, say, elements 5 and 6, we do\n\nmy_vec[c(5, 6)] <- 55\nmy_vec\n\n[1] 5 99 7 1 55 55\n\n\nR can also replace elements element wise:\n\nmy_vec[c(5, 6)] <- c(100, 200)\nmy_vec\n\n[1] 5 99 7 1 100 200\n\n\nWhat happens if you try to replace two values with a vector that has three (or more) values?\n\nmy_vec[c(5, 6)] <- c(100, 200, 500)\n\nWarning in my_vec[c(5, 6)] <- c(100, 200, 500): number of items to replace is\nnot a multiple of replacement length\n\nmy_vec\n\n[1] 5 99 7 1 100 200\n\n\nLogical expressions help us replace values that meet specific conditions without having to find them ourselves.\n\nmy_vec[my_vec > 44] <- -1\nmy_vec\n\n[1] 5 -1 7 1 -1 -1\n\n\n\n\n2.6.4 Reordering elements of vectors\nTo sort the elements of a vector from lowest to highest, we can use sort()\n\nmy_vec <- sort(my_vec)\nmy_vec\n\n[1] -1 -1 -1 1 5 7\n\n\nIf we want to sort from highest to lowest, we need to set the optional argument decreasing to TRUE\n\nmy_vec <- sort(my_vec, decreasing = TRUE)\nmy_vec\n\n[1] 7 5 1 -1 -1 -1\n\n\nAnother option is to first use sort() and then reverse the sorted vector using rev().\n\nmy_vec <- rev(sort(my_vec))\n\nA more useful feature of vectors is that we can reorder their elements based on the values of other vectors. To show this, let’s first create a vector of countries and another vector with (my guess of) their typical daily temperatures.\n\ncountries <- c(\"Japan\", \"Egypt\", \"Mexico\", \"Finland\")\ntemperatures_fahrenheit <- c(50, 90, 65, -10)\n\nImagine we want to order the vector of countries, going from coldest to hottest. The first step to reorder the countries is to use order() to create a new variable called “temperatures_ordered”.\n\ntemperatures_ordered <- order(temperatures_fahrenheit)\ntemperatures_ordered\n\n[1] 4 1 3 2\n\n\nThis output says that the lowest value in temperatures_fahrenheit is in the fourth position, the second lowest value is on the first position, and so on. So, we can think of temperatures_ordered as a vector of positional indices of temperatures in ascending order.\nNow we can use these indices to reorder the vector of countries.\n\ncountries_ordered <- countries[temperatures_ordered]\ncountries_ordered\n\n[1] \"Finland\" \"Japan\" \"Mexico\" \"Egypt\" \n\n\nTa-da!\nThese vector manipulations can do more than dazzle your friends. Imagine you have a dataset with two columns of data and you want to sort each column. If you just use sort() on each column separately, the values of each column will become uncoupled from each other. By using order() on one column, a vector of positional indices is created of the values of the column in ascending order. Then we can use this vector as the index of elements on the second column, which will return a vector of values based on the first column." }, { "objectID": "02_getting_started_with_r.html#working-with-scripts", @@ -600,6 +600,6 @@ "href": "02_getting_started_with_r.html#getting-help", "title": "2  Getting Started with R", "section": "2.5 Getting help", - "text": "2.5 Getting help\nTo access R’s built-in help information on any function simply use the help() function. For example, to open the help page for round(), we do\n\nhelp(\"round\")\n\nA shorter way of writing this is to use ? before the name of the function.\n\n?round\n\nAfter you run the code, the help page is displayed in the “Help” tab in the “Files-Plots-Packages” pane (usually in the bottom right of RStudio).\nHelp pages may seem arcane to the novice user, probably because they aim for shortness and use a lot of technical jargon. But this short jargon makes the explanations precise (most of them), so we can get the information we need without having to read the entire document. Also, all help pages are organized similarly, so we don’t have to relearn how to navigate them. So, with a bit of practice, you will be able to find exactly what you need in mere seconds.\n\n\n\n\n\n\nHow to learn to read help pages?\n\n\n\nStart by reading the help pages of functions that you can already understand. This will teach you how to understand the structure of the pages and will familiarize you with many technical terms. As you use R you will likely need other, more complicated functions, so reading more help pages will happen almost naturally.\nJust keep in mind that help pages are about code, not about the underlying concepts. If you don’t know what it means to round a number, reading the documentation for round() will not help you.\n\n\nSo far we have been working with only a single number at the time. But R can also work with groups of numbers by using something called “vectors”." + "text": "2.5 Getting help\nTo access R’s built-in help information on any function simply use the help() function. For example, to open the help page for round(), we do\n\nhelp(\"round\")\n\nA shorter way of writing this is to use ? before the name of the function.\n\n?round\n\nAfter you run the code, the help page is displayed in the “Help” tab in the “Files-Plots-Packages” pane (usually in the bottom right of RStudio).\nHelp pages may seem arcane to the novice user, probably because they aim for shortness and use a lot of technical jargon. But this short jargon makes (most of) the explanations precise, so we can use the information we need without having to read the entire document. Also, all help pages are organized similarly, so we don’t have to relearn how to navigate them. So, with a bit of practice, you will be able to find exactly what you need in mere seconds.\nThe first line of the help document displays the name of the function and the package that contains the function. Other sections are:\n\nDescription: a short description of the function.\nUsage: names the arguments associated with the function and possible default values.\nArguments: expounds each argument and what they do.\nDetails: a more detailed description of the function.\nValue: if applicable, gives the type and structure of the object returned by the function or the operator.\nSee Also: leads to other help pages with similar or related content.\nExamples: code examples on how to use the function. To see how they work, we just need to copy and paste them into the console. We can also access examples at any time by using the example() function (i.e. example(\"round\")).\n\nThe help() function is useful if we know the name of the function. But if all we remember is a key word in the name, we can search through R’s help system using help.search()\n\nhelp.search(\"round\")\n\nOr we can use the shortcut ??\n\n??round\n\nAs before, the ‘Help’ tab in RStudio will display the results of the search. help.search() searches through the help documentation, code demonstrations, and package vignettes and displays the results as clickable links that we can follow.\nAnother useful function is apropos(). This function can be used to list all functions containing a specified character string. For example, to find all functions with sum in their name.\n\napropos(\"round\")\n\n[1] \"round\" \"round.Date\" \"round.POSIXt\"\n\n\nIf we find the function we need, we can look for its documentation.\n\nhelp(\"round.Date\")\n\nAnother useful function is RSiteSearch(), which allows us to search for keywords and phrases in function help pages and vignettes for all CRAN packages, and in CRAN task views. So, we can access the online search engine directly from the Console and display the results in our web browser.\n\nRSiteSearch(\"regression\")\n\n\n\n\n\n\n\nHow to get started with help pages?\n\n\n\nStart by reading the help pages of functions that you already understand. This will teach you how to understand the structure of the pages and will familiarize you with the jargon. As you use R you will likely need other, more complicated functions, so reading more help pages will happen almost naturally.\nJust keep in mind that help pages are about code, not about the underlying concepts. If you don’t know what it means to round a number, reading the documentation for round() will not help you.\n\n\nNow that we know how to get help, we can move on to more advanced stuff. Previously we worked with one number at the time. But we can also work with groups of numbers by using something called “vectors”." } ] \ No newline at end of file