TeachR

Repository for lectures and notes from my office hours etc

pres: contains all Rmarkdown and knitted results. Contents below:
- ml2
  - Contains an overview of caret, knn, and naive bayes
- ml1
  - contains overcomplicated knn code, good for an excercise in overly fancy R code and not much else
- html-scraping
  - Contains a primer on scraping websites with rvest, as well as a slight introduction to pipes, and a few user created functions. Read this then scraping.R as a followup with more code to play with.
- copyonmodify
  - Discusses how we should avoid for loops and "growing vectors" in general, due to some of the fun little quirks of R.
R: contains all R code. Contents below:
- eda1/R
  - Feature elimination tricks in R!!
- final.R
  - R example of automated EDA, some training of models, from our final office hours :(
- logo.R
  - R code used to make the logo for this repository
- scraping.R
  - Example html scraping code
- tidy1.R
  - a quick primer on dplyr
- count-and-pipes.R
  - Counting with dplyr and piping with magrittr
- applied.R
  - Contains the first really advanced stuff that we will do in here, apply/lapply review, which is the equivalent mathematically of mapping, anonymous functions, lists of functions, "function factories" (closures), and finally brings everything together in one crazy example. Will be made into a .Rmd soon enough
- lm_1.R
  - Contains the basics of linear modeling
- ml1.R
  - Contains overcomplicated knn
- json2gif
  - Contains simple sample code where a JSON is used to show the movement of bodies through time
src: contains C/C++ code that is used to speed up R. Currently this is empty, and we may not use this directory. Interested parties can make an issue request, email me, or message me on slack and we will work on this. For now, it is enough to know this is part of the structure of a big R project.
fig: contains images and figures generated
data: contains minimal data. It is best to save data here not in csv format, but as RData/rda, because it is much much lighter.

The basic structure of a good R project

This repository is a glimpse of what a well structured R project looks like. In general, we put R code in the R directory, the pretty output in its own directory, images in a directory, and low level code in a src directory. If you intend on developing an R package, which i would be happy to discuss, a good reading is Hadley's "package structure". This is also just useful information to use on your own R projects. I will provide (opinionated) thoughts on workflow, project structure, etc. later on.

FAQ

Why won't my file knit?

Trying to use CRAN without setting a mirror

Simple solution: don't put install.packages in Rmarkdown files.

More complex solution: install.packages(packagename,repos = "http://cran.us.r-project.org")

Cannot find my file

First attempt at answering: setwd() does not work in knitr. Instead, in the R setup chunk, do knitr::opts_knit$set(root.dir = '/path/to/root/dir/of/project'), or set the root directory with the R studio GUI

A simpler, but far less reproducible attempt is to just use the absolute path. But in general, it is better to use relative paths, so see above solution. Setting the root project dir tells knitr to execute your R code in a session where the working directory is what you specified. Then all your paths should work.

A final solution, is lets say you have a Rmarkdown file in pres, and a data file in data. Then, we can in the rmd file, say:

df<-load('../data/myfile.RData')

Links to other useful sites and readings

caret documentation
ml metrics
naive bayes overview
naive bayes math/fast naive bayes
awesome-msds
- a MSDS student's repository containing awesome resources for the program
awesome-r
- awesome R packages
rmarkdown manual
- an amazing resource for knitting
knitr options
- More knitting resources
why should I use functions
- Functions not only make your code more readable, but they can also make repitive tasks easier. In my opinion, we should write many small functions and combine them in a bigger function. This makes our code more readable, and more overall useful. See below for an example:

# let us say we want to be able to take the log of any number, and if it is negative
# we want to make it the absolute value. This is not directly useful, but in math
# it pops up a lot (see differential equations)
square <- function(x){
	x*x
}

# yes there is an absolute value function, abs(), but this is for demonstration purpuses
absval <- function(x){
	sqrt(square(x))
}

# We are including `...` because the log() function can take extra arguments, e.g.
# base. We want to be able to have those be allowed in our function too.
abslog <- function(x,...){
	log(absval(x),...)
}

abslog(-2)
# [1] 0.6931472
abslog(2)
# [1] 0.6931472

# Now lets see the ...
abslog(-10, base = 10)
# [1] 1
abslog(3432, base = 2)
# [1] 11.74483
abslog(-3432, base = 2)
# [1] 11.74483

more to come
even more

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
R		R
data		data
fig		fig
pres		pres
tests		tests
.Rhistory		.Rhistory
.gitattributes		.gitattributes
.gitignore		.gitignore
Readme.md		Readme.md
_config.yml		_config.yml
shell.nix		shell.nix

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TeachR

Repository for lectures and notes from my office hours etc

Contents:

The basic structure of a good R project

FAQ

Why won't my file knit?

Trying to use CRAN without setting a mirror

Cannot find my file

Links to other useful sites and readings

About

Releases

Packages

Languages

josephsdavid/teachR

Folders and files

Latest commit

History

Repository files navigation

TeachR

Repository for lectures and notes from my office hours etc

Contents:

The basic structure of a good R project

FAQ

Why won't my file knit?

Trying to use CRAN without setting a mirror

Cannot find my file

Links to other useful sites and readings

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages