Home

The main headings are tabs on the top bar.

Tab 1: Load Data

This tab has a file loader button and a button for selecting sample data.

Load any properly formatted csv file or load one of four datasets from agricolae.
Display the data as a segmented table.
Only continuous variables and factors are supported.
Data with missing values is not supported.
Show the code used to load the data.

Types of Data

These are the types of data we will support for analyses:

A continuous independent variable and a continuous dependent variable. (single variate linear regression)
A continuous independent variable + one dependent factor variable from a completely randomized experiment design (CRD). example
A continuous independent variable + two dependent factor variables from a completely randomized experiment design (CRD). example
A continuous independent variable + one dependent "treatment" factor variable + one dependent factor block variable from a randomized complete block design (RCBD). example
A continuous independent variable + two dependent "treatment" factor variables + one dependent factor block variable from a randomized complete block design (RCBD). example
A continuous independent variable + two "treatment" factor variables + one plot rep factor variable from a split plot completely randomized design. example
A continuous independent variable + two "treatment" factor variables + one block factor variable from a split plot randomized complete block design. example
(MIXED EFFECT MODEL) A continuous independent variable + one dependent factor variable + one factor block + one RANDOM factor variable (e.g. location or year) (RCBD for mixed-effects ANOVA) concept example
(MIXED EFFECT MODEL) A continuous independent variable + one factor variable
- one factor block + two RANDOM factor variable (e.g. location and year) (RCBD for mixed-effects ANOVA) concept example

Example Code

Custom data:

my_data <- read.csv('path/to/data.csv')

Sample data:

library(agricolae)  # load "agricolae" package for the sample data
data("plots")  # either plots, corn, cotton, etc
my_data <- plots

Tab 2: Analysis

Each sub-heading here represents a section in the side panel.

Side Panel Step 1: Experiment Design

The user will select from a drop down the type of experiment design they used:

Continuous variables (Linear Regression, LR)
One Treatment Completely Randomized Design (CRD1)
Two Treatment Completely Randomized Design (CRD2)
One Treatment Randomized Complete Block Design (RCBD1)
Two Treatment Randomized Complete Block Design (RCBD2)
Two Treatment Split Plot Completely Randomized Design (SPCRD2)
Two Treatment Split Plot Randomized Complete Block Design (SPRCBD2)
One Treatment Randomized Complete Block Design with One Random Effect (RCBDM1)
One Treatment Randomized Complete Block Design with Two Random Effects (RCBDM2)

Side Panel Step 2: Select the dependent variable

This is the same for all data types, i.e. the user selects a single dependent variable. A list of all the variables available in my.data are shown and the user selects one. This variable should always be continuous but there will be not check that it is, the correct choice is up to the user.

Side Panel Step 3: Select the independent variable(s)

LR

The user can select any one of the remaining columns in the data set from a list. This column should be a continuous variable and there is no check for this.

This will create a base formula like:

Y ~ X

CRD1

The user can select one variable from the remaining columns for the treatment. This should be a factor (it will be coerced into one).

This will create a base formula like:

Y ~ X

CRD2

The user can select two variable from the remaining columns for the treatment. These should both be factors (they will be coerced into factors).

This will create a base formula like:

Y ~ X + Z + X:Z

RCBD1

The user can select one variable from the remaining columns for the treatment and select one variable for the block. Both should be factors (they will be coerced into factors if not).

This will create a base formula like:

Y ~ X + BLK

RCBD2

The user can select two variables from the remaining columns for the treatments and select one variable for the block. All should be factors (they will be coerced into factors if not).

This will create a base formula like:

Y ~ X + Z + X:Z + BLK

SPCRD2

There are three drop downs for variable selection: main plot treatment (A), sub plot treatment (B), and replication (R). Each of these variables will be coerced into factors.

This will create a base formula like:

Y ~ A + B + A:B + Error(A:R)

SPRCBD2

There are three drop downs for factor variable selection: main plot treatment (X), sub plot treatment (Z), block (BLK). Each of these variables will be coerced into factors.

This will create a base formula like:

Y ~ X + Z + X:Z + BLK + Error(X:BLK)

RCBDM1

The user can select one variable from the remaining columns for the treatment, select one variable for the block and one variable as a random effect. All should be factors (they will be coerced into factors if not).

Y ~ X + Z + (1|W/BLK)

RCBDM2

The user can select two variables from the remaining columns for the treatments, select one variable for the block and one variable as a random effect. All should be factors (they will be coerced into factors if not).

This will create a base formula like:

Y ~ X + Z + X:Z + BLK + (1|V/W/BLK)

TODO : Maker sure this formula is correct.

Side Panel Step 4: Transformations

This will allow the user to apply transformations to the dependent variable in the model. There will be a drop down to select between None, log10, sqrt, and power. This will effectively add a new column to my.data with one of the three transformations and adjust the formula's used in the analysis to:

y.pow ~ ...

or

y.log10 ~ ...

or

y.sqrt ~ ...

For sqrt and log the transformation is simply:

my.data$y.log10 <- log10(my.data$y)

and:

my.data$y.sqrt <- sqrt(my.data$y)

For the power transformation, an exponent is automatically computed with the following code:

# For one independent variable.
mean.data <- aggregate(Y ~ A, data = my.data, function(x)
                       c(logmean=log10(mean(x)), logvar=log10(var(x))))
# For two independent variables.
mean.data <- aggregate(Y ~ A + B, data = my.data, function(x)
                       c(logmean=log10(mean(x)), logvar=log10(var(x))))
power.model <- lm(logvar ~ logmean, data = as.data.frame(mean.data$Y))
power <- 1 - summary(power.model)$coefficients[2, 1] / 2
my.data$Y.pow <- my.data$Y^power

Side Panel Step 5: Run Analysis

The user presses the "Run Analysis" button which then displays the code needed to run the analyses and the assumptions tests. The results from running the analyses code will be displayed in the main window, i.e. text results interspersed with graphs.

LR

The linear regression will find the model fit, show the fit summary, show the results of the Shapiro-Wilks test, and make three plots: fitted vs residuals, Q-Q, and scatter plot with best fit line.

The code produced follows this form:

fit <- lm(formula = Y ~ X, data = my.data)
summary(model)
shapiro.test(residuals(fit))
plot(fit, c(1, 2))
plot(formula = Y ~ X, data = my.data)
abline(model)

CRD1

The CRD analyses will run a one-way ANOVA, show the ANOVA table, show the results of two assumptions tests (Shapiro-Wilk, Levene) and plots a box plot showing the effect of the levels of the independent variable on the dependent variable.

The code produced follows this form:

fit <- aov(formula = Y ~ X, data = my.data)
summary(fit)
boxplot(Y ~ X, data = my.data, main = "Effect of X on Y",
        xlab = "X", ylab = "Y")
plot(fit, which = c(1, 2))
shapiro.test(residuals(fit))
library('car')
leveneTest(Y ~ X, data = my.data)

CRD2

The CRD analyses will run a two-way ANOVA, show the ANOVA table, show the results of three assumptions tests (Shapiro-Wilk, Levene, Tukey) and plots a box plot showing the effect of the levels of the independent variables on the dependent variable along with interaction plots.

The code produced follows this form:

fit <- aov(formula = Y ~ X + Z + X:Z, data = my.data)
summary(fit)
boxplot(Y ~ X, data = my.data, main = "Effect of X on Y",
        xlab = "X", ylab = "Y")
boxplot(Y ~ Z, data = my.data, main = "Effect of Z on Y",
        xlab = "Z", ylab = "Y")
plot(fit, which = c(1, 2))
shapiro.test(residuals(fit))
library('car')
leveneTest(Y ~ X, data = my.data)
leveneTest(Y ~ Z, data = my.data)
my.data$YP.SQ <- predict(model)^2
tukey.one.df.model <- lm(formula = Y ~ X + Z + X:Z + YP.SQ,
                         data = my.data)
summary(tukey.one.df.model)
library('HH')
intxplot(Y ~ X, groups = Z, data = my.data, se = TRUE,
         ylim = range(my.data$Y), offset.scale = 500)
intxplot(Y ~ Z, groups = X, data = my.data, se = TRUE,
         ylim = range(my.data$Y), offset.scale = 500)

RCBD1

A one-way ANOVA is fit to the RCBD data and three assumptions tests are run (Shapiro-Wilk, Levene, Tukey). The ANOVA table is shown for the fit and three plots are produced: fitted vs residuals, Q-Q, box plot showing the effect of the levels on the dependent variables.

fit <- aov(formula = Y ~ X + BLK, data = my.data)
summary(fit)
plot(fit, which = c(1, 2))
boxplot(Y ~ X, data = my.data, main = "Effect of X on Y",
        xlab = "X", ylab = "Y")
shapiro.test(residuals(fit))
library('car')
leveneTest(Y ~ X, data = my.data)
my.data$YP.SQ <- predict(fit)^2
tukey.one.dof.mod <- lm(formula = Y ~ X + BLK + YP.SQ, data = my.data)
summary(tukey.one.dof.mod)

RCBD2

A three-way ANOVA is fit to the RCBD data and three assumptions tests are run (Shapiro-Wilk, Levene, Tukey). The ANOVA table is shown for the fit and six plots are produced: fitted vs residuals, Q-Q, box plots showing the effect of the levels on the dependent variables, and two interaction plots.

fit <- aov(formula = Y ~ BLK + X + Z + X:Z, data = my.data)
summary(fit)
boxplot(Y ~ X, data = my.data, main = "Effect of X on Y",
        xlab = "X", ylab = "Y")
boxplot(Y ~ Z, data = my.data, main = "Effect of Z on Y",
        xlab = "Z", ylab = "Y")
plot(fit, which = c(1, 2))
shapiro.test(residuals(fit))
library('car')
leveneTest(Y ~ X, data = my.data)
leveneTest(Y ~ Z, data = my.data)
my.data$YP.SQ <- predict(model)^2
tukey.one.df.model <- lm(formula = Y ~ BLK + X + Z + X:Z + YP.SQ,
                         data = my.data)
summary(tukey.one.df.model)
library('HH')
intxplot(Y ~ X, groups = Z, data = my.data, se = TRUE,
         ylim = range(my.data$Y), offset.scale = 500)
intxplot(Y ~ Z, groups = X, data = my.data, se = TRUE,
         ylim = range(my.data$Y), offset.scale = 500)

SPCRD2

The Split Plot CRD analyses will run a two-way ANOVA, show the ANOVA table, show the results of three assumptions tests (Shapiro-Wilk, Levene, Tukey) and plots the residauls vs fitted, Q-Q, and two interaction plots.

fit <- aov(formula = A + B + A:B + Error(A:R), data = my.data)
summary(fit)
fit.no.error <- aov(formula = A + B + A:B, data = my.data)
plot(fit.no.error, which = c(1, 2))
library('car')
leveneTest(Y ~ A, data = my.data)
leveneTest(Y ~ B, data = my.data)
my.data$YP.SQ <- predict(fit.no.error)^2
tukey.one.df.fit <- lm(formula = A + B + A:B + YP.SQ, data = my.data)
summary(tukey.one.df.fit)
library('HH')
intxplot(Y ~ A, groups = B, data = my.data, se = TRUE,
         ylim = range(my.data$Y), offset.scale = 500)
intxplot(Y ~ B, groups = A, data = my.data, se = TRUE,
         ylim = range(my.data$Y), offset.scale = 500)

SPRCBD2

A three-way ANOVA is fit to the RCBD data and three assumptions tests are run (Shapiro-Wilk, Levene, Tukey). The ANOVA table is shown for the fit and six plots are produced: fitted vs residuals, Q-Q, box plots showing the effect of the levels on the dependent variables, and two interaction plots.

fit <- aov(formula = BLK + A + B + A:B + Error(A:BLK), data = my.data)
summary(fit)
fit.no.error <- aov(formula = BLK + A + B + A:B, data = my.data)
plot(fit.no.error, which = c(1, 2))
library('car')
leveneTest(Y ~ A, data = my.data)
leveneTest(Y ~ B, data = my.data)
my.data$YP.SQ <- predict(fit.no.error)^2
tukey.one.df.fit <- lm(formula = BLK + A + B + A:B + YP.SQ, data = my.data)
summary(tukey.one.df.fit)
library('HH')
intxplot(Y ~ A, groups = B, data = my.data, se = TRUE,
         ylim = range(my.data$Y), offset.scale = 500)
intxplot(Y ~ B, groups = A, data = my.data, se = TRUE,
         ylim = range(my.data$Y), offset.scale = 500)

RCBDM1

TODO : Waiting on the complete example.

RCBDM2

TODO : Waiting on the complete example.

Tab 3: Post Hoc Analysis

This will have a button to run post hoc analyses.

TODO : This needs some help. Not sure if I know exactly what to do here.

We could let the user be in charge of choosing the factors to run LSD.test() on and not have any checking. This would be the easiest to implement.

Significant variables are identified.
For each sig var LSD.test() is run on the variable.
The text output of the $groups attribute is shown, i.e. a table that shows whether the factors are significantly different from each other.
Are any plots needed from this? Maybe just identify the "letters" for each factor level, determined by LSD.test, so they can be called as labels in a bar graph. The plots will be produced under Tab 5.

Tab 4: Downloads

A button to download a pdf report that include R code and graphs.
A button to download an R script that will execute the analyses created by the GUI.

Tab 5: Help

Help text for all the functionality.

Tab 6: About

Image and link to USAID plus a disclaimer.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

Tab 1: Load Data

Types of Data

Example Code

Tab 2: Analysis

Side Panel Step 1: Experiment Design

Side Panel Step 2: Select the dependent variable

Side Panel Step 3: Select the independent variable(s)

LR

CRD1

CRD2

RCBD1

RCBD2

SPCRD2

SPRCBD2

RCBDM1

RCBDM2

Side Panel Step 4: Transformations

Side Panel Step 5: Run Analysis

LR

CRD1

CRD2

RCBD1

RCBD2

SPCRD2

SPRCBD2

RCBDM1

RCBDM2

Tab 3: Post Hoc Analysis

Tab 4: Downloads

Tab 5: Help

Tab 6: About

Clone this wiki locally