-
Notifications
You must be signed in to change notification settings - Fork 5
Home
The main headings are tabs on the top bar.
This tab has a file loader button and a button for selecting sample data.
- Load any properly formatted csv file or load one of four datasets from agricolae.
- Display the data as a segmented table.
- Only continuous variables and factors are supported.
- Data with missing values is not supported.
- Show the code used to load the data.
These are the types of data we will support for analyses:
- A continuous independent variable and a continuous dependent variable. (single variate linear regression)
- A continuous independent variable + one dependent factor variable from a completely randomized experiment design (CRD). example
- A continuous independent variable + two dependent factor variables from a completely randomized experiment design (CRD). example
- A continuous independent variable + one dependent "treatment" factor variable + one dependent factor block variable from a randomized complete block design (RCBD). example
- A continuous independent variable + two dependent "treatment" factor variables + one dependent factor block variable from a randomized complete block design (RCBD). example
- A continuous independent variable + two "treatment" factor variables + one plot rep factor variable from a split plot completely randomized design. example
- A continuous independent variable + two "treatment" factor variables + one block factor variable from a split plot randomized complete block design. example
- (MIXED EFFECT MODEL) A continuous independent variable + one dependent factor variable + one factor block + one RANDOM factor variable (e.g. location or year) (RCBD for mixed-effects ANOVA) concept example
- (MIXED EFFECT MODEL) A continuous independent variable + one factor variable
- one factor block + two RANDOM factor variable (e.g. location and year) (RCBD for mixed-effects ANOVA) concept example
Custom data:
my_data <- read.csv('path/to/data.csv')
Sample data:
library(agricolae) # load "agricolae" package for the sample data
data("plots") # either plots, corn, cotton, etc
my_data <- plots
Each sub-heading here represents a section in the side panel.
The user will select from a drop down the type of experiment design they used:
- Continuous variables (Linear Regression,
LR
) - One Treatment Completely Randomized Design (
CRD1
) - Two Treatment Completely Randomized Design (
CRD2
) - One Treatment Randomized Complete Block Design (
RCBD1
) - Two Treatment Randomized Complete Block Design (
RCBD2
) - Two Treatment Split Plot Completely Randomized Design (
SPCRD2
) - Two Treatment Split Plot Randomized Complete Block Design (
SPRCBD2
) - One Treatment Randomized Complete Block Design with One Random Effect (
RCBDM1
) - One Treatment Randomized Complete Block Design with Two Random Effects (
RCBDM2
)
This is the same for all data types, i.e. the user selects a single dependent
variable. A list of all the variables available in my.data
are shown and the
user selects one. This variable should always be continuous but there will be
not check that it is, the correct choice is up to the user.
The user can select any one of the remaining columns in the data set from a list. This column should be a continuous variable and there is no check for this.
This will create a base formula like:
Y ~ X
The user can select one variable from the remaining columns for the treatment. This should be a factor (it will be coerced into one).
This will create a base formula like:
Y ~ X
The user can select two variable from the remaining columns for the treatment. These should both be factors (they will be coerced into factors).
This will create a base formula like:
Y ~ X + Z + X:Z
The user can select one variable from the remaining columns for the treatment and select one variable for the block. Both should be factors (they will be coerced into factors if not).
This will create a base formula like:
Y ~ X + BLK
The user can select two variables from the remaining columns for the treatments and select one variable for the block. All should be factors (they will be coerced into factors if not).
This will create a base formula like:
Y ~ X + Z + X:Z + BLK
There are three drop downs for variable selection: main plot treatment (A), sub plot treatment (B), and replication (R). Each of these variables will be coerced into factors.
This will create a base formula like:
Y ~ A + B + A:B + Error(A:R)
There are three drop downs for factor variable selection: main plot treatment (X), sub plot treatment (Z), block (BLK). Each of these variables will be coerced into factors.
This will create a base formula like:
Y ~ X + Z + X:Z + BLK + Error(X:BLK)
The user can select one variable from the remaining columns for the treatment, select one variable for the block and one variable as a random effect. All should be factors (they will be coerced into factors if not).
Y ~ X + Z + (1|W/BLK)
The user can select two variables from the remaining columns for the treatments, select one variable for the block and one variable as a random effect. All should be factors (they will be coerced into factors if not).
This will create a base formula like:
Y ~ X + Z + X:Z + BLK + (1|V/W/BLK)
TODO : Maker sure this formula is correct.
This will allow the user to apply transformations to the dependent variable in
the model. There will be a drop down to select between None, log10, sqrt, and
power. This will effectively add a new column to my.data
with one of the
three transformations and adjust the formula's used in the analysis to:
y.pow ~ ...
or
y.log10 ~ ...
or
y.sqrt ~ ...
For sqrt and log the transformation is simply:
my.data$y.log10 <- log10(my.data$y)
and:
my.data$y.sqrt <- sqrt(my.data$y)
For the power transformation, an exponent is automatically computed with the following code:
# For one independent variable.
mean.data <- aggregate(Y ~ A, data = my.data, function(x)
c(logmean=log10(mean(x)), logvar=log10(var(x))))
# For two independent variables.
mean.data <- aggregate(Y ~ A + B, data = my.data, function(x)
c(logmean=log10(mean(x)), logvar=log10(var(x))))
power.model <- lm(logvar ~ logmean, data = as.data.frame(mean.data$Y))
power <- 1 - summary(power.model)$coefficients[2, 1] / 2
my.data$Y.pow <- my.data$Y^power
The user presses the "Run Analysis" button which then displays the code needed to run the analyses and the assumptions tests. The results from running the analyses code will be displayed in the main window, i.e. text results interspersed with graphs.
The linear regression will find the model fit, show the fit summary, show the results of the Shapiro-Wilks test, and make three plots: fitted vs residuals, Q-Q, and scatter plot with best fit line.
The code produced follows this form:
fit <- lm(formula = Y ~ X, data = my.data)
summary(model)
shapiro.test(residuals(fit))
plot(fit, c(1, 2))
plot(formula = Y ~ X, data = my.data)
abline(model)
The CRD analyses will run a one-way ANOVA, show the ANOVA table, show the results of two assumptions tests (Shapiro-Wilk, Levene) and plots a box plot showing the effect of the levels of the independent variable on the dependent variable.
The code produced follows this form:
fit <- aov(formula = Y ~ X, data = my.data)
summary(fit)
boxplot(Y ~ X, data = my.data, main = "Effect of X on Y",
xlab = "X", ylab = "Y")
plot(fit, which = c(1, 2))
shapiro.test(residuals(fit))
library('car')
leveneTest(Y ~ X, data = my.data)
The CRD analyses will run a two-way ANOVA, show the ANOVA table, show the results of three assumptions tests (Shapiro-Wilk, Levene, Tukey) and plots a box plot showing the effect of the levels of the independent variables on the dependent variable along with interaction plots.
The code produced follows this form:
fit <- aov(formula = Y ~ X + Z + X:Z, data = my.data)
summary(fit)
boxplot(Y ~ X, data = my.data, main = "Effect of X on Y",
xlab = "X", ylab = "Y")
boxplot(Y ~ Z, data = my.data, main = "Effect of Z on Y",
xlab = "Z", ylab = "Y")
plot(fit, which = c(1, 2))
shapiro.test(residuals(fit))
library('car')
leveneTest(Y ~ X, data = my.data)
leveneTest(Y ~ Z, data = my.data)
my.data$YP.SQ <- predict(model)^2
tukey.one.df.model <- lm(formula = Y ~ X + Z + X:Z + YP.SQ,
data = my.data)
summary(tukey.one.df.model)
library('HH')
intxplot(Y ~ X, groups = Z, data = my.data, se = TRUE,
ylim = range(my.data$Y), offset.scale = 500)
intxplot(Y ~ Z, groups = X, data = my.data, se = TRUE,
ylim = range(my.data$Y), offset.scale = 500)
A one-way ANOVA is fit to the RCBD data and three assumptions tests are run (Shapiro-Wilk, Levene, Tukey). The ANOVA table is shown for the fit and three plots are produced: fitted vs residuals, Q-Q, box plot showing the effect of the levels on the dependent variables.
fit <- aov(formula = Y ~ X + BLK, data = my.data)
summary(fit)
plot(fit, which = c(1, 2))
boxplot(Y ~ X, data = my.data, main = "Effect of X on Y",
xlab = "X", ylab = "Y")
shapiro.test(residuals(fit))
library('car')
leveneTest(Y ~ X, data = my.data)
my.data$YP.SQ <- predict(fit)^2
tukey.one.dof.mod <- lm(formula = Y ~ X + BLK + YP.SQ, data = my.data)
summary(tukey.one.dof.mod)
A three-way ANOVA is fit to the RCBD data and three assumptions tests are run (Shapiro-Wilk, Levene, Tukey). The ANOVA table is shown for the fit and six plots are produced: fitted vs residuals, Q-Q, box plots showing the effect of the levels on the dependent variables, and two interaction plots.
fit <- aov(formula = Y ~ BLK + X + Z + X:Z, data = my.data)
summary(fit)
boxplot(Y ~ X, data = my.data, main = "Effect of X on Y",
xlab = "X", ylab = "Y")
boxplot(Y ~ Z, data = my.data, main = "Effect of Z on Y",
xlab = "Z", ylab = "Y")
plot(fit, which = c(1, 2))
shapiro.test(residuals(fit))
library('car')
leveneTest(Y ~ X, data = my.data)
leveneTest(Y ~ Z, data = my.data)
my.data$YP.SQ <- predict(model)^2
tukey.one.df.model <- lm(formula = Y ~ BLK + X + Z + X:Z + YP.SQ,
data = my.data)
summary(tukey.one.df.model)
library('HH')
intxplot(Y ~ X, groups = Z, data = my.data, se = TRUE,
ylim = range(my.data$Y), offset.scale = 500)
intxplot(Y ~ Z, groups = X, data = my.data, se = TRUE,
ylim = range(my.data$Y), offset.scale = 500)
The Split Plot CRD analyses will run a two-way ANOVA, show the ANOVA table, show the results of three assumptions tests (Shapiro-Wilk, Levene, Tukey) and plots the residauls vs fitted, Q-Q, and two interaction plots.
fit <- aov(formula = A + B + A:B + Error(A:R), data = my.data)
summary(fit)
fit.no.error <- aov(formula = A + B + A:B, data = my.data)
plot(fit.no.error, which = c(1, 2))
library('car')
leveneTest(Y ~ A, data = my.data)
leveneTest(Y ~ B, data = my.data)
my.data$YP.SQ <- predict(fit.no.error)^2
tukey.one.df.fit <- lm(formula = A + B + A:B + YP.SQ, data = my.data)
summary(tukey.one.df.fit)
library('HH')
intxplot(Y ~ A, groups = B, data = my.data, se = TRUE,
ylim = range(my.data$Y), offset.scale = 500)
intxplot(Y ~ B, groups = A, data = my.data, se = TRUE,
ylim = range(my.data$Y), offset.scale = 500)
A three-way ANOVA is fit to the RCBD data and three assumptions tests are run (Shapiro-Wilk, Levene, Tukey). The ANOVA table is shown for the fit and six plots are produced: fitted vs residuals, Q-Q, box plots showing the effect of the levels on the dependent variables, and two interaction plots.
fit <- aov(formula = BLK + A + B + A:B + Error(A:BLK), data = my.data)
summary(fit)
fit.no.error <- aov(formula = BLK + A + B + A:B, data = my.data)
plot(fit.no.error, which = c(1, 2))
library('car')
leveneTest(Y ~ A, data = my.data)
leveneTest(Y ~ B, data = my.data)
my.data$YP.SQ <- predict(fit.no.error)^2
tukey.one.df.fit <- lm(formula = BLK + A + B + A:B + YP.SQ, data = my.data)
summary(tukey.one.df.fit)
library('HH')
intxplot(Y ~ A, groups = B, data = my.data, se = TRUE,
ylim = range(my.data$Y), offset.scale = 500)
intxplot(Y ~ B, groups = A, data = my.data, se = TRUE,
ylim = range(my.data$Y), offset.scale = 500)
TODO : Waiting on the complete example.
TODO : Waiting on the complete example.
This will have a button to run post hoc analyses.
TODO : This needs some help. Not sure if I know exactly what to do here.
We could let the user be in charge of choosing the factors to run LSD.test()
on and not have any checking. This would be the easiest to implement.
- Significant variables are identified.
- For each sig var LSD.test() is run on the variable.
- The text output of the $groups attribute is shown, i.e. a table that shows whether the factors are significantly different from each other.
- Are any plots needed from this? Maybe just identify the "letters" for each factor level, determined by LSD.test, so they can be called as labels in a bar graph. The plots will be produced under Tab 5.
- A button to download a pdf report that include R code and graphs.
- A button to download an R script that will execute the analyses created by the GUI.
Help text for all the functionality.
Image and link to USAID plus a disclaimer.