-
Notifications
You must be signed in to change notification settings - Fork 5
Home
The main headings are tabs on the top bar.
This tab has a file loader button and a button for selecting sample data.
- Load any properly formatted csv file or load one of four datasets from agricolae
- Display the data as a segmented table
- Only continuous variables and factors are supported.
- Show the code used to load the data
These are the types of data we will support for analyses.
- A pair of continuous variables. (single variate linear regression)
- A continuous independent variable + one factor variable that has been randomized (CRD for ANOVA) Note "randomized" does NOT mean that this is designated as a random variable. It's referring to the DOE. example
- A continuous independent variable + two factor variable that has been randomized (CRD with factorial treatment structure for ANOVA) example
- A continuous independent variable + one factor variable + one factor block variable (RCBD for ANOVA) example
- (MIXED EFFECT MODEL) A continuous independent variable + one factor variable
- one factor block + one RANDOM factor variable (e.g. location or year) (RCBD for mixed-effects ANOVA) concept example
- (MIXED EFFECT MODEL) A continuous independent variable + one factor variable
- one factor block + two RANDOM factor variable (e.g. location and year) (RCBD for mixed-effects ANOVA) concept example
- A continuous independent variable + two factor variable + one factor block variable (RCBD with factorial treatment structure for ANOVA) example
- A continuous independent variable + two factor variables and + one factor plot variable (Split Plot CRD for ANOVA) example
- A continuous independent variable + two factor variables + one factor block variable (Split Plot RCBD for ANOVA) example
Custom data:
my_data <- read.csv('path/to/data.csv')
Sample data:
library(agricolae) # load "agricolae" package for the sample data
data("plots") # plots, corn, cotton, etc
my_data <- plots
Each sub-heading here represents a section in the side panel.
The user will select from a drop down the type of experiment design they used:
- Generic data for correlation analysis
- Completely Randomized Design (CRD)
- Randomized Complete Block Design (RCBD)
- Randomized Complete Block Design (RCBD) Mixed Effect
- Split Plot CRD
- Split Plot RCBD
If #1 is selected in Step 1 this is set to "linear regression". If 2, 3, 5, 6
is selected in step 1 then ANOVA is automatically selected. If 4 is selected
then this will be done used a mixed effect model like lme
or something.
This is the same for all data types, i.e. the user selects a single indpendent variable. A list of all the variables available in the data set are shown and the user selects one. This variable should always be continuous.
The user can select any one of the remaining columns in the data set from a list. This column should be a continuous variable.
This will create a base formula like:
y ~ x
The user can select 1 or 2 variables from the remaining columns for the treatment(s). These should be a factors. They can also select an interaction if the there are two variables.
This will create a base formulas like:
y ~ x
or
y ~ x + z
or
y ~ x + z + x:z
The user can select 1 or 2 variables from the remaining columns for the treatment(s) and select a variable for the block. All these should be factors. In addition, the user can add an interaction term if two variables are selected.
This will create a base formula like:
y ~ x + blk
or
y ~ x + z + blk
or
y ~ x + z + x:z + blk
The user selects a continous independent variable, a fixed factor variable, and a random factor variable.
y ~ x + z + (1|w)
The user selects a continous independent variable, a fixed factor variable, one factor block and two random factor variables.
y ~ x + z + blk + (1|w) + (1|v)
There are three drop downs for variable selection: main plot (A), sub plot (B), and replication (R). Which produce this formula:
y ~ A + B + A:B + Error(A:R)
There are three drop downs for factor variable selection: main plot (A), sub plot (B), block (blk).
y ~ A + B + + A:B + blk + Error(A:blk)
Each variable that was selected in the previous step will be listed along with a radio button allowing for different types: continuous or factor (grouping).
Note: This section likely no longer needed because the variable types are restricted by the analyses type.
Both should be continuous so no selections are allowed.
The dependent variable should be continuous and the independent variables should be factors. No selections allowed.
This will allow the user to apply transformations to any of the model's continuous variables. The variables that were selected will have radio buttons or a drop down beside them with options to apply log, sqrt, power. This will modify the formula, e.g. for ANOVA:
y^2 ~ x1 + ... + xn + x1*x1 + ... + xn*xn
or
log(y) ~ x1 + ... + xn + x1*x1 + ... + xn*xn
For linear regression we should allow transforming the dependent and independent variables:
y ~ x^2
A text box that lets the user set the alpha value to something other than 0.05. The default is 0.05. This will be used in the confidence interval calcs and the post hoc analyses.
The user presses the "Run Analysis" button which then displays the code needed to run the analyses and the assumptions tests <== or are the assumption tests initiated in Tab #3 (see below)?
The code produced follows this form:
fit <- aov(formula = y ~ A + B + A * B, data = my_data)
anova(fit)
confint(fit, level=0.95)
Several things are displayed in the main panel:
- The text output of the
anova()
function. - The 95% confidence intervals.
- The text output of the
shapiro.test()
functions. - The text output of the
leveneTest
function.
shapiro.test(A)
shapiro.test(B)
library(car)
leveneTest(y ~ A * B, data=my_data) <== I don't this this code is correct
***
R example for both Shapiro and Levene's test:
Inform R that Block and Trt are factors:
data.set$Block<-as.factor(firm.dat$Block)
data.set$Temp<-as.factor(firm.dat$Trt)
The ANOVA:
ANOVA.model<-lm(DependentVariable ~ Trt + Block, data.set)
anova(ANOVA.model)
TESTING ASSUMPTIONS:
- Generate residual and predicted values
data.set$resids <- residuals(ANOVA.model)
data.set$preds <- predict(ANOVA.model)
data.set$sq_preds <- data.set$preds^2
- Look at a plot of residual vs. predicted values
plot(resids ~ preds, data = data.set,
xlab = "Predicted Values", ylab = "Residuals")
- Perform a Shapiro-Wilk test for normality of residuals
shapiro.test(data.set$resids)
Perform Levene's Test for homogeneity of variances
library(car)
leveneTest(DependentVariable ~ Trt + Blk, data.set, show.table = TRUE)
This will have a button to run post hoc analyses.
TODO : This needs some help. Not sure if I know exactly what to do here.
We could let the user be in charge of choosing the factors to run LSD.test()
on and not have any checking. This would be the easiest to implement.
- Significant variables are identified.
- For each sig var LSD.test() is run on the variable.
- The text output of the $groups attribute is shown, i.e. a table that shows whether the factors are significantly different from each other.
- Are any plots needed from this? Maybe just identify the "letters" for each factor level, determined by LSD.test, so they can be called as labels in a bar graph. The plots will be produced under Tab 5.
The plots populate and update when you run the various analyses.
- Standard plots for a fit.
- A bar chart that compares the means and standard errors of the variables. (labeled with the significance letters if possible)
- Effects plots.
- Interaction plots (if any interactions are present). See page 5 for visual of what an interaction plot is example
plot(fit)
# bar chart
# TODO: Add example code.
# effects and interactions
for(i in names(ef)){
ef[[i]] <- effect(i, .fit)
}
lapply(ef, plot)
- A button to download a pdf report that include R code and graphs.
- A button to download an R script that will execute the analyses created by the GUI.
Help text for all the functionality.
Image and link to USAID plus a disclaimer.