-
Notifications
You must be signed in to change notification settings - Fork 5
Home
The main headings are tabs on the top bar.
This tab has a file loader button and a button for selecting sample data.
- Load any properly formatted csv file or load one of four datasets from agricolae
- Display the data as a segmented table
- Only continuous variables and factors are supported.
- Show the code used to load the data
These are the types of data we will support for analyses.
- A pair of continuous variables. (single variate linear regression)
- A continuous independent variable + any number of factor variables. (general ANOVA, includes t-test) ==> I think this contains all the possibilities below
- A continuous independent variable + one factor variable that has been randomized (CRD for ANOVA) example
- A continuous independent variable + two factor variable that has been randomized (CRD with factorial treatment structure for ANOVA) example
- A continuous independent variable + one factor variable + one factor block variable (RCBD for ANOVA) example
- A continuous independent variable + two factor variable + one factor block variable (RCBD with factorial treatment structure for ANOVA) example
- A continuous independent variable + two factor variables and + one factor plot variable (Split Plot CRD for ANOVA) example
- A continuous independent variable + two factor variables + one factor block variable (Split Plot RCBD for ANOVA) example
Custom data:
my_data <- read.csv('path/to/data.csv')
Sample data:
library(agricolae) # load "agricolae" package for the sample data
data("plots") # plots, corn, cotton, etc
my_data <- plots
Each sub-heading here represents a section in the side panel.
The user will select from a drop down the type of experiment design they used:
- Generic data for correlation analysis
- Completely Randomized Design (CRD)
- Randomized Complete Block Design (RCBD)
- Split Plot CRD
- Split Plot RCBD
If #1 is selected in Step 1, the user can select either "linear regression" or "ANOVA". If #2-5 is selected in step 1, then ANOVA is automatically selected and the user can't change it.
This is the same for all data types. A list of all the variables available in the data set are shown and the user selects one. This variable should always be continuous.
The user can select any one of the remaining columns in the data set from a list. This column should be a continuous variable.
This will create a base formula like:
y ~ x
The user can select any number of the remaining variables as independent variables. Combinations of variables can be selected for interactions.
This will create a base formula like:
y ~ x1 + ... + xn + x1*x1 + ... + xn*xn
The user can select 1 to 2 variables from the remaining columns for the treatment(s). This column should be a factor variable.
This will create a base formula like:
y ~ x or y ~ x + z or y ~ x + z + x*z
The user can select 1 to 2 variables from the remaining columns for the treatment(s) and select a variable for the block (two drop downs). Both of these should be factors.
This will create a base formula like:
y ~ x + blk, or y ~ x + y + blk, or y ~ x + z + x*z + blk
There are three drop downs for variable selection: main plot (A), sub plot (B), and replication (R).
y ~ A + B + A*B + Error(A:R)
There are three drop downs for factor variable selection: main plot (A), sub plot (B), block (blk).
y ~ blk + A + Error(A:blk) + B + A*B
Each variable that was selected in the previous step will be listed along with a radio button allowing for different types: continuous or factor (grouping).
Note: This section likely no longer needed because the variable types are restricted by the analyses type.
Both should be continuous so no selections are allowed.
The dependent variable should be continuous and the independent variables should be factors. No selections allowed.
This will allow the user to apply transformations to any of the model's continuous variables. The variables that were selected will have radio buttons or a drop down beside them with options to apply log, sqrt, power. This will modify the formula, e.g. for ANOVA:
y^2 ~ x1 + ... + xn + x1*x1 + ... + xn*xn
or
log(y) ~ x1 + ... + xn + x1*x1 + ... + xn*xn
For linear regression we should allow transforming the dependent and independent variables:
y ~ x^2
A text box that lets the user set the alpha value to something other than 0.05. The default is 0.05. This will be used in the confidence interval calcs and the post hoc analyses.
The user presses the "Run Analysis" button which then displays the code needed to run the analyses and the assumptions tests <== or are the assumption tests initiated in Tab #3 (see below)?
The code produced follows this form:
fit <- aov(formula = y ~ A + B + A * B, data = my_data)
anova(fit)
confint(fit, level=0.95)
Several things are displayed in the main panel:
- The text output of the
anova()
function. - The 95% confidence intervals.
This tab will have a button to execute functions to test the ANOVA assumptions. It will display:
- The text output of the
shapiro.test()
functions. - The text output of the
leveneTest
function.
and the code to generate that:
shapiro.test(A)
shapiro.test(B)
library(car)
leveneTest(y ~ A * B, data=my_data) <== I don't this this code is correct
***
R example for both Shapiro and Levene's test:
Inform R that Block and Trt are factors
data.set$Block<-as.factor(firm.dat$Block)
data.set$Temp<-as.factor(firm.dat$Trt)
The ANOVA
ANOVA.model<-lm(DependentVariable ~ Trt + Block, data.set)
anova(ANOVA.model)
TESTING ASSUMPTIONS
#Generate residual and predicted values
data.set$resids <- residuals(ANOVA.model)
data.set$preds <- predict(ANOVA.model)
data.set$sq_preds <- data.set$preds^2
Look at a plot of residual vs. predicted values
plot(resids ~ preds, data = data.set,
xlab = "Predicted Values", ylab = "Residuals")
Perform a Shapiro-Wilk test for normality of residuals
shapiro.test(data.set$resids)
Perform Levene's Test for homogeneity of variances
library(car)
leveneTest(DependentVariable ~ Trt + Blk, data.set, show.table = TRUE)
This will have a button to run post hoc analyses.
TODO : This needs some help. Not sure if I know exactly what to do here.
- Significant variables are identified.
- For each sig var LSD.test() is run on the variable.
- The text output of the $groups attribute is shown, i.e. a table that shows whether the factors are significantly different from each other.
- Are any plots needed from this? Maybe just identify the "letters" for each factor level, determined by LSD.test, so they can be called as labels in a bar graph. The plots will be produced under Tab 5.
The plots populate and update when you run the various analyses.
- Standard plots for a fit.
- A bar chart that compares the means and standard errors of the variables. (labeled with the significance letters if possible)
- Effects plots.
- Interaction plots (if any interactions are present). See page 5 for visual of what an interaction plot is example
plot(fit)
# bar chart
# TODO: Add example code.
# effects and interactions
for(i in names(ef)){
ef[[i]] <- effect(i, .fit)
}
lapply(ef, plot)
- A button to download a pdf report that include R code and graphs.
- A button to download an R script that will execute the analyses created by the GUI.
Help text for all the functionality.
Image and link to USAID plus a disclaimer.