-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathintro_to_polca.Rmd
253 lines (171 loc) · 8.04 KB
/
intro_to_polca.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
---
title: "Polytomous Latent Class Analysis"
author: "Dina Arch"
date: "Updated: `r format(Sys.time(), '%B %d, %Y')`"
output:
html_document:
toc: yes
toc_float: yes
theme: flatly
editor_options:
markdown:
wrap: sentence
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE,
warning = FALSE,
message = FALSE) #Here, I have made it so that when you knit your .rmd, warnings and messages will not show up in the html markdown.
```
------------------------------------------------------------------------
# IMMERSE Project
{style="float: left;" width="300"}
The Institute of Mixture Modeling for Equity-Oriented Researchers, Scholars, and Educators (IMMERSE) is an IES funded training grant (R305B220021) to support education scholars in integrating mixture modeling into their research.
- Please [visit our website](https://immerse.education.ucsb.edu/) to learn more.
- Visit our [GitHub](https://github.com/immerse-ucsb) account to access all the IMMERSE materials.
- Follow us on [Twitter](https://twitter.com/IMMERSE_UCSB)!
How to reference this workshop: Institute of Mixture Modeling for Equity-Oriented Researchers, Scholars, and Educators (2025).
IMMERSE Online Resources (IES No. 305B220021).
Institute of Education Sciences.
<https://immerse-ucsb.github.io/>
------------------------------------------------------------------------
Load packages
```{r, cache = FALSE}
library(poLCA)
library(tidyverse)
library(janitor)
library(gt)
library(MplusAutomation)
library(here)
library(glue)
here::i_am("intro_to_polca.Rmd")
```
------------------------------------------------------------------------
## Polytomous LCA
Polytomous LCA deals with variables that have more than two categories, such as survey questions with responses like `never`, `sometimes`, and `always`. The workflow of a polytomous LCA model is similar to that of an LCA model with binary indicators. However, polytomous LCA captures more complex response patterns, which can make interpretation a bit trickier. The following code demonstrates an example, along with a visualization of the model.
------------------------------------------------------------------------
## Example: Elections
"Two sets of six questions with four responses each, asking respondents’ opinions of how well various traits describe presidential candidates Al Gore and George W. Bush. Also potential covariates vote choice, age, education, gender, and party ID. Source: The National Election Studies (2000)." (poLCA, 2016) [See documentation here](https://cran.r-project.org/web/packages/poLCA/poLCA.pdf)
Two sets of six questions with four responses each, asking respondents’ opinions of how well various traits describe presidential candidates Al Gore and George W. Bush. In the election data set, respondents to the 2000 American National Election Study public opinion poll were asked to evaluate how well a series of traits—moral, caring, knowledgeable, good leader, dishonest, and intelligent—described presidential candidates Al Gore and George W. Bush. Each question had four possible choices: (1) extremely well; (2) quite well; (3) not too well; and (4) not well at all.
------------------------------------------------------------------------
### Prepare Data
```{r, eval=TRUE}
data(election)
# Detaching packages that mask the dpylr functions
detach(package:poLCA, unload = TRUE)
detach(package:MASS, unload = TRUE)
df_election <- election %>%
clean_names() %>%
select(moralb:dishonb) %>%
mutate(across(everything(),
~ as.factor(as.numeric(gsub("\\D", "", .))),
.names = "{.col}1"))
# Quick summary
summary(df_election)
```
------------------------------------------------------------------------
### Descriptive Statistics
```{r}
ds <- df_election %>%
pivot_longer(moralb1:dishonb1, names_to = "variable") %>%
count(variable, value) %>% # Count occurrences of each value for each variable
group_by(variable) %>%
mutate(prop = n / sum(n)) %>%
arrange(desc(variable))
# Create the table
prop_table <- ds %>%
gt() %>%
tab_header(title = md("**Descriptive Summary**")) %>%
cols_label(
variable = "Variable",
n = md("*N*"),
prop = md("Proportion")
) %>%
fmt_number(c("n", "prop"), decimals = 2) %>% # Format both n and prop columns
cols_align(
align = "center",
columns = c(prop, n)
)
# View the table
prop_table
# Save as a Word doc
#gtsave(prop_table, here("figures", "prop_table.docx"))
```
------------------------------------------------------------------------
### Enumeration
This code uses the `mplusObject` function in the `MplusAutomation` package.
```{r, eval=FALSE, cache = TRUE}
lca_enumeration <- lapply(1:6, function(k) {
lca_enum <- mplusObject(
TITLE = glue("{k}-Class"),
VARIABLE = glue(
"categorical = moralb1-dishonb1;
usevar = moralb1-dishonb1;
classes = c({k}); "),
ANALYSIS =
"estimator = mlr;
type = mixture;
starts = 500 100;
processors = 10;",
OUTPUT = "sampstat residual tech11 tech14 svalues;",
usevariables = colnames(df_election),
rdata = df_election)
lca_enum_fit <- mplusModeler(lca_enum,
dataout=glue(here("poLCA", "election.dat")),
modelout=glue(here("poLCA", "c{k}_election.inp")) ,
check=TRUE, run = TRUE, hashfilename = FALSE)
})
```
------------------------------------------------------------------------
### Table of Fit
```{r}
source("enum_table.txt")
output_election <- readModels(here("poLCA"), filefilter = "election", quiet = TRUE)
# To see rows:
#seeRows(output_election)
# Arguments for `enum_table`
# 1. readModels objects
# 2-5. Rows of successfully estimated models
enum_table(output_election, 1:6)
```
------------------------------------------------------------------------
Save table:
```{r, eval = FALSE}
#gtsave(fit_table1, here("figures", "fit_table1.png"))
```
------------------------------------------------------------------------
### Information Criteria Plot
```{r height=5, width=7}
ic_plot(output_election)
```
------------------------------------------------------------------------
Save figure:
```{r}
#ggsave(here("figures", "info_criteria.png"), dpi=300, height=5, width=7, units="in")
```
------------------------------------------------------------------------
### 4-Class Probability Plot
The functions `poLCA_stacked` and `poLCA_grouped` create visualizations of class probabilities for LCA with polytomous indicators. Each function takes the following arguments:
- **`model_name`**: The LCA model read into R using the `readModels` function from the `MplusAutomation` package.
- **`category_labels`**: A character vector of category labels for the response options (e.g., survey answers).
Note: Double check that the labels are in the correct order!
```{r, fig.width= 12, fig.height= 8}
source("poLCA_plot")
poLCA_stacked(output_election$c4_election.out, category_labels = c("1" = "1: Extremely well",
"2" = "2: Quite Well",
"3" = "3: Not Too Well",
"4" = "4: Not Well at All"))
```
Alternative plot
```{r, fig.width= 12, fig.height= 8}
poLCA_grouped(output_election$c4_election.out, category_labels = c("1" = "1: Extremely well",
"2" = "2: Quite Well",
"3" = "3: Not Too Well",
"4" = "4: Not Well at All"))
```
------------------------------------------------------------------------
Save figure:
```{r, eval = FALSE}
#ggsave(here("figures", "C2_Cheat_LCA_Plot.png"), dpi="retina", height=5, width=7, units="in")
```
------------------------------------------------------------------------
{width="75%"}