-
-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Use of text files to create and manage DAGs #118
Comments
So you basically want to be able to create them from files rather than dagitty or It's not pushed to the PR yet but #117 is adding functionality for updating the data and dagitty components of the ggdag object. In theory that could be extended to create a DAG from data frame, e.g. you would read and join those files yourself then supply the data to ggdag to set it up. Would that meet your need? I should also note that you can always just join extra information like |
I think that would do. Essentially, I'd like to work with data frames rather than other objects. Starting from a To me it's more intuitive and it reflects closer what we usually do in applied epi research. |
Proof of concept from #117 library(ggdag, warn.conflicts = FALSE)
graph <- data.frame(
name = c("c", "c", "x"),
to = c("x", "y", "y")
)
metadata <- data.frame(
name = c("x", "y", "c"),
status = c("exposure", "outcome", "latent"),
adjusted = c("unadjusted", "unadjusted", "adjusted"),
variable_descriptions = c("a", "b", "d")
)
dag_data <- dplyr::full_join(
graph,
metadata,
by = "name"
)
tidy_dag_data <- as_tidy_dagitty(dag_data)
tidy_dag_data
#> # A DAG with 3 nodes and 4 edges
#> #
#> # Exposure: x
#> # Outcome: y
#> # Latent Variable: c
#> #
#> # A tibble: 4 × 11
#> name x y direction to xend yend circular status adjusted
#> <chr> <dbl> <dbl> <fct> <chr> <dbl> <dbl> <lgl> <chr> <chr>
#> 1 c -1.23 -0.0453 -> x -0.367 0.468 FALSE latent adjusted
#> 2 c -1.23 -0.0453 -> y -0.352 -0.532 FALSE latent adjusted
#> 3 x -0.367 0.468 -> y -0.352 -0.532 FALSE exposure unadjust…
#> 4 y -0.352 -0.532 -> <NA> NA NA FALSE outcome unadjust…
#> # ℹ 1 more variable: variable_descriptions <chr>
ggdag(tidy_dag_data) Created on 2023-08-09 with reprex v2.0.2 |
It looks really good and handy! |
I find the current pipeline of building and managing DAGs not that streamline (especially with
dagitty
). When I start working on new papers, I usually have to make a data request, which means choosing from the available variables and creating acsv
file. The idea would be the following:from
andto
(e.g.,node A
andnode B
of the graph).node
,variable_name
(the name of the node might not correspond to the variable in the dataset),description
(a short text description of each variable),labels
(if a factor, levels and labels),observed
(whether it is observed or not),status
(e.g., exposure).These files would then be used to create a
ggdag
object. Since the files contain information on e.g., levels and labels, it would then be possible to generate a publication-ready table of confounders (a sort of Table 1).Would this be useful?
The text was updated successfully, but these errors were encountered: