-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Fix file argument validation and add summary tree and agenda consistency checks #12
Conversation
NAs were being accepted as file path for chat and transcripts to be merged. Plus, some reindentation
Also, move validation function in their own file
WalkthroughThese updates focus on enhancing the reliability and consistency of data management and validation within a software project. Key improvements include the proper conversion of agenda times to clock time, ensuring consistency between summary trees and agendas, and refining the process of merging speech transcripts. Additionally, validation routines have been introduced to verify the integrity of agenda elements and the alignment between agendas and summary trees. Changes
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 7
Configuration used: CodeRabbit UI
Files ignored due to path filters (3)
man/check_agenda_summary_tree_consistency.Rd
is excluded by:!man/*.Rd
man/check_summary_tree_consistency.Rd
is excluded by:!man/*.Rd
man/validate_agenda_element.Rd
is excluded by:!man/*.Rd
Files selected for processing (3)
- DESCRIPTION (1 hunks)
- R/data_management.R (6 hunks)
- R/validation.R (1 hunks)
Files skipped from review due to trivial changes (1)
- DESCRIPTION
#' Validate summary tree id consistency | ||
#' | ||
#' @param summary_tree A list containing the summary tree or a file path to it. | ||
#' | ||
#' @return Nothing, will throw an error if the summary tree is not consistent. | ||
check_summary_tree_consistency <- function(summary_tree) { | ||
|
||
if (is.character(summary_tree)) { | ||
summary_tree <- dget(summary_tree) | ||
} | ||
|
||
if (length(summary_tree) == 0) { | ||
stop("The summary tree is empty.") | ||
} | ||
|
||
obs_ids <- names(summary_tree) | ||
|
||
titles <- purrr::map(summary_tree, \(x) x[["title"]]) | ||
sessions <- purrr::map(summary_tree, \(x) { | ||
if (!purrr::is_empty(x[["session"]])) { | ||
x[["session"]] | ||
} else { | ||
x[["title"]] | ||
} | ||
}) | ||
|
||
exp_ids <- paste(sessions, titles, sep = "_") | ||
|
||
test <- all.equal(obs_ids, exp_ids) | ||
|
||
if (isTRUE(test)) { | ||
return() | ||
} | ||
|
||
stop("The summary tree is not consistent: ", test, ".\n", | ||
purrr::map(seq_along(obs_ids), \(i) { | ||
if (obs_ids[i] != exp_ids[i]) { | ||
sprintf('"%s" != "%s"', obs_ids[i], exp_ids[i]) | ||
} | ||
}) |> purrr::compact() |> unlist() |> paste(collapse = "\n") | ||
) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function check_summary_tree_consistency
effectively checks the consistency of IDs in the summary tree. It uses modern R syntax (\
for anonymous functions) and purrr
functions, which are appropriate. However, the error message concatenation could be simplified by using paste
directly instead of purrr::map
followed by purrr::compact
and unlist
.
- }) |> purrr::compact() |> unlist() |> paste(collapse = "\n")
+ }, .keep_all = TRUE) %>% unlist() %>% paste(collapse = "\n")
Simplify the error message generation for efficiency and readability.
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
#' Validate summary tree id consistency | |
#' | |
#' @param summary_tree A list containing the summary tree or a file path to it. | |
#' | |
#' @return Nothing, will throw an error if the summary tree is not consistent. | |
check_summary_tree_consistency <- function(summary_tree) { | |
if (is.character(summary_tree)) { | |
summary_tree <- dget(summary_tree) | |
} | |
if (length(summary_tree) == 0) { | |
stop("The summary tree is empty.") | |
} | |
obs_ids <- names(summary_tree) | |
titles <- purrr::map(summary_tree, \(x) x[["title"]]) | |
sessions <- purrr::map(summary_tree, \(x) { | |
if (!purrr::is_empty(x[["session"]])) { | |
x[["session"]] | |
} else { | |
x[["title"]] | |
} | |
}) | |
exp_ids <- paste(sessions, titles, sep = "_") | |
test <- all.equal(obs_ids, exp_ids) | |
if (isTRUE(test)) { | |
return() | |
} | |
stop("The summary tree is not consistent: ", test, ".\n", | |
purrr::map(seq_along(obs_ids), \(i) { | |
if (obs_ids[i] != exp_ids[i]) { | |
sprintf('"%s" != "%s"', obs_ids[i], exp_ids[i]) | |
} | |
}) |> purrr::compact() |> unlist() |> paste(collapse = "\n") | |
) | |
} | |
stop("The summary tree is not consistent: ", test, ".\n", | |
purrr::map(seq_along(obs_ids), \(i) { | |
if (obs_ids[i] != exp_ids[i]) { | |
sprintf('"%s" != "%s"', obs_ids[i], exp_ids[i]) | |
} | |
}, .keep_all = TRUE) %>% unlist() %>% paste(collapse = "\n") | |
) |
# Convert to clock time | ||
cur_time <- agenda[[i]][[time]] | ||
|
||
agenda[[i]][[time]] <- ( | ||
event_start_time + | ||
lubridate::seconds_to_period(cur_time)) | ||
agenda[[i]][[time]] <- ( | ||
event_start_time + | ||
lubridate::seconds_to_period(cur_time)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the convert_agenda_times
function, the conversion from seconds to clock time uses lubridate::seconds_to_period
which is correct. However, there's a missing conversion step that directly formats the POSIXct
object into the desired conversion_format
. This could lead to incorrect time formats being returned.
- lubridate::seconds_to_period(cur_time))
+ lubridate::seconds(cur_time)) %>% format(conversion_format)
Ensure the conversion from seconds to clock time correctly formats the time according to conversion_format
.
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
# Convert to clock time | |
cur_time <- agenda[[i]][[time]] | |
agenda[[i]][[time]] <- ( | |
event_start_time + | |
lubridate::seconds_to_period(cur_time)) | |
agenda[[i]][[time]] <- ( | |
event_start_time + | |
lubridate::seconds_to_period(cur_time)) | |
# Convert to clock time | |
cur_time <- agenda[[i]][[time]] | |
agenda[[i]][[time]] <- ( | |
event_start_time + | |
lubridate::seconds(cur_time)) %>% format(conversion_format) |
if (!purrr::is_empty(chat_file) && !is.na(chat_file)) { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the speech_to_summary_workflow
function, adding chat transcripts assumes the chat_file
is not empty or NA without verifying the file's existence. This could lead to errors if the file is specified but does not exist.
+ if (!file.exists(chat_file)) {
+ stop("Chat file not found: ", chat_file)
+ }
Add a file existence check before attempting to add chat transcripts to ensure the file is accessible.
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
if (!purrr::is_empty(chat_file) && !is.na(chat_file)) { | |
if (!purrr::is_empty(chat_file) && !is.na(chat_file)) { | |
if (!file.exists(chat_file)) { | |
stop("Chat file not found: ", chat_file) | |
} |
This pull request includes the following changes:
Fixed detection of valid file arguments for chat and transcripts merging
Added consistency checks for agenda and summary tree ids
Additional changes:
Reindentation of code
Moved validation functions to their own file