Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add R API & usage example #654

Open
yannickwurm opened this issue Jun 12, 2023 · 0 comments
Open

Add R API & usage example #654

yannickwurm opened this issue Jun 12, 2023 · 0 comments

Comments

@yannickwurm
Copy link
Member

Subtasks?

  1. Make a mini wrapper / R library for doing those commands? https://sequenceserver.com/doc/api/
  2. And some examples of how to take the extended tab-delimited results, and filter them.
  3. Add mechanism for it to work through SequenceServer Cloud authentication mechanism (magiclinks…)
  4. (no need to make it possible to programatically add new databases at this stage).
  5. Create documentation / blog post

We have an existing way of interrogating from shell (https://sequenceserver.com/doc/api/).

And an R example online: https://sequenceserver.com/blog/disentangling-homology-orthology-parology-and-similarity-with-BLAST

Here is some R code for getting the

# Loading data
blast_results_file_tsv = "sequenceserver-full_tsv_report.tsv" # although https example is better!
blast_results <- read.delim(blast_results_file_tsv, header = FALSE, comment.char = "#")
blast_results_header <- grep(pattern = "# Fields:", x = readLines(con = blast_results_file_tsv, n = 10), value = TRUE)[1]
blast_results_header <- sub(pattern = "# Fields: ", replacement = "", x = blast_results_header)
blast_results_colnames <- unlist(strsplit(x = blast_results_header, split = ", "))
blast_results_colnames <- gsub(pattern = " ", replacement = "_", x = blast_results_colnames)
colnames(blast_results) <- blast_results_colnames

# Example of pulling out top hit for each query
library(dplyr)
top_hits <- blast_results %>%
  arrange(query_id, desc(bit_score)) %>%
  group_by(query_id) %>%
  slice_head(n = 1) %>%
  arrange(desc(bit_score))
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant