Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

ncbi_searcher XML parsing error #64

Closed
zachary-foster opened this issue Feb 10, 2016 · 4 comments
Closed

ncbi_searcher XML parsing error #64

zachary-foster opened this issue Feb 10, 2016 · 4 comments
Labels

Comments

@zachary-foster
Copy link
Contributor

I ran into an error when using ncbi_searcher in my package today.
It had worked in the past, so something must have changed in the past few months.
The error can be produced by running the example code for ncbi_searcher:

 out <- ncbi_searcher(taxa="Umbra limi", seqrange = "1:2000")
Retrieving data for taxon 'Umbra limi'

Working on Umbra limi...
...retrieving sequence IDs...
Error in UseMethod("xpathApply") : 
  no applicable method for 'xpathApply' applied to an object of class "c('xml_document', 'xml_node')"

The line of code (#143) that causes the error is:

esearch_result <- xpathApply(content(query_init, as = "parsed"), "//eSearchResult")[[1]]

It seems that httr::content is returning a class with a slightly different name than the ones XML::xpathApply expects:

class(content(query_init, as = "parsed"))
No encoding supplied: defaulting to UTF-8.
[1] "xml_document" "xml_node"    
library(XML)
methods(xpathApply)
[1] xpathApply.XMLInternalDocument* xpathApply.XMLInternalNode*     xpathApply.XMLNode*            
see '?methods' for accessing help and source code

I noticed that Hadley Wickham made the httr package.
He also made the xml2 package, which seems to do many of the same things as the XML package.
I wonder if he changed the classes that httr::content outputs in order to be compatible with his new xml2 package.
Anyway, it looks like there is a function xml2::xml_find_all which dose the same thing as XML:: xpathApply.
Using some simple tests, I was able to get xml2::xml_find_all to work with httr::content as far as I can tell.
I guess we should switch to using xml2 instead of XML?
I am working on getting around the error now.

@sckott sckott added the bug label Feb 10, 2016
@sckott
Copy link
Contributor

sckott commented Feb 10, 2016

Good catch. Yep, httr now uses xml2 by default to parse XML, so if you let content() parse the data, it uses xml2. I've been going through all packages I work on (hadn't gotten to this one yet) updating to use content(x, as ="text", encoding = "UTF-8"), then parse manually. The encoding has to be set explicitly otherwise you get a warning.

see #65

@sckott
Copy link
Contributor

sckott commented Feb 10, 2016

you sending a PR?

@zachary-foster
Copy link
Contributor Author

Yes, I will send a PR if I can get it fixed. I think I can fix it shortly. Thanks for the quick response!

@sckott
Copy link
Contributor

sckott commented Feb 10, 2016

thx

@sckott sckott closed this as completed in e8aaeda Feb 11, 2016
sckott added a commit that referenced this issue Feb 11, 2016
resolves #64; fixes ncbi_searcher XML parsing error
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants