ncbi_searcher XML parsing error #64

zachary-foster · 2016-02-10T19:52:14Z

I ran into an error when using ncbi_searcher in my package today.
It had worked in the past, so something must have changed in the past few months.
The error can be produced by running the example code for ncbi_searcher:

 out <- ncbi_searcher(taxa="Umbra limi", seqrange = "1:2000")

Retrieving data for taxon 'Umbra limi'

Working on Umbra limi...
...retrieving sequence IDs...
Error in UseMethod("xpathApply") : 
  no applicable method for 'xpathApply' applied to an object of class "c('xml_document', 'xml_node')"

The line of code (#143) that causes the error is:

esearch_result <- xpathApply(content(query_init, as = "parsed"), "//eSearchResult")[[1]]

It seems that httr::content is returning a class with a slightly different name than the ones XML::xpathApply expects:

class(content(query_init, as = "parsed"))

No encoding supplied: defaulting to UTF-8.
[1] "xml_document" "xml_node"

library(XML)
methods(xpathApply)

[1] xpathApply.XMLInternalDocument* xpathApply.XMLInternalNode*     xpathApply.XMLNode*            
see '?methods' for accessing help and source code

I noticed that Hadley Wickham made the httr package.
He also made the xml2 package, which seems to do many of the same things as the XML package.
I wonder if he changed the classes that httr::content outputs in order to be compatible with his new xml2 package.
Anyway, it looks like there is a function xml2::xml_find_all which dose the same thing as XML:: xpathApply.
Using some simple tests, I was able to get xml2::xml_find_all to work with httr::content as far as I can tell.
I guess we should switch to using xml2 instead of XML?
I am working on getting around the error now.

The text was updated successfully, but these errors were encountered:

sckott · 2016-02-10T20:13:38Z

Good catch. Yep, httr now uses xml2 by default to parse XML, so if you let content() parse the data, it uses xml2. I've been going through all packages I work on (hadn't gotten to this one yet) updating to use content(x, as ="text", encoding = "UTF-8"), then parse manually. The encoding has to be set explicitly otherwise you get a warning.

see #65

sckott · 2016-02-10T20:14:09Z

you sending a PR?

zachary-foster · 2016-02-10T22:21:04Z

Yes, I will send a PR if I can get it fixed. I think I can fix it shortly. Thanks for the quick response!

sckott · 2016-02-10T22:26:19Z

thx

resolves #64; fixes ncbi_searcher XML parsing error

sckott added the bug label Feb 10, 2016

zachary-foster mentioned this issue Feb 10, 2016

resolves #64; fixes ncbi_searcher XML parsing error #66

Merged

sckott closed this as completed in e8aaeda Feb 11, 2016

sckott added a commit that referenced this issue Feb 11, 2016

Merge pull request #66 from zachary-foster/master

3f13442

resolves #64; fixes ncbi_searcher XML parsing error

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ncbi_searcher XML parsing error #64

ncbi_searcher XML parsing error #64

zachary-foster commented Feb 10, 2016

sckott commented Feb 10, 2016

sckott commented Feb 10, 2016

zachary-foster commented Feb 10, 2016

sckott commented Feb 10, 2016

ncbi_searcher XML parsing error #64

ncbi_searcher XML parsing error #64

Comments

zachary-foster commented Feb 10, 2016

sckott commented Feb 10, 2016

sckott commented Feb 10, 2016

zachary-foster commented Feb 10, 2016

sckott commented Feb 10, 2016