-
Notifications
You must be signed in to change notification settings - Fork 343
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Having trouble scraping historical data using rvest on CSS html pages #318
Comments
I believe this works with v1.0.1. |
I still see a failure: library(rvest)
test_url <- "https://www.moneycontrol.com/stocks/hist_stock_result.php?ex=B&sc_id=ITC&mycomp=ITC" %>% read_html()
test.table <- test_url %>%
html_nodes("table") %>%
html_table()
#> Error in matrix(unlist(values), ncol = width, byrow = TRUE): 'data' must be of a vector type, was 'NULL' Created on 2021-08-03 by the reprex package (v2.0.0) |
Sorry, my bad. This is solved with v1.0.1.9000 (current master at GitHub, but not yet on CRAN): remotes::install_github("tidyverse/rvest", force = TRUE)
library(rvest)
"https://www.moneycontrol.com/stocks/hist_stock_result.php?ex=B&sc_id=ITC&mycomp=ITC" %>%
read_html() %>%
html_elements("table") %>%
html_table() yields [[1]]
# A tibble: 1 x 1
X1
<lgl>
1 NA
[[2]]
# A tibble: 1 x 3
X1 X2 X3
<chr> <chr> <chr>
1 Period High : Period Low : Change in market-cap : 0%
[[3]]
# A tibble: 0 x 0
[[4]]
# A tibble: 1 x 2
X1 X2
<chr> <chr>
1 AT (Rs) GAIN (Rs)
[[5]]
# A tibble: 1 x 2
X1 X2
<chr> <chr>
1 RECO PRICE PEAK PRICE |
Ah, our replies crossed 😀 |
@epiben nice, I can close this then 😄 |
I still see the error. J FYI k<-read_html("https://www.geos.ed.ac.uk/sccs/project-info/1182") k%>%html_table(".table", header = FALSE) Error in matrix(unlist(values), ncol = width, byrow = TRUE) : packageVersion("rvest") [1] ‘1.0.3’ |
I'll look into it ASAP. |
Turns out that the offending table (the eleventh) was empty in the sense of having no cells, but it did have a row. I proposed a fix to this in #360. |
Thanks @epiben But just wanted to let you know. Just in case you wanted to handle this case too. Thanks again! |
No worries, and thank you for raising it here so we can handle also this edge case; it's better to handle it up front. I never knew empty tables came in so many different forms. |
Please briefly describe your problem and what output you expect. If you have a question, please don't use this form. Instead, ask on https://stackoverflow.com/ or https://community.rstudio.com/.
Please include a minimal reproducible example (AKA a reprex). If you've never heard of a reprex before, start by reading https://www.tidyverse.org/help/#reprex.
test_url <-"https://www.moneycontrol.com/stocks/hist_stock_result.php?ex=B&sc_id=ITC&mycomp=ITC" %>% read_html()
test.table<- test_url %>% html_nodes("table") %>% html_table()
Error in matrix(unlist(values), ncol = width, byrow = TRUE) :
'data' must be of a vector type, was 'NULL'
Brief description of the problem
# insert reprex here
The text was updated successfully, but these errors were encountered: