Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Wrong fileEncoding in get_poll #17

Open
MarcoPortmann opened this issue Dec 20, 2021 · 2 comments
Open

Wrong fileEncoding in get_poll #17

MarcoPortmann opened this issue Dec 20, 2021 · 2 comments
Labels
bug Something isn't working question Further information is requested

Comments

@MarcoPortmann
Copy link

I have been investigating why get_poll contains way to few observations. I suspect that fileEncoding should not be set to "latin1".

Example:
nrow( swissdd::get_poll(bfsnr = 6380))
returns 13 rows.

This is equivalent to this:

test_df <- read.csv("https://swissvotes.ch/vote/638.00/nachbefragung.csv", sep = ",", header = TRUE,  stringsAsFactors = FALSE, fileEncoding = "latin1")
nrow(test_df)

The dataset has 13 rows and read.csv reports the following warning:

Warning messages:
1: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  :
  invalid input found on input connection 'https://swissvotes.ch/vote/638.00/nachbefragung.csv'
2: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  :
  EOF within quoted string

However, the following seems to be fine:

test_df2 <- read.csv("https://swissvotes.ch/vote/638.00/nachbefragung.csv", sep = ",", header = TRUE,  stringsAsFactors = FALSE)
nrow(test_df2)
# 3070 observations.
@ThomasWilli
Copy link
Contributor

Hi @MarcoPortmann
Thanks for pointing this out. We haven't encountered this error yet. Hence, could you provide some information regarding your locale?
Thanks

@ThomasWilli ThomasWilli added bug Something isn't working question Further information is requested labels Dec 22, 2021
@MarcoPortmann
Copy link
Author

Is this the information you're referring to?

> Sys.getlocale (category = "LC_ALL")
[1] "LC_COLLATE=German_Switzerland.1252;LC_CTYPE=German_Switzerland.1252;LC_MONETARY=German_Switzerland.1252;LC_NUMERIC=C;LC_TIME=German_Switzerland.1252"

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants