Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

dmy() Parsing of months in text failed #405

Closed
gordchan opened this issue Apr 26, 2016 · 8 comments
Closed

dmy() Parsing of months in text failed #405

gordchan opened this issue Apr 26, 2016 · 8 comments

Comments

@gordchan
Copy link

gordchan commented Apr 26, 2016

I had been using v.1.3.3 of lubridate and had no trouble running the following code:

start <- dmy("1 Jan 1987")

However on another computer running v.1.5.6 the following error was returned:

> start <- dmy("1 Jan 1987")
Warning message:
All formats failed to parse. No formats found.

Was the parsing of textual format of months obsolete in the new version? Thanks.

@cderv
Copy link
Contributor

cderv commented Apr 26, 2016

The problem may come from your locale settings. dmy takes as argument locale and guess format based on it.

run Sys.getlocale("LC_TIME") to have your local. It will help to reproduce the example.

For me, dmy("1 Jan 1987") is not working either but I am in french local "French_France.1252".
So I have to write dmy("1 Janv. 1987") to have ther correct guess.

you could check this with lubridtae::guess_format function in which you can change the locales setting:

# working
lubridate::guess_formats("1 Janv. 1987", orders = "dmy", local = "French_France.1252")
#>        dmy 
#> "%d %b %Y"
# not working
lubridate::guess_formats("1 Jan 1987", orders = "dmy", local = "French_France.1252")
#> NULL
#Working
lubridate::guess_formats("1 Jan 1987", orders = "dmy", local = "C")
#>        dmy 
#> "%d %b %Y"

You could also try to see what is going on the other way around by format a date in several forms:

format(lubridate::dmy("1 01 1987"), "%b@%B")
#> [1] "janv.@janvier"

You could see that my abbreviated version ( %b format) for january is not "Jan" in my local but "janv."

You could get each abbreviation by this command (here is for french locales)

format(seq.Date(as.Date('2000-01-01'), by = 'month', len = 12), "%b")
#>  [1] "janv." "févr." "mars"  "avr."  "mai"   "juin"  "juil." "août" 
#>  [9] "sept." "oct."  "nov."  "déc."

You could try to run these commands and see if it could be the source of the problem. And if I have your locale I may be able to help you more.

By the way, I could not try it with previous version 1.3.3 as I can't succes to install it. Sorry.
I will try to see what's changed in the code, but I can help you to have it worked with new version.

@vspinu
Copy link
Member

vspinu commented Apr 26, 2016

@cderv is completely right. It's the locale issue. There is an open issue for adding alphabetic parsing of English names independently of the locale (#287).

@gordchan
Copy link
Author

Thanks for the help @cderv and @vspinu, I have tried the following code on the problematic computer running v.1.5.6:

> Sys.getlocale()
[1] "LC_COLLATE=Chinese (Traditional)_Hong Kong S.A.R..950;LC_CTYPE=Chinese (Traditional)_Hong Kong S.A.R..950;LC_MONETARY=Chinese (Traditional)_Hong Kong S.A.R..950;LC_NUMERIC=C;LC_TIME=Chinese (Traditional)_Hong Kong S.A.R..950"
> format(seq.Date(as.Date('2000-01-01'), by = 'month', len = 12), "%b")
 [1] "一月"   "二月"   "三月"   "四月"   "五月"   "六月"   "七月"   "八月"  
 [9] "九月"   "十月"   "十一月" "十二月"

Comparing with the original computer running v.1.3.3:

> Sys.getlocale()
[1] "LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C"

I think it is unrelated to the version of lubridate. But could I set the locale before parsing the date? Not sure what would happen if I mess around with the locale there.

@vspinu
Copy link
Member

vspinu commented Apr 27, 2016

But could I set the locale before parsing the date?

Yes. You can set locale with Sys.setlocale(LC_CTYPE, "LC_TIME=en_US.UTF-8") inside R session. Or you can also pass the locale argument directly to parsing functions like ymd.

@vspinu vspinu closed this as completed Apr 27, 2016
@cderv
Copy link
Contributor

cderv commented Apr 27, 2016

@vspinu , If guess_formats checks only with the locale, may be the help page is out of date since it is precise

locale locale to use, default to the current locale (also checks en_US)

It is not really a problem but just a wrong indication, at least until new feature.

@vspinu
Copy link
Member

vspinu commented Apr 27, 2016

Indeed. I have just fixed that. Thanks!

@cderv
Copy link
Contributor

cderv commented Apr 27, 2016

Cool, glad I could help. Should I have done a PR for that or was it ok to just tell you ?
It's been several time I hesitate on the correct way to contribute.

@vspinu
Copy link
Member

vspinu commented Apr 27, 2016

Either is fine. But PRs are always very much appreciated :)

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants