I get NA when I convert character to time (POSIXlt)

Question:

Why do I get NA when I do this character to POSIXlt conversion?

    library(bReeze)
    data(winddata)

    tempo <- winddata[,1]
    tempo[1:6] # Preview 
    # [1] "06.05.2009 11:20" "06.05.2009 11:30" "06.05.2009 11:40"

    tempo_POSIX <- strptime(tempo, format = "%d.%m.%Y %H:%M")
    sum(is.na(tempo_POSIX))
    # [1] 6

    valores_NA <- which(is.na(tempo_POSIX))
    tempo[valores_NA]
    # [1] "18.10.2009 00:00" "18.10.2009 00:10" "18.10.2009 00:20" 
    # [3] "18.10.2009 00:30" "18.10.2009 00:40" "18.10.2009 00:50"

As you can see, the values ​​that were converted to NA behave normally… they follow the same format as the others.

Interestingly, the error DOES NOT OCCUR if I pass a value to the tz argument

    tempo_POSIX <- strptime(tempo, format = "%d.%m.%Y %H:%M", tz = "GMT")
    sum(is.na(tempo_POSIX))
    # [1] 0

My system information is:

    > sessionInfo()
    R version 3.0.2 (2013-09-25)
    Platform: x86_64-w64-mingw32/x64 (64-bit)

    locale:
    [1] LC_COLLATE=Portuguese_Brazil.1252  LC_CTYPE=Portuguese_Brazil.1252   
    [3] LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C                      
    [5] LC_TIME=Portuguese_Brazil.1252    

    attached base packages:
    [1] stats     graphics  grDevices utils     datasets  methods   base     

    other attached packages:
    [1] bReeze_0.4-0

    loaded via a namespace (and not attached):
    [1] tools_3.0.2

Answer:

In the as.POSIXlt help, there is the following passage that highlights that converting date time formats needs a time-zone and will validate this time and that this can cause problems in daylight saving time (DST):

Character input is first converted to class "POSIXlt" by strptime: numeric input is first converted to "POSIXct". Any conversion that needs to go between the two date-time classes requires a time zone: > conversion from "POSIXlt" to "POSIXct" will validate times in the selected time zone. One issue is what happens at transitions to and from DST

When you do strptime(tempo, format = "%d.%m.%Y %H:%M") , you are converting the object to the POSIXlt class.

class(tempo_POSIX)
[1] "POSIXlt" "POSIXt" 

But when you do is.na() , you are converting to POSIXct. Note that the is.na.POSIXlt method uses the as.POSIXct function:

is.na.POSIXlt
function (x) 
is.na(as.POSIXct(x))
<bytecode: 0x26519a14>
<environment: namespace:base>

Daylight saving time in Brazil in 2009 started on October 18th at 00:00. That is, considering daylight savings time, there is no 00:00 in Brazil on October 18, 2009, because when the clock turned 23:59 the day before, it automatically jumped to 01:00 in the morning.

So when you do is.na() you are transforming the date into POSIXct and this conversion validates the date supplied with your locale (which is probably Brazil/São Paulo, as you didn't specify the time zone, the of the system). And since there is no 00:00 on October 28th in this time zone, this results (correctly, but unexpectedly) in NA. When you put the GMT time-zone or another one that exists the date you are passing (like London), it does the conversion normally, that's why it worked with tz = "GMT" (and why it worked with Djongs, it must be in another locale).

Scroll to Top