Datasets from light loggers often have implicit gaps. These gaps are implicit
in the sense that consecutive timestamps (Datetimes) might not follow a
regular epoch/interval. This function fills these implicit gaps by creating a
gapless sequence of Datetimes and joining it to the dataset. The gapless
sequence is determined by the minimum and maximum Datetime in the dataset
(per group) and an epoch. The epoch can either be guessed from the dataset or
specified by the user. A sequence of gapless Datetimes can be created with
the gapless_Datetimes() function, whereas the dominant epoch in the data
can be checked with the dominant_epoch() function. The behaviour argument
specifies how the data is combined. By default, the data is joined with a
full join, which means that all rows from the gapless sequence are kept, even
if there is no matching row in the dataset.
Usage
gap_handler(
dataset,
Datetime.colname = Datetime,
epoch = "dominant.epoch",
behavior = c("full_sequence", "regulars", "irregulars", "gaps"),
full.days = FALSE
)Arguments
- dataset
A light logger dataset. Needs to be a dataframe.
- Datetime.colname
The column that contains the datetime. Needs to be a
POSIXctand part of the dataset.- epoch
The epoch to use for the gapless sequence. Can be either a
lubridate::duration()or a string. If it is a string, it needs to be either '"dominant.epoch"' (the default) for a guess based on the data or a validlubridate::duration()string, e.g.,"1 day"or"10 sec".- behavior
The behavior of the join of the
datasetwith thegaplesssequence. Can be one of"full_sequence"(the default),"regulars","irregulars", or"gaps". See @return for details.- full.days
If
TRUE, the gapless sequence will include the whole first and last day where there is data.
Value
A modified tibble similar to dataset but with handling of implicit gaps, depending on the behavior argument:
"full_sequence"adds timestamps to thedatasetthat are missing based on a full sequence ofDatetimes(i.e., the gapless sequence). Thedatasetis this equal (no gaps) or greater in the number of rows than the input. One column is added.is.implicitindicates whether the row was added (TRUE) or not (FALSE). This helps differentiating measurement values from values that might be imputed later on."regulars"keeps only rows from the gapless sequence that have a matching row in the dataset. This can be interpreted as a row-reduceddatasetwith only regular timestamps according to theepoch. In case of no gaps this tibble has the same number of rows as the input."irregulars"keeps only rows from thedatasetthat do not follow the regular sequence ofDatetimesaccording to theepoch. In case of no gaps this tibble has 0 rows."gaps"returns atibbleof all implicit gaps in the dataset. In case of no gaps this tibble has 0 rows.
See also
Other regularize:
dominant_epoch(),
extract_gaps(),
gap_finder(),
gapless_Datetimes(),
has_gaps(),
has_irregulars()
Examples
dataset <-
tibble::tibble(Id = c("A", "A", "A", "B", "B", "B"),
Datetime = lubridate::as_datetime(1) +
lubridate::days(c(0:2, 4, 6, 8)) +
lubridate::hours(c(0,12,rep(0,4)))) %>%
dplyr::group_by(Id)
dataset
#> # A tibble: 6 × 2
#> # Groups: Id [2]
#> Id Datetime
#> <chr> <dttm>
#> 1 A 1970-01-01 00:00:01
#> 2 A 1970-01-02 12:00:01
#> 3 A 1970-01-03 00:00:01
#> 4 B 1970-01-05 00:00:01
#> 5 B 1970-01-07 00:00:01
#> 6 B 1970-01-09 00:00:01
#assuming the epoch is 1 day, we can add implicit data to our dataset
dataset %>% gap_handler(epoch = "1 day")
#> # A tibble: 9 × 3
#> # Groups: Id [2]
#> Id Datetime is.implicit
#> <chr> <dttm> <lgl>
#> 1 A 1970-01-01 00:00:01 FALSE
#> 2 A 1970-01-02 00:00:01 TRUE
#> 3 A 1970-01-02 12:00:01 FALSE
#> 4 A 1970-01-03 00:00:01 FALSE
#> 5 B 1970-01-05 00:00:01 FALSE
#> 6 B 1970-01-06 00:00:01 TRUE
#> 7 B 1970-01-07 00:00:01 FALSE
#> 8 B 1970-01-08 00:00:01 TRUE
#> 9 B 1970-01-09 00:00:01 FALSE
#we can also check whether there are irregular Datetimes in our dataset
dataset %>% gap_handler(epoch = "1 day", behavior = "irregulars")
#> # A tibble: 1 × 3
#> # Groups: Id [1]
#> Id Datetime is.implicit
#> <chr> <dttm> <lgl>
#> 1 A 1970-01-02 12:00:01 FALSE
#to get to the gaps, we can use the "gaps" behavior
dataset %>% gap_handler(epoch = "1 day", behavior = "gaps")
#> # A tibble: 3 × 2
#> # Groups: Id [2]
#> Id Datetime
#> <chr> <dttm>
#> 1 A 1970-01-02 00:00:01
#> 2 B 1970-01-06 00:00:01
#> 3 B 1970-01-08 00:00:01
#finally, we can also get just the regular Datetimes
dataset %>% gap_handler(epoch = "1 day", behavior = "regulars")
#> # A tibble: 5 × 3
#> # Groups: Id [2]
#> Id Datetime is.implicit
#> <chr> <dttm> <lgl>
#> 1 A 1970-01-01 00:00:01 FALSE
#> 2 A 1970-01-03 00:00:01 FALSE
#> 3 B 1970-01-05 00:00:01 FALSE
#> 4 B 1970-01-07 00:00:01 FALSE
#> 5 B 1970-01-09 00:00:01 FALSE
