gap_table()
creates a gt::gt()
with one row per group, summarizing key gap
and gap-related information about the dataset. These include the available
data, total duration, number of gaps, missing implicit and explicit data,
and, optionally, irregular data.
Usage
gap_table(
dataset,
Variable.colname = MEDI,
Variable.label = "melanopic EDI",
title = "Summary of available and missing data",
Datetime.colname = Datetime,
epoch = "dominant.epoch",
full.days = TRUE,
include.implicit.gaps = TRUE,
check.irregular = TRUE
)
Arguments
- dataset
A light logger dataset. Needs to be a dataframe.
- Variable.colname
Column name of the variable to check for NA values. Expects a symbol.
- Variable.label
Clear name of the variable. Expects a string
- title
Title string for the table
- Datetime.colname
The column that contains the datetime. Needs to be a
POSIXct
and part of the dataset.- epoch
The epoch to use for the gapless sequence. Can be either a
lubridate::duration()
or a string. If it is a string, it needs to be either '"dominant.epoch"' (the default) for a guess based on the data or a validlubridate::duration()
string, e.g.,"1 day"
or"10 sec"
.- full.days
If
TRUE
, the gapless sequence will include the whole first and last day where there is data.- include.implicit.gaps
Logical. Whether to expand the datetime sequence and search for implicit gaps, or not. Default is
TRUE
. If noVariable.colname
is provided, this argument will be ignored. If there are implicit gaps, gap calculation can be incorrect whenever there are missing explicit gaps flanking implicit gaps!- check.irregular
Logical on whether to include irregular data in the summary, i.e. data points that do not fall on the regular sequence.
Examples
sample.data.environment |> dplyr::filter(MEDI <= 50000) |> gap_table()
Summary of available and missing data
Variable: melanopic EDI
Time
%
n1
n2,1
Time
n1
Time
N
ø
øn1
Time
%
n1
Time
%
n1
Time
%
n1
Overall
Environment
Participant
1 Number of (missing or actual) observations
2 If n > 0: it is possible that the other summary statistics are affected, as they are calculated based on the most prominent interval.
3 Based on times, not necessarily number of observations