You are tasked with visualizing a day of light logger data, worn by a participant over the course of one week. You´re supposed to show the data in comparison with what is available outside in unobstructed daylight. And, to make matters slightly more complex, you also want to show how the participant´s luminous exposure compares to the current recommendations of healthy daytime, evening, and nighttime light exporsure. While the figure is nothing special in itself, working with light logger data to get the data ready for plotting is fraught with many hurdles, beginning with various raw data structures depending on the device manufacturer, varying temporal resolutions, implicit missing data, irregular data, working with Datetimes, and so forth. LightLogR aims to make this process easier by providing a holistic workflow for the data from import over validation up to and including figure generation and metric calculation.
This article guides you through the process of
importing Light Logger data from a participant as well as the environment, and sleep data about the same participant
creating exploratory visualizations to select a good day for the figure
connecting the datasets to a coherent whole and setting recommended light levels based on the sleep data
creating an appealing visualization with various styles to finish the task
Let´s head right in by loading the package! We will also work a bit
with data manipulation, display the odd table, and enroll the help of
several plotting aids from the ggplot2
package, so we will
also need the tidyverse
and the gt
package.
Later on, we want to combine plots, this is where patchwork
will come in.
Importing Data
The data we need are part of the LightLogR
package. They
are unprocessed (after device export) data from light loggers (and a
diary app for capturing sleep times). All data is anonymous, and we can
access it through the following paths:
path <- system.file("extdata",
package = "LightLogR")
file.LL <- "205_actlumus_Log_1020_20230904101707532.txt.zip"
file.env <- "cyepiamb_CW35_Log_1431_20230904081953614.txt.zip"
file.sleep <- "205_sleepdiary_all_20230904.csv"
Participant Light Logger Data
LightLogR
provides convenient import functions for a
range of supported devices (use the command
supported_devices()
if you want to see what devices are
supported at present). Because LightLogR
knows how the
files from these devices are structured, it needs very little input. In
fact, the mere filepath would suffice. It is, however, a good idea to
also provide the timezone argument tz
to specify that these
measurements were made in the Europe/Berlin
timezone. This
makes your data future-proof for when it is used in comparison with
other geolocations.
Every light logger dataset needs an Id
to connect or
separate observations from the same or different
participant/device/study/etc. If we don´t provide an Id
to
the import function (or the dataset doesn´t contain an Id
column), the filename will be used as an Id
. As this would
be rather cumbersome in our case, we will use a regex
to
extract the first three digits from the filename, which serve this
purpose here.
tz <- "Europe/Berlin"
dataset.LL <- import$ActLumus(file.LL, path, auto.id = "^(\\d{3})", tz = tz)
#>
#> Successfully read in 61'016 observations across 1 Ids from 1 ActLumus-file(s).
#> Timezone set is Europe/Berlin.
#> The system timezone is UTC. Please correct if necessary!
#>
#> First Observation: 2023-08-28 08:47:54
#> Last Observation: 2023-09-04 10:17:04
#> Timespan: 7.1 days
#>
#> Observation intervals:
#> Id interval.time n pct
#> 1 205 10s 61015 100%
As you can see, the import is accompanied by a (hopefully) helpful
message about the imported data. It contains the number ob measurements,
the timezone, start- and enddate, the timespan, and all observation
intervals. In this case, the measurements all follow a 10
second epoch. We also get a plotted overview of the data. In our
case, this is not particularly helpful, but quickly helps to assess how
different datasets compare to one another on the timeline. We could
deactivate this plot by setting auto.plot = FALSE
during
import, or create it separately with the gg_overview()
function.
Because we have no missing values that we would have to deal with first, this dataset is already good to go. If you, e.g., want to know the range of melanopic EDI (a measure of stimulus strength for the nonvisual system) for every day in the dataset, you can do that:
dataset.LL %>% group_by(Date = as_date(Datetime)) %>%
summarize(range.MEDI = range(MEDI) %>% str_flatten(" - ")) %>% gt()
Date | range.MEDI |
---|---|
2023-08-28 | 0 - 10647.22 |
2023-08-29 | 0 - 7591.5 |
2023-08-30 | 0 - 10863.57 |
2023-08-31 | 0 - 10057.08 |
2023-09-01 | 0 - 67272.17 |
2023-09-02 | 0 - 106835.71 |
2023-09-03 | 0 - 57757.9 |
2023-09-04 | 0 - 64323.52 |
Same goes for visualization - it is always helpful to get a good look
at data immediately after import. The gg_day()
function
creates a simple ggplot
of the data, stacked vertically by
Days. The function needs very little input beyond the dataset (in fact,
it would even work without the size
input, which just makes
the default point size smaller, and the interactive
command
sends the output to plotly
to facilitate data exploration).
gg_day()
features a lot of flexibility, and can be adapted
and extended to fit various needs, as we will see shortly.