this document contains the analysis documentation for the article How to deal with darkness: Modelling and visualization of zero-inflated personal light exposure data on a logarithmic scale.
A preprint of the publication is available at bioRxiv. This page will be updated with a link and DOI to the main publication in the Journal of Biological Rhythms upon release.
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.2 ✔ tibble 3.3.0
✔ lubridate 1.9.4 ✔ tidyr 1.3.1
✔ purrr 1.0.4
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(mgcv)
Loading required package: nlme
Attaching package: 'nlme'
The following object is masked from 'package:dplyr':
collapse
This is mgcv 1.9-1. For overview type 'help("mgcv-package")'.
library(patchwork)library(gt)library(itsadug)
Loading required package: plotfunctions
Attaching package: 'plotfunctions'
The following object is masked from 'package:ggplot2':
alpha
Loaded package itsadug 2.4 (see 'help("itsadug")' ).
library(ggsci)library(cowplot)
Attaching package: 'cowplot'
The following object is masked from 'package:gt':
as_gtable
The following object is masked from 'package:patchwork':
align_plots
The following object is masked from 'package:lubridate':
stamp
library(tweedie)library(ggforce)
Baseline Data and visualization
In this section, we will copy relevant portions of a tutorial from LightLogR The whole game which will be the basis for both visualizations and analysis.
Successfully read in 61'016 observations across 1 Ids from 1 ActLumus-file(s).
Timezone set is Europe/Berlin.
First Observation: 2023-08-28 08:47:54
Last Observation: 2023-09-04 10:17:04
Timespan: 7.1 days
Observation intervals:
Id interval.time n pct
1 205 10s 61015 100%
Successfully read in 20'143 observations across 1 Ids from 1 ActLumus-file(s).
Timezone set is Europe/Berlin.
First Observation: 2023-08-28 08:28:39
Last Observation: 2023-09-04 08:19:38
Timespan: 7 days
Observation intervals:
Id interval.time n pct
1 CW35 29s 1 0%
2 CW35 30s 20141 100%
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Scale for x is already present.
Adding another scale for x, which will replace the existing scale.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Scale for x is already present.
Adding another scale for x, which will replace the existing scale.
Scale for x is already present.
Adding another scale for x, which will replace the existing scale.
Scale for fill is already present.
Adding another scale for fill, which will replace the existing scale.
Scale for colour is already present.
Adding another scale for colour, which will replace the existing scale.
Scale for x is already present.
Adding another scale for x, which will replace the existing scale.
Scale for fill is already present.
Adding another scale for fill, which will replace the existing scale.
Scale for colour is already present.
Adding another scale for colour, which will replace the existing scale.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Scale for x is already present.
Adding another scale for x, which will replace the existing scale.
Scale for fill is already present.
Adding another scale for fill, which will replace the existing scale.
Scale for colour is already present.
Adding another scale for colour, which will replace the existing scale.
Scale for y is already present.
Adding another scale for y, which will replace the existing scale.
Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning
-Inf
Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning
-Inf
Scale for x is already present.
Adding another scale for x, which will replace the existing scale.
Scale for fill is already present.
Adding another scale for fill, which will replace the existing scale.
Scale for colour is already present.
Adding another scale for colour, which will replace the existing scale.
Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning
-Inf
Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning
-Inf
Modelling
#setting the ends for the cyclic smoothknots_day <-list(time =c(0, 24*3600))model_data <-dataset.LL.partial %>%ungroup() %>%rename(Environment = Reference) %>%pivot_longer(names_to ="type", cols =c(MEDI, Environment)) %>%arrange(type) %>%mutate(time = hms::as_hms(Datetime) %>%as.numeric(),time =c(time[-n()], 24*3600),type =factor(type),start.event = time ==0,input_m1 =case_when( value ==0~NA,.default = value),input_m2 = value + .Machine$double.eps,input_m3 = value +0.1,input_m4 = value,.by = type )
Model type 1: 0 to NA
pattern_formula <-log10(input_m1) ~ type +s(time, by = type, bs ="cc", k =24)model_data_m1 <- model_data %>%group_by(type, case =is.na(input_m1)) %>%mutate(start.event =c(TRUE, rep(FALSE, n() -1)))model_1 <-bam(formula = pattern_formula, data = model_data_m1, method ="REML", knots = knots_day)r1 <-start_value_rho(model_1, plot=TRUE)
Method: REML Optimizer: outer newton
full convergence after 6 iterations.
Gradient range [-4.288836e-10,2.338929e-11]
(score 189.8507 & scale 0.1288363).
Hessian positive definite, eigenvalue range [2.029136,185.2563].
Model rank = 46 / 46
Basis dimension (k) checking results. Low p-value (k-index<1) may
indicate that k is too low, especially if edf is close to k'.
k' edf k-index p-value
s(time):typeEnvironment 22.0 14.5 0.87 0.005 **
s(time):typeMEDI 22.0 18.5 0.87 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Method: REML Optimizer: outer newton
full convergence after 5 iterations.
Gradient range [-2.069517e-07,5.04231e-09]
(score 949.2801 & scale 2.561581).
Hessian positive definite, eigenvalue range [4.66738,288.4646].
Model rank = 46 / 46
Basis dimension (k) checking results. Low p-value (k-index<1) may
indicate that k is too low, especially if edf is close to k'.
k' edf k-index p-value
s(time):typeEnvironment 22.0 16.8 0.99 0.35
s(time):typeMEDI 22.0 15.6 0.99 0.47
Method: REML Optimizer: outer newton
full convergence after 5 iterations.
Gradient range [-1.933665e-07,3.07114e-09]
(score 21.69085 & scale 0.09072546).
Hessian positive definite, eigenvalue range [5.504781,288.4603].
Model rank = 46 / 46
Basis dimension (k) checking results. Low p-value (k-index<1) may
indicate that k is too low, especially if edf is close to k'.
k' edf k-index p-value
s(time):typeEnvironment 22.0 17.6 1.1 0.99
s(time):typeMEDI 22.0 14.5 1.1 0.98
Method: REML Optimizer: outer newton
full convergence after 5 iterations.
Gradient range [-0.005965489,0.0009400968]
(score 2895.865 & scale 2.084384).
Hessian positive definite, eigenvalue range [2.112661,585.8741].
Model rank = 46 / 46
Basis dimension (k) checking results. Low p-value (k-index<1) may
indicate that k is too low, especially if edf is close to k'.
k' edf k-index p-value
s(time):typeEnvironment 22.0 15.7 1.12 1
s(time):typeMEDI 22.0 16.6 1.12 1