Research Question 1

Preface

This document contains the analysis for Research Question 1, as defined in the preregistration.

Research Question 1:
Do sites differ in objectively measured personal light exposure? If so, does an individual’s behaviour and micro-environment have a stronger effect than site?

H1: There is a significant difference in personal light-exposure metrics across sites, even after accounting for latitude and photoperiod.

H2: The variance of personal light exposure patterns (hourly geometric mean of melanopic EDI) by participants within sites is greater than the variance between sites.

Setup

In this section, we load in the preprocessed and prepared data, site data, and load required packages.

In [1]:

Code

Setup

#| message: false
#| warning: false
#| filename: Setup
library(tidyverse)

Warning: package 'ggplot2' was built under R version 4.5.2

Warning: package 'tibble' was built under R version 4.5.2

Warning: package 'tidyr' was built under R version 4.5.2

Warning: package 'readr' was built under R version 4.5.2

Warning: package 'purrr' was built under R version 4.5.2

Warning: package 'dplyr' was built under R version 4.5.2

Warning: package 'lubridate' was built under R version 4.5.2

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.2.1     ✔ readr     2.2.0
✔ forcats   1.0.1     ✔ stringr   1.6.0
✔ ggplot2   4.0.2     ✔ tibble    3.3.1
✔ lubridate 1.9.5     ✔ tidyr     1.3.2
✔ purrr     1.2.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

Code

#| message: false
#| warning: false
#| filename: Setup
library(melidosData)
library(LightLogR)
library(mgcv)

Loading required package: nlme

Attaching package: 'nlme'

The following object is masked from 'package:dplyr':

    collapse

This is mgcv 1.9-4. For overview type '?mgcv'.

Code

#| message: false
#| warning: false
#| filename: Setup
library(lme4)

Warning: package 'lme4' was built under R version 4.5.2

Loading required package: Matrix

Attaching package: 'Matrix'

The following objects are masked from 'package:tidyr':

    expand, pack, unpack


Attaching package: 'lme4'

The following object is masked from 'package:nlme':

    lmList

Code

#| message: false
#| warning: false
#| filename: Setup
library(performance)

Warning: package 'performance' was built under R version 4.5.2

Code

#| message: false
#| warning: false
#| filename: Setup
library(cowplot)


Attaching package: 'cowplot'

The following object is masked from 'package:lubridate':

    stamp

Code

#| message: false
#| warning: false
#| filename: Setup
library(ggridges)
library(broom)

Warning: package 'broom' was built under R version 4.5.2

Code

#| message: false
#| warning: false
#| filename: Setup
library(itsadug)

Warning: package 'itsadug' was built under R version 4.5.2

Loading required package: plotfunctions

Warning: package 'plotfunctions' was built under R version 4.5.2


Attaching package: 'plotfunctions'

The following object is masked from 'package:ggplot2':

    alpha

Loaded package itsadug 2.5 (see 'help("itsadug")' ).

Code

#| message: false
#| warning: false
#| filename: Setup
library(gratia)

Warning: package 'gratia' was built under R version 4.5.2


Attaching package: 'gratia'

The following object is masked from 'package:itsadug':

    dispersion

The following object is masked from 'package:stringr':

    boundary

Code

#| message: false
#| warning: false
#| filename: Setup
library(emmeans)

Warning: package 'emmeans' was built under R version 4.5.2

Welcome to emmeans.
Caution: You lose important information if you filter this package's results.
See '? untidy'

Code

#| message: false
#| warning: false
#| filename: Setup
library(patchwork)


Attaching package: 'patchwork'

The following object is masked from 'package:cowplot':

    align_plots

Code

#| message: false
#| warning: false
#| filename: Setup
library(rlang)

Warning: package 'rlang' was built under R version 4.5.2


Attaching package: 'rlang'

The following objects are masked from 'package:purrr':

    flatten, flatten_chr, flatten_dbl, flatten_int, flatten_lgl,
    flatten_raw, invoke, splice

Code

#| message: false
#| warning: false
#| filename: Setup
library(broom.mixed)

Warning: package 'broom.mixed' was built under R version 4.5.2

Code

#| message: false
#| warning: false
#| filename: Setup
library(gt)

Warning: package 'gt' was built under R version 4.5.2


Attaching package: 'gt'

The following object is masked from 'package:cowplot':

    as_gtable

Code

#| message: false
#| warning: false
#| filename: Setup
library(performance)
library(glmmTMB)

Warning: package 'glmmTMB' was built under R version 4.5.2

Warning in check_dep_version(dep_pkg = "TMB"): package version mismatch: 
glmmTMB was built with TMB package version 1.9.19
Current TMB package version is 1.9.21
Please re-install glmmTMB from source or restore original 'TMB' package (see '?reinstalling' for more information)

Code

#| message: false
#| warning: false
#| filename: Setup
library(gghighlight)
library(gtsummary)

Warning: package 'gtsummary' was built under R version 4.5.2

Code

#| message: false
#| warning: false
#| filename: Setup
source("scripts/fitting.R")
source("scripts/summaries.R")
source("scripts/H1_specific.R")

In [2]:

Code cell - Load data

#| message: false
#| code-summary: Code cell - Load data
load("data/metrics_glasses.RData")
metrics <- metrics_glasses
H1 <- "Hypothesis: There is a significant difference in personal light-exposure metrics across sites, even after accounting for latitude and photoperiod."
H2 <- "Hypothesis: The variance of personal light exposure patterns (hourly geometric mean of melanopic EDI) by participants within sites is greater than the variance between sites."
sig.level <- 0.05

#for chest:
# load("data/metrics_chest.RData")
# metrics <- metrics_chest
# melidos_cities <- melidos_cities[names(melidos_cities)!="MPI"]

Hypothesis 1

Details

Hypothesis 1 states:

There is a significant difference in personal light-exposure metrics across sites, even after accounting for latitude and photoperiod.

Type of model: LMM

Base formula: \text{Metric} \sim \text{Site} + \text{Photoperiod} + (1|\text{Site}:\text{Participant})

Additional notes: - For dynamics-based metrics LM without Participant random effect. - Because latitude and site are each individual values per site (i.e., each site has its unique latitude), they have to be modelled separately and will be compared via AIC.

Specific formulas: - $ + $ (for dynamics-based metrics) - \text{Metric} \sim \text{Latitude} + \text{Photoperiod} + (1|\text{Site}:\text{Participant}) (for the latitude version of the base formula) - \text{Metric} \sim \text{Photoperiod} + (1|\text{Site}) + (1|\text{Site}:\text{Participant}) (to check whether site is better represented as a random effect)

False-discovery-rate correction: 17 (number of metrics that are compared)

Testing whether photoperiod differs significantly by site

In [3]:

Code

by.day.data <- metrics$data[[3]]

model_phot <- lmer(photoperiod ~ site + (1|Id), data = by.day.data)
summary(model_phot)

Linear mixed model fit by REML ['lmerMod']
Formula: photoperiod ~ site + (1 | Id)
   Data: by.day.data

REML criterion at convergence: -580.8

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.6766 -0.4604 -0.0003  0.4417  3.2394 

Random effects:
 Groups   Name        Variance Std.Dev.
 Id       (Intercept) 3.131047 1.76948 
 Residual             0.007516 0.08669 
Number of obs: 811, groups:  Id, 141

Fixed effects:
            Estimate Std. Error t value
(Intercept)  16.7092     0.4172  40.055
siteFUSPCEU  -5.5992     0.5570 -10.053
siteIZTECH   -3.7039     0.5986  -6.188
siteKNUST    -4.1498     0.6188  -6.707
siteMPI      -4.1705     0.5427  -7.685
siteRISE     -0.4959     0.6442  -0.770
siteTHUAS    -1.5697     0.6442  -2.437
siteTUM       0.3819     0.6980   0.547
siteUCR      -3.2370     0.8343  -3.880

Correlation of Fixed Effects:
            (Intr) sFUSPC sIZTEC sKNUST sitMPI stRISE sTHUAS sitTUM
siteFUSPCEU -0.749                                                 
siteIZTECH  -0.697  0.522                                          
siteKNUST   -0.674  0.505  0.470                                   
siteMPI     -0.769  0.576  0.536  0.518                            
siteRISE    -0.648  0.485  0.451  0.437  0.498                     
siteTHUAS   -0.648  0.485  0.451  0.437  0.498  0.419              
siteTUM     -0.598  0.448  0.416  0.403  0.459  0.387  0.387       
siteUCR     -0.500  0.374  0.348  0.337  0.384  0.324  0.324  0.299

Code

drop1(model_phot, test = "Chisq")

Single term deletions

Model:
photoperiod ~ site + (1 | Id)
       npar     AIC    LRT   Pr(Chi)    
<none>      -556.31                     
site      8 -445.65 126.66 < 2.2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Code

model_phot |> 
  emmeans(~ site) |> 
  contrast() |> 
  as_tibble() |> 
  gt() |> 
  fmt_number() |> 
  fmt(p.value, fns = style_pvalue)

contrast	estimate	SE	df	t.ratio	p.value
BAUA effect	2.50	0.40	132.00	6.23	<0.001
FUSPCEU effect	−3.09	0.36	132.01	−8.51	<0.001
IZTECH effect	−1.20	0.41	132.00	−2.91	0.005
KNUST effect	−1.64	0.43	132.01	−3.79	<0.001
MPI effect	−1.67	0.35	132.00	−4.81	<0.001
RISE effect	2.01	0.46	131.99	4.34	<0.001
THUAS effect	0.94	0.46	132.00	2.02	0.051
TUM effect	2.89	0.52	131.99	5.56	<0.001
UCR effect	−0.73	0.66	132.02	−1.11	0.3

Sites do vary significantly in their photoperiod, with only Delft (NL) and San José (CR) non-significantly different from the average.

Preparation

In [4]:

Code cell - Model setup based on distribution

#| code-summary: Code cell - Model setup based on distribution
model_tbl <- tibble::tribble(
    ~name, ~engine, ~ response, ~family,
    "interdaily_stability", "lm", "qlogis(metric)", gaussian(),
    "intradaily_variability", "lm", "metric", gaussian(),
    "Mean", "lmer", "log_zero_inflated(metric)", gaussian(),
    "dose", "lmer", "log_zero_inflated(metric)", gaussian(),
    "MDER", "lmer", "metric", gaussian(),
    "darkest_10h_mean", "lmer", "log_zero_inflated(metric)", gaussian(),
    "brightest_10h_mean", "lmer", "log_zero_inflated(metric)", gaussian(),
    "duration_above_1000", "glmmTMB", "metric", tweedie(link = "log"),
    "duration_above_250", "lmer", NA, gaussian(),
    "duration_below_10", "lmer", NA, gaussian(),
    "duration_below_1", "lmer", NA, gaussian(),
    "duration_above_250_wake", "glmmTMB", "metric", tweedie(link = "log"),
    "duration_below_10_pre-sleep", "glmmTMB", "metric", tweedie(link = "log"),
    "duration_below_1_sleep", "glmmTMB", "metric", tweedie(link = "log"),
    "period_above_250", "lmer", "log_zero_inflated(metric)", gaussian(),
    "brightest_10h_midpoint", "lmer", "metric", gaussian(),
    "brightest_10h_onset", "lmer", NA, gaussian(),
    "brightest_10h_offset", "lmer", NA, gaussian(),
    "darkest_10h_midpoint", "lmer", "metric", gaussian(),
    "darkest_10h_onset", "lmer", NA, gaussian(),
    "darkest_10h_offset", "lmer", NA, gaussian(),
    "mean_timing_above_250", "lmer", "metric", gaussian(),
    "first_timing_above_250", "lmer", "metric", gaussian(),
    "last_timing_above_250", "lmer", "metric", gaussian()
  )

In [5]:

Code cell - Formula setup for inference

#| code-summary: Code cell - Formula setup for inference
metrics <-
  metrics |> 
  mutate(H1_1 = case_when(type == "participant" ~ 
                                  "~ site + photoperiod",
                                type == "participant-day" ~ 
                                  "~ site + photoperiod + (1| site:Id)"),
         H1_0 = case_when(type == "participant" ~ 
                                   "~ photoperiod",
                                 type == "participant-day" ~ 
                                  "~ photoperiod + (1| site:Id)"),
         H1_lat = case_when(type == "participant" ~ 
                                   "~ lat + photoperiod",
                                 type == "participant-day" ~ 
                                  "~ lat + photoperiod + (1|site:Id)"),
         H1_phot = case_when(type == "participant" ~ 
                                   "~ site",
                                 type == "participant-day" ~ 
                                  "~ site + (1|site:Id)"),
         H1_rand = case_when(type == "participant" ~ 
                                   NA,
                                 type == "participant-day" ~ 
                                  "~ photoperiod + (1|site/Id)")
         ) |> 
  left_join(model_tbl, by = "name")

H1_formulas <-  names(metrics) |> subset(grepl("H1_", names(metrics)))

metrics <- 
  metrics |> relocate(all_of(H1_formulas), .after = family)
H1_fdr_n <- metrics |> filter(!is.na(response)) |> nrow()

In [6]:

Code

#| echo: false
metrics |> 
  select(-data) |>
  filter(!is.na(response)) |> 
  arrange(metric_type, name) |> 
  mutate(family = map(family, 1),
         rowname = seq_along(1:length(name))) |> 
  group_by(metric_type) |> 
  gt() |> 
  cols_label_with(fn = \(x) x |> str_to_title() |> str_replace_all("_", " ")) |> 
  sub_missing(missing_text = "") |> 
  sub_values(values = "NULL", replacement = "") |> 
  fmt(name, fns = \(x) x |> str_to_title() |>  str_replace_all("_", " ")) |> 
  sub_values(values = "MDER", replacement = "MDER") |> 
  tab_header("Hypothesis 1: metric model specifications",
             subtitle = H1) |> 
  tab_footnote("Confirmatory analysis will use a p-value correction through the false-discovery rate for n=17 comparisons") |> 
  tab_footnote("These are the models used for the confirmatory analysis. Each H1 1 and H1 0 pair are compaired with likelihood-ratio tests to determine whether site is meaningful", locations = cells_column_labels(c(H1_1, H1_0))) |> 
  tab_footnote("These models will be checked via AIC against H1 1 to see whether there is indeed a site difference beyond the specific difference in the model. Also, this model is compared against the H1 0 model", locations = cells_column_labels(c("H1_lat", "H1_rand", "H1_phot")))

	Name	Type	Metric family	Engine	Response	Family	H1 1¹	H1 0¹	H1 lat²	H1 phot²	H1 rand²
Hypothesis 1: metric model specifications
Hypothesis: There is a significant difference in personal light-exposure metrics across sites, even after accounting for latitude and photoperiod.
duration
1	Duration above 1000	participant-day		glmmTMB	metric	tweedie	~ site + photoperiod + (1\| site:Id)	~ photoperiod + (1\| site:Id)	~ lat + photoperiod + (1\|site:Id)	~ site + (1\|site:Id)	~ photoperiod + (1\|site/Id)
2	Duration above 250 wake	participant-day		glmmTMB	metric	tweedie	~ site + photoperiod + (1\| site:Id)	~ photoperiod + (1\| site:Id)	~ lat + photoperiod + (1\|site:Id)	~ site + (1\|site:Id)	~ photoperiod + (1\|site/Id)
3	Duration below 10 pre-Sleep	participant-day		glmmTMB	metric	tweedie	~ site + photoperiod + (1\| site:Id)	~ photoperiod + (1\| site:Id)	~ lat + photoperiod + (1\|site:Id)	~ site + (1\|site:Id)	~ photoperiod + (1\|site/Id)
4	Duration below 1 sleep	participant-day		glmmTMB	metric	tweedie	~ site + photoperiod + (1\| site:Id)	~ photoperiod + (1\| site:Id)	~ lat + photoperiod + (1\|site:Id)	~ site + (1\|site:Id)	~ photoperiod + (1\|site/Id)
5	Period above 250	participant-day		lmer	log_zero_inflated(metric)	gaussian	~ site + photoperiod + (1\| site:Id)	~ photoperiod + (1\| site:Id)	~ lat + photoperiod + (1\|site:Id)	~ site + (1\|site:Id)	~ photoperiod + (1\|site/Id)
dynamics
6	Interdaily stability	participant	Interdaily stability	lm	qlogis(metric)	gaussian	~ site + photoperiod	~ photoperiod	~ lat + photoperiod	~ site
7	Intradaily variability	participant	Intradaily variability	lm	metric	gaussian	~ site + photoperiod	~ photoperiod	~ lat + photoperiod	~ site
exposure history
8	Dose	participant-day		lmer	log_zero_inflated(metric)	gaussian	~ site + photoperiod + (1\| site:Id)	~ photoperiod + (1\| site:Id)	~ lat + photoperiod + (1\|site:Id)	~ site + (1\|site:Id)	~ photoperiod + (1\|site/Id)
level
9	Mean	participant-day		lmer	log_zero_inflated(metric)	gaussian	~ site + photoperiod + (1\| site:Id)	~ photoperiod + (1\| site:Id)	~ lat + photoperiod + (1\|site:Id)	~ site + (1\|site:Id)	~ photoperiod + (1\|site/Id)
10	Brightest 10h mean	participant-day		lmer	log_zero_inflated(metric)	gaussian	~ site + photoperiod + (1\| site:Id)	~ photoperiod + (1\| site:Id)	~ lat + photoperiod + (1\|site:Id)	~ site + (1\|site:Id)	~ photoperiod + (1\|site/Id)
11	Darkest 10h mean	participant-day		lmer	log_zero_inflated(metric)	gaussian	~ site + photoperiod + (1\| site:Id)	~ photoperiod + (1\| site:Id)	~ lat + photoperiod + (1\|site:Id)	~ site + (1\|site:Id)	~ photoperiod + (1\|site/Id)
spectrum
12	MDER	participant-day	MDER	lmer	metric	gaussian	~ site + photoperiod + (1\| site:Id)	~ photoperiod + (1\| site:Id)	~ lat + photoperiod + (1\|site:Id)	~ site + (1\|site:Id)	~ photoperiod + (1\|site/Id)
timing
13	Brightest 10h midpoint	participant-day		lmer	metric	gaussian	~ site + photoperiod + (1\| site:Id)	~ photoperiod + (1\| site:Id)	~ lat + photoperiod + (1\|site:Id)	~ site + (1\|site:Id)	~ photoperiod + (1\|site/Id)
14	Darkest 10h midpoint	participant-day		lmer	metric	gaussian	~ site + photoperiod + (1\| site:Id)	~ photoperiod + (1\| site:Id)	~ lat + photoperiod + (1\|site:Id)	~ site + (1\|site:Id)	~ photoperiod + (1\|site/Id)
15	First timing above 250	participant-day		lmer	metric	gaussian	~ site + photoperiod + (1\| site:Id)	~ photoperiod + (1\| site:Id)	~ lat + photoperiod + (1\|site:Id)	~ site + (1\|site:Id)	~ photoperiod + (1\|site/Id)
16	Last timing above 250	participant-day		lmer	metric	gaussian	~ site + photoperiod + (1\| site:Id)	~ photoperiod + (1\| site:Id)	~ lat + photoperiod + (1\|site:Id)	~ site + (1\|site:Id)	~ photoperiod + (1\|site/Id)
17	Mean timing above 250	participant-day		lmer	metric	gaussian	~ site + photoperiod + (1\| site:Id)	~ photoperiod + (1\| site:Id)	~ lat + photoperiod + (1\|site:Id)	~ site + (1\|site:Id)	~ photoperiod + (1\|site/Id)
Confirmatory analysis will use a p-value correction through the false-discovery rate for n=17 comparisons
¹ These are the models used for the confirmatory analysis. Each H1 1 and H1 0 pair are compaired with likelihood-ratio tests to determine whether site is meaningful
² These models will be checked via AIC against H1 1 to see whether there is indeed a site difference beyond the specific difference in the model. Also, this model is compared against the H1 0 model

In [7]:

Code cell - Formula creation

#| code-summary: Code cell - Formula creation
metrics <-
  metrics |> 
  mutate(across(all_of(H1_formulas),
                \(x) map2(x, response, \(y,z) make_formula(z, y))
                )
         )

In [8]:

Code cell - Offset zero-values in mixed gamma models

#| code-summary: Code cell - Offset zero-values in mixed gamma models
 metrics <- 
  metrics |>
  mutate(data = map2(data, engine, \(data, engine) {
    if(engine == "glmer") {
      data |> mutate(metric = metric + 0.1)
    } else data
  }))

Analysis

Fit models

In [9]:

Code cell - Model fit

#| code-summary: Code cell - Model fit
#| message: false
metrics <-
  metrics |> 
  mutate(
    across(all_of(H1_formulas),
           \(x) pmap(list(data, x, engine, family), fit_model),
           .names = "{.col}_model")
    )

boundary (singular) fit: see help('isSingular')

Main model diagnostics

In [10]:

Code cell - Create diagnostic plots for main model

#| code-summary: Code cell - Create diagnostic plots for main model
#| message: false
metrics <-
  metrics |> 
  mutate(
    across(all_of(paste0(H1_formulas[1], "_model")),
           \(x) map(x, diagnostics),
           .names = "{str_remove(.col, '_model')}_diagnostics")
    )

`check_outliers()` does not yet support models of class `glmmTMB`.
`check_outliers()` does not yet support models of class `glmmTMB`.
`check_outliers()` does not yet support models of class `glmmTMB`.
`check_outliers()` does not yet support models of class `glmmTMB`.

Model diagnostics overall look good to OK. Anomalies are found but deemed non-critical. Alternative error families, like the Tweedie distribution have been used as well. For Tweedie models, the homogeneity of variance does not show in the plots (duration metrics), but was checked interactively.

Model comparisons

In [11]:

Code cell - Model comparisons

#| code-summary: Code cell - Model comparisons
#| message: false
metrics <- 
  metrics |> 
  mutate(
    H1_1_comp = map2(H1_0_model, H1_1_model, model_anova),
    H1_1_p = map2(H1_1_comp, engine, model_comp_p, n = H1_fdr_n) |> list_c(),
    H1_1_sig = H1_1_p <= sig.level,
    
    H1_lat_comp = map2(H1_0_model, H1_lat_model, model_anova),
    H1_lat_p = map2(H1_lat_comp, engine, model_comp_p, n = H1_fdr_n) |> list_c(),
    H1_lat_sig = H1_lat_p <= sig.level,
    
    H1_phot_comp = map2(H1_phot_model, H1_1_model, model_anova),
    H1_phot_p = map2(H1_phot_comp, engine, model_comp_p, n = H1_fdr_n) |> list_c(),
    H1_phot_sig = H1_phot_p <= sig.level,
    
    H1_rand_comp = map2(H1_0_model, H1_rand_model, model_anova),
    H1_rand_p = map2(H1_rand_comp, engine, model_comp_p, n = H1_fdr_n) |> list_c(),
    H1_rand_sig = H1_rand_p <= sig.level,
    
    AICs = pmap(list(H1_0_model, H1_1_model, H1_phot_model, H1_lat_model, 
                     H1_rand_model), model_AIC),
  
    summary = pmap(list(H1_1_sig, H1_0_model, H1_1_model), model_summary),
    summary_lat = pmap(list(H1_lat_sig, H1_0_model, H1_lat_model), model_summary),
    summary_rand = pmap(list(H1_rand_sig, H1_0_model, H1_rand_model),
                        model_summary),
  
    table = pmap(list(name, H1_1_p, summary, metric_type, 
                           response, family), model_table),
    table_lat = pmap(list(name, H1_lat_p, summary_lat, metric_type, 
                           response, family), model_table),
    table_lat = map(table_lat, \(x) if(is.null(x)) return() else x |>
                  mutate(across(any_of("lat"), \(x) x*10))),
    table = map2(table, H1_phot_sig, \(x,y) if(is.null(x)) return() else x |>
                  mutate(dAIC_photoperiod = ifelse(y, 0, -3))),
    table_rand = pmap(list(name, H1_rand_p, summary_rand, metric_type, 
                           response, family),model_table),
    table = pmap(list(table, table_lat, AICs, "lat", "H1_1", "H1_lat"),
                 H1_combining_tables),
    table = pmap(list(table, table_rand, AICs, "SD Site", "H1_1", "H1_rand"),
                 H1_combining_tables),
    )

refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)

Adjusted csum-coding for R1

In [12]:

Code cell - adjust model parameters for sum-coding of site

#| code-summary: Code cell - adjust model parameters for sum-coding of site
metrics <- 
  metrics |> 
  mutate(
    site_coefs = map2(H1_1_model, H1_1_sig, sum_coding),
    table = map2(table, site_coefs, merge_results)
  )

R^2

In [13]:

Code cell - calculate R^2

#| warning: false
#| code-summary: Code cell - calculate R^2
R2_table <- 
  metrics |> 
  filter_out(is.na(response)) |> 
  mutate(
    name = name,
    metric_type = metric_type,
    H1_1_r2 = pmap(list(H1_1_model, data, family, response), r2_helper), 
    H1_0_r2 = pmap(list(H1_0_model, data, family, response), r2_helper), 
    H1_phot_r2 = pmap(list(H1_phot_model, data, family, response), r2_helper), 
    H1_lat_r2 = pmap(list(H1_lat_model, data, family, response), r2_helper), 
    H1_phot_sig = H1_phot_sig,
    H1_lat_sig = H1_lat_sig,
    H1_1_sig = H1_1_sig,
    .keep = "none"
  )

In [14]:

Code cell - choose relevant R^2

#| code-summary: Code cell - choose relevant R^2
R2_table <- 
  R2_table |> 
  rowwise() |> 
  mutate(
    name = name,
    metric_type = metric_type,
    marginal_r2 = ifelse(H1_1_sig, H1_1_r2$R2_marginal, H1_0_r2$R2_marginal),
    conditional_r2 = ifelse(H1_1_sig, H1_0_r2$R2_conditional, H1_0_r2$R2_conditional),
    site_r2 = ifelse(H1_1_sig, H1_1_r2$R2_marginal - H1_0_r2$R2_marginal, NA),
    phot_r2 = ifelse(H1_phot_sig, H1_1_r2$R2_marginal - H1_phot_r2$R2_marginal, NA),
    rand_r2 = ifelse(H1_1_sig, H1_1_r2$R2_conditional - H1_1_r2$R2_marginal, #
                     H1_0_r2$R2_conditional - H1_0_r2$R2_marginal),
    rand_r2 = ifelse(rand_r2 == 0, NA, rand_r2),
    lat_r2 = ifelse(H1_lat_sig, H1_lat_r2$R2_marginal - H1_0_r2$R2_marginal, NA),
    .keep = "none"
  ) |> 
  mutate(across(c(marginal_r2, conditional_r2), 
                \(x) ifelse(metric_type == "dynamics", NA, x)
                ),
             phot_r2 = ifelse(is.na(site_r2) & !is.na(conditional_r2), marginal_r2, phot_r2)
         )

Results

In [15]:

Code cell - Table H1

#| message: false
#| warning: false
#| code-summary: Code cell - Table H1
#| label: tbl-H1
#| tab-name: Hypothesis 1 results
H1_tab <- 
  H1_table(metrics, H1_fdr_n)|> 
  cols_width(
    stub() ~ px(232),
    photoperiod ~ px(120),
    everything() ~ px(95)
  )
H1_tab 
#| message: false
#| warning: false
#| code-summary: Code cell - Table H1
#| label: tbl-H1
#| tab-name: Hypothesis 1 results
gtsave(H1_tab, "tables/H1.png", vwidth = 2000)

file:////var/folders/9p/326_k3kx43qbn_cyl1rqfhb00000gn/T//RtmpXhfCfn/file13f9349b7e454.html screenshot completed

Code cell - Table H1

#| message: false
#| warning: false
#| code-summary: Code cell - Table H1
#| label: tbl-H1
#| tab-name: Hypothesis 1 results
save(metrics, file = "data/H1_results.RData")

In [16]:

	Coefs are²	p-value³	Intercept	Site coefficients¹									Photoperiod^4,5	SD Participant	SD Residual⁶	Comparative models
Model results for Hypothesis 1
Hypothesis: There is a significant difference in personal light-exposure metrics across sites, even after accounting for latitude and photoperiod.
	Coefs are²	p-value³	Intercept	Borås (SE)	Delft (NL)	Dortmund (DE)	Tübingen (DE)	Munich (DE)	Madrid (ES)	Izmir (TR)	San José (CR)	Kumasi (GH)	Photoperiod^4,5	SD Participant	SD Residual⁶	Latitude^4,7	SD Site^4,7
Duration
Duration above 1000	multiplied	>0.9	5m 43s	—	—	—	—	—	—	—	—	—	1.17×	1.84×	—	—	—
Duration above 250 wake	multiplied	0.062	38m 26s	—	—	—	—	—	—	—	—	—	1.10×	1.77×	—	—	—
Duration below 10 pre-sleep	multiplied	>0.9	2h 31m	—	—	—	—	—	—	—	—	—	0.97×	1.40×	—	—	—
Duration below 1 sleep	multiplied	>0.9	8h 57m	—	—	—	—	—	—	—	—	—	0.98×	1.20×	—	—	—
Period above 250	multiplied	>0.9	10m 27s	—	—	—	—	—	—	—	—	—	1.10×	1.63×	2.12×	—	—
Dynamics⁸
Interdaily stability	logit scale	>0.9	−0.15	—	—	—	—	—	—	—	—	—	−0.05	—	—	—	—
Intradaily variability	additive	>0.9	1.33	—	—	—	—	—	—	—	—	—	0.00	—	—	—	—
Exposure History
Dose	multiplied	>0.9	356	—	—	—	—	—	—	—	—	—	1.19×	2.11×	3.77×	—	—
Level
Mean	multiplied	0.007	4.70	1.03×	0.95×	0.91×	1.47×	0.98×	1.03×	1.34×	1.14×	0.49×	1.18×	1.82×	2.02×	1.15×	1.32×
Brightest 10h mean	multiplied	0.040	88.57	1.46×	1.08×	0.95×	1.46×	0.72×	1.42×	1.09×	1.01×	0.40×	1.25×	2.22×	3.17×	1.24×	—
Darkest 10h mean	multiplied	<0.001	0.23	0.74×	0.85×	0.83×	1.34×	1.33×	0.75×	1.48×	1.43×	0.67×	1.09×	1.63×	1.60×	—	1.34×
Spectrum
MDER	additive	0.3	0.48	—	—	—	—	—	—	—	—	—	0.02	0.08	0.08	—	—
Timing
Brightest 10h midpoint	additive	0.002	13:50	−35m 30s	10m 42s	12m 54s	10m 5s	39m 16s	38m 42s	34m 6s	−1h 7m	−43m 6s	2m 33s	50m 26s	1h 34m	12m 7s	32m 15s
Darkest 10h midpoint	additive	0.011	02:43	−23m 16s	−7m 58s	9m 34s	13m 12s	58m 1s	7m 27s	34m 45s	−34m 51s	−56m 54s	−8m 37s	51m 23s	1h 38m	—	—
First timing above 250	additive	0.4	11:23	—	—	—	—	—	—	—	—	—	−8m 18s	1h 14m	2h 6m	—	—
Last timing above 250	additive	<0.001	17:37	20m 25s	20m 1s	10m 5s	−6m 33s	44m 3s	1h 23m	54m 17s	−2h 5m	−1h 40m	19m 50s	1h 4m	2h 2m	28m 29s	1h 3m
Mean timing above 250	additive	<0.001	13:18	−14m 19s	31m 39s	13m 57s	15m 15s	25m 24s	53m 27s	53m 12s	−1h 53m	−1h 4m	7m 27s	33m 11s	1h 34m	21m 17s	52m 9s
¹ Sites have only entries if the (adjusted) p-value is significant, otherwise the results for the null-model are shown. Entries show the site-specific coefficient (relative to the overall mean level). Grey entries indicate a non-significant difference from the Overall mean.
² multiplied: coefficients are multiplicative. This comes from the fact that the models were calculated on a log scale and backtransformed in the table. The Intercept can be read as linear scale, with coefficients being multiplied with the intercept. additive: All values are on the linear scale and no transformation is necessary. logit scale: data were modelled on the logit scale. Coefficients can be added to the intercept, but have to be transformed with the logistic function.
³ p-values are adjusted for multiple comparisons using the false-discovery-rate for n= 17 comparisons
⁴ Photoperiod: coefficient per 1-hour change in photoperiod. Latitude: coefficient per 10° change in latitude. SD Site: random effect standard deviation for site.
⁵ If the coefficient is greyed out it means it is not significant in a model base minus photoperiod.
⁶ Models with a tweedie error distribution do not contain a residual standard deviation, as a dispersion parameter is estimated instead during modeling.
⁷ These coefficients (Latitude) and random effects (SD Site) come from two respective separate models that did not contain a fixed effect of site. If the random site or fixed latitude effect is significant, a value is presented. If AIC indicates a significant improvement over the base model (site as fixed effect), it is shown in bold. If the base model is significantly stronger, however, values are shown in grey.
⁸ Dynamics-based metrics are calculated per participant and thus are using linear/generalized models instead of the mixed-model framework

Interpretation

@tbl-H1 shows a comprehensive overview of the main inference results.

Timing-based metrics show a significant site-specific effect that is neither dependent on photoperiod, nor latitude
While level-based metrics show a significant site-specific effect, it can be drawn back to photoperiod and latitude for daytime metrics (but not for nighttime)
Duration-based metrics are photoperiod-dependent, but not site specific, same as exposure history and spectrum - these metrics are rather more dependent on inter- and intraindividual differences of participants
In general, interindividual differences highly relevant and trump site specific factors, although usually not as large as intraindividual differences
None of the above means that individual sites can not be significantly different from the group or another site. But rather that the variance across all nine sites is or is not sufficiently large.

R2

In [17]:

Code cell - R^2 table

#| code-summary: Code cell - R^2 table
R2_tbl <- 
R2_table |> 
  mutate(metric_type = str_to_title(metric_type)) |> 
  arrange(metric_type, name) |> 
  group_by(metric_type) |> 
  gt(rowname_col = "name") |> 
  cols_add(Residual = 1 - conditional_r2) |> 
  fmt_percent(decimals = 0) |>
  sub_missing() |> 
  tab_header(md("**H1: Variance explained by the model parameters ($R^{2}$)**")) |> 
  fmt(name, 
      rows = name != "MDER",
      fns = \(x) x |> str_to_sentence() |> str_replace_all("_", " ")) |> 
  cols_move(rand_r2, after = lat_r2) |> 
  cols_move(marginal_r2, after = conditional_r2) |> 
  cols_label(
    conditional_r2 = "Model",
    rand_r2 = "Participants",
    marginal_r2 = "Fixed effects",
    site_r2 = "Site",
    phot_r2 = "Photoperiod",
    lat_r2 = "Latitude"
  ) |> 
  tab_spanner("Unique contribution", columns = c(site_r2, phot_r2, lat_r2)) |> 
  cols_width(
    name ~ px(230),
    phot_r2 ~ px(100),
    c(rand_r2, Residual) ~ px(90),
    everything() ~ px(70)) |> 
  grand_summary_rows(
    columns = where(is.numeric),
    fns = list(
      `Grand average` ~ mean(., na.rm = TRUE)
    ),
        side = "top",
    fmt = ~ fmt_percent(., decimals = 0)
  ) |> 
  tab_style(
    cell_text(align = "center"),
    list(cells_body(),
         cells_grand_summary(),
         cells_column_labels())
  ) |> 
    tab_style(
    cell_text(weight = "bold"),
    list(cells_column_labels(),
         cells_row_groups(),
         cells_grand_summary(),
         cells_stub_grand_summary())
  ) |> 
  tab_footnote(
    md("*Model*: conditional $R^2$, *Participants*: partial $R^2$ due to interindividual differences on a participant level, Fixed effects: marginal $R^2$ (all fixed effects combined, based on a model of $site + photoperiod$), *Site/Photoperiod/Latitude*: unique contribution of the given parameter relative to a model without it. *Residual*: proportion of residual variance (e.g., intraindividual differences)")
  )

In [18]:

Code

R2_tbl

	Model	Fixed effects	Unique contribution			Participants	Residual
H1: Variance explained by the model parameters (R^{2})
	Model	Fixed effects	Site	Photoperiod	Latitude	Participants	Residual
Grand average	40%	12%	10%	7%	5%	28%	60%
Duration
Duration above 1000	49%	17%	—	17%	—	33%	51%
Duration above 250 wake	51%	9%	—	9%	—	42%	49%
Duration below 10 pre-sleep	31%	2%	—	2%	—	29%	69%
Duration below 1 sleep	38%	3%	—	3%	—	35%	62%
Period above 250	35%	8%	—	8%	—	27%	65%
Dynamics
Interdaily stability	—	—	—	—	—	—	—
Intradaily variability	—	—	—	—	—	—	—
Exposure History
Dose	31%	9%	—	9%	—	22%	69%
Level
Mean	55%	24%	7%	8%	3%	32%	45%
Brightest 10h mean	43%	18%	6%	6%	4%	26%	57%
Darkest 10h mean	62%	23%	15%	3%	—	40%	38%
Spectrum
MDER	54%	13%	—	13%	—	41%	46%
Timing
Brightest 10h midpoint	27%	8%	8%	—	2%	20%	73%
Darkest 10h midpoint	27%	9%	6%	—	—	19%	73%
First timing above 250	28%	2%	—	2%	—	25%	72%
Last timing above 250	39%	23%	12%	5%	6%	17%	61%
Mean timing above 250	25%	17%	14%	1%	7%	9%	75%
Model: conditional R^2, Participants: partial R^2 due to interindividual differences on a participant level, Fixed effects: marginal R^2 (all fixed effects combined, based on a model of site + photoperiod), Site/Photoperiod/Latitude: unique contribution of the given parameter relative to a model without it. Residual: proportion of residual variance (e.g., intraindividual differences)

Code

gtsave(R2_tbl, "tables/H1_R2_tbl.png")

file:////var/folders/9p/326_k3kx43qbn_cyl1rqfhb00000gn/T//RtmpXhfCfn/file13f93229869a1.html screenshot completed

Overall model fit varies widely (≈25–62%), with highest explanatory power for level-based metrics (e.g., darkest/mean light levels) and lowest for timing metrics.
Fixed effects explain a modest share (≈2–24%), indicating environmental factors (site, photoperiod, latitude) contribute but are not dominant drivers.
Individual (participant-level) differences account for a large portion of variance (≈20–42%), often exceeding fixed-effect contributions.
Site effects are generally stronger than photoperiod, particularly for level metrics, but both show overlapping (shared) explanatory power.
Latitude contributes only marginally where included, suggesting limited additional explanatory value beyond photoperiod.
Residual variance remains high across most metrics (≈38–75%), indicating substantial unexplained within-individual or day-to-day variability.

Model summaries

The following tabs show the model summaries for either

\text{Metric} \sim \text{Site} + \text{Photoperiod} + (1|\text{Site}:\text{Participant})

\text{Metric} \sim \text{Photoperiod} + (1|\text{Site}:\text{Participant})

depending on whether site is significant

Model summary

Characteristic	Beta	95% CI	p-value
photoperiod	-0.05	-0.07, -0.02	<0.001
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).

Characteristic	p-value
Model 1: qlogis(metric) ~ photoperiod
Model 2: qlogis(metric) ~ site + photoperiod	0.12

Model summary

Characteristic	Beta	95% CI	p-value
photoperiod	0.00	-0.03, 0.02	0.8
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).

Characteristic	p-value
Model 1: metric ~ photoperiod
Model 2: metric ~ site + photoperiod	0.2

Model summary

Characteristic	Beta	95% CI
site
BAUA	0.00	—
FUSPCEU	0.05	-0.18, 0.29
IZTECH	0.17	-0.05, 0.38
KNUST	-0.27	-0.50, -0.04
MPI	0.21	0.00, 0.42
RISE	0.05	-0.15, 0.26
THUAS	0.02	-0.19, 0.23
TUM	0.03	-0.19, 0.26
UCR	0.10	-0.19, 0.38
photoperiod	0.07	0.05, 0.10
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).

Characteristic	p-value
model1: log_zero_inflated(metric) ~ photoperiod + (1 \| site:Id)
model2: log_zero_inflated(metric) ~ site + photoperiod + (1 \| site:Id)	<0.001

Model summary

Characteristic	Beta	95% CI
site
BAUA	0.00	—
FUSPCEU	0.18	-0.15, 0.51
IZTECH	0.06	-0.24, 0.36
KNUST	-0.37	-0.69, -0.05
MPI	0.19	-0.10, 0.48
RISE	0.19	-0.10, 0.48
THUAS	0.06	-0.24, 0.35
TUM	-0.12	-0.43, 0.19
UCR	0.03	-0.37, 0.43
photoperiod	0.10	0.06, 0.13
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).

Characteristic	p-value
model1: log_zero_inflated(metric) ~ photoperiod + (1 \| site:Id)
model2: log_zero_inflated(metric) ~ site + photoperiod + (1 \| site:Id)	0.002

Model summary

Characteristic	Beta	95% CI
site
BAUA	0.00	—
FUSPCEU	0.43	-0.44, 1.3
IZTECH	0.35	-0.45, 1.2
KNUST	-0.93	-1.8, -0.08
MPI	-0.05	-0.82, 0.72
RISE	-0.81	-1.6, -0.05
THUAS	-0.04	-0.81, 0.74
TUM	0.44	-0.38, 1.3
UCR	-1.3	-2.4, -0.27
photoperiod	0.04	-0.06, 0.14
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).

Characteristic	p-value
model1: metric ~ photoperiod + (1 \| site:Id)
model2: metric ~ site + photoperiod + (1 \| site:Id)	<0.001

Model summary

Characteristic	Beta	95% CI
site
BAUA	0.00	—
FUSPCEU	-0.04	-0.23, 0.14
IZTECH	0.25	0.08, 0.42
KNUST	-0.09	-0.28, 0.09
MPI	0.21	0.04, 0.37
RISE	-0.05	-0.22, 0.11
THUAS	0.01	-0.15, 0.18
TUM	0.20	0.03, 0.38
UCR	0.23	0.01, 0.46
photoperiod	0.04	0.01, 0.06
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).

Characteristic	p-value
model1: log_zero_inflated(metric) ~ photoperiod + (1 \| site:Id)
model2: log_zero_inflated(metric) ~ site + photoperiod + (1 \| site:Id)	<0.001

Model summary

Characteristic	Beta	95% CI
site
BAUA	0.00	—
FUSPCEU	-0.04	-0.93, 0.86
IZTECH	0.42	-0.40, 1.2
KNUST	-1.1	-2.0, -0.23
MPI	0.06	-0.73, 0.85
RISE	-0.55	-1.3, 0.23
THUAS	-0.29	-1.1, 0.50
TUM	0.81	-0.04, 1.7
UCR	-0.74	-1.8, 0.35
photoperiod	-0.14	-0.25, -0.04
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).

Characteristic	p-value
model1: metric ~ photoperiod + (1 \| site:Id)
model2: metric ~ site + photoperiod + (1 \| site:Id)	<0.001

Model summary

Warning: the 'nobars' function has moved to the reformulas package. Please update your imports, or ask an upstream package maintainer to do so.
This warning is displayed once per session.

ℹ Multinomial models, multi-component models and other groups models have a
  different underlying structure than the models gtsummary was designed for.
• Functions designed to work with `tbl_regression()` objects may yield
  unexpected results.
ℹ Suppress this message with `?suppressMessages()`.

Characteristic	Beta	95% CI	p-value
cond
photoperiod	0.16	0.12, 0.21	<0.001
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).

Characteristic	p-value
model1: metric ~ photoperiod + (1 \| site:Id), zi=~0, disp=~1
model2: metric ~ site + photoperiod + (1 \| site:Id), zi=~0, disp=~1	0.058

Model summary

Characteristic	Beta	95% CI
photoperiod	0.04	0.03, 0.06
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).

Characteristic	p-value
model1: log_zero_inflated(metric) ~ photoperiod + (1 \| site:Id)
model2: log_zero_inflated(metric) ~ site + photoperiod + (1 \| site:Id)	0.3

Model summary

Characteristic	Beta	95% CI
site
BAUA	0.00	—
FUSPCEU	0.66	-0.06, 1.4
IZTECH	0.65	0.00, 1.3
KNUST	-1.3	-2.0, -0.60
MPI	0.02	-0.61, 0.66
RISE	-0.47	-1.1, 0.14
THUAS	0.29	-0.34, 0.93
TUM	0.19	-0.48, 0.86
UCR	-2.1	-3.0, -1.3
photoperiod	0.12	0.04, 0.21
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).

Characteristic	p-value
model1: metric ~ photoperiod + (1 \| site:Id)
model2: metric ~ site + photoperiod + (1 \| site:Id)	<0.001

Model summary

Characteristic	Beta	95% CI
photoperiod	-0.14	-0.23, -0.04
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).

Characteristic	p-value
model1: metric ~ photoperiod + (1 \| site:Id)
model2: metric ~ site + photoperiod + (1 \| site:Id)	0.026

Model summary

Characteristic	Beta	95% CI
site
BAUA	0.00	—
FUSPCEU	1.2	0.08, 2.4
IZTECH	0.74	-0.31, 1.8
KNUST	-1.8	-3.0, -0.73
MPI	-0.28	-1.3, 0.73
RISE	0.17	-0.81, 1.2
THUAS	0.17	-0.84, 1.2
TUM	0.57	-0.50, 1.6
UCR	-2.3	-3.6, -0.88
photoperiod	0.33	0.20, 0.46
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).

Characteristic	p-value
model1: metric ~ photoperiod + (1 \| site:Id)
model2: metric ~ site + photoperiod + (1 \| site:Id)	<0.001

Model summary

Characteristic	Beta	95% CI
photoperiod	0.08	0.05, 0.10
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).

Characteristic	p-value
model1: log_zero_inflated(metric) ~ photoperiod + (1 \| site:Id)
model2: log_zero_inflated(metric) ~ site + photoperiod + (1 \| site:Id)	0.2

Model summary

Characteristic	Beta	95% CI
photoperiod	0.02	0.01, 0.02
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).

Characteristic	p-value
model1: metric ~ photoperiod + (1 \| site:Id)
model2: metric ~ site + photoperiod + (1 \| site:Id)	0.016

Model summary

ℹ Multinomial models, multi-component models and other groups models have a
  different underlying structure than the models gtsummary was designed for.
• Functions designed to work with `tbl_regression()` objects may yield
  unexpected results.
ℹ Suppress this message with `?suppressMessages()`.

Characteristic	Beta	95% CI	p-value
cond
photoperiod	-0.03	-0.06, -0.01	0.017
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).

Characteristic	p-value
model1: metric ~ photoperiod + (1 \| site:Id), zi=~0, disp=~1
model2: metric ~ site + photoperiod + (1 \| site:Id), zi=~0, disp=~1	0.2

Model summary

ℹ Multinomial models, multi-component models and other groups models have a
  different underlying structure than the models gtsummary was designed for.
• Functions designed to work with `tbl_regression()` objects may yield
  unexpected results.
ℹ Suppress this message with `?suppressMessages()`.

Characteristic	Beta	95% CI	p-value
cond
photoperiod	-0.02	-0.03, -0.01	0.005
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).

Characteristic	p-value
model1: metric ~ photoperiod + (1 \| site:Id), zi=~0, disp=~1
model2: metric ~ site + photoperiod + (1 \| site:Id), zi=~0, disp=~1	0.069

Model summary

ℹ Multinomial models, multi-component models and other groups models have a
  different underlying structure than the models gtsummary was designed for.
• Functions designed to work with `tbl_regression()` objects may yield
  unexpected results.
ℹ Suppress this message with `?suppressMessages()`.

Characteristic	Beta	95% CI	p-value
cond
photoperiod	0.10	0.06, 0.14	<0.001
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).

Characteristic	p-value
model1: metric ~ photoperiod + (1 \| site:Id), zi=~0, disp=~1
model2: metric ~ site + photoperiod + (1 \| site:Id), zi=~0, disp=~1	0.004

Hypothesis 2

Details

Hypothesis 2 states:

The variance of personal light exposure patterns (30 minute mean of melanopic EDI) by participants within sites is greater than the variance between sites.

Type of model: HGAM (Hierarchical GAM)

Base formula: \text{melanopic EDI} \sim s(\text{Time}) + s(\text{Time}, \text{Site}) + s(\text{Time}, \text{Participant})

Additional notes: - Site effects are modelled with the sz basis, that allows for the most identifiable modelling of parametric differences - Participant effects are modelled with the fs basis, which models the participant-effects as random smooths - melanopic EDI is scaled with log_zero_inflated() before model fitting.

Specific formulas: - \text{melanopic EDI} \sim s(\text{Time}) + s(\text{Time}, \text{Site}) + s(\text{Time}, \text{Participant}) + s(\text{Time}, \text{Photoperiod}) (to test photoperiod effects) - \text{melanopic EDI} \sim s(\text{Time}) + s(\text{Time}, \text{Participant}) (comparison to the base formula)

False-discovery-rate correction: 1 (there is only one main comparison that tests the effect of site)

Preparation

In [19]:

Code

Load prepared data

#| filename: Load prepared data
load("data/metrics_separate_glasses.RData")
load("data/preprocessed_glasses_2.RData")

metric_participanthour <-
  metric_glasses_participanthour |>
  add_Time_col() |> 
  ungroup() |> 
  mutate(Time = as.numeric(Time)/3600 + 0.25,
         across(c(site, Id, sleep, State.Brown, wear, photoperiod.state),
                fct),
         Date = factor(Date),
         Id_date = interaction(Id, Date),
         lzMEDI = log_zero_inflated(MEDI),
         photoperiod = (dusk - dawn) |> as.numeric()
  ) |> 
  group_by(Id) |> 
  mutate(AR.start = ifelse(row_number() == 1, TRUE, FALSE))

In [20]:

Code

Overview plot of the data (median with IQR)

#| fig-height: 12
#| fig-width: 12
#| filename: Overview plot of the data (median with IQR)
metric_participanthour |>
  group_by(site) |> 
  aggregate_Date(
    numeric.handler = \(x) median(x, na.rm = TRUE),
    p75 = quantile(MEDI, p=0.75, na.rm = TRUE),
    p25 = quantile(MEDI, p=0.25, na.rm = TRUE)
  ) |> 
  site_conv_mutate() |> 
  gg_doubleplot(geom = "blank", jco_color = FALSE, facetting = FALSE,
                x.axis.breaks.repeat = ~Datetime_breaks(.x, by = "12 hours", shift =
    lubridate::duration(0, "hours"))) +
  scale_fill_manual(values = melidos_colors) +
  scale_color_manual(values = melidos_colors) +
  geom_ribbon(aes(ymin = p25, ymax = p75, color = site, fill = site)) +
  geom_line() +
  geom_hline(yintercept = 250, linetype = "dashed") +
  facet_wrap(~site, scales = "free_x") +
  guides(fill = "none", col = "none") +
  theme(panel.spacing = unit(2, "lines"))

In [21]:

Code

H2 formulas

#| filename: H2 formulas
H2_form_1 <- lzMEDI ~ s(Time, k = 12) + s(Time, site, bs = "sz", k = 12) + s(Time, Id, bs = "fs") + s(Id_date, bs = "re")
H2_form_0 <- lzMEDI ~ s(Time, k = 12) + s(Time, Id, bs = "fs") + s(Id_date, bs = "re")
H2_form_phot <- lzMEDI ~ s(Time, k = 12) + s(Time, site, bs = "sz", k = 12) + s(Time, photoperiod.state, bs = "sz", k=12) + s(Time, Id, bs = "fs") + s(Id_date, bs = "re")

Analysis

Base model (H2_1)

In [22]:

Code

Main model

#| filename: Main model
H2_model_1 <- 
  bam(H2_form_1, 
      data = metric_participanthour, 
      discrete = TRUE,
      method = "fREML",
      nthreads = 10
      )

r1 <- start_value_rho(H2_model_1, 
                      plot=TRUE)

Code

#| filename: Main model
H2_model_1 <- 
  bam(H2_form_1, 
      data = metric_participanthour, 
      discrete = TRUE,
      method = "fREML",
      nthreads = 10,
      rho = r1,
      AR.start = AR.start
      )

acf_resid(H2_model_1, split_pred = "AR.start")

Code

#| filename: Main model
H2_summary_1 <- summary(H2_model_1)
H2_summary_1


Family: gaussian 
Link function: identity 

Formula:
lzMEDI ~ s(Time, k = 12) + s(Time, site, bs = "sz", k = 12) + 
    s(Time, Id, bs = "fs") + s(Id_date, bs = "re")

Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.85202    0.02428   35.09   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                edf  Ref.df       F p-value    
s(Time)       10.88   10.98 246.416  <2e-16 ***
s(Time,site)  67.21   74.27   5.910  <2e-16 ***
s(Time,Id)   743.18 1392.00   3.605  <2e-16 ***
s(Id_date)   350.83  800.00   0.848  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.767   Deviance explained = 77.4%
fREML =  33251  Scale est. = 0.50672   n = 37603

Code

#| filename: Main model
H2_model_1 |> glance()

# A tibble: 1 × 9
     df  logLik    AIC    BIC deviance df.residual  nobs adj.r.squared  npar
  <dbl>   <dbl>  <dbl>  <dbl>    <dbl>       <dbl> <int>         <dbl> <int>
1 1173. -31318. 65000. 75086.   20059.      36430. 37603         0.767  2327

Code

#| filename: Main model
H2_model_1 |> appraise()

In [23]:

Code

Variance explained

#| filename: Variance explained
Xp <- predict(H2_model_1, type = "terms")
variance <- apply(Xp, 2, var)
cor(Xp)

                  s(Time) s(Time,site)  s(Time,Id)   s(Id_date)
s(Time)      1.000000e+00 3.414920e-02 0.003809311 4.374669e-05
s(Time,site) 3.414920e-02 1.000000e+00 0.019659500 3.124536e-05
s(Time,Id)   3.809311e-03 1.965950e-02 1.000000000 1.432294e-01
s(Id_date)   4.374669e-05 3.124536e-05 0.143229435 1.000000e+00

Code

#| filename: Variance explained
H2_R2 <- (variance/sum(variance)) |> enframe(value = "R2", name = "parameter")
H2_R2

# A tibble: 4 × 2
  parameter        R2
  <chr>         <dbl>
1 s(Time)      0.843 
2 s(Time,site) 0.0489
3 s(Time,Id)   0.0984
4 s(Id_date)   0.0100

Null model (H2_0)

In [24]:

Code

Null model

#| filename: Null model
H2_model_0 <- 
  bam(H2_form_0, 
      data = metric_participanthour, 
      discrete = TRUE,
      method = "fREML",
      nthreads = 8
      )

Warning in gam.side(sm, X, tol = .Machine$double.eps^0.5): model has repeated
1-d smooths of same variable.

Code

#| filename: Null model
r1 <- start_value_rho(H2_model_0, 
                      plot=TRUE)

Code

#| filename: Null model
H2_model_0 <- 
  bam(H2_form_0, 
      data = metric_participanthour, 
      discrete = TRUE,
      method = "fREML",
      nthreads = 10,
      rho = r1,
      AR.start = AR.start
      )

Warning in gam.side(sm, X, tol = .Machine$double.eps^0.5): model has repeated
1-d smooths of same variable.

Code

#| filename: Null model
acf_resid(H2_model_0, split_pred = "AR.start")

Code

#| filename: Null model
H2_summary_0 <- summary(H2_model_0)
H2_summary_0


Family: gaussian 
Link function: identity 

Formula:
lzMEDI ~ s(Time, k = 12) + s(Time, Id, bs = "fs") + s(Id_date, 
    bs = "re")

Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.85639    0.03083   27.78   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
              edf  Ref.df       F p-value    
s(Time)     10.88   10.97 288.872  <2e-16 ***
s(Time,Id) 888.54 1408.00   5.478  <2e-16 ***
s(Id_date) 332.29  809.00   0.788  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.768   Deviance explained = 77.5%
fREML =  33343  Scale est. = 0.5066    n = 37603

Code

#| filename: Null model
H2_model_0 |> glance()

# A tibble: 1 × 9
     df  logLik    AIC    BIC deviance df.residual  nobs adj.r.squared  npar
  <dbl>   <dbl>  <dbl>  <dbl>    <dbl>       <dbl> <int>         <dbl> <int>
1 1233. -31283. 65040. 75595.   19961.      36370. 37603         0.768  2231

Code

#| filename: Null model
H2_model_0 |> appraise()

Code

#| filename: Null model
H2_model_0 |> draw()

Photoperiod model

In [25]:

Code

Photoperiod model

#| filename: Photoperiod model
H2_model_phot <- 
  bam(H2_form_phot, 
      data = metric_participanthour, 
      discrete = TRUE,
      method = "fREML",
      nthreads = 10
      )

r1 <- start_value_rho(H2_model_phot, 
                      plot=TRUE)

Code

#| filename: Photoperiod model
H2_model_phot <- 
  bam(H2_form_phot, 
      data = metric_participanthour, 
      discrete = TRUE,
      method = "fREML",
      nthreads = 10,
      rho = r1,
      AR.start = AR.start
      )

acf_resid(H2_model_phot, split_pred = "AR.start")

Code

#| filename: Photoperiod model
H2_summary_phot <- summary(H2_model_phot)
H2_summary_phot


Family: gaussian 
Link function: identity 

Formula:
lzMEDI ~ s(Time, k = 12) + s(Time, site, bs = "sz", k = 12) + 
    s(Time, photoperiod.state, bs = "sz", k = 12) + s(Time, Id, 
    bs = "fs") + s(Id_date, bs = "re")

Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.74254    0.03972    18.7   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                              edf   Ref.df       F p-value    
s(Time)                    10.728   10.859 189.248  <2e-16 ***
s(Time,site)               65.220   72.433   5.497  <2e-16 ***
s(Time,photoperiod.state)   6.372    7.415   6.726  <2e-16 ***
s(Time,Id)                739.267 1392.000   3.565  <2e-16 ***
s(Id_date)                352.959  800.000   0.857  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.767   Deviance explained = 77.5%
fREML =  33229  Scale est. = 0.50527   n = 37603

Code

#| filename: Photoperiod model
H2_model_phot |> glance()

# A tibble: 1 × 9
     df  logLik    AIC    BIC deviance df.residual  nobs adj.r.squared  npar
  <dbl>   <dbl>  <dbl>  <dbl>    <dbl>       <dbl> <int>         <dbl> <int>
1 1176. -31291. 64952. 75069.   20014.      36427. 37603         0.767  2339

Code

#| filename: Photoperiod model
H2_model_phot |> appraise()

Code

#| filename: Photoperiod model
H2_model_phot |> draw()

Model comparison

In [26]:

Code

H2_AICs <- AIC(H2_model_1, H2_model_0, H2_model_phot)
H2_AICs |> as_tibble(rownames = "model") |> gt()

model	df	AIC
H2_model_1	1181.730	64999.75
H2_model_0	1236.712	65039.74
H2_model_phot	1185.365	64952.13

Since the ∆AIC of the base model against the null model is negative and lower than two (< -2), the base model is considered significantly better to explain the variance. The photoperiod model is even better and will be used henceforth.

In [27]:

Code

Variance explained

#| filename: Variance explained
Xp <- predict(H2_model_phot, type = "terms")
variance <- apply(Xp, 2, var)

cor(Xp)

                                s(Time) s(Time,site) s(Time,photoperiod.state)
s(Time)                    1.0000000000 3.514387e-02               0.333386974
s(Time,site)               0.0351438694 1.000000e+00               0.070097241
s(Time,photoperiod.state)  0.3333869744 7.009724e-02               1.000000000
s(Time,Id)                 0.0043575648 2.105299e-02               0.020739525
s(Id_date)                -0.0001540493 1.501554e-05               0.001913428
                           s(Time,Id)    s(Id_date)
s(Time)                   0.004357565 -1.540493e-04
s(Time,site)              0.021052988  1.501554e-05
s(Time,photoperiod.state) 0.020739525  1.913428e-03
s(Time,Id)                1.000000000  1.441526e-01
s(Id_date)                0.144152613  1.000000e+00

Code

#| filename: Variance explained
H2_R2 <- (variance/sum(variance)) |> enframe(value = "partial R2", name = "parameter")
H2_R2 |> gt() |> fmt_percent()

parameter	partial R2
s(Time)	83.61%
s(Time,site)	4.87%
s(Time,photoperiod.state)	0.34%
s(Time,Id)	10.13%
s(Id_date)	1.05%

Model visualization

In [28]:

Code

prepare photoperiod model H2 plot

#| filename: prepare photoperiod model H2 plot
H2_site_data <- 
H2_model_phot |> 
  conditional_values(
    condition = list(Time = seq(0.25, 23.75, by = 0.5), "site", "photoperiod.state"),
    exclude = c("s(Time,Id)", "s(Id_date)")
    ) |> 
  mutate(across(c(.fitted, .lower_ci, .upper_ci),
                exp_zero_inflated)
  )

H2_site_data <-
metric_participanthour |> 
  Datetime2Time() |> 
  group_by(site, Time) |> 
  summarize(
    .groups = "drop_last",
    dawn = dawn |> as.numeric() |> mean(na.rm = TRUE)/3600,
    dusk = dusk |> as.numeric() |> mean(na.rm = TRUE)/3600
  ) |> 
  mutate(
    photoperiod.state = ifelse(between(Time, dawn, dusk), "day", "night")
  ) |> 
  select(-c(dawn, dusk)) |> 
  left_join(H2_site_data, by = c("site", "Time", "photoperiod.state"))

H2_site_data_phot <- 
H2_site_data |> 
  number_states(photoperiod.state) |> 
  filter(photoperiod.state == "night") |> 
  group_by(site, photoperiod.state.count) |> 
  summarize(xmin = min(Time),
            xmax = max(Time),
            .groups = "drop_last") |> 
    site_conv_mutate(rev = FALSE)

In [29]:

Code

Visualize photoperiod model H2

#| filename: Visualize photoperiod model H2
#| fig-height: 9
#| fig-width: 10
source("scripts/Brown_bracket.R")

Loading required package: legendry

Loading required package: ggtext

Code

#| filename: Visualize photoperiod model H2
#| fig-height: 9
#| fig-width: 10
H2_pred <- 
H2_site_data |> 
  site_conv_mutate(rev = FALSE) |>
  ggplot(aes()) +
  map(c(1,10,250, 10^4), 
          \(x) geom_hline(aes(yintercept = x), 
                          col = "grey", linetype = "dashed")
      ) +
  geom_rect(data = H2_site_data_phot ,
            aes(xmin = xmin, ymin = -Inf, ymax = Inf, xmax = xmax),
            alpha = 0.25, inherit.aes = FALSE) +
  geom_ribbon(aes(x = Time, ymin = .lower_ci, ymax = .upper_ci, fill = site), alpha = 0.5) +
  geom_line(aes(x = Time, y = .fitted, col = site), linewidth = 0.1) +
  geom_line(aes(x = Time, y = .fitted), linewidth = 1) +
  scale_fill_manual(values = melidos_colors) +
  scale_color_manual(values = melidos_colors) +
  theme_cowplot() +
  scale_y_continuous(trans = "symlog", breaks = c(0,1,10,100,250, 1000)) +
  scale_x_continuous(breaks = seq(0, 24, by = 6)) +
  coord_cartesian(xlim = c(0,24), ylim = c(0,10^4), expand = FALSE) +
  labs(y = "Melanopic EDI (lx)", x = "Local time (hrs)", fill = "Site", col = "Site") +
  facet_wrap(~site) +
  # guides( y = guide_axis_stack(Brown_bracket, "axis")) +
  gghighlight(
    label_key = site,
    label_params = list(x = 12, y = 4.5, force = 0, hjust = 0.25)
  ) +
  theme_sub_strip(background = element_blank(),
                  text = element_blank()) +
  theme(panel.spacing = unit(1, "lines"),
        plot.caption = element_textbox_simple())
  # labs(caption = Brown_caption)

In [30]:

Code

Visualize Individual

#| filename: Visualize Individual
#| fig-height: 5
#| fig-width: 8.5
P_individual <- 
H2_model_phot |> 
  conditional_values(
    condition = list("Time", "Id"),
    exclude = c("s(Time,site)", "s(Id_date)", "s(Time,photoperiod.state)")
    ) |> 
  mutate(across(c(.fitted, .lower_ci, .upper_ci),
                exp_zero_inflated)
  ) |> 
  group_by(Id) |>
  select(-site) |> 
  separate_wider_delim(Id, delim = "_", names = c("site","Id")) |> 
  site_conv_mutate() |> 
  ggplot(aes(x = Time)) +
  map(c(1,10,250, 10^4), 
          \(x) geom_hline(aes(yintercept = x), col = "grey", linetype = "dashed")
      ) +
  # geom_ribbon(aes(ymin = .lower_ci, ymax = .upper_ci, group = Id), alpha = 0.025) +
  geom_line(aes(y = .fitted, group = interaction(Id, site), color = site), 
            linewidth = 1, alpha = 0.35) +
  theme_cowplot() +
  scale_color_manual(values = melidos_colors) +
  scale_y_continuous(trans = "symlog", breaks = c(0,1,10,100,250, 1000)) +
  scale_x_continuous(breaks = seq(0, 24, by = 6)) +
  coord_cartesian(xlim = c(0,24), ylim = c(0,1.5*10^4), expand = FALSE) +
  labs(y = "Melanopic EDI (lx)", x = "Local time (hrs)") +
  guides( y = guide_axis_stack(Brown_bracket, "axis"), color = "none") +
  theme(plot.caption = element_textbox_simple())
  # labs(caption = Brown_caption)

In [31]:

Code

prepare time model

#| filename: prepare time model
H2_data <- 
H2_model_phot |> 
  conditional_values(
    condition = list(Time = seq(0.25, 23.75, by = 0.5), "photoperiod.state"),
    exclude = c("s(Time,site)", "s(Time,Id)", "s(Id_date)")
    ) |> 
  mutate(across(c(.fitted, .lower_ci, .upper_ci),
                exp_zero_inflated)
  )

H2_data <-
metric_participanthour |> 
  Datetime2Time() |> 
  group_by(Time) |> 
  summarize(
    .groups = "drop_last",
    dawn = dawn |> as.numeric() |> mean(na.rm = TRUE)/3600,
    dusk = dusk |> as.numeric() |> mean(na.rm = TRUE)/3600
  ) |> 
  mutate(
    photoperiod.state = ifelse(between(Time, dawn, dusk), "day", "night")
  ) |> 
  select(-c(dawn, dusk)) |> 
  left_join(H2_data, by = c("Time", "photoperiod.state"))

H2_data_phot <- 
H2_data |> 
  number_states(photoperiod.state) |> 
  filter(photoperiod.state == "night") |> 
  group_by(photoperiod.state.count) |> 
  summarize(xmin = min(Time),
            xmax = max(Time),
            .groups = "drop_last")

In [32]:

Code

Visualize Time

#| filename: Visualize Time
#| fig-height: 5
#| fig-width: 8.5
P_time <- 
H2_data |> 
  ggplot(aes(x = Time)) +
  map(c(1,10, 100, 250, 10^3), 
          \(x) geom_hline(aes(yintercept = x), col = "grey", linetype = "dashed")
      ) +
  geom_rect(data = H2_data_phot ,
            aes(xmin = xmin, ymin = -Inf, ymax = Inf, xmax = xmax),
            alpha = 0.25, inherit.aes = FALSE) +
  geom_ribbon(aes(ymin = .lower_ci, ymax = .upper_ci), alpha = 0.5) +
  geom_line(aes(y = .fitted), 
            linewidth = 1, alpha = 1) +
  theme_cowplot() +
  scale_y_continuous(trans = "symlog", breaks = c(0,1,10,100,250, 1000)) +
  scale_x_continuous(breaks = seq(0, 24, by = 6)) +
  coord_cartesian(xlim = c(0,24), ylim = c(0,1.5*10^3), expand = FALSE) +
  labs(y = "Melanopic EDI (lx)", x = "Local time (hrs)") +
  guides( y = guide_axis_stack(Brown_bracket, "axis"), color = "none") +
  theme(plot.caption = element_textbox_simple())
  # labs(caption = Brown_caption)

In [33]:

Code

Visualize model terms

#| filename: Visualize model terms

P_terms <-
H2_model_phot |> 
  smooth_estimates(select = "s(Time,site)") |> 
  mutate(.lower_ci = .estimate -1.96*.se,
         .upper_ci = .estimate +1.96*.se,
         .fitted = .estimate,
         signif = .lower_ci > 0 | .upper_ci < 0,
         signif_id = consecutive_id(signif),
         .by = site) |> 
  site_conv_mutate(rev = FALSE) |>
  ggplot(aes()) +
  map(c(log10(c(0.1, 0.2, 0.5, 2, 5, 10))),
      \(x) geom_hline(yintercept = x, linewidth = 0.5, col = "grey", linetype = 2)) +
  geom_rect(data = H2_site_data_phot ,
            aes(xmin = xmin, ymin = -Inf, ymax = Inf, xmax = xmax),
            alpha = 0.25, inherit.aes = FALSE) +
  geom_ribbon(aes(x = Time, ymin = .lower_ci, ymax = .upper_ci, fill = site), alpha = 0.5) +
  geom_hline(yintercept = 0, linewidth = 1, linetype = 2) +
  geom_line(aes(x = Time, y = .fitted), 
            linewidth = 1) +
  geom_line(data = \(x) filter(x, signif),
            aes(x = Time, y = .fitted, group = signif_id), 
            col = "red", linewidth = 1) +
  scale_fill_manual(values = melidos_colors) +
  # scale_color_manual(values = c(melidos_colors)) +
  theme_cowplot() +
  scale_y_continuous(labels = expression(10^-1, 5^-1, 2^-1, "1  ", "2  ", "5  ", "10  "), 
                     breaks = log10(c(0.1, 0.2, 0.5, 1, 2, 5, 10)))+
  scale_x_continuous(breaks = seq(0, 24, by = 6)) +
  coord_cartesian(xlim = c(0,24), expand = FALSE) +
  guides(fill = "none") +
  labs(y = "Factor", x = "Local time (hrs)", fill = "Site") +
  facet_wrap(~site) +
  theme_sub_strip(background = element_blank(),
                  text = element_blank()) +
  theme(panel.spacing = unit(1, "lines"),
        plot.caption = element_textbox_simple())

In [34]:

Code

Combination of partial plots H2

#| fig-height: 17
#| fig-width: 13
#| filename: Combination of partial plots H2

((P_time / P_individual) | (H2_pred / P_terms)) +
  plot_annotation(tag_levels = "A", caption = Brown_caption,
                  theme = theme(plot.caption = element_textbox_simple())) +
  plot_layout()

Code

#| fig-height: 17
#| fig-width: 13
#| filename: Combination of partial plots H2

ggsave("figures/Fig4.pdf", height = 17, width = 13)
ggsave("figures/Fig4.png", height = 17, width = 13)

Interpretation

There is an overall trend of time and photoperiod (panel A). Daytime melanopic EDI light levels are < 100 lx in the morning and late afternoon, and between 100 and about 200 lx between late morning and early afternoon. During the evening, light levels are below 10 lx melanopic EDI, and generally below 1 lx at night.

Individual deviations from this trend vary greatly, especially during the day (panel B)

Model predictions (panel C) and deviations from the global trend (panel D) of each site shows specific differences:

Delft (NL), Tübingen (DE), and San José (CR) do not show significant deviations from the global trend. In the case of San José (CR) this is likely due to the small number of participants, as can be seen by the wide confidence interval compared to the other sites
Boras (SE), Dortmund (DE), and partially Munich (DE) show similar patterns of higher mel EDI in the hours after dawn and before dusk. The difference is about a factor 2-5 higher compared to the global trend.
Madrid (ES), Kumasi (GH), and Izmir (TR) have a significantly reduced mel EDI around dawn (factor 2-5 reduction)
Madrid (ES) and Kumasi (GH) have significantly reduced mel EDI around dusk (factor 2-5 reduction)
Izmir (TR) show significantly hightened melanopic EDI in the late evening (factor 2-5 increase)
Kumasi (GH) has significantly lower daytime illuminance both in the afternoon and after dusk (factor 2-5 reduction)

Session info

In [35]:

Code cell - Session info

#| code-summary: Code cell - Session info
sessionInfo()

R version 4.5.0 (2025-04-11)
Platform: aarch64-apple-darwin20
Running under: macOS 26.3.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/Berlin
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
 [1] ggtext_0.1.2        legendry_0.2.4      gtsummary_2.5.0    
 [4] gghighlight_0.5.0   glmmTMB_1.1.14      gt_1.3.0           
 [7] broom.mixed_0.2.9.7 rlang_1.2.0         patchwork_1.3.2    
[10] emmeans_2.0.3       gratia_0.11.2       itsadug_2.5        
[13] plotfunctions_1.5   broom_1.0.12        ggridges_0.5.7     
[16] cowplot_1.2.0       performance_0.16.0  lme4_2.0-1         
[19] Matrix_1.7-3        mgcv_1.9-4          nlme_3.1-168       
[22] LightLogR_0.10.2    melidosData_1.0.6   lubridate_1.9.5    
[25] forcats_1.0.1       stringr_1.6.0       dplyr_1.2.1        
[28] purrr_1.2.2         readr_2.2.0         tidyr_1.3.2        
[31] tibble_3.3.1        ggplot2_4.0.2       tidyverse_2.0.0    

loaded via a namespace (and not attached):
  [1] RColorBrewer_1.1-3   rstudioapi_0.18.0    DHARMa_0.4.7        
  [4] jsonlite_2.0.0       datawizard_1.3.0     magrittr_2.0.5      
  [7] estimability_1.5.1   farver_2.1.2         nloptr_2.2.1        
 [10] rmarkdown_2.31       ragg_1.5.2           fs_2.0.1            
 [13] vctrs_0.7.3          minqa_1.2.8          base64enc_0.1-6     
 [16] effectsize_1.0.2     htmltools_0.5.9      curl_7.0.0          
 [19] haven_2.5.5          sass_0.4.10          parallelly_1.47.0   
 [22] htmlwidgets_1.6.4    pbkrtest_0.5.5       sandwich_3.1-1      
 [25] zoo_1.8-15           TMB_1.9.21           commonmark_2.0.0    
 [28] lifecycle_1.0.5      pkgconfig_2.0.3      webshot2_0.1.2      
 [31] R6_2.6.1             fastmap_1.2.0        rbibutils_2.4.1     
 [34] future_1.70.0        digest_0.6.39        numDeriv_2016.8-1.1 
 [37] furrr_0.4.0          textshaping_1.0.5    labeling_0.4.3      
 [40] timechange_0.4.0     compiler_4.5.0       withr_3.0.2         
 [43] S7_0.2.1-1           backports_1.5.1      MASS_7.3-65         
 [46] tools_4.5.0          chromote_0.5.1       otel_0.2.0          
 [49] glue_1.8.0           promises_1.5.0       gridtext_0.1.6      
 [52] grid_4.5.0           see_0.13.0           generics_0.1.4      
 [55] gtable_0.3.6         labelled_2.16.0      tzdb_0.5.0          
 [58] websocket_1.4.4      hms_1.1.4            xml2_1.5.2          
 [61] utf8_1.2.6           ggrepel_0.9.8        pillar_1.11.1       
 [64] markdown_2.0         later_1.4.8          splines_4.5.0       
 [67] lattice_0.22-7       renv_1.1.4           tidyselect_1.2.1    
 [70] juicyjuice_0.1.0     knitr_1.51           reformulas_0.4.4    
 [73] V8_8.1.0             litedown_0.9         xfun_0.57           
 [76] statmod_1.5.1        stringi_1.8.7        yaml_2.3.12         
 [79] boot_1.3-31          ggokabeito_0.1.0     evaluate_1.0.5      
 [82] codetools_0.2-20     cli_3.6.6            systemfonts_1.3.2   
 [85] tweedie_3.0.17       xtable_1.8-8         parameters_0.28.3   
 [88] Rdpack_2.6.6         processx_3.8.7       mirai_2.6.1         
 [91] Rcpp_1.1.1-1         globals_0.19.1       coda_0.19-4.1       
 [94] nanonext_1.8.2       parallel_4.5.0       bayestestR_0.17.0   
 [97] mvnfast_0.2.8        listenv_0.10.1       broom.helpers_1.22.0
[100] mvtnorm_1.3-7        scales_1.4.0         insight_1.5.0       
[103] cards_0.7.1