Back to Article
Research Question 1
Download Source

Do sites differ in objectively measured personal light exposure? If so, does an individual’s behaviour and micro-environment have a stronger effect than site?

Author

Johannes Zauner

Last modified:

May 20, 2026

Preface

This document contains the analysis for Research Question 1, as defined in the preregistration.

Research Question 1:
Do sites differ in objectively measured personal light exposure? If so, does an individual’s behaviour and micro-environment have a stronger effect than site?

H1: There is a significant difference in personal light-exposure metrics across sites, even after accounting for latitude and photoperiod.

H2: The variance of personal light exposure patterns (hourly geometric mean of melanopic EDI) by participants within sites is greater than the variance between sites.

Setup

In this section, we load in the preprocessed and prepared data, site data, and load required packages.

In [1]:
Code
Setup
#| message: false
#| warning: false
#| filename: Setup
library(tidyverse)
Warning: package 'ggplot2' was built under R version 4.5.2
Warning: package 'tibble' was built under R version 4.5.2
Warning: package 'tidyr' was built under R version 4.5.2
Warning: package 'readr' was built under R version 4.5.2
Warning: package 'purrr' was built under R version 4.5.2
Warning: package 'dplyr' was built under R version 4.5.2
Warning: package 'lubridate' was built under R version 4.5.2
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.2.1     ✔ readr     2.2.0
✔ forcats   1.0.1     ✔ stringr   1.6.0
✔ ggplot2   4.0.2     ✔ tibble    3.3.1
✔ lubridate 1.9.5     ✔ tidyr     1.3.2
✔ purrr     1.2.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Code
#| message: false
#| warning: false
#| filename: Setup
library(melidosData)
library(LightLogR)
library(mgcv)
Loading required package: nlme

Attaching package: 'nlme'

The following object is masked from 'package:dplyr':

    collapse

This is mgcv 1.9-4. For overview type '?mgcv'.
Code
#| message: false
#| warning: false
#| filename: Setup
library(lme4)
Warning: package 'lme4' was built under R version 4.5.2
Loading required package: Matrix

Attaching package: 'Matrix'

The following objects are masked from 'package:tidyr':

    expand, pack, unpack


Attaching package: 'lme4'

The following object is masked from 'package:nlme':

    lmList
Code
#| message: false
#| warning: false
#| filename: Setup
library(performance)
Warning: package 'performance' was built under R version 4.5.2
Code
#| message: false
#| warning: false
#| filename: Setup
library(cowplot)

Attaching package: 'cowplot'

The following object is masked from 'package:lubridate':

    stamp
Code
#| message: false
#| warning: false
#| filename: Setup
library(ggridges)
library(broom)
Warning: package 'broom' was built under R version 4.5.2
Code
#| message: false
#| warning: false
#| filename: Setup
library(itsadug)
Warning: package 'itsadug' was built under R version 4.5.2
Loading required package: plotfunctions
Warning: package 'plotfunctions' was built under R version 4.5.2

Attaching package: 'plotfunctions'

The following object is masked from 'package:ggplot2':

    alpha

Loaded package itsadug 2.5 (see 'help("itsadug")' ).
Code
#| message: false
#| warning: false
#| filename: Setup
library(gratia)
Warning: package 'gratia' was built under R version 4.5.2

Attaching package: 'gratia'

The following object is masked from 'package:itsadug':

    dispersion

The following object is masked from 'package:stringr':

    boundary
Code
#| message: false
#| warning: false
#| filename: Setup
library(emmeans)
Warning: package 'emmeans' was built under R version 4.5.2
Welcome to emmeans.
Caution: You lose important information if you filter this package's results.
See '? untidy'
Code
#| message: false
#| warning: false
#| filename: Setup
library(patchwork)

Attaching package: 'patchwork'

The following object is masked from 'package:cowplot':

    align_plots
Code
#| message: false
#| warning: false
#| filename: Setup
library(rlang)
Warning: package 'rlang' was built under R version 4.5.2

Attaching package: 'rlang'

The following objects are masked from 'package:purrr':

    flatten, flatten_chr, flatten_dbl, flatten_int, flatten_lgl,
    flatten_raw, invoke, splice
Code
#| message: false
#| warning: false
#| filename: Setup
library(broom.mixed)
Warning: package 'broom.mixed' was built under R version 4.5.2
Code
#| message: false
#| warning: false
#| filename: Setup
library(gt)
Warning: package 'gt' was built under R version 4.5.2

Attaching package: 'gt'

The following object is masked from 'package:cowplot':

    as_gtable
Code
#| message: false
#| warning: false
#| filename: Setup
library(performance)
library(glmmTMB)
Warning: package 'glmmTMB' was built under R version 4.5.2
Warning in check_dep_version(dep_pkg = "TMB"): package version mismatch: 
glmmTMB was built with TMB package version 1.9.19
Current TMB package version is 1.9.21
Please re-install glmmTMB from source or restore original 'TMB' package (see '?reinstalling' for more information)
Code
#| message: false
#| warning: false
#| filename: Setup
library(gghighlight)
library(gtsummary)
Warning: package 'gtsummary' was built under R version 4.5.2
Code
#| message: false
#| warning: false
#| filename: Setup
source("scripts/fitting.R")
source("scripts/summaries.R")
source("scripts/H1_specific.R")
In [2]:
Code cell - Load data
#| message: false
#| code-summary: Code cell - Load data
load("data/metrics_glasses.RData")
metrics <- metrics_glasses
H1 <- "Hypothesis: There is a significant difference in personal light-exposure metrics across sites, even after accounting for latitude and photoperiod."
H2 <- "Hypothesis: The variance of personal light exposure patterns (hourly geometric mean of melanopic EDI) by participants within sites is greater than the variance between sites."
sig.level <- 0.05

#for chest:
# load("data/metrics_chest.RData")
# metrics <- metrics_chest
# melidos_cities <- melidos_cities[names(melidos_cities)!="MPI"]

Hypothesis 1

Details

Hypothesis 1 states:

There is a significant difference in personal light-exposure metrics across sites, even after accounting for latitude and photoperiod.

Type of model: LMM

Base formula: \text{Metric} \sim \text{Site} + \text{Photoperiod} + (1|\text{Site}:\text{Participant})

Additional notes: - For dynamics-based metrics LM without Participant random effect. - Because latitude and site are each individual values per site (i.e., each site has its unique latitude), they have to be modelled separately and will be compared via AIC.

Specific formulas: - $ + $ (for dynamics-based metrics) - \text{Metric} \sim \text{Latitude} + \text{Photoperiod} + (1|\text{Site}:\text{Participant}) (for the latitude version of the base formula) - \text{Metric} \sim \text{Photoperiod} + (1|\text{Site}) + (1|\text{Site}:\text{Participant}) (to check whether site is better represented as a random effect)

False-discovery-rate correction: 17 (number of metrics that are compared)

Testing whether photoperiod differs significantly by site

In [3]:
Code
by.day.data <- metrics$data[[3]]

model_phot <- lmer(photoperiod ~ site + (1|Id), data = by.day.data)
summary(model_phot)
Linear mixed model fit by REML ['lmerMod']
Formula: photoperiod ~ site + (1 | Id)
   Data: by.day.data

REML criterion at convergence: -580.8

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.6766 -0.4604 -0.0003  0.4417  3.2394 

Random effects:
 Groups   Name        Variance Std.Dev.
 Id       (Intercept) 3.131047 1.76948 
 Residual             0.007516 0.08669 
Number of obs: 811, groups:  Id, 141

Fixed effects:
            Estimate Std. Error t value
(Intercept)  16.7092     0.4172  40.055
siteFUSPCEU  -5.5992     0.5570 -10.053
siteIZTECH   -3.7039     0.5986  -6.188
siteKNUST    -4.1498     0.6188  -6.707
siteMPI      -4.1705     0.5427  -7.685
siteRISE     -0.4959     0.6442  -0.770
siteTHUAS    -1.5697     0.6442  -2.437
siteTUM       0.3819     0.6980   0.547
siteUCR      -3.2370     0.8343  -3.880

Correlation of Fixed Effects:
            (Intr) sFUSPC sIZTEC sKNUST sitMPI stRISE sTHUAS sitTUM
siteFUSPCEU -0.749                                                 
siteIZTECH  -0.697  0.522                                          
siteKNUST   -0.674  0.505  0.470                                   
siteMPI     -0.769  0.576  0.536  0.518                            
siteRISE    -0.648  0.485  0.451  0.437  0.498                     
siteTHUAS   -0.648  0.485  0.451  0.437  0.498  0.419              
siteTUM     -0.598  0.448  0.416  0.403  0.459  0.387  0.387       
siteUCR     -0.500  0.374  0.348  0.337  0.384  0.324  0.324  0.299
Code
drop1(model_phot, test = "Chisq")
Single term deletions

Model:
photoperiod ~ site + (1 | Id)
       npar     AIC    LRT   Pr(Chi)    
<none>      -556.31                     
site      8 -445.65 126.66 < 2.2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code
model_phot |> 
  emmeans(~ site) |> 
  contrast() |> 
  as_tibble() |> 
  gt() |> 
  fmt_number() |> 
  fmt(p.value, fns = style_pvalue)
contrast estimate SE df t.ratio p.value
BAUA effect 2.50 0.40 132.00 6.23 <0.001
FUSPCEU effect −3.09 0.36 132.01 −8.51 <0.001
IZTECH effect −1.20 0.41 132.00 −2.91 0.005
KNUST effect −1.64 0.43 132.01 −3.79 <0.001
MPI effect −1.67 0.35 132.00 −4.81 <0.001
RISE effect 2.01 0.46 131.99 4.34 <0.001
THUAS effect 0.94 0.46 132.00 2.02 0.051
TUM effect 2.89 0.52 131.99 5.56 <0.001
UCR effect −0.73 0.66 132.02 −1.11 0.3

Sites do vary significantly in their photoperiod, with only Delft (NL) and San José (CR) non-significantly different from the average.

Preparation

In [4]:
Code cell - Model setup based on distribution
#| code-summary: Code cell - Model setup based on distribution
model_tbl <- tibble::tribble(
    ~name, ~engine, ~ response, ~family,
    "interdaily_stability", "lm", "qlogis(metric)", gaussian(),
    "intradaily_variability", "lm", "metric", gaussian(),
    "Mean", "lmer", "log_zero_inflated(metric)", gaussian(),
    "dose", "lmer", "log_zero_inflated(metric)", gaussian(),
    "MDER", "lmer", "metric", gaussian(),
    "darkest_10h_mean", "lmer", "log_zero_inflated(metric)", gaussian(),
    "brightest_10h_mean", "lmer", "log_zero_inflated(metric)", gaussian(),
    "duration_above_1000", "glmmTMB", "metric", tweedie(link = "log"),
    "duration_above_250", "lmer", NA, gaussian(),
    "duration_below_10", "lmer", NA, gaussian(),
    "duration_below_1", "lmer", NA, gaussian(),
    "duration_above_250_wake", "glmmTMB", "metric", tweedie(link = "log"),
    "duration_below_10_pre-sleep", "glmmTMB", "metric", tweedie(link = "log"),
    "duration_below_1_sleep", "glmmTMB", "metric", tweedie(link = "log"),
    "period_above_250", "lmer", "log_zero_inflated(metric)", gaussian(),
    "brightest_10h_midpoint", "lmer", "metric", gaussian(),
    "brightest_10h_onset", "lmer", NA, gaussian(),
    "brightest_10h_offset", "lmer", NA, gaussian(),
    "darkest_10h_midpoint", "lmer", "metric", gaussian(),
    "darkest_10h_onset", "lmer", NA, gaussian(),
    "darkest_10h_offset", "lmer", NA, gaussian(),
    "mean_timing_above_250", "lmer", "metric", gaussian(),
    "first_timing_above_250", "lmer", "metric", gaussian(),
    "last_timing_above_250", "lmer", "metric", gaussian()
  )
In [5]:
Code cell - Formula setup for inference
#| code-summary: Code cell - Formula setup for inference
metrics <-
  metrics |> 
  mutate(H1_1 = case_when(type == "participant" ~ 
                                  "~ site + photoperiod",
                                type == "participant-day" ~ 
                                  "~ site + photoperiod + (1| site:Id)"),
         H1_0 = case_when(type == "participant" ~ 
                                   "~ photoperiod",
                                 type == "participant-day" ~ 
                                  "~ photoperiod + (1| site:Id)"),
         H1_lat = case_when(type == "participant" ~ 
                                   "~ lat + photoperiod",
                                 type == "participant-day" ~ 
                                  "~ lat + photoperiod + (1|site:Id)"),
         H1_phot = case_when(type == "participant" ~ 
                                   "~ site",
                                 type == "participant-day" ~ 
                                  "~ site + (1|site:Id)"),
         H1_rand = case_when(type == "participant" ~ 
                                   NA,
                                 type == "participant-day" ~ 
                                  "~ photoperiod + (1|site/Id)")
         ) |> 
  left_join(model_tbl, by = "name")

H1_formulas <-  names(metrics) |> subset(grepl("H1_", names(metrics)))

metrics <- 
  metrics |> relocate(all_of(H1_formulas), .after = family)
H1_fdr_n <- metrics |> filter(!is.na(response)) |> nrow()
In [6]:
Code
#| echo: false
metrics |> 
  select(-data) |>
  filter(!is.na(response)) |> 
  arrange(metric_type, name) |> 
  mutate(family = map(family, 1),
         rowname = seq_along(1:length(name))) |> 
  group_by(metric_type) |> 
  gt() |> 
  cols_label_with(fn = \(x) x |> str_to_title() |> str_replace_all("_", " ")) |> 
  sub_missing(missing_text = "") |> 
  sub_values(values = "NULL", replacement = "") |> 
  fmt(name, fns = \(x) x |> str_to_title() |>  str_replace_all("_", " ")) |> 
  sub_values(values = "MDER", replacement = "MDER") |> 
  tab_header("Hypothesis 1: metric model specifications",
             subtitle = H1) |> 
  tab_footnote("Confirmatory analysis will use a p-value correction through the false-discovery rate for n=17 comparisons") |> 
  tab_footnote("These are the models used for the confirmatory analysis. Each H1 1 and H1 0 pair are compaired with likelihood-ratio tests to determine whether site is meaningful", locations = cells_column_labels(c(H1_1, H1_0))) |> 
  tab_footnote("These models will be checked via AIC against H1 1 to see whether there is indeed a site difference beyond the specific difference in the model. Also, this model is compared against the H1 0 model", locations = cells_column_labels(c("H1_lat", "H1_rand", "H1_phot")))
Hypothesis 1: metric model specifications
Hypothesis: There is a significant difference in personal light-exposure metrics across sites, even after accounting for latitude and photoperiod.
Name Type Metric family Engine Response Family H1 11 H1 01 H1 lat2 H1 phot2 H1 rand2
duration
1 Duration above 1000 participant-day
glmmTMB metric tweedie ~ site + photoperiod + (1| site:Id) ~ photoperiod + (1| site:Id) ~ lat + photoperiod + (1|site:Id) ~ site + (1|site:Id) ~ photoperiod + (1|site/Id)
2 Duration above 250 wake participant-day
glmmTMB metric tweedie ~ site + photoperiod + (1| site:Id) ~ photoperiod + (1| site:Id) ~ lat + photoperiod + (1|site:Id) ~ site + (1|site:Id) ~ photoperiod + (1|site/Id)
3 Duration below 10 pre-Sleep participant-day
glmmTMB metric tweedie ~ site + photoperiod + (1| site:Id) ~ photoperiod + (1| site:Id) ~ lat + photoperiod + (1|site:Id) ~ site + (1|site:Id) ~ photoperiod + (1|site/Id)
4 Duration below 1 sleep participant-day
glmmTMB metric tweedie ~ site + photoperiod + (1| site:Id) ~ photoperiod + (1| site:Id) ~ lat + photoperiod + (1|site:Id) ~ site + (1|site:Id) ~ photoperiod + (1|site/Id)
5 Period above 250 participant-day
lmer log_zero_inflated(metric) gaussian ~ site + photoperiod + (1| site:Id) ~ photoperiod + (1| site:Id) ~ lat + photoperiod + (1|site:Id) ~ site + (1|site:Id) ~ photoperiod + (1|site/Id)
dynamics
6 Interdaily stability participant Interdaily stability lm qlogis(metric) gaussian ~ site + photoperiod ~ photoperiod ~ lat + photoperiod ~ site
7 Intradaily variability participant Intradaily variability lm metric gaussian ~ site + photoperiod ~ photoperiod ~ lat + photoperiod ~ site
exposure history
8 Dose participant-day
lmer log_zero_inflated(metric) gaussian ~ site + photoperiod + (1| site:Id) ~ photoperiod + (1| site:Id) ~ lat + photoperiod + (1|site:Id) ~ site + (1|site:Id) ~ photoperiod + (1|site/Id)
level
9 Mean participant-day
lmer log_zero_inflated(metric) gaussian ~ site + photoperiod + (1| site:Id) ~ photoperiod + (1| site:Id) ~ lat + photoperiod + (1|site:Id) ~ site + (1|site:Id) ~ photoperiod + (1|site/Id)
10 Brightest 10h mean participant-day
lmer log_zero_inflated(metric) gaussian ~ site + photoperiod + (1| site:Id) ~ photoperiod + (1| site:Id) ~ lat + photoperiod + (1|site:Id) ~ site + (1|site:Id) ~ photoperiod + (1|site/Id)
11 Darkest 10h mean participant-day
lmer log_zero_inflated(metric) gaussian ~ site + photoperiod + (1| site:Id) ~ photoperiod + (1| site:Id) ~ lat + photoperiod + (1|site:Id) ~ site + (1|site:Id) ~ photoperiod + (1|site/Id)
spectrum
12 MDER participant-day MDER lmer metric gaussian ~ site + photoperiod + (1| site:Id) ~ photoperiod + (1| site:Id) ~ lat + photoperiod + (1|site:Id) ~ site + (1|site:Id) ~ photoperiod + (1|site/Id)
timing
13 Brightest 10h midpoint participant-day
lmer metric gaussian ~ site + photoperiod + (1| site:Id) ~ photoperiod + (1| site:Id) ~ lat + photoperiod + (1|site:Id) ~ site + (1|site:Id) ~ photoperiod + (1|site/Id)
14 Darkest 10h midpoint participant-day
lmer metric gaussian ~ site + photoperiod + (1| site:Id) ~ photoperiod + (1| site:Id) ~ lat + photoperiod + (1|site:Id) ~ site + (1|site:Id) ~ photoperiod + (1|site/Id)
15 First timing above 250 participant-day
lmer metric gaussian ~ site + photoperiod + (1| site:Id) ~ photoperiod + (1| site:Id) ~ lat + photoperiod + (1|site:Id) ~ site + (1|site:Id) ~ photoperiod + (1|site/Id)
16 Last timing above 250 participant-day
lmer metric gaussian ~ site + photoperiod + (1| site:Id) ~ photoperiod + (1| site:Id) ~ lat + photoperiod + (1|site:Id) ~ site + (1|site:Id) ~ photoperiod + (1|site/Id)
17 Mean timing above 250 participant-day
lmer metric gaussian ~ site + photoperiod + (1| site:Id) ~ photoperiod + (1| site:Id) ~ lat + photoperiod + (1|site:Id) ~ site + (1|site:Id) ~ photoperiod + (1|site/Id)
Confirmatory analysis will use a p-value correction through the false-discovery rate for n=17 comparisons
1 These are the models used for the confirmatory analysis. Each H1 1 and H1 0 pair are compaired with likelihood-ratio tests to determine whether site is meaningful
2 These models will be checked via AIC against H1 1 to see whether there is indeed a site difference beyond the specific difference in the model. Also, this model is compared against the H1 0 model
In [7]:
Code cell - Formula creation
#| code-summary: Code cell - Formula creation
metrics <-
  metrics |> 
  mutate(across(all_of(H1_formulas),
                \(x) map2(x, response, \(y,z) make_formula(z, y))
                )
         )
In [8]:
Code cell - Offset zero-values in mixed gamma models
#| code-summary: Code cell - Offset zero-values in mixed gamma models
 metrics <- 
  metrics |>
  mutate(data = map2(data, engine, \(data, engine) {
    if(engine == "glmer") {
      data |> mutate(metric = metric + 0.1)
    } else data
  }))

Analysis

Fit models

In [9]:
Code cell - Model fit
#| code-summary: Code cell - Model fit
#| message: false
metrics <-
  metrics |> 
  mutate(
    across(all_of(H1_formulas),
           \(x) pmap(list(data, x, engine, family), fit_model),
           .names = "{.col}_model")
    )
boundary (singular) fit: see help('isSingular')

Main model diagnostics

In [10]:
Code cell - Create diagnostic plots for main model
#| code-summary: Code cell - Create diagnostic plots for main model
#| message: false
metrics <-
  metrics |> 
  mutate(
    across(all_of(paste0(H1_formulas[1], "_model")),
           \(x) map(x, diagnostics),
           .names = "{str_remove(.col, '_model')}_diagnostics")
    )
`check_outliers()` does not yet support models of class `glmmTMB`.
`check_outliers()` does not yet support models of class `glmmTMB`.
`check_outliers()` does not yet support models of class `glmmTMB`.
`check_outliers()` does not yet support models of class `glmmTMB`.

Model diagnostics overall look good to OK. Anomalies are found but deemed non-critical. Alternative error families, like the Tweedie distribution have been used as well. For Tweedie models, the homogeneity of variance does not show in the plots (duration metrics), but was checked interactively.

Model comparisons

In [11]:
Code cell - Model comparisons
#| code-summary: Code cell - Model comparisons
#| message: false
metrics <- 
  metrics |> 
  mutate(
    H1_1_comp = map2(H1_0_model, H1_1_model, model_anova),
    H1_1_p = map2(H1_1_comp, engine, model_comp_p, n = H1_fdr_n) |> list_c(),
    H1_1_sig = H1_1_p <= sig.level,
    
    H1_lat_comp = map2(H1_0_model, H1_lat_model, model_anova),
    H1_lat_p = map2(H1_lat_comp, engine, model_comp_p, n = H1_fdr_n) |> list_c(),
    H1_lat_sig = H1_lat_p <= sig.level,
    
    H1_phot_comp = map2(H1_phot_model, H1_1_model, model_anova),
    H1_phot_p = map2(H1_phot_comp, engine, model_comp_p, n = H1_fdr_n) |> list_c(),
    H1_phot_sig = H1_phot_p <= sig.level,
    
    H1_rand_comp = map2(H1_0_model, H1_rand_model, model_anova),
    H1_rand_p = map2(H1_rand_comp, engine, model_comp_p, n = H1_fdr_n) |> list_c(),
    H1_rand_sig = H1_rand_p <= sig.level,
    
    AICs = pmap(list(H1_0_model, H1_1_model, H1_phot_model, H1_lat_model, 
                     H1_rand_model), model_AIC),
  
    summary = pmap(list(H1_1_sig, H1_0_model, H1_1_model), model_summary),
    summary_lat = pmap(list(H1_lat_sig, H1_0_model, H1_lat_model), model_summary),
    summary_rand = pmap(list(H1_rand_sig, H1_0_model, H1_rand_model),
                        model_summary),
  
    table = pmap(list(name, H1_1_p, summary, metric_type, 
                           response, family), model_table),
    table_lat = pmap(list(name, H1_lat_p, summary_lat, metric_type, 
                           response, family), model_table),
    table_lat = map(table_lat, \(x) if(is.null(x)) return() else x |>
                  mutate(across(any_of("lat"), \(x) x*10))),
    table = map2(table, H1_phot_sig, \(x,y) if(is.null(x)) return() else x |>
                  mutate(dAIC_photoperiod = ifelse(y, 0, -3))),
    table_rand = pmap(list(name, H1_rand_p, summary_rand, metric_type, 
                           response, family),model_table),
    table = pmap(list(table, table_lat, AICs, "lat", "H1_1", "H1_lat"),
                 H1_combining_tables),
    table = pmap(list(table, table_rand, AICs, "SD Site", "H1_1", "H1_rand"),
                 H1_combining_tables),
    )
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)

Adjusted csum-coding for R1

In [12]:
Code cell - adjust model parameters for sum-coding of site
#| code-summary: Code cell - adjust model parameters for sum-coding of site
metrics <- 
  metrics |> 
  mutate(
    site_coefs = map2(H1_1_model, H1_1_sig, sum_coding),
    table = map2(table, site_coefs, merge_results)
  )

R^2

In [13]:
Code cell - calculate R^2
#| warning: false
#| code-summary: Code cell - calculate R^2
R2_table <- 
  metrics |> 
  filter_out(is.na(response)) |> 
  mutate(
    name = name,
    metric_type = metric_type,
    H1_1_r2 = pmap(list(H1_1_model, data, family, response), r2_helper), 
    H1_0_r2 = pmap(list(H1_0_model, data, family, response), r2_helper), 
    H1_phot_r2 = pmap(list(H1_phot_model, data, family, response), r2_helper), 
    H1_lat_r2 = pmap(list(H1_lat_model, data, family, response), r2_helper), 
    H1_phot_sig = H1_phot_sig,
    H1_lat_sig = H1_lat_sig,
    H1_1_sig = H1_1_sig,
    .keep = "none"
  )
In [14]:
Code cell - choose relevant R^2
#| code-summary: Code cell - choose relevant R^2
R2_table <- 
  R2_table |> 
  rowwise() |> 
  mutate(
    name = name,
    metric_type = metric_type,
    marginal_r2 = ifelse(H1_1_sig, H1_1_r2$R2_marginal, H1_0_r2$R2_marginal),
    conditional_r2 = ifelse(H1_1_sig, H1_0_r2$R2_conditional, H1_0_r2$R2_conditional),
    site_r2 = ifelse(H1_1_sig, H1_1_r2$R2_marginal - H1_0_r2$R2_marginal, NA),
    phot_r2 = ifelse(H1_phot_sig, H1_1_r2$R2_marginal - H1_phot_r2$R2_marginal, NA),
    rand_r2 = ifelse(H1_1_sig, H1_1_r2$R2_conditional - H1_1_r2$R2_marginal, #
                     H1_0_r2$R2_conditional - H1_0_r2$R2_marginal),
    rand_r2 = ifelse(rand_r2 == 0, NA, rand_r2),
    lat_r2 = ifelse(H1_lat_sig, H1_lat_r2$R2_marginal - H1_0_r2$R2_marginal, NA),
    .keep = "none"
  ) |> 
  mutate(across(c(marginal_r2, conditional_r2), 
                \(x) ifelse(metric_type == "dynamics", NA, x)
                ),
             phot_r2 = ifelse(is.na(site_r2) & !is.na(conditional_r2), marginal_r2, phot_r2)
         )

Results

In [15]:
Code cell - Table H1
#| message: false
#| warning: false
#| code-summary: Code cell - Table H1
#| label: tbl-H1
#| tab-name: Hypothesis 1 results
H1_tab <- 
  H1_table(metrics, H1_fdr_n)|> 
  cols_width(
    stub() ~ px(232),
    photoperiod ~ px(120),
    everything() ~ px(95)
  )
H1_tab 
#| message: false
#| warning: false
#| code-summary: Code cell - Table H1
#| label: tbl-H1
#| tab-name: Hypothesis 1 results
gtsave(H1_tab, "tables/H1.png", vwidth = 2000)
file:////var/folders/9p/326_k3kx43qbn_cyl1rqfhb00000gn/T//RtmpXhfCfn/file13f9349b7e454.html screenshot completed
Code cell - Table H1
#| message: false
#| warning: false
#| code-summary: Code cell - Table H1
#| label: tbl-H1
#| tab-name: Hypothesis 1 results
save(metrics, file = "data/H1_results.RData")
In [16]:
Model results for Hypothesis 1
Hypothesis: There is a significant difference in personal light-exposure metrics across sites, even after accounting for latitude and photoperiod.
Coefs are2 p-value3 Intercept
Site coefficients1
Photoperiod4,5 SD Participant SD Residual6
Comparative models
Borås (SE) Delft (NL) Dortmund (DE) Tübingen (DE) Munich (DE) Madrid (ES) Izmir (TR) San José (CR) Kumasi (GH) Latitude4,7 SD Site4,7
Duration
Duration above 1000 multiplied >0.9 5m 43s 1.17× 1.84×
Duration above 250 wake multiplied 0.062 38m 26s 1.10× 1.77×
Duration below 10 pre-sleep multiplied >0.9 2h 31m 0.97× 1.40×
Duration below 1 sleep multiplied >0.9 8h 57m 0.98× 1.20×
Period above 250 multiplied >0.9 10m 27s 1.10× 1.63× 2.12×
Dynamics8
Interdaily stability logit scale >0.9 −0.15 −0.05
Intradaily variability additive >0.9 1.33 0.00
Exposure History
Dose multiplied >0.9 356 1.19× 2.11× 3.77×
Level
Mean multiplied 0.007 4.70 1.03× 0.95× 0.91× 1.47× 0.98× 1.03× 1.34× 1.14× 0.49× 1.18× 1.82× 2.02× 1.15× 1.32×
Brightest 10h mean multiplied 0.040 88.57 1.46× 1.08× 0.95× 1.46× 0.72× 1.42× 1.09× 1.01× 0.40× 1.25× 2.22× 3.17× 1.24×
Darkest 10h mean multiplied <0.001 0.23 0.74× 0.85× 0.83× 1.34× 1.33× 0.75× 1.48× 1.43× 0.67× 1.09× 1.63× 1.60× 1.34×
Spectrum
MDER additive 0.3 0.48 0.02 0.08 0.08
Timing
Brightest 10h midpoint additive 0.002 13:50 −35m 30s 10m 42s 12m 54s 10m 5s 39m 16s 38m 42s 34m 6s −1h 7m −43m 6s 2m 33s 50m 26s 1h 34m 12m 7s 32m 15s
Darkest 10h midpoint additive 0.011 02:43 −23m 16s −7m 58s 9m 34s 13m 12s 58m 1s 7m 27s 34m 45s −34m 51s −56m 54s −8m 37s 51m 23s 1h 38m
First timing above 250 additive 0.4 11:23 −8m 18s 1h 14m 2h 6m
Last timing above 250 additive <0.001 17:37 20m 25s 20m 1s 10m 5s −6m 33s 44m 3s 1h 23m 54m 17s −2h 5m −1h 40m 19m 50s 1h 4m 2h 2m 28m 29s 1h 3m
Mean timing above 250 additive <0.001 13:18 −14m 19s 31m 39s 13m 57s 15m 15s 25m 24s 53m 27s 53m 12s −1h 53m −1h 4m 7m 27s 33m 11s 1h 34m 21m 17s 52m 9s
1 Sites have only entries if the (adjusted) p-value is significant, otherwise the results for the null-model are shown. Entries show the site-specific coefficient (relative to the overall mean level). Grey entries indicate a non-significant difference from the Overall mean.
2 multiplied: coefficients are multiplicative. This comes from the fact that the models were calculated on a log scale and backtransformed in the table. The Intercept can be read as linear scale, with coefficients being multiplied with the intercept. additive: All values are on the linear scale and no transformation is necessary. logit scale: data were modelled on the logit scale. Coefficients can be added to the intercept, but have to be transformed with the logistic function.
3 p-values are adjusted for multiple comparisons using the false-discovery-rate for n= 17 comparisons
4 Photoperiod: coefficient per 1-hour change in photoperiod. Latitude: coefficient per 10° change in latitude. SD Site: random effect standard deviation for site.
5 If the coefficient is greyed out it means it is not significant in a model base minus photoperiod.
6 Models with a tweedie error distribution do not contain a residual standard deviation, as a dispersion parameter is estimated instead during modeling.
7 These coefficients (Latitude) and random effects (SD Site) come from two respective separate models that did not contain a fixed effect of site. If the random site or fixed latitude effect is significant, a value is presented. If AIC indicates a significant improvement over the base model (site as fixed effect), it is shown in bold. If the base model is significantly stronger, however, values are shown in grey.
8 Dynamics-based metrics are calculated per participant and thus are using linear/generalized models instead of the mixed-model framework

Interpretation

@tbl-H1 shows a comprehensive overview of the main inference results.

  • Timing-based metrics show a significant site-specific effect that is neither dependent on photoperiod, nor latitude
  • While level-based metrics show a significant site-specific effect, it can be drawn back to photoperiod and latitude for daytime metrics (but not for nighttime)
  • Duration-based metrics are photoperiod-dependent, but not site specific, same as exposure history and spectrum - these metrics are rather more dependent on inter- and intraindividual differences of participants
  • In general, interindividual differences highly relevant and trump site specific factors, although usually not as large as intraindividual differences
  • None of the above means that individual sites can not be significantly different from the group or another site. But rather that the variance across all nine sites is or is not sufficiently large.

R2

In [17]:
Code cell - R^2 table
#| code-summary: Code cell - R^2 table
R2_tbl <- 
R2_table |> 
  mutate(metric_type = str_to_title(metric_type)) |> 
  arrange(metric_type, name) |> 
  group_by(metric_type) |> 
  gt(rowname_col = "name") |> 
  cols_add(Residual = 1 - conditional_r2) |> 
  fmt_percent(decimals = 0) |>
  sub_missing() |> 
  tab_header(md("**H1: Variance explained by the model parameters ($R^{2}$)**")) |> 
  fmt(name, 
      rows = name != "MDER",
      fns = \(x) x |> str_to_sentence() |> str_replace_all("_", " ")) |> 
  cols_move(rand_r2, after = lat_r2) |> 
  cols_move(marginal_r2, after = conditional_r2) |> 
  cols_label(
    conditional_r2 = "Model",
    rand_r2 = "Participants",
    marginal_r2 = "Fixed effects",
    site_r2 = "Site",
    phot_r2 = "Photoperiod",
    lat_r2 = "Latitude"
  ) |> 
  tab_spanner("Unique contribution", columns = c(site_r2, phot_r2, lat_r2)) |> 
  cols_width(
    name ~ px(230),
    phot_r2 ~ px(100),
    c(rand_r2, Residual) ~ px(90),
    everything() ~ px(70)) |> 
  grand_summary_rows(
    columns = where(is.numeric),
    fns = list(
      `Grand average` ~ mean(., na.rm = TRUE)
    ),
        side = "top",
    fmt = ~ fmt_percent(., decimals = 0)
  ) |> 
  tab_style(
    cell_text(align = "center"),
    list(cells_body(),
         cells_grand_summary(),
         cells_column_labels())
  ) |> 
    tab_style(
    cell_text(weight = "bold"),
    list(cells_column_labels(),
         cells_row_groups(),
         cells_grand_summary(),
         cells_stub_grand_summary())
  ) |> 
  tab_footnote(
    md("*Model*: conditional $R^2$, *Participants*: partial $R^2$ due to interindividual differences on a participant level, Fixed effects: marginal $R^2$ (all fixed effects combined, based on a model of $site + photoperiod$), *Site/Photoperiod/Latitude*: unique contribution of the given parameter relative to a model without it. *Residual*: proportion of residual variance (e.g., intraindividual differences)")
  )
In [18]:
Code
R2_tbl
H1: Variance explained by the model parameters (R^{2})
Model Fixed effects
Unique contribution
Participants Residual
Site Photoperiod Latitude
Grand average 40% 12% 10% 7% 5% 28% 60%
Duration
Duration above 1000 49% 17% 17% 33% 51%
Duration above 250 wake 51% 9% 9% 42% 49%
Duration below 10 pre-sleep 31% 2% 2% 29% 69%
Duration below 1 sleep 38% 3% 3% 35% 62%
Period above 250 35% 8% 8% 27% 65%
Dynamics
Interdaily stability
Intradaily variability
Exposure History
Dose 31% 9% 9% 22% 69%
Level
Mean 55% 24% 7% 8% 3% 32% 45%
Brightest 10h mean 43% 18% 6% 6% 4% 26% 57%
Darkest 10h mean 62% 23% 15% 3% 40% 38%
Spectrum
MDER 54% 13% 13% 41% 46%
Timing
Brightest 10h midpoint 27% 8% 8% 2% 20% 73%
Darkest 10h midpoint 27% 9% 6% 19% 73%
First timing above 250 28% 2% 2% 25% 72%
Last timing above 250 39% 23% 12% 5% 6% 17% 61%
Mean timing above 250 25% 17% 14% 1% 7% 9% 75%
Model: conditional R^2, Participants: partial R^2 due to interindividual differences on a participant level, Fixed effects: marginal R^2 (all fixed effects combined, based on a model of site + photoperiod), Site/Photoperiod/Latitude: unique contribution of the given parameter relative to a model without it. Residual: proportion of residual variance (e.g., intraindividual differences)
Code
gtsave(R2_tbl, "tables/H1_R2_tbl.png")
file:////var/folders/9p/326_k3kx43qbn_cyl1rqfhb00000gn/T//RtmpXhfCfn/file13f93229869a1.html screenshot completed
  • Overall model fit varies widely (≈25–62%), with highest explanatory power for level-based metrics (e.g., darkest/mean light levels) and lowest for timing metrics.
  • Fixed effects explain a modest share (≈2–24%), indicating environmental factors (site, photoperiod, latitude) contribute but are not dominant drivers.
  • Individual (participant-level) differences account for a large portion of variance (≈20–42%), often exceeding fixed-effect contributions.
  • Site effects are generally stronger than photoperiod, particularly for level metrics, but both show overlapping (shared) explanatory power.
  • Latitude contributes only marginally where included, suggesting limited additional explanatory value beyond photoperiod.
  • Residual variance remains high across most metrics (≈38–75%), indicating substantial unexplained within-individual or day-to-day variability.

Model summaries

The following tabs show the model summaries for either

  • \text{Metric} \sim \text{Site} + \text{Photoperiod} + (1|\text{Site}:\text{Participant})

or

  • \text{Metric} \sim \text{Photoperiod} + (1|\text{Site}:\text{Participant})

depending on whether site is significant

Model summary

Characteristic Beta 95% CI p-value
photoperiod -0.05 -0.07, -0.02 <0.001
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).
Characteristic p-value
Model 1: qlogis(metric) ~ photoperiod
Model 2: qlogis(metric) ~ site + photoperiod 0.12

Model summary

Characteristic Beta 95% CI p-value
photoperiod 0.00 -0.03, 0.02 0.8
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).
Characteristic p-value
Model 1: metric ~ photoperiod
Model 2: metric ~ site + photoperiod 0.2

Model summary

Characteristic Beta 95% CI
site

    BAUA 0.00
    FUSPCEU 0.05 -0.18, 0.29
    IZTECH 0.17 -0.05, 0.38
    KNUST -0.27 -0.50, -0.04
    MPI 0.21 0.00, 0.42
    RISE 0.05 -0.15, 0.26
    THUAS 0.02 -0.19, 0.23
    TUM 0.03 -0.19, 0.26
    UCR 0.10 -0.19, 0.38
photoperiod 0.07 0.05, 0.10
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).
Characteristic p-value
model1: log_zero_inflated(metric) ~ photoperiod + (1 | site:Id)
model2: log_zero_inflated(metric) ~ site + photoperiod + (1 | site:Id) <0.001

Model summary

Characteristic Beta 95% CI
site

    BAUA 0.00
    FUSPCEU 0.18 -0.15, 0.51
    IZTECH 0.06 -0.24, 0.36
    KNUST -0.37 -0.69, -0.05
    MPI 0.19 -0.10, 0.48
    RISE 0.19 -0.10, 0.48
    THUAS 0.06 -0.24, 0.35
    TUM -0.12 -0.43, 0.19
    UCR 0.03 -0.37, 0.43
photoperiod 0.10 0.06, 0.13
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).
Characteristic p-value
model1: log_zero_inflated(metric) ~ photoperiod + (1 | site:Id)
model2: log_zero_inflated(metric) ~ site + photoperiod + (1 | site:Id) 0.002

Model summary

Characteristic Beta 95% CI
site

    BAUA 0.00
    FUSPCEU 0.43 -0.44, 1.3
    IZTECH 0.35 -0.45, 1.2
    KNUST -0.93 -1.8, -0.08
    MPI -0.05 -0.82, 0.72
    RISE -0.81 -1.6, -0.05
    THUAS -0.04 -0.81, 0.74
    TUM 0.44 -0.38, 1.3
    UCR -1.3 -2.4, -0.27
photoperiod 0.04 -0.06, 0.14
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).
Characteristic p-value
model1: metric ~ photoperiod + (1 | site:Id)
model2: metric ~ site + photoperiod + (1 | site:Id) <0.001

Model summary

Characteristic Beta 95% CI
site

    BAUA 0.00
    FUSPCEU -0.04 -0.23, 0.14
    IZTECH 0.25 0.08, 0.42
    KNUST -0.09 -0.28, 0.09
    MPI 0.21 0.04, 0.37
    RISE -0.05 -0.22, 0.11
    THUAS 0.01 -0.15, 0.18
    TUM 0.20 0.03, 0.38
    UCR 0.23 0.01, 0.46
photoperiod 0.04 0.01, 0.06
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).
Characteristic p-value
model1: log_zero_inflated(metric) ~ photoperiod + (1 | site:Id)
model2: log_zero_inflated(metric) ~ site + photoperiod + (1 | site:Id) <0.001

Model summary

Characteristic Beta 95% CI
site

    BAUA 0.00
    FUSPCEU -0.04 -0.93, 0.86
    IZTECH 0.42 -0.40, 1.2
    KNUST -1.1 -2.0, -0.23
    MPI 0.06 -0.73, 0.85
    RISE -0.55 -1.3, 0.23
    THUAS -0.29 -1.1, 0.50
    TUM 0.81 -0.04, 1.7
    UCR -0.74 -1.8, 0.35
photoperiod -0.14 -0.25, -0.04
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).
Characteristic p-value
model1: metric ~ photoperiod + (1 | site:Id)
model2: metric ~ site + photoperiod + (1 | site:Id) <0.001

Model summary

Warning: the 'nobars' function has moved to the reformulas package. Please update your imports, or ask an upstream package maintainer to do so.
This warning is displayed once per session.
ℹ Multinomial models, multi-component models and other groups models have a
  different underlying structure than the models gtsummary was designed for.
• Functions designed to work with `tbl_regression()` objects may yield
  unexpected results.
ℹ Suppress this message with `?suppressMessages()`.
Characteristic Beta 95% CI p-value
cond
photoperiod 0.16 0.12, 0.21 <0.001
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).
Characteristic p-value
model1: metric ~ photoperiod + (1 | site:Id), zi=~0, disp=~1
model2: metric ~ site + photoperiod + (1 | site:Id), zi=~0, disp=~1 0.058

Model summary

Characteristic Beta 95% CI
photoperiod 0.04 0.03, 0.06
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).
Characteristic p-value
model1: log_zero_inflated(metric) ~ photoperiod + (1 | site:Id)
model2: log_zero_inflated(metric) ~ site + photoperiod + (1 | site:Id) 0.3

Model summary

Characteristic Beta 95% CI
site

    BAUA 0.00
    FUSPCEU 0.66 -0.06, 1.4
    IZTECH 0.65 0.00, 1.3
    KNUST -1.3 -2.0, -0.60
    MPI 0.02 -0.61, 0.66
    RISE -0.47 -1.1, 0.14
    THUAS 0.29 -0.34, 0.93
    TUM 0.19 -0.48, 0.86
    UCR -2.1 -3.0, -1.3
photoperiod 0.12 0.04, 0.21
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).
Characteristic p-value
model1: metric ~ photoperiod + (1 | site:Id)
model2: metric ~ site + photoperiod + (1 | site:Id) <0.001

Model summary

Characteristic Beta 95% CI
photoperiod -0.14 -0.23, -0.04
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).
Characteristic p-value
model1: metric ~ photoperiod + (1 | site:Id)
model2: metric ~ site + photoperiod + (1 | site:Id) 0.026

Model summary

Characteristic Beta 95% CI
site

    BAUA 0.00
    FUSPCEU 1.2 0.08, 2.4
    IZTECH 0.74 -0.31, 1.8
    KNUST -1.8 -3.0, -0.73
    MPI -0.28 -1.3, 0.73
    RISE 0.17 -0.81, 1.2
    THUAS 0.17 -0.84, 1.2
    TUM 0.57 -0.50, 1.6
    UCR -2.3 -3.6, -0.88
photoperiod 0.33 0.20, 0.46
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).
Characteristic p-value
model1: metric ~ photoperiod + (1 | site:Id)
model2: metric ~ site + photoperiod + (1 | site:Id) <0.001

Model summary

Characteristic Beta 95% CI
photoperiod 0.08 0.05, 0.10
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).
Characteristic p-value
model1: log_zero_inflated(metric) ~ photoperiod + (1 | site:Id)
model2: log_zero_inflated(metric) ~ site + photoperiod + (1 | site:Id) 0.2

Model summary

Characteristic Beta 95% CI
photoperiod 0.02 0.01, 0.02
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).
Characteristic p-value
model1: metric ~ photoperiod + (1 | site:Id)
model2: metric ~ site + photoperiod + (1 | site:Id) 0.016

Model summary

ℹ Multinomial models, multi-component models and other groups models have a
  different underlying structure than the models gtsummary was designed for.
• Functions designed to work with `tbl_regression()` objects may yield
  unexpected results.
ℹ Suppress this message with `?suppressMessages()`.
Characteristic Beta 95% CI p-value
cond
photoperiod -0.03 -0.06, -0.01 0.017
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).
Characteristic p-value
model1: metric ~ photoperiod + (1 | site:Id), zi=~0, disp=~1
model2: metric ~ site + photoperiod + (1 | site:Id), zi=~0, disp=~1 0.2

Model summary

ℹ Multinomial models, multi-component models and other groups models have a
  different underlying structure than the models gtsummary was designed for.
• Functions designed to work with `tbl_regression()` objects may yield
  unexpected results.
ℹ Suppress this message with `?suppressMessages()`.
Characteristic Beta 95% CI p-value
cond
photoperiod -0.02 -0.03, -0.01 0.005
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).
Characteristic p-value
model1: metric ~ photoperiod + (1 | site:Id), zi=~0, disp=~1
model2: metric ~ site + photoperiod + (1 | site:Id), zi=~0, disp=~1 0.069

Model summary

ℹ Multinomial models, multi-component models and other groups models have a
  different underlying structure than the models gtsummary was designed for.
• Functions designed to work with `tbl_regression()` objects may yield
  unexpected results.
ℹ Suppress this message with `?suppressMessages()`.
Characteristic Beta 95% CI p-value
cond
photoperiod 0.10 0.06, 0.14 <0.001
Abbreviation: CI = Confidence Interval

Model comparison

✖ Unable to identify the list of variables.

This is usually due to an error calling `stats::model.frame(x)`or `stats::model.matrix(x)`.
It could be the case if that type of model does not implement these methods.
Rarely, this error may occur if the model object was created within
a functional programming framework (e.g. using `lappy()`, `purrr::map()`, etc.).
Characteristic p-value
model1: metric ~ photoperiod + (1 | site:Id), zi=~0, disp=~1
model2: metric ~ site + photoperiod + (1 | site:Id), zi=~0, disp=~1 0.004

Hypothesis 2

Details

Hypothesis 2 states:

The variance of personal light exposure patterns (30 minute mean of melanopic EDI) by participants within sites is greater than the variance between sites.

Type of model: HGAM (Hierarchical GAM)

Base formula: \text{melanopic EDI} \sim s(\text{Time}) + s(\text{Time}, \text{Site}) + s(\text{Time}, \text{Participant})

Additional notes: - Site effects are modelled with the sz basis, that allows for the most identifiable modelling of parametric differences - Participant effects are modelled with the fs basis, which models the participant-effects as random smooths - melanopic EDI is scaled with log_zero_inflated() before model fitting.

Specific formulas: - \text{melanopic EDI} \sim s(\text{Time}) + s(\text{Time}, \text{Site}) + s(\text{Time}, \text{Participant}) + s(\text{Time}, \text{Photoperiod}) (to test photoperiod effects) - \text{melanopic EDI} \sim s(\text{Time}) + s(\text{Time}, \text{Participant}) (comparison to the base formula)

False-discovery-rate correction: 1 (there is only one main comparison that tests the effect of site)

Preparation

In [19]:
Code
Load prepared data
#| filename: Load prepared data
load("data/metrics_separate_glasses.RData")
load("data/preprocessed_glasses_2.RData")

metric_participanthour <-
  metric_glasses_participanthour |>
  add_Time_col() |> 
  ungroup() |> 
  mutate(Time = as.numeric(Time)/3600 + 0.25,
         across(c(site, Id, sleep, State.Brown, wear, photoperiod.state),
                fct),
         Date = factor(Date),
         Id_date = interaction(Id, Date),
         lzMEDI = log_zero_inflated(MEDI),
         photoperiod = (dusk - dawn) |> as.numeric()
  ) |> 
  group_by(Id) |> 
  mutate(AR.start = ifelse(row_number() == 1, TRUE, FALSE))
In [20]:
Code
Overview plot of the data (median with IQR)
#| fig-height: 12
#| fig-width: 12
#| filename: Overview plot of the data (median with IQR)
metric_participanthour |>
  group_by(site) |> 
  aggregate_Date(
    numeric.handler = \(x) median(x, na.rm = TRUE),
    p75 = quantile(MEDI, p=0.75, na.rm = TRUE),
    p25 = quantile(MEDI, p=0.25, na.rm = TRUE)
  ) |> 
  site_conv_mutate() |> 
  gg_doubleplot(geom = "blank", jco_color = FALSE, facetting = FALSE,
                x.axis.breaks.repeat = ~Datetime_breaks(.x, by = "12 hours", shift =
    lubridate::duration(0, "hours"))) +
  scale_fill_manual(values = melidos_colors) +
  scale_color_manual(values = melidos_colors) +
  geom_ribbon(aes(ymin = p25, ymax = p75, color = site, fill = site)) +
  geom_line() +
  geom_hline(yintercept = 250, linetype = "dashed") +
  facet_wrap(~site, scales = "free_x") +
  guides(fill = "none", col = "none") +
  theme(panel.spacing = unit(2, "lines"))

In [21]:
Code
H2 formulas
#| filename: H2 formulas
H2_form_1 <- lzMEDI ~ s(Time, k = 12) + s(Time, site, bs = "sz", k = 12) + s(Time, Id, bs = "fs") + s(Id_date, bs = "re")
H2_form_0 <- lzMEDI ~ s(Time, k = 12) + s(Time, Id, bs = "fs") + s(Id_date, bs = "re")
H2_form_phot <- lzMEDI ~ s(Time, k = 12) + s(Time, site, bs = "sz", k = 12) + s(Time, photoperiod.state, bs = "sz", k=12) + s(Time, Id, bs = "fs") + s(Id_date, bs = "re")

Analysis

Base model (H2_1)

In [22]:
Code
Main model
#| filename: Main model
H2_model_1 <- 
  bam(H2_form_1, 
      data = metric_participanthour, 
      discrete = TRUE,
      method = "fREML",
      nthreads = 10
      )

r1 <- start_value_rho(H2_model_1, 
                      plot=TRUE)

Code
#| filename: Main model
H2_model_1 <- 
  bam(H2_form_1, 
      data = metric_participanthour, 
      discrete = TRUE,
      method = "fREML",
      nthreads = 10,
      rho = r1,
      AR.start = AR.start
      )

acf_resid(H2_model_1, split_pred = "AR.start")

Code
#| filename: Main model
H2_summary_1 <- summary(H2_model_1)
H2_summary_1

Family: gaussian 
Link function: identity 

Formula:
lzMEDI ~ s(Time, k = 12) + s(Time, site, bs = "sz", k = 12) + 
    s(Time, Id, bs = "fs") + s(Id_date, bs = "re")

Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.85202    0.02428   35.09   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                edf  Ref.df       F p-value    
s(Time)       10.88   10.98 246.416  <2e-16 ***
s(Time,site)  67.21   74.27   5.910  <2e-16 ***
s(Time,Id)   743.18 1392.00   3.605  <2e-16 ***
s(Id_date)   350.83  800.00   0.848  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.767   Deviance explained = 77.4%
fREML =  33251  Scale est. = 0.50672   n = 37603
Code
#| filename: Main model
H2_model_1 |> glance()
# A tibble: 1 × 9
     df  logLik    AIC    BIC deviance df.residual  nobs adj.r.squared  npar
  <dbl>   <dbl>  <dbl>  <dbl>    <dbl>       <dbl> <int>         <dbl> <int>
1 1173. -31318. 65000. 75086.   20059.      36430. 37603         0.767  2327
Code
#| filename: Main model
H2_model_1 |> appraise()

In [23]:
Code
Variance explained
#| filename: Variance explained
Xp <- predict(H2_model_1, type = "terms")
variance <- apply(Xp, 2, var)
cor(Xp)
                  s(Time) s(Time,site)  s(Time,Id)   s(Id_date)
s(Time)      1.000000e+00 3.414920e-02 0.003809311 4.374669e-05
s(Time,site) 3.414920e-02 1.000000e+00 0.019659500 3.124536e-05
s(Time,Id)   3.809311e-03 1.965950e-02 1.000000000 1.432294e-01
s(Id_date)   4.374669e-05 3.124536e-05 0.143229435 1.000000e+00
Code
#| filename: Variance explained
H2_R2 <- (variance/sum(variance)) |> enframe(value = "R2", name = "parameter")
H2_R2
# A tibble: 4 × 2
  parameter        R2
  <chr>         <dbl>
1 s(Time)      0.843 
2 s(Time,site) 0.0489
3 s(Time,Id)   0.0984
4 s(Id_date)   0.0100

Null model (H2_0)

In [24]:
Code
Null model
#| filename: Null model
H2_model_0 <- 
  bam(H2_form_0, 
      data = metric_participanthour, 
      discrete = TRUE,
      method = "fREML",
      nthreads = 8
      )
Warning in gam.side(sm, X, tol = .Machine$double.eps^0.5): model has repeated
1-d smooths of same variable.
Code
#| filename: Null model
r1 <- start_value_rho(H2_model_0, 
                      plot=TRUE)

Code
#| filename: Null model
H2_model_0 <- 
  bam(H2_form_0, 
      data = metric_participanthour, 
      discrete = TRUE,
      method = "fREML",
      nthreads = 10,
      rho = r1,
      AR.start = AR.start
      )
Warning in gam.side(sm, X, tol = .Machine$double.eps^0.5): model has repeated
1-d smooths of same variable.
Code
#| filename: Null model
acf_resid(H2_model_0, split_pred = "AR.start")

Code
#| filename: Null model
H2_summary_0 <- summary(H2_model_0)
H2_summary_0

Family: gaussian 
Link function: identity 

Formula:
lzMEDI ~ s(Time, k = 12) + s(Time, Id, bs = "fs") + s(Id_date, 
    bs = "re")

Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.85639    0.03083   27.78   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
              edf  Ref.df       F p-value    
s(Time)     10.88   10.97 288.872  <2e-16 ***
s(Time,Id) 888.54 1408.00   5.478  <2e-16 ***
s(Id_date) 332.29  809.00   0.788  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.768   Deviance explained = 77.5%
fREML =  33343  Scale est. = 0.5066    n = 37603
Code
#| filename: Null model
H2_model_0 |> glance()
# A tibble: 1 × 9
     df  logLik    AIC    BIC deviance df.residual  nobs adj.r.squared  npar
  <dbl>   <dbl>  <dbl>  <dbl>    <dbl>       <dbl> <int>         <dbl> <int>
1 1233. -31283. 65040. 75595.   19961.      36370. 37603         0.768  2231
Code
#| filename: Null model
H2_model_0 |> appraise()

Code
#| filename: Null model
H2_model_0 |> draw()

Photoperiod model

In [25]:
Code
Photoperiod model
#| filename: Photoperiod model
H2_model_phot <- 
  bam(H2_form_phot, 
      data = metric_participanthour, 
      discrete = TRUE,
      method = "fREML",
      nthreads = 10
      )

r1 <- start_value_rho(H2_model_phot, 
                      plot=TRUE)

Code
#| filename: Photoperiod model
H2_model_phot <- 
  bam(H2_form_phot, 
      data = metric_participanthour, 
      discrete = TRUE,
      method = "fREML",
      nthreads = 10,
      rho = r1,
      AR.start = AR.start
      )

acf_resid(H2_model_phot, split_pred = "AR.start")

Code
#| filename: Photoperiod model
H2_summary_phot <- summary(H2_model_phot)
H2_summary_phot

Family: gaussian 
Link function: identity 

Formula:
lzMEDI ~ s(Time, k = 12) + s(Time, site, bs = "sz", k = 12) + 
    s(Time, photoperiod.state, bs = "sz", k = 12) + s(Time, Id, 
    bs = "fs") + s(Id_date, bs = "re")

Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.74254    0.03972    18.7   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                              edf   Ref.df       F p-value    
s(Time)                    10.728   10.859 189.248  <2e-16 ***
s(Time,site)               65.220   72.433   5.497  <2e-16 ***
s(Time,photoperiod.state)   6.372    7.415   6.726  <2e-16 ***
s(Time,Id)                739.267 1392.000   3.565  <2e-16 ***
s(Id_date)                352.959  800.000   0.857  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.767   Deviance explained = 77.5%
fREML =  33229  Scale est. = 0.50527   n = 37603
Code
#| filename: Photoperiod model
H2_model_phot |> glance()
# A tibble: 1 × 9
     df  logLik    AIC    BIC deviance df.residual  nobs adj.r.squared  npar
  <dbl>   <dbl>  <dbl>  <dbl>    <dbl>       <dbl> <int>         <dbl> <int>
1 1176. -31291. 64952. 75069.   20014.      36427. 37603         0.767  2339
Code
#| filename: Photoperiod model
H2_model_phot |> appraise()

Code
#| filename: Photoperiod model
H2_model_phot |> draw()

Model comparison

In [26]:
Code
H2_AICs <- AIC(H2_model_1, H2_model_0, H2_model_phot)
H2_AICs |> as_tibble(rownames = "model") |> gt()
model df AIC
H2_model_1 1181.730 64999.75
H2_model_0 1236.712 65039.74
H2_model_phot 1185.365 64952.13

Since the ∆AIC of the base model against the null model is negative and lower than two (< -2), the base model is considered significantly better to explain the variance. The photoperiod model is even better and will be used henceforth.

In [27]:
Code
Variance explained
#| filename: Variance explained
Xp <- predict(H2_model_phot, type = "terms")
variance <- apply(Xp, 2, var)

cor(Xp)
                                s(Time) s(Time,site) s(Time,photoperiod.state)
s(Time)                    1.0000000000 3.514387e-02               0.333386974
s(Time,site)               0.0351438694 1.000000e+00               0.070097241
s(Time,photoperiod.state)  0.3333869744 7.009724e-02               1.000000000
s(Time,Id)                 0.0043575648 2.105299e-02               0.020739525
s(Id_date)                -0.0001540493 1.501554e-05               0.001913428
                           s(Time,Id)    s(Id_date)
s(Time)                   0.004357565 -1.540493e-04
s(Time,site)              0.021052988  1.501554e-05
s(Time,photoperiod.state) 0.020739525  1.913428e-03
s(Time,Id)                1.000000000  1.441526e-01
s(Id_date)                0.144152613  1.000000e+00
Code
#| filename: Variance explained
H2_R2 <- (variance/sum(variance)) |> enframe(value = "partial R2", name = "parameter")
H2_R2 |> gt() |> fmt_percent()
parameter partial R2
s(Time) 83.61%
s(Time,site) 4.87%
s(Time,photoperiod.state) 0.34%
s(Time,Id) 10.13%
s(Id_date) 1.05%

Model visualization

In [28]:
Code
prepare photoperiod model H2 plot
#| filename: prepare photoperiod model H2 plot
H2_site_data <- 
H2_model_phot |> 
  conditional_values(
    condition = list(Time = seq(0.25, 23.75, by = 0.5), "site", "photoperiod.state"),
    exclude = c("s(Time,Id)", "s(Id_date)")
    ) |> 
  mutate(across(c(.fitted, .lower_ci, .upper_ci),
                exp_zero_inflated)
  )

H2_site_data <-
metric_participanthour |> 
  Datetime2Time() |> 
  group_by(site, Time) |> 
  summarize(
    .groups = "drop_last",
    dawn = dawn |> as.numeric() |> mean(na.rm = TRUE)/3600,
    dusk = dusk |> as.numeric() |> mean(na.rm = TRUE)/3600
  ) |> 
  mutate(
    photoperiod.state = ifelse(between(Time, dawn, dusk), "day", "night")
  ) |> 
  select(-c(dawn, dusk)) |> 
  left_join(H2_site_data, by = c("site", "Time", "photoperiod.state"))

H2_site_data_phot <- 
H2_site_data |> 
  number_states(photoperiod.state) |> 
  filter(photoperiod.state == "night") |> 
  group_by(site, photoperiod.state.count) |> 
  summarize(xmin = min(Time),
            xmax = max(Time),
            .groups = "drop_last") |> 
    site_conv_mutate(rev = FALSE)
In [29]:
Code
Visualize photoperiod model H2
#| filename: Visualize photoperiod model H2
#| fig-height: 9
#| fig-width: 10
source("scripts/Brown_bracket.R")
Loading required package: legendry
Loading required package: ggtext
Code
#| filename: Visualize photoperiod model H2
#| fig-height: 9
#| fig-width: 10
H2_pred <- 
H2_site_data |> 
  site_conv_mutate(rev = FALSE) |>
  ggplot(aes()) +
  map(c(1,10,250, 10^4), 
          \(x) geom_hline(aes(yintercept = x), 
                          col = "grey", linetype = "dashed")
      ) +
  geom_rect(data = H2_site_data_phot ,
            aes(xmin = xmin, ymin = -Inf, ymax = Inf, xmax = xmax),
            alpha = 0.25, inherit.aes = FALSE) +
  geom_ribbon(aes(x = Time, ymin = .lower_ci, ymax = .upper_ci, fill = site), alpha = 0.5) +
  geom_line(aes(x = Time, y = .fitted, col = site), linewidth = 0.1) +
  geom_line(aes(x = Time, y = .fitted), linewidth = 1) +
  scale_fill_manual(values = melidos_colors) +
  scale_color_manual(values = melidos_colors) +
  theme_cowplot() +
  scale_y_continuous(trans = "symlog", breaks = c(0,1,10,100,250, 1000)) +
  scale_x_continuous(breaks = seq(0, 24, by = 6)) +
  coord_cartesian(xlim = c(0,24), ylim = c(0,10^4), expand = FALSE) +
  labs(y = "Melanopic EDI (lx)", x = "Local time (hrs)", fill = "Site", col = "Site") +
  facet_wrap(~site) +
  # guides( y = guide_axis_stack(Brown_bracket, "axis")) +
  gghighlight(
    label_key = site,
    label_params = list(x = 12, y = 4.5, force = 0, hjust = 0.25)
  ) +
  theme_sub_strip(background = element_blank(),
                  text = element_blank()) +
  theme(panel.spacing = unit(1, "lines"),
        plot.caption = element_textbox_simple())
  # labs(caption = Brown_caption)
In [30]:
Code
Visualize Individual
#| filename: Visualize Individual
#| fig-height: 5
#| fig-width: 8.5
P_individual <- 
H2_model_phot |> 
  conditional_values(
    condition = list("Time", "Id"),
    exclude = c("s(Time,site)", "s(Id_date)", "s(Time,photoperiod.state)")
    ) |> 
  mutate(across(c(.fitted, .lower_ci, .upper_ci),
                exp_zero_inflated)
  ) |> 
  group_by(Id) |>
  select(-site) |> 
  separate_wider_delim(Id, delim = "_", names = c("site","Id")) |> 
  site_conv_mutate() |> 
  ggplot(aes(x = Time)) +
  map(c(1,10,250, 10^4), 
          \(x) geom_hline(aes(yintercept = x), col = "grey", linetype = "dashed")
      ) +
  # geom_ribbon(aes(ymin = .lower_ci, ymax = .upper_ci, group = Id), alpha = 0.025) +
  geom_line(aes(y = .fitted, group = interaction(Id, site), color = site), 
            linewidth = 1, alpha = 0.35) +
  theme_cowplot() +
  scale_color_manual(values = melidos_colors) +
  scale_y_continuous(trans = "symlog", breaks = c(0,1,10,100,250, 1000)) +
  scale_x_continuous(breaks = seq(0, 24, by = 6)) +
  coord_cartesian(xlim = c(0,24), ylim = c(0,1.5*10^4), expand = FALSE) +
  labs(y = "Melanopic EDI (lx)", x = "Local time (hrs)") +
  guides( y = guide_axis_stack(Brown_bracket, "axis"), color = "none") +
  theme(plot.caption = element_textbox_simple())
  # labs(caption = Brown_caption)
In [31]:
Code
prepare time model
#| filename: prepare time model
H2_data <- 
H2_model_phot |> 
  conditional_values(
    condition = list(Time = seq(0.25, 23.75, by = 0.5), "photoperiod.state"),
    exclude = c("s(Time,site)", "s(Time,Id)", "s(Id_date)")
    ) |> 
  mutate(across(c(.fitted, .lower_ci, .upper_ci),
                exp_zero_inflated)
  )

H2_data <-
metric_participanthour |> 
  Datetime2Time() |> 
  group_by(Time) |> 
  summarize(
    .groups = "drop_last",
    dawn = dawn |> as.numeric() |> mean(na.rm = TRUE)/3600,
    dusk = dusk |> as.numeric() |> mean(na.rm = TRUE)/3600
  ) |> 
  mutate(
    photoperiod.state = ifelse(between(Time, dawn, dusk), "day", "night")
  ) |> 
  select(-c(dawn, dusk)) |> 
  left_join(H2_data, by = c("Time", "photoperiod.state"))

H2_data_phot <- 
H2_data |> 
  number_states(photoperiod.state) |> 
  filter(photoperiod.state == "night") |> 
  group_by(photoperiod.state.count) |> 
  summarize(xmin = min(Time),
            xmax = max(Time),
            .groups = "drop_last")
In [32]:
Code
Visualize Time
#| filename: Visualize Time
#| fig-height: 5
#| fig-width: 8.5
P_time <- 
H2_data |> 
  ggplot(aes(x = Time)) +
  map(c(1,10, 100, 250, 10^3), 
          \(x) geom_hline(aes(yintercept = x), col = "grey", linetype = "dashed")
      ) +
  geom_rect(data = H2_data_phot ,
            aes(xmin = xmin, ymin = -Inf, ymax = Inf, xmax = xmax),
            alpha = 0.25, inherit.aes = FALSE) +
  geom_ribbon(aes(ymin = .lower_ci, ymax = .upper_ci), alpha = 0.5) +
  geom_line(aes(y = .fitted), 
            linewidth = 1, alpha = 1) +
  theme_cowplot() +
  scale_y_continuous(trans = "symlog", breaks = c(0,1,10,100,250, 1000)) +
  scale_x_continuous(breaks = seq(0, 24, by = 6)) +
  coord_cartesian(xlim = c(0,24), ylim = c(0,1.5*10^3), expand = FALSE) +
  labs(y = "Melanopic EDI (lx)", x = "Local time (hrs)") +
  guides( y = guide_axis_stack(Brown_bracket, "axis"), color = "none") +
  theme(plot.caption = element_textbox_simple())
  # labs(caption = Brown_caption)
In [33]:
Code
Visualize model terms
#| filename: Visualize model terms

P_terms <-
H2_model_phot |> 
  smooth_estimates(select = "s(Time,site)") |> 
  mutate(.lower_ci = .estimate -1.96*.se,
         .upper_ci = .estimate +1.96*.se,
         .fitted = .estimate,
         signif = .lower_ci > 0 | .upper_ci < 0,
         signif_id = consecutive_id(signif),
         .by = site) |> 
  site_conv_mutate(rev = FALSE) |>
  ggplot(aes()) +
  map(c(log10(c(0.1, 0.2, 0.5, 2, 5, 10))),
      \(x) geom_hline(yintercept = x, linewidth = 0.5, col = "grey", linetype = 2)) +
  geom_rect(data = H2_site_data_phot ,
            aes(xmin = xmin, ymin = -Inf, ymax = Inf, xmax = xmax),
            alpha = 0.25, inherit.aes = FALSE) +
  geom_ribbon(aes(x = Time, ymin = .lower_ci, ymax = .upper_ci, fill = site), alpha = 0.5) +
  geom_hline(yintercept = 0, linewidth = 1, linetype = 2) +
  geom_line(aes(x = Time, y = .fitted), 
            linewidth = 1) +
  geom_line(data = \(x) filter(x, signif),
            aes(x = Time, y = .fitted, group = signif_id), 
            col = "red", linewidth = 1) +
  scale_fill_manual(values = melidos_colors) +
  # scale_color_manual(values = c(melidos_colors)) +
  theme_cowplot() +
  scale_y_continuous(labels = expression(10^-1, 5^-1, 2^-1, "1  ", "2  ", "5  ", "10  "), 
                     breaks = log10(c(0.1, 0.2, 0.5, 1, 2, 5, 10)))+
  scale_x_continuous(breaks = seq(0, 24, by = 6)) +
  coord_cartesian(xlim = c(0,24), expand = FALSE) +
  guides(fill = "none") +
  labs(y = "Factor", x = "Local time (hrs)", fill = "Site") +
  facet_wrap(~site) +
  theme_sub_strip(background = element_blank(),
                  text = element_blank()) +
  theme(panel.spacing = unit(1, "lines"),
        plot.caption = element_textbox_simple())
In [34]:
Code
Combination of partial plots H2
#| fig-height: 17
#| fig-width: 13
#| filename: Combination of partial plots H2

((P_time / P_individual) | (H2_pred / P_terms)) +
  plot_annotation(tag_levels = "A", caption = Brown_caption,
                  theme = theme(plot.caption = element_textbox_simple())) +
  plot_layout()

Code
#| fig-height: 17
#| fig-width: 13
#| filename: Combination of partial plots H2

ggsave("figures/Fig4.pdf", height = 17, width = 13)
ggsave("figures/Fig4.png", height = 17, width = 13)

Interpretation

There is an overall trend of time and photoperiod (panel A). Daytime melanopic EDI light levels are < 100 lx in the morning and late afternoon, and between 100 and about 200 lx between late morning and early afternoon. During the evening, light levels are below 10 lx melanopic EDI, and generally below 1 lx at night.

Individual deviations from this trend vary greatly, especially during the day (panel B)

Model predictions (panel C) and deviations from the global trend (panel D) of each site shows specific differences:

  • Delft (NL), Tübingen (DE), and San José (CR) do not show significant deviations from the global trend. In the case of San José (CR) this is likely due to the small number of participants, as can be seen by the wide confidence interval compared to the other sites

  • Boras (SE), Dortmund (DE), and partially Munich (DE) show similar patterns of higher mel EDI in the hours after dawn and before dusk. The difference is about a factor 2-5 higher compared to the global trend.

  • Madrid (ES), Kumasi (GH), and Izmir (TR) have a significantly reduced mel EDI around dawn (factor 2-5 reduction)

  • Madrid (ES) and Kumasi (GH) have significantly reduced mel EDI around dusk (factor 2-5 reduction)

  • Izmir (TR) show significantly hightened melanopic EDI in the late evening (factor 2-5 increase)

  • Kumasi (GH) has significantly lower daytime illuminance both in the afternoon and after dusk (factor 2-5 reduction)

Session info

In [35]:
Code cell - Session info
#| code-summary: Code cell - Session info
sessionInfo()
R version 4.5.0 (2025-04-11)
Platform: aarch64-apple-darwin20
Running under: macOS 26.3.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/Berlin
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
 [1] ggtext_0.1.2        legendry_0.2.4      gtsummary_2.5.0    
 [4] gghighlight_0.5.0   glmmTMB_1.1.14      gt_1.3.0           
 [7] broom.mixed_0.2.9.7 rlang_1.2.0         patchwork_1.3.2    
[10] emmeans_2.0.3       gratia_0.11.2       itsadug_2.5        
[13] plotfunctions_1.5   broom_1.0.12        ggridges_0.5.7     
[16] cowplot_1.2.0       performance_0.16.0  lme4_2.0-1         
[19] Matrix_1.7-3        mgcv_1.9-4          nlme_3.1-168       
[22] LightLogR_0.10.2    melidosData_1.0.6   lubridate_1.9.5    
[25] forcats_1.0.1       stringr_1.6.0       dplyr_1.2.1        
[28] purrr_1.2.2         readr_2.2.0         tidyr_1.3.2        
[31] tibble_3.3.1        ggplot2_4.0.2       tidyverse_2.0.0    

loaded via a namespace (and not attached):
  [1] RColorBrewer_1.1-3   rstudioapi_0.18.0    DHARMa_0.4.7        
  [4] jsonlite_2.0.0       datawizard_1.3.0     magrittr_2.0.5      
  [7] estimability_1.5.1   farver_2.1.2         nloptr_2.2.1        
 [10] rmarkdown_2.31       ragg_1.5.2           fs_2.0.1            
 [13] vctrs_0.7.3          minqa_1.2.8          base64enc_0.1-6     
 [16] effectsize_1.0.2     htmltools_0.5.9      curl_7.0.0          
 [19] haven_2.5.5          sass_0.4.10          parallelly_1.47.0   
 [22] htmlwidgets_1.6.4    pbkrtest_0.5.5       sandwich_3.1-1      
 [25] zoo_1.8-15           TMB_1.9.21           commonmark_2.0.0    
 [28] lifecycle_1.0.5      pkgconfig_2.0.3      webshot2_0.1.2      
 [31] R6_2.6.1             fastmap_1.2.0        rbibutils_2.4.1     
 [34] future_1.70.0        digest_0.6.39        numDeriv_2016.8-1.1 
 [37] furrr_0.4.0          textshaping_1.0.5    labeling_0.4.3      
 [40] timechange_0.4.0     compiler_4.5.0       withr_3.0.2         
 [43] S7_0.2.1-1           backports_1.5.1      MASS_7.3-65         
 [46] tools_4.5.0          chromote_0.5.1       otel_0.2.0          
 [49] glue_1.8.0           promises_1.5.0       gridtext_0.1.6      
 [52] grid_4.5.0           see_0.13.0           generics_0.1.4      
 [55] gtable_0.3.6         labelled_2.16.0      tzdb_0.5.0          
 [58] websocket_1.4.4      hms_1.1.4            xml2_1.5.2          
 [61] utf8_1.2.6           ggrepel_0.9.8        pillar_1.11.1       
 [64] markdown_2.0         later_1.4.8          splines_4.5.0       
 [67] lattice_0.22-7       renv_1.1.4           tidyselect_1.2.1    
 [70] juicyjuice_0.1.0     knitr_1.51           reformulas_0.4.4    
 [73] V8_8.1.0             litedown_0.9         xfun_0.57           
 [76] statmod_1.5.1        stringi_1.8.7        yaml_2.3.12         
 [79] boot_1.3-31          ggokabeito_0.1.0     evaluate_1.0.5      
 [82] codetools_0.2-20     cli_3.6.6            systemfonts_1.3.2   
 [85] tweedie_3.0.17       xtable_1.8-8         parameters_0.28.3   
 [88] Rdpack_2.6.6         processx_3.8.7       mirai_2.6.1         
 [91] Rcpp_1.1.1-1         globals_0.19.1       coda_0.19-4.1       
 [94] nanonext_1.8.2       parallel_4.5.0       bayestestR_0.17.0   
 [97] mvnfast_0.2.8        listenv_0.10.1       broom.helpers_1.22.0
[100] mvtnorm_1.3-7        scales_1.4.0         insight_1.5.0       
[103] cards_0.7.1