Back to Article
Research Question 3
Download Source

Is personal light exposure related to participant characteristics?

Author

Johannes Zauner

Last modified:

April 15, 2026

Preface

This document contains the analysis for Research Question 3, as defined in the preregistration.

Research Question 1:
Is personal light exposure related to participant characteristics?

H8: Duration, exposure-history, and level-based metrics of personal light exposure are significantly associated with personal light sensitivity (VLSQ-8)

H9: Timing-based metrics of personal light exposure are significantly associated with chronotype.

H10: Personal light exposure metrics depend on age and sex

H11: Personal light exposure patterns depend on sex

Setup

In this section, we load in the preprocessed and prepared data, site data, and load required packages.

In [1]:
Code cell - Setup
library(tidyverse)
Warning: package 'ggplot2' was built under R version 4.5.2
Warning: package 'tibble' was built under R version 4.5.2
Warning: package 'tidyr' was built under R version 4.5.2
Warning: package 'readr' was built under R version 4.5.2
Warning: package 'purrr' was built under R version 4.5.2
Warning: package 'dplyr' was built under R version 4.5.2
Warning: package 'lubridate' was built under R version 4.5.2
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.2.1     ✔ readr     2.2.0
✔ forcats   1.0.1     ✔ stringr   1.6.0
✔ ggplot2   4.0.2     ✔ tibble    3.3.1
✔ lubridate 1.9.5     ✔ tidyr     1.3.2
✔ purrr     1.2.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Code cell - Setup
library(melidosData)
library(LightLogR)
Warning: package 'LightLogR' was built under R version 4.5.2
Code cell - Setup
library(mgcv)
Loading required package: nlme

Attaching package: 'nlme'

The following object is masked from 'package:dplyr':

    collapse

This is mgcv 1.9-4. For overview type '?mgcv'.
Code cell - Setup
library(lme4)
Warning: package 'lme4' was built under R version 4.5.2
Loading required package: Matrix

Attaching package: 'Matrix'

The following objects are masked from 'package:tidyr':

    expand, pack, unpack


Attaching package: 'lme4'

The following object is masked from 'package:nlme':

    lmList
Code cell - Setup
library(performance)
Warning: package 'performance' was built under R version 4.5.2
Code cell - Setup
library(cowplot)

Attaching package: 'cowplot'

The following object is masked from 'package:lubridate':

    stamp
Code cell - Setup
library(ggridges)
library(ggtext)
library(sjPlot)

Attaching package: 'sjPlot'

The following objects are masked from 'package:cowplot':

    plot_grid, save_plot

The following object is masked from 'package:ggplot2':

    set_theme
Code cell - Setup
library(broom)
Warning: package 'broom' was built under R version 4.5.2
Code cell - Setup
library(itsadug)
Warning: package 'itsadug' was built under R version 4.5.2
Loading required package: plotfunctions
Warning: package 'plotfunctions' was built under R version 4.5.2

Attaching package: 'plotfunctions'

The following object is masked from 'package:ggplot2':

    alpha

Loaded package itsadug 2.5 (see 'help("itsadug")' ).
Code cell - Setup
library(gratia)
Warning: package 'gratia' was built under R version 4.5.2

Attaching package: 'gratia'

The following object is masked from 'package:itsadug':

    dispersion

The following object is masked from 'package:stringr':

    boundary
Code cell - Setup
library(emmeans)
Warning: package 'emmeans' was built under R version 4.5.2
Welcome to emmeans.
Caution: You lose important information if you filter this package's results.
See '? untidy'
Code cell - Setup
library(patchwork)

Attaching package: 'patchwork'

The following object is masked from 'package:cowplot':

    align_plots
Code cell - Setup
library(rlang)
Warning: package 'rlang' was built under R version 4.5.2

Attaching package: 'rlang'

The following objects are masked from 'package:purrr':

    flatten, flatten_chr, flatten_dbl, flatten_int, flatten_lgl,
    flatten_raw, invoke, splice
Code cell - Setup
library(broom.mixed)
Warning: package 'broom.mixed' was built under R version 4.5.2
Code cell - Setup
library(gt)
Warning: package 'gt' was built under R version 4.5.2

Attaching package: 'gt'

The following object is masked from 'package:cowplot':

    as_gtable
Code cell - Setup
library(performance)
library(glmmTMB)
Warning: package 'glmmTMB' was built under R version 4.5.2
Warning in check_dep_version(dep_pkg = "TMB"): package version mismatch: 
glmmTMB was built with TMB package version 1.9.19
Current TMB package version is 1.9.21
Please re-install glmmTMB from source or restore original 'TMB' package (see '?reinstalling' for more information)
Code cell - Setup
library(gghighlight)
library(gtsummary)
Warning: package 'gtsummary' was built under R version 4.5.2
Code cell - Setup
source("scripts/fitting.R")
source("scripts/summaries.R")
# source("scripts/RQ3d_specific.R")
In [2]:
Code
load("data/metrics_glasses.RData")
load("data/preprocessed_glasses_2.RData")
metrics <- metrics_glasses
hourly_data <-  light_glasses_processed2
H8 <- "Hypothesis: Duration, exposure-history, and level-based metrics of personal light exposure are significantly associated with personal light sensitivity (VLSQ-8)."
H9 <- "Hypothesis: Timing-based metrics of personal light exposure are significantly associated with chronotype."
H10 <- "Hypothesis: Personal light exposure metrics depend on age and sex."
H11 <- "Hypothesis: Personal light exposure patterns depend on sex."
sig.level <- 0.05

#for chest:
# load("data/metrics_chest.RData")
# metrics <- metrics_chest
# melidos_cities <- melidos_cities[names(melidos_cities)!="MPI"]

Hypothesis 8

Details

Hypothesis 3 states:

Duration, exposure-history, and level-based metrics of personal light exposure are significantly associated with personal light sensitivity (VLSQ-8)

Type of model: GLMM

Base formula: \(\text{metric} \sim \text{VLSQ-8}*\text{Site} + (1|\text{Participant})\)

Notes: - Site was sum-coded, so as to calculate an average effect of site - Error family depends on the metric

Specific formulas: - \(\text{mel EDI} \sim \text{Site} + (1|\text{Site}:\text{Participant})\) (Null-model) - \(\text{mel EDI} \sim \text{VLSQ-8} + \text{Site} + (1|\text{Site}:\text{Participant})\) (Null-model interaction) - \(\text{mel EDI} \sim \text{VLSQ-8} + (1|\text{Participant})\) (Null-model site)

False-discovery-rate correction: - 9 metrics with one confirmatory analysis: n=9 - For sites (interaction), n=9

Preparation

In [3]:
Code
Load VLSQ-8 data
vlsq8 <- load_data("vlsq8") |> flatten_data()
In [4]:
Code cell - Model setup based on distribution
model_tbl <- tibble::tribble(
    ~name, ~engine, ~ response, ~family,
    "interdaily_stability", "lm", "qlogis(metric)", gaussian(),
    "intradaily_variability", "lm", "metric", gaussian(),
    "Mean", "lmer", "log_zero_inflated(metric)", gaussian(),
    "dose", "lmer", "log_zero_inflated(metric)", gaussian(),
    "MDER", "lmer", "metric", gaussian(),
    "darkest_10h_mean", "lmer", "log_zero_inflated(metric)", gaussian(),
    "brightest_10h_mean", "lmer", "log_zero_inflated(metric)", gaussian(),
    "duration_above_1000", "glmmTMB", "metric", tweedie(link = "log"),
    "duration_above_250", "lmer", NA, gaussian(),
    "duration_below_10", "lmer", NA, gaussian(),
    "duration_below_1", "lmer", NA, gaussian(),
    "duration_above_250_wake", "glmmTMB", "metric", tweedie(link = "log"),
    "duration_below_10_pre-sleep", "glmmTMB", "metric", tweedie(link = "log"),
    "duration_below_1_sleep", "glmmTMB", "metric", tweedie(link = "log"),
    "period_above_250", "lmer", "log_zero_inflated(metric)", gaussian(),
    "brightest_10h_midpoint", "lmer", "metric", gaussian(),
    "brightest_10h_onset", "lmer", NA, gaussian(),
    "brightest_10h_offset", "lmer", NA, gaussian(),
    "darkest_10h_midpoint", "lmer", "metric", gaussian(),
    "darkest_10h_onset", "lmer", NA, gaussian(),
    "darkest_10h_offset", "lmer", NA, gaussian(),
    "mean_timing_above_250", "lmer", "metric", gaussian(),
    "first_timing_above_250", "lmer", "metric", gaussian(),
    "last_timing_above_250", "lmer", "metric", gaussian()
  )
In [5]:
Code cell - Formula setup for inference
H8_data <-
  metrics |> 
  filter(metric_type %in% c("duration", "exposure history", "level")) |> 
  filter_out(name == "duration_above_250") |> 
  mutate(
    H8_1 = "~ site*VLSQ8 + (1|Id)",
    H8_0 = "~ site + (1|Id)",
    H8_ni = "~ site + VLSQ8 + (1|Id)",
    H8_ns = "~ VLSQ8 + (1|Id)",
         ) |> 
  left_join(model_tbl, by = "name")

H8_formulas <-  names(H8_data) |> subset(grepl("H8_", names(H8_data)))

H8_data <- 
  H8_data |> relocate(all_of(H8_formulas), .after = family)
H8_fdr_n <- H8_data |> filter(!is.na(response)) |> nrow()
In [6]:
Code
H8_data <-
  H8_data |> 
  mutate(data = data |> map(\(x) x |> left_join(vlsq8, by = c("site", "Id"))))
In [7]:
Code
H8_data |> 
  select(-data) |>
  filter(!is.na(response)) |> 
  arrange(metric_type, name) |> 
  mutate(family = map(family, 1),
         rowname = seq_along(1:length(name))) |> 
  group_by(metric_type) |> 
  gt() |> 
  cols_label_with(fn = \(x) x |> str_to_title() |> str_replace_all("_", " ")) |> 
  sub_missing(missing_text = "") |> 
  sub_values(values = "NULL", replacement = "") |> 
  fmt(name, fns = \(x) x |> str_to_title() |>  str_replace_all("_", " ")) |> 
  sub_values(values = "MDER", replacement = "MDER") |> 
  tab_header("Hypothesis 8: metric model specifications",
             subtitle = H8) |> 
  tab_footnote("Confirmatory analysis will use a p-value correction through the false-discovery rate for n=9 comparisons") |> 
  tab_footnote("These are the models used for the confirmatory analysis. Each H1 1 and H1 0 pair are compaired with likelihood-ratio tests to determine whether site is meaningful", locations = cells_column_labels(c(H8_1, H8_0))) |> 
  tab_footnote("These are the models used for exploratory analyses. Models are compared with all sensible combinations of hierarchical effects", locations = cells_column_labels(c(H8_ni, H8_ns)))
Hypothesis 8: metric model specifications
Hypothesis: Duration, exposure-history, and level-based metrics of personal light exposure are significantly associated with personal light sensitivity (VLSQ-8).
Name Type Engine Response Family H8 11 H8 01 H8 ni2 H8 ns2
duration
1 Duration above 1000 participant-day glmmTMB metric tweedie ~ site*VLSQ8 + (1|Id) ~ site + (1|Id) ~ site + VLSQ8 + (1|Id) ~ VLSQ8 + (1|Id)
2 Duration above 250 wake participant-day glmmTMB metric tweedie ~ site*VLSQ8 + (1|Id) ~ site + (1|Id) ~ site + VLSQ8 + (1|Id) ~ VLSQ8 + (1|Id)
3 Duration below 10 pre-Sleep participant-day glmmTMB metric tweedie ~ site*VLSQ8 + (1|Id) ~ site + (1|Id) ~ site + VLSQ8 + (1|Id) ~ VLSQ8 + (1|Id)
4 Duration below 1 sleep participant-day glmmTMB metric tweedie ~ site*VLSQ8 + (1|Id) ~ site + (1|Id) ~ site + VLSQ8 + (1|Id) ~ VLSQ8 + (1|Id)
5 Period above 250 participant-day lmer log_zero_inflated(metric) gaussian ~ site*VLSQ8 + (1|Id) ~ site + (1|Id) ~ site + VLSQ8 + (1|Id) ~ VLSQ8 + (1|Id)
exposure history
6 Dose participant-day lmer log_zero_inflated(metric) gaussian ~ site*VLSQ8 + (1|Id) ~ site + (1|Id) ~ site + VLSQ8 + (1|Id) ~ VLSQ8 + (1|Id)
level
7 Mean participant-day lmer log_zero_inflated(metric) gaussian ~ site*VLSQ8 + (1|Id) ~ site + (1|Id) ~ site + VLSQ8 + (1|Id) ~ VLSQ8 + (1|Id)
8 Brightest 10h mean participant-day lmer log_zero_inflated(metric) gaussian ~ site*VLSQ8 + (1|Id) ~ site + (1|Id) ~ site + VLSQ8 + (1|Id) ~ VLSQ8 + (1|Id)
9 Darkest 10h mean participant-day lmer log_zero_inflated(metric) gaussian ~ site*VLSQ8 + (1|Id) ~ site + (1|Id) ~ site + VLSQ8 + (1|Id) ~ VLSQ8 + (1|Id)
Confirmatory analysis will use a p-value correction through the false-discovery rate for n=9 comparisons
1 These are the models used for the confirmatory analysis. Each H1 1 and H1 0 pair are compaired with likelihood-ratio tests to determine whether site is meaningful
2 These are the models used for exploratory analyses. Models are compared with all sensible combinations of hierarchical effects
In [8]:
Code cell - Formula creation
H8_data <-
  H8_data |> 
  mutate(across(all_of(H8_formulas),
                \(x) map2(x, response, \(y,z) make_formula(z, y))
                )
         )
In [9]:
Code
H8_data |> 
  select(name, data) |> 
  unnest(data) |> 
  ggplot(aes(x = VLSQ8, y = metric)) + 
  geom_point() + 
  facet_wrap(~name, scales = "free")
Warning: Removed 108 rows containing missing values or values outside the scale range
(`geom_point()`).

Analysis

Fit models

In [10]:
Code cell - Model fit
H8_data <-
  H8_data |> 
  mutate(
    across(all_of(H8_formulas),
           \(x) pmap(list(data, x, engine, family), fit_model),
           .names = "{.col}_model")
    )

Model diagnostics are omitted here. We checked in RQ1 for sensible base models

Model comparisons

In [11]:
Code cell - Model comparisons
H8_data <- 
  H8_data |> 
  mutate(
    H8_1_comp = map2(H8_0_model, H8_1_model, model_anova),
    H8_1_p = map2(H8_1_comp, engine, model_comp_p, n = H8_fdr_n) |> list_c(),
    H8_1_sig = H8_1_p <= sig.level,
  
    H8_i_comp = map2(H8_ni_model, H8_1_model, model_anova),
    H8_i_p = map2(H8_i_comp, engine, model_comp_p, n = H8_fdr_n) |> list_c(),
    H8_i_sig = H8_i_p <= sig.level,
  
    H8_s_comp = map2(H8_ns_model, H8_ni_model, model_anova),
    H8_s_p = map2(H8_s_comp, engine, model_comp_p, n = H8_fdr_n) |> list_c(),
    H8_s_sig = H8_s_p <= sig.level,
  )
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
In [12]:
Code
H8_data |> 
  select(metric_type, name, H8_1_p, H8_1_sig) |> 
  mutate(H8_1_p = style_pvalue(H8_1_p)) |> 
  gt() |> 
  tab_header(
    "H8: Model results",
    subtitle = H8
  )
H8: Model results
Hypothesis: Duration, exposure-history, and level-based metrics of personal light exposure are significantly associated with personal light sensitivity (VLSQ-8).
metric_type name H8_1_p H8_1_sig
level Mean >0.9 FALSE
level brightest_10h_mean >0.9 FALSE
level darkest_10h_mean >0.9 FALSE
duration duration_above_1000 >0.9 FALSE
duration period_above_250 >0.9 FALSE
exposure history dose >0.9 FALSE
duration duration_below_10_pre-sleep >0.9 FALSE
duration duration_below_1_sleep >0.9 FALSE
duration duration_above_250_wake >0.9 FALSE

Interpretation

None of the light exposure metrics are significantly associated with VLSQ-8 scores

Hypothesis 9

Details

Hypothesis 9 states:

Timing-based metrics of personal light exposure are significantly associated with chronotype.

Type of model: GLMM

Base formula: \(\text{metric} \sim \text{Chronotype}*\text{Site} + (1|\text{Participant})\)

Notes: - Site was sum-coded, so as to calculate an average effect of site - Chronotype is operationalized through the MCTQ-score \(msf_{sc}\) (Mid-sleep on free days. Sleep corrected) - Error family depends on the metric

Specific formulas: - \(\text{mel EDI} \sim \text{Site} + (1|\text{Site}:\text{Participant})\) (Null-model) - \(\text{mel EDI} \sim \text{Chronotype} + \text{Site} + (1|\text{Site}:\text{Participant})\) (Null-model interaction) - \(\text{mel EDI} \sim \text{Chronotype} + (1|\text{Participant})\) (Null-model site)

False-discovery-rate correction: - 5 metrics with one confirmatory analysis: n=5 - For sites (interaction), n=9

Preparation

In [13]:
Code
Load chronotype data
ct <- load_data("chronotype") |> flatten_data()
loading modality: chronotype ■■■■■■■■■■■■■■■■■■■■■■■■■■■■      89% |  ETA:  0s
Code
ct$msf_sc |> range(na.rm = TRUE) |> hms::as_hms()
01:15:42.857143
08:07:51.428571
Code
ct <- ct |> mutate(msf_sc = as.numeric(msf_sc)/3600)

Since all msf_sc values are after midnight, we will use linear modelling instead of circular.

In [14]:
Code cell - Formula setup for inference
H9_data <-
  metrics |> 
  filter(metric_type %in% c("timing")) |> 
  filter_out(str_detect(name, "onset|offset")) |> 
  mutate(
    H9_1 = "~ site*msf_sc + (1|Id)",
    H9_0 = "~ site + (1|Id)",
    H9_00 = "~ 1 + (1|Id)",
    H9_ni = "~ site + msf_sc + (1|Id)",
    H9_ns = "~ msf_sc + (1|Id)",
         ) |> 
  left_join(model_tbl, by = "name")

H9_formulas <-  names(H9_data) |> subset(grepl("H9_", names(H9_data)))

H9_data <- 
  H9_data |> relocate(all_of(H9_formulas), .after = family)
H9_fdr_n <- H9_data |> filter(!is.na(response)) |> nrow()
In [15]:
Code
H9_data <-
  H9_data |> 
  mutate(data = data |> map(\(x) x |> left_join(ct, by = c("site", "Id")) |> 
                              drop_na(msf_sc)))
In [16]:
Code
H9_data |> 
  select(-data) |>
  filter(!is.na(response)) |> 
  arrange(metric_type, name) |> 
  mutate(family = map(family, 1),
         rowname = seq_along(1:length(name))) |> 
  group_by(metric_type) |> 
  gt() |> 
  cols_label_with(fn = \(x) x |> str_to_title() |> str_replace_all("_", " ")) |> 
  sub_missing(missing_text = "") |> 
  sub_values(values = "NULL", replacement = "") |> 
  fmt(name, fns = \(x) x |> str_to_title() |>  str_replace_all("_", " ")) |> 
  sub_values(values = "MDER", replacement = "MDER") |> 
  tab_header("Hypothesis 9: metric model specifications",
             subtitle = H9) |> 
  tab_footnote("Confirmatory analysis will use a p-value correction through the false-discovery rate for n=5 comparisons") |> 
  tab_footnote("These are the models used for the confirmatory analysis. Each H1 1 and H1 0 pair are compaired with likelihood-ratio tests to determine whether site is meaningful", locations = cells_column_labels(c(H9_ni, H9_00))) |> 
  tab_footnote("These are the models used for exploratory analyses. Models are compared with all sensible combinations of hierarchical effects", locations = cells_column_labels(c(H9_1, H9_ns, H9_0)))
Hypothesis 9: metric model specifications
Hypothesis: Timing-based metrics of personal light exposure are significantly associated with chronotype.
Name Type Engine Response Family H9 11 H9 01 H9 002 H9 ni2 H9 ns1
timing
1 Brightest 10h midpoint participant-day lmer metric gaussian ~ site*msf_sc + (1|Id) ~ site + (1|Id) ~ 1 + (1|Id) ~ site + msf_sc + (1|Id) ~ msf_sc + (1|Id)
2 Darkest 10h midpoint participant-day lmer metric gaussian ~ site*msf_sc + (1|Id) ~ site + (1|Id) ~ 1 + (1|Id) ~ site + msf_sc + (1|Id) ~ msf_sc + (1|Id)
3 First timing above 250 participant-day lmer metric gaussian ~ site*msf_sc + (1|Id) ~ site + (1|Id) ~ 1 + (1|Id) ~ site + msf_sc + (1|Id) ~ msf_sc + (1|Id)
4 Last timing above 250 participant-day lmer metric gaussian ~ site*msf_sc + (1|Id) ~ site + (1|Id) ~ 1 + (1|Id) ~ site + msf_sc + (1|Id) ~ msf_sc + (1|Id)
5 Mean timing above 250 participant-day lmer metric gaussian ~ site*msf_sc + (1|Id) ~ site + (1|Id) ~ 1 + (1|Id) ~ site + msf_sc + (1|Id) ~ msf_sc + (1|Id)
Confirmatory analysis will use a p-value correction through the false-discovery rate for n=5 comparisons
1 These are the models used for exploratory analyses. Models are compared with all sensible combinations of hierarchical effects
2 These are the models used for the confirmatory analysis. Each H1 1 and H1 0 pair are compaired with likelihood-ratio tests to determine whether site is meaningful
In [17]:
Code cell - Formula creation
H9_data <-
  H9_data |> 
  mutate(across(all_of(H9_formulas),
                \(x) map2(x, response, \(y,z) make_formula(z, y))
                )
         )
In [18]:
Code
H9_data |> 
  select(name, data) |> 
  unnest(data) |> 
  ggplot(aes(x = msf_sc, y = metric)) + 
  geom_point() + 
  facet_wrap(~name, scales = "free") +
  geom_smooth()
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
Warning: Removed 99 rows containing non-finite outside the scale range
(`stat_smooth()`).
Warning: Removed 99 rows containing missing values or values outside the scale range
(`geom_point()`).

Analysis

Fit models

In [19]:
Code cell - Model fit
H9_data <-
  H9_data |> 
  mutate(
    across(all_of(H9_formulas),
           \(x) pmap(list(data, x, engine, family), fit_model),
           .names = "{.col}_model")
    )

Model diagnostics are omitted here. We checked in RQ1 for sensible base models

In [20]:
Code cell - Model comparisons
H9_data <- 
  H9_data |> 
  mutate(
    H9_1_comp = map2(H9_00_model, H9_ns_model, model_anova),
    H9_1_p = map2(H9_1_comp, engine, model_comp_p, n = H9_fdr_n) |> list_c(),
    H9_1_sig = H9_1_p <= sig.level,
  
    H9_i_comp = map2(H9_ni_model, H9_1_model, model_anova),
    H9_i_p = map2(H9_i_comp, engine, model_comp_p, n = H9_fdr_n) |> list_c(),
    H9_i_sig = H9_i_p <= sig.level,
  
    H9_s_comp = map2(H9_ns_model, H9_ni_model, model_anova),
    H9_s_p = map2(H9_s_comp, engine, model_comp_p, n = H9_fdr_n) |> list_c(),
    H9_s_sig = H9_s_p <= sig.level,
    
    H9_model = pmap(
      list(H9_1_sig, H9_i_sig, H9_s_sig, H9_00_model, H9_0_model, H9_ns_model, H9_ni_model, H9_1_model), 
      \(sig1, sig2, sig3, model00, model_site, model_ct, model_both, model_interaction) {
        if(all(sig1, sig2, sig3)) {
          model <- model_interaction 
        } else if(all(sig1, sig3) & !sig2) {
            model <- model_both
        } else if(sig1 & !sig2 & !sig3) {
            model <- model_ct
        } else if(!sig1 & sig3 & !sig2) {
            model <- model_site
          } else model <- model00
          model
      }
      ),
    
    marginals = map(H9_ns_model, \(x) emtrends(x, ~ msf_sc, var = "msf_sc")),
    marginals = map2(H9_1_sig, marginals,
                     \(sig, marginal) {
                        if(sig) return(marginal |> as.tibble()) else return(tibble(msf_sc = NA))
                          }
    )
  )
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
Warning: There was 1 warning in `mutate()`.
ℹ In argument: `marginals = map2(...)`.
Caused by warning:
! `as.tibble()` was deprecated in tibble 2.0.0.
ℹ Please use `as_tibble()` instead.
ℹ The signature and semantics have changed, see `?as_tibble`.
In [21]:
Code
H9_data <- 
H9_data |> 
  mutate(
    H9_r2 = pmap(list(H9_ns_model, data, family, response), r2_helper),
    H9_marginal = map_dbl(H9_r2, \(x) x$R2_marginal),
    H9_marginal = ifelse(H9_1_sig, H9_marginal, NA)
  ) 
# |> pull(H9_r2)

Basic table

In [22]:
Code
H9_tab <- 
H9_data |> 
  select(
    metric_type, name, H9_1_sig, H9_1_p, marginals, H9_marginal
  ) |> 
  unnest(marginals) |> 
  gt() |> 
  fmt_number() |>
    sub_missing(H9_marginal) |>
  fmt(H9_1_p, fns = style_pvalue) |>
  cols_merge(
    c(msf_sc.trend, SE, lower.CL, upper.CL),
    pattern = "<<**{1}** ±{2} ({3} - {4})>>"
  ) |> 
  fmt(name, fns = \(x) x |> str_replace_all("_", " ") |> str_to_sentence()) |> 
  as.data.frame() |> 
  gt() |> 

  tab_style(style = cell_text(weight = "bold"),
            locations = cells_body(H9_1_p, H9_1_sig == "TRUE")) |> 
  cols_hide(c(H9_1_sig, df, msf_sc, metric_type, SE, lower.CL, upper.CL)) |> 
  cols_label(
    H9_1_p = "p-value",
    msf_sc.trend = md("$MSF_{sc}$"),
    name = "Metric",
    H9_marginal = md("$R^2_{marginal}$")
  ) |> 
  fmt_markdown(msf_sc.trend) |> 
  tab_header(
    "H9: Model results",
    subtitle = H9
  ) |> 
  tab_footnote(
    "p-value adjustment for n=5 model comparisons with FDR correction",
    locations = cells_column_labels(H9_1_p)
  ) |> 
  sub_missing()
In [23]:
Code
H9_tab
H9: Model results
Hypothesis: Timing-based metrics of personal light exposure are significantly associated with chronotype.
Metric p-value1 \(MSF_{sc}\) \(R^2_{marginal}\)
Brightest 10h midpoint <0.001 0.31 ±0.07 (0.17 - 0.44) 0.05
Darkest 10h midpoint <0.001 0.40 ±0.07 (0.27 - 0.53) 0.08
Mean timing above 250 <0.001 0.32 ±0.07 (0.19 - 0.45) 0.06
First timing above 250 <0.001 0.44 ±0.09 (0.26 - 0.62) 0.06
Last timing above 250 0.12
1 p-value adjustment for n=5 model comparisons with FDR correction
Code
H9_tab |> gtsave("tables/H9.png")
file:////var/folders/9p/326_k3kx43qbn_cyl1rqfhb00000gn/T//RtmpLH6dtc/file14bf5286af4b9.html screenshot completed

Visualization

In [24]:
Code
H9_plot1 <- 
H9_data |> 
  select(name, data) |> 
  unnest(data) |> 
  site_conv_mutate(rev = FALSE) |> 
  filter_out(name == "last_timing_above_250") |> 
  ggplot(aes(x = msf_sc, y = metric)) + 
  geom_point(aes(color = site), alpha = 0.5) + 
  facet_wrap(~name, scales = "free") +
  guides(color = "none") +
  theme_cowplot() +
  labs(y = "Local time of day (hr, 0 is midnight)", x = "MSF_sc") +
  geom_smooth(method = "lm") +
  scale_color_manual(values = melidos_colors)

H9_plot2 <- 
H9_data |> 
  select(name, data) |> 
  unnest(data) |> 
  site_conv_mutate(rev = TRUE) |> 
  filter_out(name == "last_timing_above_250") |> 
  ggplot(aes(x = msf_sc, y = site, color = site)) + 
  geom_boxplot() + 
  theme_cowplot() +
  guides(color = "none") +
  labs(y = NULL, x = "MSF_sc") +
  scale_color_manual(values = melidos_colors)

H9_plot1 + H9_plot2 + plot_layout(widths = c(3,2)) + plot_annotation(tag_levels = "A")
`geom_smooth()` using formula = 'y ~ x'
Warning: Removed 66 rows containing non-finite outside the scale range
(`stat_smooth()`).
Warning: Removed 66 rows containing missing values or values outside the scale range
(`geom_point()`).

Code
ggsave("figures/Fig7.pdf", width = 10, height = 5)
`geom_smooth()` using formula = 'y ~ x'
Warning: Removed 66 rows containing non-finite outside the scale range
(`stat_smooth()`).
Removed 66 rows containing missing values or values outside the scale range
(`geom_point()`).
In [25]:
Code
lm(msf_sc ~ site, data = ct, contrasts = list(site = "contr.sum")) |> 
  emmeans(~ site) |> contrast()
 contrast       estimate    SE  df t.ratio p.value
 BAUA effect      -0.502 0.207 176  -2.424  0.0210
 FUSPCEU effect    0.762 0.215 176   3.548  0.0011
 IZTECH effect     1.186 0.240 176   4.937 <0.0001
 KNUST effect     -1.711 0.254 176  -6.738 <0.0001
 MPI effect        0.421 0.200 176   2.104  0.0414
 RISE effect      -0.907 0.240 176  -3.775  0.0007
 THUAS effect      0.206 0.254 176   0.812  0.4180
 TUM effect        1.001 0.305 176   3.277  0.0023
 UCR effect       -0.457 0.170 176  -2.681  0.0120

P value adjustment: fdr method for 9 tests 

Interpretation

  • Chronotype is an important dependency timing-based metrics. Per one hour later MSF_sc, participants experience important aspects of the day about 0.31 to 0.44 hours later in the day, including the brightest and darkest 10h midpoint, mean and first timing about 250lx melanopic EDI. The last timing above 250 lx melanopic EDI was not significant
  • Chronotype is significantly different between sites

Hypothesis 10

Details

Hypothesis 10 states:

Personal light exposure metrics depend on age and sex

Type of model: GLMM

Base formulas: - \(\text{mel EDI} \sim \text{sex/age} + \text{Site} + (1|\text{Participant})\)

Notes: - Site was sum-coded, so as to calculate an average effect of site - Error family depends on the metric

Specific formulas: - \(\text{mel EDI} \sim \text{Site} + (1|\text{Participant})\) (Null-model) - \(\text{metric} \sim \text{sex/age}*\text{Site} + (1|\text{Participant})\) (Interaction-model)

False-discovery-rate correction: - 17 metrics with one confirmatory analysis times 2 variables: n=34 - For sites (interaction), n=9

Preparation

In [26]:
Code
load demographics
demographics <- load_data("demographics") |> flatten_data()
In [27]:
Code
lm(age ~ site, data = demographics) |> emmeans(~ site) |> contrast()
 contrast       estimate   SE  df t.ratio p.value
 BAUA effect        4.72 1.85 182   2.552  0.0259
 FUSPCEU effect     3.57 1.88 182   1.898  0.1066
 IZTECH effect     -6.30 2.15 182  -2.935  0.0113
 KNUST effect      -7.95 2.27 182  -3.501  0.0026
 MPI effect        -2.95 1.79 182  -1.648  0.1456
 RISE effect        9.41 2.15 182   4.381  0.0002
 THUAS effect       2.30 2.00 182   1.152  0.2823
 TUM effect        -4.35 2.73 182  -1.591  0.1456
 UCR effect         1.54 1.52 182   1.013  0.3122

P value adjustment: fdr method for 9 tests 
Code
demographics |> ggplot(aes(x = site, y = age)) + geom_boxplot()

In [28]:
Code
glm(sex ~ site, data = demographics, family = binomial()) |> emmeans(~ site) |> contrast()
 contrast       estimate    SE  df z.ratio p.value
 BAUA effect     0.00554 0.394 Inf   0.014  0.9888
 FUSPCEU effect -0.45601 0.417 Inf  -1.093  0.6631
 IZTECH effect  -0.43354 0.475 Inf  -0.913  0.6631
 KNUST effect    0.03906 0.483 Inf   0.081  0.9888
 MPI effect      0.01844 0.381 Inf   0.048  0.9888
 RISE effect     0.77873 0.475 Inf   1.640  0.6631
 THUAS effect    0.57806 0.433 Inf   1.336  0.6631
 TUM effect     -0.23287 0.591 Inf  -0.394  0.9888
 UCR effect     -0.29741 0.331 Inf  -0.899  0.6631

Results are given on the log odds ratio (not the response) scale. 
P value adjustment: fdr method for 9 tests 
Code
demographics |> ggplot(aes(x = site, fill = sex)) + geom_bar(position = position_dodge())

age shows significant differences between sites, sex does not

In [29]:
Code cell - Formula setup for inference
H10_data <-
  metrics |> 
  left_join(model_tbl, by = "name") |> 
  filter_out(is.na(response)) |> 
  mutate(
    H10_1a = "~ site + age + (1|Id)",
    H10_1s = "~ site + sex + (1|Id)",
    H10_0 = "~ site + (1|Id)",
    H10_a = "~ age + (1|Id)",
    H10_00 = "~ 1 + (1|Id)",
    H10_ia = "~ site*age + (1|Id)",
    H10_is = "~ site*sex + (1|Id)",
    across(H10_1a:H10_is, 
           \(x) ifelse(engine == "lm", str_remove(x, " \\+ \\(1\\|Id\\)"), x)
             ),
         )

H10_formulas <-  names(H10_data) |> subset(grepl("H10_", names(H10_data)))

H10_data <- 
  H10_data |> relocate(all_of(H10_formulas), .after = family)
H10_fdr_n <- H10_data |> filter(!is.na(response)) |> nrow() *2
In [30]:
Code
H10_data |> 
  select(-data) |>
  filter(!is.na(response)) |> 
  arrange(metric_type, name) |> 
  mutate(family = map(family, 1),
         rowname = seq_along(1:length(name))) |> 
  group_by(metric_type) |> 
  gt() |> 
  cols_label_with(fn = \(x) x |> str_to_title() |> str_replace_all("_", " ")) |> 
  sub_missing(missing_text = "") |> 
  sub_values(values = "NULL", replacement = "") |> 
  fmt(name, fns = \(x) x |> str_to_title() |>  str_replace_all("_", " ")) |> 
  sub_values(values = "MDER", replacement = "MDER") |> 
  tab_header("Hypothesis 9: metric model specifications",
             subtitle = H10) |> 
  tab_footnote("Confirmatory analysis will use a p-value correction through the false-discovery rate for n=34 comparisons") |>  
  tab_footnote("These are the models used for the confirmatory analysis. Each H1 1 and H1 0 pair are compaired with likelihood-ratio tests to determine whether site is meaningful", locations = cells_column_labels(c(H10_1s, H10_1a, H10_0))) |> 
  tab_footnote("These are the models used for exploratory analyses. Models are compared with all sensible combinations of hierarchical effects", locations = cells_column_labels(c(H10_is, H10_ia, H10_00)))
Hypothesis 9: metric model specifications
Hypothesis: Personal light exposure metrics depend on age and sex.
Name Type Engine Response Family H10 1a1 H10 1s1 H10 01 H10 a H10 002 H10 ia2 H10 is2
duration
1 Duration above 1000 participant-day glmmTMB metric tweedie ~ site + age + (1|Id) ~ site + sex + (1|Id) ~ site + (1|Id) ~ age + (1|Id) ~ 1 + (1|Id) ~ site*age + (1|Id) ~ site*sex + (1|Id)
2 Duration above 250 wake participant-day glmmTMB metric tweedie ~ site + age + (1|Id) ~ site + sex + (1|Id) ~ site + (1|Id) ~ age + (1|Id) ~ 1 + (1|Id) ~ site*age + (1|Id) ~ site*sex + (1|Id)
3 Duration below 10 pre-Sleep participant-day glmmTMB metric tweedie ~ site + age + (1|Id) ~ site + sex + (1|Id) ~ site + (1|Id) ~ age + (1|Id) ~ 1 + (1|Id) ~ site*age + (1|Id) ~ site*sex + (1|Id)
4 Duration below 1 sleep participant-day glmmTMB metric tweedie ~ site + age + (1|Id) ~ site + sex + (1|Id) ~ site + (1|Id) ~ age + (1|Id) ~ 1 + (1|Id) ~ site*age + (1|Id) ~ site*sex + (1|Id)
5 Period above 250 participant-day lmer log_zero_inflated(metric) gaussian ~ site + age + (1|Id) ~ site + sex + (1|Id) ~ site + (1|Id) ~ age + (1|Id) ~ 1 + (1|Id) ~ site*age + (1|Id) ~ site*sex + (1|Id)
dynamics
6 Interdaily stability participant lm qlogis(metric) gaussian ~ site + age ~ site + sex ~ site ~ age ~ 1 ~ site*age ~ site*sex
7 Intradaily variability participant lm metric gaussian ~ site + age ~ site + sex ~ site ~ age ~ 1 ~ site*age ~ site*sex
exposure history
8 Dose participant-day lmer log_zero_inflated(metric) gaussian ~ site + age + (1|Id) ~ site + sex + (1|Id) ~ site + (1|Id) ~ age + (1|Id) ~ 1 + (1|Id) ~ site*age + (1|Id) ~ site*sex + (1|Id)
level
9 Mean participant-day lmer log_zero_inflated(metric) gaussian ~ site + age + (1|Id) ~ site + sex + (1|Id) ~ site + (1|Id) ~ age + (1|Id) ~ 1 + (1|Id) ~ site*age + (1|Id) ~ site*sex + (1|Id)
10 Brightest 10h mean participant-day lmer log_zero_inflated(metric) gaussian ~ site + age + (1|Id) ~ site + sex + (1|Id) ~ site + (1|Id) ~ age + (1|Id) ~ 1 + (1|Id) ~ site*age + (1|Id) ~ site*sex + (1|Id)
11 Darkest 10h mean participant-day lmer log_zero_inflated(metric) gaussian ~ site + age + (1|Id) ~ site + sex + (1|Id) ~ site + (1|Id) ~ age + (1|Id) ~ 1 + (1|Id) ~ site*age + (1|Id) ~ site*sex + (1|Id)
spectrum
12 MDER participant-day lmer metric gaussian ~ site + age + (1|Id) ~ site + sex + (1|Id) ~ site + (1|Id) ~ age + (1|Id) ~ 1 + (1|Id) ~ site*age + (1|Id) ~ site*sex + (1|Id)
timing
13 Brightest 10h midpoint participant-day lmer metric gaussian ~ site + age + (1|Id) ~ site + sex + (1|Id) ~ site + (1|Id) ~ age + (1|Id) ~ 1 + (1|Id) ~ site*age + (1|Id) ~ site*sex + (1|Id)
14 Darkest 10h midpoint participant-day lmer metric gaussian ~ site + age + (1|Id) ~ site + sex + (1|Id) ~ site + (1|Id) ~ age + (1|Id) ~ 1 + (1|Id) ~ site*age + (1|Id) ~ site*sex + (1|Id)
15 First timing above 250 participant-day lmer metric gaussian ~ site + age + (1|Id) ~ site + sex + (1|Id) ~ site + (1|Id) ~ age + (1|Id) ~ 1 + (1|Id) ~ site*age + (1|Id) ~ site*sex + (1|Id)
16 Last timing above 250 participant-day lmer metric gaussian ~ site + age + (1|Id) ~ site + sex + (1|Id) ~ site + (1|Id) ~ age + (1|Id) ~ 1 + (1|Id) ~ site*age + (1|Id) ~ site*sex + (1|Id)
17 Mean timing above 250 participant-day lmer metric gaussian ~ site + age + (1|Id) ~ site + sex + (1|Id) ~ site + (1|Id) ~ age + (1|Id) ~ 1 + (1|Id) ~ site*age + (1|Id) ~ site*sex + (1|Id)
Confirmatory analysis will use a p-value correction through the false-discovery rate for n=34 comparisons
1 These are the models used for the confirmatory analysis. Each H1 1 and H1 0 pair are compaired with likelihood-ratio tests to determine whether site is meaningful
2 These are the models used for exploratory analyses. Models are compared with all sensible combinations of hierarchical effects
In [31]:
Code cell - Formula creation
H10_data <-
  H10_data |> 
  mutate(across(all_of(H10_formulas),
                \(x) map2(x, response, \(y,z) make_formula(z, y))
                )
         )
In [32]:
Code
H10_data |> 
  select(name, data) |> 
  unnest(data) |> 
  ggplot(aes(x = age, y = metric)) + 
  geom_point() + 
  facet_wrap(~name, scales = "free") +
  geom_smooth()
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
Warning: Removed 293 rows containing non-finite outside the scale range
(`stat_smooth()`).
Warning: Removed 293 rows containing missing values or values outside the scale range
(`geom_point()`).

In [33]:
Code
H10_data |> 
  select(name, data) |> 
  unnest(data) |> 
  ggplot(aes(x = sex, y = metric)) + 
  geom_boxplot() + 
  facet_wrap(~name, scales = "free") +
  geom_smooth()
Warning: Removed 293 rows containing non-finite outside the scale range
(`stat_boxplot()`).
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
Warning: Removed 293 rows containing non-finite outside the scale range
(`stat_smooth()`).

Analysis

Fit models

In [34]:
Code cell - Model fit
H10_data <-
  H10_data |> 
  mutate(
    across(all_of(H10_formulas),
           \(x) pmap(list(data, x, engine, family), fit_model),
           .names = "{.col}_model")
    )

Model diagnostics are omitted here. We checked in RQ1 for sensible base models

In [35]:
Code cell - Model comparisons
H10_data <- 
  H10_data |> 
  mutate(
    H10_1s_comp = map2(H10_0_model, H10_1s_model, model_anova),
    H10_1s_p = map2(H10_1s_comp, engine, model_comp_p, n = H10_fdr_n) |> list_c(),
    H10_1s_sig = H10_1s_p <= sig.level,
    
    H10_1a_comp = map2(H10_0_model, H10_1a_model, model_anova),
    H10_1a_p = map2(H10_1a_comp, engine, model_comp_p, n = H10_fdr_n) |> list_c(),
    H10_1a_sig = H10_1a_p <= sig.level,
  
    H10_ia_comp = map2(H10_1a_model, H10_ia_model, model_anova),
    H10_ia_p = map2(H10_ia_comp, engine, model_comp_p, n = H10_fdr_n) |> list_c(),
    H10_ia_sig = H10_ia_p <= sig.level,
    
    marginals = map(H10_1a_model, \(x) emtrends(x, ~ age, var = "age") |> as.tibble()),
  )
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
In [36]:
Code
H10_data |> 
  select(metric_type, name, ends_with(c("_p", "_sig"))) |> 
  gt() |> 
  fmt(where(is.numeric), fns = style_pvalue) |> 
  tab_style(cell_text(weight = "bold"),
            locations = cells_body(H10_1a_p, H10_1a_sig)) |> 
  cols_hide(ends_with("_sig")) |> 
  tab_header("H10: Overall model results",
             subtitle = H10) |> 
  tab_footnote("H10_1s_p: p-values for sex, H10_1a_p: p-values for age, H10_ia_p: p-values for interaction of age with site. p-values where adjusted with FDR method and n=34", locations = cells_column_labels(ends_with("_p")))
H10: Overall model results
Hypothesis: Personal light exposure metrics depend on age and sex.
metric_type name H10_1s_p1 H10_1a_p1 H10_ia_p1
dynamics interdaily_stability >0.9 >0.9 >0.9
dynamics intradaily_variability >0.9 0.8 >0.9
level Mean >0.9 >0.9 >0.9
level brightest_10h_mean >0.9 0.14 >0.9
timing brightest_10h_midpoint >0.9 >0.9 0.3
level darkest_10h_mean >0.9 >0.9 0.8
timing darkest_10h_midpoint >0.9 >0.9 >0.9
duration duration_above_1000 0.7 0.017 >0.9
duration period_above_250 >0.9 0.5 >0.9
timing mean_timing_above_250 >0.9 >0.9 >0.9
timing first_timing_above_250 >0.9 >0.9 >0.9
timing last_timing_above_250 >0.9 >0.9 >0.9
exposure history dose >0.9 0.2 0.5
spectrum MDER >0.9 >0.9 >0.9
duration duration_below_10_pre-sleep >0.9 >0.9 >0.9
duration duration_below_1_sleep >0.9 >0.9 >0.9
duration duration_above_250_wake >0.9 >0.9 >0.9
1 H10_1s_p: p-values for sex, H10_1a_p: p-values for age, H10_ia_p: p-values for interaction of age with site. p-values where adjusted with FDR method and n=34

After the initial testing, only a single model was significant. The next table summarizes this model

In [37]:
Code
H10_data <- 
H10_data |> 
  mutate(
    H10_r2 = pmap(list(H10_a_model, data, family, response), r2_helper),
    H10_marginal = map_dbl(H10_r2, \(x) x$R2_marginal),
    H10_marginal = ifelse(H10_1a_sig, H10_marginal, NA)
  ) 
# |> pull(H10_r2)

Basic table

In [38]:
Code
H10_tab <-
H10_data |> 
  filter(H10_1a_sig) |> 
  select(
    metric_type, name, H10_1a_sig, H10_1a_p, marginals, H10_marginal
  ) |> 
  unnest(marginals) |> 
  gt() |> 
  fmt_number(decimals = 3) |>
    sub_missing(H10_marginal) |>
  fmt(H10_1a_p, fns = style_pvalue) |>
  cols_merge(
    c(age.trend, SE, asymp.LCL, asymp.UCL),
    pattern = "<<**{1}** ±{2} ({3} - {4})>>"
  ) |> 
  fmt(name, fns = \(x) x |> str_replace_all("_", " ") |> str_to_sentence()) |> 
  as.data.frame() |> 
  gt() |> 

  tab_style(style = cell_text(weight = "bold"),
            locations = cells_body(H10_1a_p, H10_1a_sig == "TRUE")) |> 
  cols_hide(c(H10_1a_sig, df, age, metric_type, SE, asymp.LCL, asymp.UCL)) |> 
  cols_label(
    H10_1a_p = "p-value",
    age.trend = "Age",
    name = "Metric",
    H10_marginal = md("$R^2_{marginal}$")
  ) |> 
  fmt_markdown(age.trend) |> 
  tab_header(
    "H10: Model results",
    subtitle = H10
  ) |> 
  tab_footnote(
    "p-value adjustment for n=34 model comparisons with FDR correction",
    locations = cells_column_labels(H10_1a_p)
  ) |> 
  sub_missing()
In [39]:
Code
H10_tab
H10: Model results
Hypothesis: Personal light exposure metrics depend on age and sex.
Metric p-value1 Age \(R^2_{marginal}\)
Duration above 1000 0.017 0.024 ±0.007 (0.011 - 0.037) 0.074
1 p-value adjustment for n=34 model comparisons with FDR correction
Code
H10_tab |> gtsave("tables/H10.png")
file:////var/folders/9p/326_k3kx43qbn_cyl1rqfhb00000gn/T//RtmpLH6dtc/file14bf5eb17da8.html screenshot completed

Visualization

In [40]:
Code
H10_plot1 <- 
H10_data |> 
  select(name, data) |> 
  unnest(data) |> 
  site_conv_mutate(rev = FALSE) |> 
  filter(name == "duration_above_1000") |> 
  ggplot(aes(x = age, y = metric)) + 
  geom_point(aes(color = site), alpha = 0.5) + 
  facet_wrap(~name, scales = "free") +
  guides(color = "none") +
  theme_cowplot() +
  labs(y = "Local time of day (hr)", x = "Age (yrs)") +
  geom_smooth(method = "lm") +
  scale_color_manual(values = melidos_colors)

H10_plot2 <- 
H10_data |> 
  select(name, data) |> 
  unnest(data) |> 
  site_conv_mutate(rev = TRUE) |> 
  filter_out(name == "last_timing_above_250") |> 
  ggplot(aes(x = age, y = site, color = site)) + 
  geom_boxplot() + 
  theme_cowplot() +
  guides(color = "none") +
  labs(y = NULL, x = "Age (yrs)") +
  scale_color_manual(values = melidos_colors)

H10_plot1 + H10_plot2 + plot_layout(widths = c(3,2)) + plot_annotation(tag_levels = "A")
`geom_smooth()` using formula = 'y ~ x'

Code
ggsave("figures/Fig8.pdf", width = 10, height = 5)
`geom_smooth()` using formula = 'y ~ x'

Interpretation

Overall, there are no strong connections between melanopic EDI based metrics and age or sex in these datasets. The execption is a strong dependency of age with Duration above 1000 lx melanopic EDI, where for every 10 years of age, about 15 minutes more daily time above 1000 lx is reached. This effect is marginally relevant, with about 7.4% of variance explained.

Hypothesis 11

Hypothesis 11 states:

Personal light exposure patterns depend on sex

Type of model: HGAM (Hierarchical GAM)

Base formula: \(\text{melanopic EDI} \sim s(\text{Time}) + s(\text{Time}, \text{Sex}) + s(\text{Time}, \text{Site}) + s(\text{Time}, \text{Participant})\)

Additional notes: - Site effects are modelled with the sz basis, that allows for the most identifiable modelling of parametric differences - Participant effects are modelled with the fs basis, which models the participant-effects as random smooths - melanopic EDI is scaled with log_zero_inflated() before model fitting. - The modle is based on the result from H2

Specific formulas: - \(\text{melanopic EDI} \sim s(\text{Time}) + s(\text{Time}, \text{Site}) + s(\text{Time}, \text{Participant})\) (comparison to the base formula)

False-discovery-rate correction: 1 (there is only one main comparison that tests the effect of site)

Preparation

In [41]:
Code
Load prepared data
load("data/metrics_separate_glasses.RData")
load("data/preprocessed_glasses_2.RData")

metric_participanthour <-
  metric_glasses_participanthour |>
  add_Time_col() |> 
  ungroup() |> 
  mutate(Time = as.numeric(Time)/3600 + 0.25,
         across(c(site, Id, sleep, State.Brown, wear, photoperiod.state),
                fct),
         Date = factor(Date),
         Id_date = interaction(Id, Date),
         lzMEDI = log_zero_inflated(MEDI),
         photoperiod = (dusk - dawn) |> as.numeric()
  ) |> 
  group_by(Id) |> 
  mutate(AR.start = ifelse(row_number() == 1, TRUE, FALSE)) |> 
  ungroup() |> 
  left_join(demographics, by = c("site", "Id")) |> 
  mutate(across(c(site, Id), fct))
In [42]:
Code
Overview plot of the data (median with IQR)
metric_participanthour |>
  group_by(sex) |> 
  aggregate_Date(
    date.handler = \(x) ymd("2025-01-01"),
    numeric.handler = \(x) median(x, na.rm = TRUE),
    p75 = quantile(MEDI, p=0.75, na.rm = TRUE),
    p25 = quantile(MEDI, p=0.25, na.rm = TRUE)
  ) |> 
  gg_doubleplot(geom = "blank", jco_color = FALSE, facetting = FALSE,
                x.axis.breaks.repeat = ~Datetime_breaks(.x, by = "12 hours", shift =
    lubridate::duration(0, "hours"))) +
  geom_ribbon(aes(ymin = p25, ymax = p75, fill = sex), alpha = 0.5) +
  geom_line(aes(color = sex)) +
  geom_hline(yintercept = 250, linetype = "dashed") +
  # facet_wrap(~sex, scales = "free_x") +
  guides(fill = "none") +
  theme(panel.spacing = unit(2, "lines"))

In [43]:
Code
H11 formulas
H11_form_1 <- lzMEDI ~ s(Time, k = 12) + s(Time, sex, bs = "sz", k = 12) + s(Time, site, bs = "sz", k = 12) + s(Time, Id, bs = "fs") + s(Id_date, bs = "re")
H11_form_0 <- lzMEDI ~ s(Time, k = 12) + s(Time, site, bs = "sz", k = 12) + s(Time, Id, bs = "fs") + s(Id_date, bs = "re")

Analysis

Base model (H11_1)

In [44]:
Code
Main model
H11_model_1 <- 
  bam(H11_form_1, 
      data = metric_participanthour, 
      discrete = TRUE,
      method = "fREML",
      nthreads = 10
      )

r1 <- start_value_rho(H11_model_1, 
                      plot=TRUE)

Code
H11_model_1 <- 
  bam(H11_form_1, 
      data = metric_participanthour, 
      discrete = TRUE,
      method = "fREML",
      nthreads = 10,
      rho = r1,
      AR.start = AR.start
      )

acf_resid(H11_model_1, split_pred = "AR.start")

Code
H11_summary_1 <- summary(H11_model_1)
H11_summary_1

Family: gaussian 
Link function: identity 

Formula:
lzMEDI ~ s(Time, k = 12) + s(Time, sex, bs = "sz", k = 12) + 
    s(Time, site, bs = "sz", k = 12) + s(Time, Id, bs = "fs") + 
    s(Id_date, bs = "re")

Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.85781    0.02432   35.27   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                 edf  Ref.df       F  p-value    
s(Time)       10.886   10.98 251.382  < 2e-16 ***
s(Time,sex)    9.775   10.94   3.458 6.86e-05 ***
s(Time,site)  67.093   74.23   5.822  < 2e-16 ***
s(Time,Id)   728.904 1390.00   3.539  < 2e-16 ***
s(Id_date)   350.493  800.00   0.846  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.767   Deviance explained = 77.4%
fREML =  33247  Scale est. = 0.50669   n = 37603
Code
H11_model_1 |> glance()
# A tibble: 1 × 9
     df  logLik    AIC    BIC deviance df.residual  nobs adj.r.squared  npar
  <dbl>   <dbl>  <dbl>  <dbl>    <dbl>       <dbl> <int>         <dbl> <int>
1 1168. -31318. 64991. 75039.   20064.      36435. 37603         0.767  2339
Code
H11_model_1 |> appraise()

In [45]:
Code
Variance explained
Xp <- predict(H11_model_1, type = "terms")
variance <- apply(Xp, 2, var)
cor(Xp)
                   s(Time)   s(Time,sex)  s(Time,site)   s(Time,Id)
s(Time)       1.000000e+00 -0.0420963875  3.397128e-02  0.003486141
s(Time,sex)  -4.209639e-02  1.0000000000  5.746969e-02 -0.006491081
s(Time,site)  3.397128e-02  0.0574696905  1.000000e+00  0.018949167
s(Time,Id)    3.486141e-03 -0.0064910814  1.894917e-02  1.000000000
s(Id_date)   -5.126648e-05  0.0004281968 -3.737668e-06  0.142041813
                s(Id_date)
s(Time)      -5.126648e-05
s(Time,sex)   4.281968e-04
s(Time,site) -3.737668e-06
s(Time,Id)    1.420418e-01
s(Id_date)    1.000000e+00
Code
H11_R2 <- (variance/sum(variance)) |> enframe(value = "R2", name = "parameter")
H11_R2
# A tibble: 5 × 2
  parameter         R2
  <chr>          <dbl>
1 s(Time)      0.844  
2 s(Time,sex)  0.00247
3 s(Time,site) 0.0476 
4 s(Time,Id)   0.0963 
5 s(Id_date)   0.00996

Null model (H11_0)

In [46]:
Code
Null model
H11_model_0 <- 
  bam(H11_form_0, 
      data = metric_participanthour, 
      discrete = TRUE,
      method = "fREML",
      nthreads = 8
      )

r1 <- start_value_rho(H11_model_0, 
                      plot=TRUE)

Code
H11_model_0 <- 
  bam(H11_form_0, 
      data = metric_participanthour, 
      discrete = TRUE,
      method = "fREML",
      nthreads = 10,
      rho = r1,
      AR.start = AR.start
      )

acf_resid(H11_model_0, split_pred = "AR.start")

Code
H11_summary_0 <- summary(H11_model_0)
H11_summary_0

Family: gaussian 
Link function: identity 

Formula:
lzMEDI ~ s(Time, k = 12) + s(Time, site, bs = "sz", k = 12) + 
    s(Time, Id, bs = "fs") + s(Id_date, bs = "re")

Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.85202    0.02428   35.09   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                edf  Ref.df       F p-value    
s(Time)       10.88   10.98 246.405  <2e-16 ***
s(Time,site)  67.22   74.28   5.909  <2e-16 ***
s(Time,Id)   743.18 1392.00   3.604  <2e-16 ***
s(Id_date)   350.78  801.00   0.846  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.767   Deviance explained = 77.4%
fREML =  33250  Scale est. = 0.50672   n = 37603
Code
H11_model_0 |> glance()
# A tibble: 1 × 9
     df  logLik    AIC    BIC deviance df.residual  nobs adj.r.squared  npar
  <dbl>   <dbl>  <dbl>  <dbl>    <dbl>       <dbl> <int>         <dbl> <int>
1 1173. -31318. 64999. 75084.   20059.      36430. 37603         0.767  2327
Code
H11_model_0 |> appraise()

Code
H11_model_0 |> draw()

Model comparison

In [47]:
Code
H11_AICs <- AIC(H11_model_1, H11_model_0)
H11_AICs |> as_tibble(rownames = "model") |> gt()
model df AIC
H11_model_1 1177.244 64991.19
H11_model_0 1181.694 64998.72

While sex only explains very little of the overall structure (<1% of variance explained), is has a ∆AIC >2 and will thus be included

Model visualization

In [48]:
Code
prepare photoperiod model H2 plot
H11_sex_data <- 
H11_model_1 |> 
  conditional_values(
    condition = list(Time = seq(0.25, 23.75, by = 0.5), "sex"),
    exclude = c("s(Time,site)", "s(Time,Id)", "s(Id_date)")
    ) |> 
  mutate(across(c(.fitted, .lower_ci, .upper_ci),
                exp_zero_inflated)
  )
In [49]:
Code
Visualize sex model H11
source("scripts/Brown_bracket.R")
Loading required package: legendry
Code
H11_pred <-
H11_sex_data |> 
  ggplot(aes()) +
  map(c(1,10,250, 10^4), 
          \(x) geom_hline(aes(yintercept = x), 
                          col = "grey", linetype = "dashed")
      ) +
  geom_ribbon(aes(x = Time, ymin = .lower_ci, ymax = .upper_ci, fill = sex), alpha = 0.5) +
  geom_line(aes(x = Time, y = .fitted, col = sex), linewidth = 0.1) +
  geom_line(aes(x = Time, y = .fitted), linewidth = 1) +
  theme_cowplot() +
  scale_y_continuous(trans = "symlog", breaks = c(0,1,10,100,250, 1000)) +
  scale_x_continuous(breaks = seq(0, 24, by = 6)) +
  coord_cartesian(xlim = c(0,24), ylim = c(0,10^4), expand = FALSE) +
  labs(y = "Melanopic EDI (lx)", x = "Local time (hrs)", fill = "Sex", col = "Sex") +
  facet_wrap(~sex) +
  # guides( y = guide_axis_stack(Brown_bracket, "axis")) +
  gghighlight(
    label_key = sex,
    label_params = list(x = 12, y = 4, force = 0, hjust = 0.25)
  ) +
  theme_sub_strip(background = element_blank(),
                  text = element_blank()) +
  theme(panel.spacing = unit(1, "lines"),
        plot.caption = element_textbox_simple())
  # labs(caption = Brown_caption)
In [50]:
Code
Visualize model terms
H11_P_terms <-
H11_model_1 |> 
  smooth_estimates(select = "s(Time,sex)") |> 
  mutate(.lower_ci = .estimate -1.96*.se,
         .upper_ci = .estimate +1.96*.se,
         .fitted = .estimate,
         signif = .lower_ci > 0 | .upper_ci < 0,
         signif_id = consecutive_id(signif),
         .by = sex) |> 
  ggplot(aes()) +
  map(c(log10(c(0.1, 0.2, 0.5, 2, 5, 10))),
      \(x) geom_hline(yintercept = x, linewidth = 0.5, col = "grey", linetype = 2)) +
  geom_ribbon(aes(x = Time, ymin = .lower_ci, ymax = .upper_ci, fill = sex), alpha = 0.5) +
  geom_hline(yintercept = 0, linewidth = 1, linetype = 2) +
  geom_line(aes(x = Time, y = .fitted), 
            linewidth = 1) +
  geom_line(data = \(x) filter(x, signif),
            aes(x = Time, y = .fitted, group = signif_id), 
            col = "red", linewidth = 1) +
  # scale_color_manual(values = c(melidos_colors)) +
  theme_cowplot() +
  scale_y_continuous(labels = expression(10^-1, 5^-1, 2^-1, "1  ", "2  ", "5  ", "10  "), 
                     breaks = log10(c(0.1, 0.2, 0.5, 1, 2, 5, 10)))+
  scale_x_continuous(breaks = seq(0, 24, by = 6)) +
  coord_cartesian(xlim = c(0,24), expand = FALSE) +
  # guides(fill = "none") +
  labs(y = "Factor", x = "Local time (hrs)", fill = "Sex") +
  facet_wrap(~sex) +
  theme_sub_strip(background = element_blank(),
                  text = element_blank()) +
  theme(panel.spacing = unit(1, "lines"),
        plot.caption = element_textbox_simple(),
        legend.position = "bottom")
In [51]:
Code
H11_P_terms

Code
ggsave("figures/Fig8.pdf")
Saving 7 x 5 in image

Interpretation

  • Sex is a minor differentiator in terms of melanopic stimulus, that explanes less than 1% of the variance. The only significant difference between the sexes is a lower value during the morning for females compared to males. This phase lasts about 2 hours and has a peak difference of about factor 2

Session Info

In [52]:
Code
sessionInfo()
R version 4.5.0 (2025-04-11)
Platform: aarch64-apple-darwin20
Running under: macOS 26.3.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/Berlin
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
 [1] legendry_0.2.4      gtsummary_2.5.0     gghighlight_0.5.0  
 [4] glmmTMB_1.1.14      gt_1.3.0            broom.mixed_0.2.9.7
 [7] rlang_1.2.0         patchwork_1.3.2     emmeans_2.0.3      
[10] gratia_0.11.2       itsadug_2.5         plotfunctions_1.5  
[13] broom_1.0.12        sjPlot_2.9.0        ggtext_0.1.2       
[16] ggridges_0.5.7      cowplot_1.2.0       performance_0.16.0 
[19] lme4_2.0-1          Matrix_1.7-3        mgcv_1.9-4         
[22] nlme_3.1-168        LightLogR_0.10.0    melidosData_1.0.3  
[25] lubridate_1.9.5     forcats_1.0.1       stringr_1.6.0      
[28] dplyr_1.2.1         purrr_1.2.2         readr_2.2.0        
[31] tidyr_1.3.2         tibble_3.3.1        ggplot2_4.0.2      
[34] tidyverse_2.0.0    

loaded via a namespace (and not attached):
 [1] Rdpack_2.6.6        sandwich_3.1-1      magrittr_2.0.5     
 [4] furrr_0.4.0         otel_0.2.0          compiler_4.5.0     
 [7] systemfonts_1.3.2   vctrs_0.7.3         pkgconfig_2.0.3    
[10] fastmap_1.2.0       backports_1.5.1     labeling_0.4.3     
[13] utf8_1.2.6          promises_1.5.0      rmarkdown_2.31     
[16] markdown_2.0        tzdb_0.5.0          nloptr_2.2.1       
[19] ragg_1.5.2          xfun_0.57           litedown_0.9       
[22] jsonlite_2.0.0      later_1.4.8         parallel_4.5.0     
[25] R6_2.6.1            stringi_1.8.7       RColorBrewer_1.1-3 
[28] parallelly_1.47.0   boot_1.3-31         numDeriv_2016.8-1.1
[31] estimability_1.5.1  Rcpp_1.1.1-1        knitr_1.51         
[34] zoo_1.8-15          base64enc_0.1-6     mvnfast_0.2.8      
[37] splines_4.5.0       timechange_0.4.0    tidyselect_1.2.1   
[40] rstudioapi_0.18.0   yaml_2.3.12         websocket_1.4.4    
[43] TMB_1.9.21          codetools_0.2-20    processx_3.8.7     
[46] listenv_0.10.1      lattice_0.22-7      withr_3.0.2        
[49] S7_0.2.1-1          coda_0.19-4.1       evaluate_1.0.5     
[52] future_1.70.0       xml2_1.5.2          pillar_1.11.1      
[55] renv_1.1.4          reformulas_0.4.4    insight_1.5.0      
[58] generics_0.1.4      chromote_0.5.1      hms_1.1.4          
[61] commonmark_2.0.0    mirai_2.6.1         scales_1.4.0       
[64] minqa_1.2.8         globals_0.19.1      glue_1.8.0         
[67] webshot2_0.1.2      tools_4.5.0         fs_2.0.1           
[70] mvtnorm_1.3-7       grid_4.5.0          ggokabeito_0.1.0   
[73] rbibutils_2.4.1     cards_0.7.1         cli_3.6.6          
[76] textshaping_1.0.5   gtable_0.3.6        tweedie_3.0.17     
[79] sass_0.4.10         digest_0.6.39       ggrepel_0.9.8      
[82] pbkrtest_0.5.5      nanonext_1.8.2      htmlwidgets_1.6.4  
[85] farver_2.1.2        htmltools_0.5.9     lifecycle_1.0.5    
[88] statmod_1.5.1       gridtext_0.1.6      MASS_7.3-65