Research Question 3

Preface

This document contains the analysis for Research Question 3, as defined in the preregistration.

Research Question 3:
Is personal light exposure related to participant characteristics?

H8: Duration, exposure-history, and level-based metrics of personal light exposure are significantly associated with personal light sensitivity (VLSQ-8)

H9: Timing-based metrics of personal light exposure are significantly associated with chronotype.

H10: Personal light exposure metrics depend on age and sex

H11: Personal light exposure patterns depend on sex

Setup

In this section, we load in the preprocessed and prepared data, site data, and load required packages.

In [1]:

Code cell - Setup

#| message: false
#| warning: false
#| code-summary: Code cell - Setup
library(tidyverse)

Warning: package 'ggplot2' was built under R version 4.5.2

Warning: package 'tibble' was built under R version 4.5.2

Warning: package 'tidyr' was built under R version 4.5.2

Warning: package 'readr' was built under R version 4.5.2

Warning: package 'purrr' was built under R version 4.5.2

Warning: package 'dplyr' was built under R version 4.5.2

Warning: package 'lubridate' was built under R version 4.5.2

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.2.1     ✔ readr     2.2.0
✔ forcats   1.0.1     ✔ stringr   1.6.0
✔ ggplot2   4.0.2     ✔ tibble    3.3.1
✔ lubridate 1.9.5     ✔ tidyr     1.3.2
✔ purrr     1.2.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

Code cell - Setup

#| message: false
#| warning: false
#| code-summary: Code cell - Setup
library(melidosData)
library(LightLogR)
library(mgcv)

Loading required package: nlme

Attaching package: 'nlme'

The following object is masked from 'package:dplyr':

    collapse

This is mgcv 1.9-4. For overview type '?mgcv'.

Code cell - Setup

#| message: false
#| warning: false
#| code-summary: Code cell - Setup
library(lme4)

Warning: package 'lme4' was built under R version 4.5.2

Loading required package: Matrix

Attaching package: 'Matrix'

The following objects are masked from 'package:tidyr':

    expand, pack, unpack


Attaching package: 'lme4'

The following object is masked from 'package:nlme':

    lmList

Code cell - Setup

#| message: false
#| warning: false
#| code-summary: Code cell - Setup
library(performance)

Warning: package 'performance' was built under R version 4.5.2

Code cell - Setup

#| message: false
#| warning: false
#| code-summary: Code cell - Setup
library(cowplot)


Attaching package: 'cowplot'

The following object is masked from 'package:lubridate':

    stamp

Code cell - Setup

#| message: false
#| warning: false
#| code-summary: Code cell - Setup
library(ggridges)
library(ggtext)
library(sjPlot)


Attaching package: 'sjPlot'

The following objects are masked from 'package:cowplot':

    plot_grid, save_plot

The following object is masked from 'package:ggplot2':

    set_theme

Code cell - Setup

#| message: false
#| warning: false
#| code-summary: Code cell - Setup
library(broom)

Warning: package 'broom' was built under R version 4.5.2

Code cell - Setup

#| message: false
#| warning: false
#| code-summary: Code cell - Setup
library(itsadug)

Warning: package 'itsadug' was built under R version 4.5.2

Loading required package: plotfunctions

Warning: package 'plotfunctions' was built under R version 4.5.2


Attaching package: 'plotfunctions'

The following object is masked from 'package:ggplot2':

    alpha

Loaded package itsadug 2.5 (see 'help("itsadug")' ).

Code cell - Setup

#| message: false
#| warning: false
#| code-summary: Code cell - Setup
library(gratia)

Warning: package 'gratia' was built under R version 4.5.2


Attaching package: 'gratia'

The following object is masked from 'package:itsadug':

    dispersion

The following object is masked from 'package:stringr':

    boundary

Code cell - Setup

#| message: false
#| warning: false
#| code-summary: Code cell - Setup
library(emmeans)

Warning: package 'emmeans' was built under R version 4.5.2

Welcome to emmeans.
Caution: You lose important information if you filter this package's results.
See '? untidy'

Code cell - Setup

#| message: false
#| warning: false
#| code-summary: Code cell - Setup
library(patchwork)


Attaching package: 'patchwork'

The following object is masked from 'package:cowplot':

    align_plots

Code cell - Setup

#| message: false
#| warning: false
#| code-summary: Code cell - Setup
library(rlang)

Warning: package 'rlang' was built under R version 4.5.2


Attaching package: 'rlang'

The following objects are masked from 'package:purrr':

    flatten, flatten_chr, flatten_dbl, flatten_int, flatten_lgl,
    flatten_raw, invoke, splice

Code cell - Setup

#| message: false
#| warning: false
#| code-summary: Code cell - Setup
library(broom.mixed)

Warning: package 'broom.mixed' was built under R version 4.5.2

Code cell - Setup

#| message: false
#| warning: false
#| code-summary: Code cell - Setup
library(gt)

Warning: package 'gt' was built under R version 4.5.2


Attaching package: 'gt'

The following object is masked from 'package:cowplot':

    as_gtable

Code cell - Setup

#| message: false
#| warning: false
#| code-summary: Code cell - Setup
library(performance)
library(glmmTMB)

Warning: package 'glmmTMB' was built under R version 4.5.2

Warning in check_dep_version(dep_pkg = "TMB"): package version mismatch: 
glmmTMB was built with TMB package version 1.9.19
Current TMB package version is 1.9.21
Please re-install glmmTMB from source or restore original 'TMB' package (see '?reinstalling' for more information)

Code cell - Setup

#| message: false
#| warning: false
#| code-summary: Code cell - Setup
library(gghighlight)
library(gtsummary)

Warning: package 'gtsummary' was built under R version 4.5.2

Code cell - Setup

#| message: false
#| warning: false
#| code-summary: Code cell - Setup
source("scripts/fitting.R")
source("scripts/summaries.R")
source("scripts/RQ2_specific.R")

In [2]:

Code

#| message: false
#| file-name: Load data
load("data/metrics_glasses.RData")
load("data/preprocessed_glasses_2.RData")
metrics <- metrics_glasses
hourly_data <-  light_glasses_processed2
H8 <- "Hypothesis: Duration, exposure-history, and level-based metrics of personal light exposure are significantly associated with personal light sensitivity (VLSQ-8)."
H9 <- "Hypothesis: Timing-based metrics of personal light exposure are significantly associated with chronotype."
H10 <- "Hypothesis: Personal light exposure metrics depend on age and sex."
H11 <- "Hypothesis: Personal light exposure patterns depend on sex."
sig.level <- 0.05

#for chest:
# load("data/metrics_chest.RData")
# metrics <- metrics_chest
# melidos_cities <- melidos_cities[names(melidos_cities)!="MPI"]

Hypothesis 8

Details

Hypothesis 3 states:

Duration, exposure-history, and level-based metrics of personal light exposure are significantly associated with personal light sensitivity (VLSQ-8)

Type of model: GLMM

Base formula: \text{metric} \sim \text{VLSQ-8}*\text{Site} + (1|\text{Participant})

Notes: - Site was sum-coded, so as to calculate an average effect of site - Error family depends on the metric

Specific formulas: - \text{mel EDI} \sim \text{Site} + (1|\text{Site}:\text{Participant}) (Null-model) - \text{mel EDI} \sim \text{VLSQ-8} + \text{Site} + (1|\text{Site}:\text{Participant}) (Null-model interaction) - \text{mel EDI} \sim \text{VLSQ-8} + (1|\text{Participant}) (Null-model site)

False-discovery-rate correction: - 9 metrics with one confirmatory analysis: n=9 - For sites (interaction), n=9

Preparation

In [3]:

Code

Load VLSQ-8 data

#| filename: Load VLSQ-8 data
vlsq8 <- load_data("vlsq8") |> flatten_data()

loading modality: vlsq8 ■■■■■■■■■■■                       33% |  ETA:  2s

In [4]:

Code cell - Model setup based on distribution

#| code-summary: Code cell - Model setup based on distribution
model_tbl <- tibble::tribble(
    ~name, ~engine, ~ response, ~family,
    "interdaily_stability", "lm", "qlogis(metric)", gaussian(),
    "intradaily_variability", "lm", "metric", gaussian(),
    "Mean", "lmer", "log_zero_inflated(metric)", gaussian(),
    "dose", "lmer", "log_zero_inflated(metric)", gaussian(),
    "MDER", "lmer", "metric", gaussian(),
    "darkest_10h_mean", "lmer", "log_zero_inflated(metric)", gaussian(),
    "brightest_10h_mean", "lmer", "log_zero_inflated(metric)", gaussian(),
    "duration_above_1000", "glmmTMB", "metric", tweedie(link = "log"),
    "duration_above_250", "lmer", NA, gaussian(),
    "duration_below_10", "lmer", NA, gaussian(),
    "duration_below_1", "lmer", NA, gaussian(),
    "duration_above_250_wake", "glmmTMB", "metric", tweedie(link = "log"),
    "duration_below_10_pre-sleep", "glmmTMB", "metric", tweedie(link = "log"),
    "duration_below_1_sleep", "glmmTMB", "metric", tweedie(link = "log"),
    "period_above_250", "lmer", "log_zero_inflated(metric)", gaussian(),
    "brightest_10h_midpoint", "lmer", "metric", gaussian(),
    "brightest_10h_onset", "lmer", NA, gaussian(),
    "brightest_10h_offset", "lmer", NA, gaussian(),
    "darkest_10h_midpoint", "lmer", "metric", gaussian(),
    "darkest_10h_onset", "lmer", NA, gaussian(),
    "darkest_10h_offset", "lmer", NA, gaussian(),
    "mean_timing_above_250", "lmer", "metric", gaussian(),
    "first_timing_above_250", "lmer", "metric", gaussian(),
    "last_timing_above_250", "lmer", "metric", gaussian()
  )

In [5]:

Code cell - Formula setup for inference

#| code-summary: Code cell - Formula setup for inference
H8_data <-
  metrics |> 
  filter(metric_type %in% c("duration", "exposure history", "level")) |> 
  filter_out(name == "duration_above_250") |> 
  mutate(
    H8_1 = "~ site*VLSQ8 + (1|Id)",
    H8_0 = "~ site + (1|Id)",
    H8_ni = "~ site + VLSQ8 + (1|Id)",
    H8_ns = "~ VLSQ8 + (1|Id)",
         ) |> 
  left_join(model_tbl, by = "name")

H8_formulas <-  names(H8_data) |> subset(grepl("H8_", names(H8_data)))

H8_data <- 
  H8_data |> relocate(all_of(H8_formulas), .after = family)
H8_fdr_n <- H8_data |> filter(!is.na(response)) |> nrow()

In [6]:

Code

H8_data <-
  H8_data |> 
  mutate(data = data |> map(\(x) x |> left_join(vlsq8, by = c("site", "Id"))))

In [7]:

Code

#| echo: false
H8_data |> 
  select(-data) |>
  filter(!is.na(response)) |> 
  arrange(metric_type, name) |> 
  mutate(family = map(family, 1),
         rowname = seq_along(1:length(name))) |> 
  group_by(metric_type) |> 
  gt() |> 
  cols_label_with(fn = \(x) x |> str_to_title() |> str_replace_all("_", " ")) |> 
  sub_missing(missing_text = "") |> 
  sub_values(values = "NULL", replacement = "") |> 
  fmt(name, fns = \(x) x |> str_to_title() |>  str_replace_all("_", " ")) |> 
  sub_values(values = "MDER", replacement = "MDER") |> 
  tab_header("Hypothesis 8: metric model specifications",
             subtitle = H8) |> 
  tab_footnote("Confirmatory analysis will use a p-value correction through the false-discovery rate for n=9 comparisons") |> 
  tab_footnote("These are the models used for the confirmatory analysis. Each H1 1 and H1 0 pair are compaired with likelihood-ratio tests to determine whether site is meaningful", locations = cells_column_labels(c(H8_1, H8_0))) |> 
  tab_footnote("These are the models used for exploratory analyses. Models are compared with all sensible combinations of hierarchical effects", locations = cells_column_labels(c(H8_ni, H8_ns)))

	Name	Type	Engine	Response	Family	H8 1¹	H8 0¹	H8 ni²	H8 ns²
Hypothesis 8: metric model specifications
Hypothesis: Duration, exposure-history, and level-based metrics of personal light exposure are significantly associated with personal light sensitivity (VLSQ-8).
duration
1	Duration above 1000	participant-day	glmmTMB	metric	tweedie	~ site*VLSQ8 + (1\|Id)	~ site + (1\|Id)	~ site + VLSQ8 + (1\|Id)	~ VLSQ8 + (1\|Id)
2	Duration above 250 wake	participant-day	glmmTMB	metric	tweedie	~ site*VLSQ8 + (1\|Id)	~ site + (1\|Id)	~ site + VLSQ8 + (1\|Id)	~ VLSQ8 + (1\|Id)
3	Duration below 10 pre-Sleep	participant-day	glmmTMB	metric	tweedie	~ site*VLSQ8 + (1\|Id)	~ site + (1\|Id)	~ site + VLSQ8 + (1\|Id)	~ VLSQ8 + (1\|Id)
4	Duration below 1 sleep	participant-day	glmmTMB	metric	tweedie	~ site*VLSQ8 + (1\|Id)	~ site + (1\|Id)	~ site + VLSQ8 + (1\|Id)	~ VLSQ8 + (1\|Id)
5	Period above 250	participant-day	lmer	log_zero_inflated(metric)	gaussian	~ site*VLSQ8 + (1\|Id)	~ site + (1\|Id)	~ site + VLSQ8 + (1\|Id)	~ VLSQ8 + (1\|Id)
exposure history
6	Dose	participant-day	lmer	log_zero_inflated(metric)	gaussian	~ site*VLSQ8 + (1\|Id)	~ site + (1\|Id)	~ site + VLSQ8 + (1\|Id)	~ VLSQ8 + (1\|Id)
level
7	Mean	participant-day	lmer	log_zero_inflated(metric)	gaussian	~ site*VLSQ8 + (1\|Id)	~ site + (1\|Id)	~ site + VLSQ8 + (1\|Id)	~ VLSQ8 + (1\|Id)
8	Brightest 10h mean	participant-day	lmer	log_zero_inflated(metric)	gaussian	~ site*VLSQ8 + (1\|Id)	~ site + (1\|Id)	~ site + VLSQ8 + (1\|Id)	~ VLSQ8 + (1\|Id)
9	Darkest 10h mean	participant-day	lmer	log_zero_inflated(metric)	gaussian	~ site*VLSQ8 + (1\|Id)	~ site + (1\|Id)	~ site + VLSQ8 + (1\|Id)	~ VLSQ8 + (1\|Id)
Confirmatory analysis will use a p-value correction through the false-discovery rate for n=9 comparisons
¹ These are the models used for the confirmatory analysis. Each H1 1 and H1 0 pair are compaired with likelihood-ratio tests to determine whether site is meaningful
² These are the models used for exploratory analyses. Models are compared with all sensible combinations of hierarchical effects

In [8]:

Code cell - Formula creation

#| code-summary: Code cell - Formula creation
H8_data <-
  H8_data |> 
  mutate(across(all_of(H8_formulas),
                \(x) map2(x, response, \(y,z) make_formula(z, y))
                )
         )

In [9]:

Code

H8_data |> 
  select(name, data) |> 
  unnest(data) |> 
  ggplot(aes(x = VLSQ8, y = metric)) + 
  geom_point() + 
  facet_wrap(~name, scales = "free")

Warning: Removed 108 rows containing missing values or values outside the scale range
(`geom_point()`).

Analysis

Fit models

In [10]:

Code cell - Model fit

#| code-summary: Code cell - Model fit
#| message: false
H8_data <-
  H8_data |> 
  mutate(
    across(all_of(H8_formulas),
           \(x) pmap(list(data, x, engine, family), fit_model),
           .names = "{.col}_model")
    )

Model diagnostics are omitted here. We checked in RQ1 for sensible base models

Model comparisons

In [11]:

Code cell - Model comparisons

#| code-summary: Code cell - Model comparisons
#| message: false
H8_data <- 
  H8_data |> 
  mutate(
    H8_1_comp = map2(H8_0_model, H8_1_model, model_anova),
    H8_1_p = map2(H8_1_comp, engine, model_comp_p, n = H8_fdr_n) |> list_c(),
    H8_1_sig = H8_1_p <= sig.level,
  
    H8_i_comp = map2(H8_ni_model, H8_1_model, model_anova),
    H8_i_p = map2(H8_i_comp, engine, model_comp_p, n = H8_fdr_n) |> list_c(),
    H8_i_sig = H8_i_p <= sig.level,
  
    H8_s_comp = map2(H8_ns_model, H8_ni_model, model_anova),
    H8_s_p = map2(H8_s_comp, engine, model_comp_p, n = H8_fdr_n) |> list_c(),
    H8_s_sig = H8_s_p <= sig.level,
  )

refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)

In [12]:

Code

H8_tab <- 
H8_data |> 
  select(metric_type, name, H8_1_p, H8_1_sig) |> 
  mutate(H8_1_p = style_pvalue(H8_1_p)) |> 
  gt() |> 
  tab_header(
    "H8: Model results",
    subtitle = H8
  )

In [13]:

Code

H8_tab

metric_type	name	H8_1_p	H8_1_sig
H8: Model results
Hypothesis: Duration, exposure-history, and level-based metrics of personal light exposure are significantly associated with personal light sensitivity (VLSQ-8).
level	Mean	>0.9	FALSE
level	brightest_10h_mean	>0.9	FALSE
level	darkest_10h_mean	>0.9	FALSE
duration	duration_above_1000	>0.9	FALSE
duration	period_above_250	>0.9	FALSE
exposure history	dose	>0.9	FALSE
duration	duration_below_10_pre-sleep	>0.9	FALSE
duration	duration_below_1_sleep	>0.9	FALSE
duration	duration_above_250_wake	>0.9	FALSE

Code

gtsave(H8_tab, "tables/H8.png", vwidth = 1200)

file:////var/folders/9p/326_k3kx43qbn_cyl1rqfhb00000gn/T//RtmpMZyGIO/file7a7e50dba803.html screenshot completed

Code

gtsave(H8_tab, "tables/H8.docx", vwidth = 1200)

Interpretation

None of the light exposure metrics are significantly associated with VLSQ-8 scores

Hypothesis 9

Details

Hypothesis 9 states:

Timing-based metrics of personal light exposure are significantly associated with chronotype.

Type of model: GLMM

Base formula: \text{metric} \sim \text{Chronotype}*\text{Site} + (1|\text{Participant})

Notes: - Site was sum-coded, so as to calculate an average effect of site - Chronotype is operationalized through the MCTQ-score msf_{sc} (Mid-sleep on free days. Sleep corrected) - Error family depends on the metric

Specific formulas: - \text{mel EDI} \sim \text{Site} + (1|\text{Site}:\text{Participant}) (Null-model) - \text{mel EDI} \sim \text{Chronotype} + \text{Site} + (1|\text{Site}:\text{Participant}) (Null-model interaction) - \text{mel EDI} \sim \text{Chronotype} + (1|\text{Participant}) (Null-model site)

False-discovery-rate correction: - 5 metrics with one confirmatory analysis: n=5 - For sites (interaction), n=9

Preparation

In [14]:

Code

Load chronotype data

#| filename: Load chronotype data
ct <- load_data("chronotype") |> flatten_data()
ct$msf_sc |> range(na.rm = TRUE) |> hms::as_hms()

01:15:42.857143
08:07:51.428571

Code

#| filename: Load chronotype data
ct <- ct |> mutate(msf_sc = as.numeric(msf_sc)/3600)

Since all msf_sc values are after midnight, we will use linear modelling instead of circular.

In [15]:

Code cell - Formula setup for inference

#| code-summary: Code cell - Formula setup for inference
H9_data <-
  metrics |> 
  filter(metric_type %in% c("timing")) |> 
  filter_out(str_detect(name, "onset|offset")) |> 
  mutate(
    H9_1 = "~ site*msf_sc + (1|Id)",
    H9_0 = "~ site + (1|Id)",
    H9_00 = "~ 1 + (1|Id)",
    H9_ni = "~ site + msf_sc + (1|Id)",
    H9_ns = "~ msf_sc + (1|Id)",
         ) |> 
  left_join(model_tbl, by = "name")

H9_formulas <-  names(H9_data) |> subset(grepl("H9_", names(H9_data)))

H9_data <- 
  H9_data |> relocate(all_of(H9_formulas), .after = family)
H9_fdr_n <- H9_data |> filter(!is.na(response)) |> nrow()

In [16]:

Code

H9_data <-
  H9_data |> 
  mutate(data = data |> map(\(x) x |> left_join(ct, by = c("site", "Id")) |> 
                              drop_na(msf_sc)))

In [17]:

Code

#| echo: false
H9_data |> 
  select(-data) |>
  filter(!is.na(response)) |> 
  arrange(metric_type, name) |> 
  mutate(family = map(family, 1),
         rowname = seq_along(1:length(name))) |> 
  group_by(metric_type) |> 
  gt() |> 
  cols_label_with(fn = \(x) x |> str_to_title() |> str_replace_all("_", " ")) |> 
  sub_missing(missing_text = "") |> 
  sub_values(values = "NULL", replacement = "") |> 
  fmt(name, fns = \(x) x |> str_to_title() |>  str_replace_all("_", " ")) |> 
  sub_values(values = "MDER", replacement = "MDER") |> 
  tab_header("Hypothesis 9: metric model specifications",
             subtitle = H9) |> 
  tab_footnote("Confirmatory analysis will use a p-value correction through the false-discovery rate for n=5 comparisons") |> 
  tab_footnote("These are the models used for the confirmatory analysis. Each H1 1 and H1 0 pair are compaired with likelihood-ratio tests to determine whether site is meaningful", locations = cells_column_labels(c(H9_ni, H9_00))) |> 
  tab_footnote("These are the models used for exploratory analyses. Models are compared with all sensible combinations of hierarchical effects", locations = cells_column_labels(c(H9_1, H9_ns, H9_0)))

	Name	Type	Engine	Response	Family	H9 1¹	H9 0¹	H9 00²	H9 ni²	H9 ns¹
Hypothesis 9: metric model specifications
Hypothesis: Timing-based metrics of personal light exposure are significantly associated with chronotype.
timing
1	Brightest 10h midpoint	participant-day	lmer	metric	gaussian	~ site*msf_sc + (1\|Id)	~ site + (1\|Id)	~ 1 + (1\|Id)	~ site + msf_sc + (1\|Id)	~ msf_sc + (1\|Id)
2	Darkest 10h midpoint	participant-day	lmer	metric	gaussian	~ site*msf_sc + (1\|Id)	~ site + (1\|Id)	~ 1 + (1\|Id)	~ site + msf_sc + (1\|Id)	~ msf_sc + (1\|Id)
3	First timing above 250	participant-day	lmer	metric	gaussian	~ site*msf_sc + (1\|Id)	~ site + (1\|Id)	~ 1 + (1\|Id)	~ site + msf_sc + (1\|Id)	~ msf_sc + (1\|Id)
4	Last timing above 250	participant-day	lmer	metric	gaussian	~ site*msf_sc + (1\|Id)	~ site + (1\|Id)	~ 1 + (1\|Id)	~ site + msf_sc + (1\|Id)	~ msf_sc + (1\|Id)
5	Mean timing above 250	participant-day	lmer	metric	gaussian	~ site*msf_sc + (1\|Id)	~ site + (1\|Id)	~ 1 + (1\|Id)	~ site + msf_sc + (1\|Id)	~ msf_sc + (1\|Id)
Confirmatory analysis will use a p-value correction through the false-discovery rate for n=5 comparisons
¹ These are the models used for exploratory analyses. Models are compared with all sensible combinations of hierarchical effects
² These are the models used for the confirmatory analysis. Each H1 1 and H1 0 pair are compaired with likelihood-ratio tests to determine whether site is meaningful

In [18]:

Code cell - Formula creation

#| code-summary: Code cell - Formula creation
H9_data <-
  H9_data |> 
  mutate(across(all_of(H9_formulas),
                \(x) map2(x, response, \(y,z) make_formula(z, y))
                )
         )

In [19]:

Code

H9_data |> 
  select(name, data) |> 
  unnest(data) |> 
  ggplot(aes(x = msf_sc, y = metric)) + 
  geom_point() + 
  facet_wrap(~name, scales = "free") +
  geom_smooth()

`geom_smooth()` using method = 'loess' and formula = 'y ~ x'

Warning: Removed 99 rows containing non-finite outside the scale range
(`stat_smooth()`).

Warning: Removed 99 rows containing missing values or values outside the scale range
(`geom_point()`).

Analysis

Fit models

In [20]:

Code cell - Model fit

#| code-summary: Code cell - Model fit
#| message: false
H9_data <-
  H9_data |> 
  mutate(
    across(all_of(H9_formulas),
           \(x) pmap(list(data, x, engine, family), fit_model),
           .names = "{.col}_model")
    )

Model diagnostics are omitted here. We checked in RQ1 for sensible base models

In [21]:

Code cell - Model comparisons

#| code-summary: Code cell - Model comparisons
#| message: false
H9_data <- 
  H9_data |> 
  mutate(
    H9_1_comp = map2(H9_00_model, H9_ns_model, model_anova),
    H9_1_p = map2(H9_1_comp, engine, model_comp_p, n = H9_fdr_n) |> list_c(),
    H9_1_sig = H9_1_p <= sig.level,
  
    H9_i_comp = map2(H9_ni_model, H9_1_model, model_anova),
    H9_i_p = map2(H9_i_comp, engine, model_comp_p, n = H9_fdr_n) |> list_c(),
    H9_i_sig = H9_i_p <= sig.level,
  
    H9_s_comp = map2(H9_ns_model, H9_ni_model, model_anova),
    H9_s_p = map2(H9_s_comp, engine, model_comp_p, n = H9_fdr_n) |> list_c(),
    H9_s_sig = H9_s_p <= sig.level,
    
    H9_model = pmap(
      list(H9_1_sig, H9_i_sig, H9_s_sig, H9_00_model, H9_0_model, H9_ns_model, H9_ni_model, H9_1_model), 
      \(sig1, sig2, sig3, model00, model_site, model_ct, model_both, model_interaction) {
        if(all(sig1, sig2, sig3)) {
          model <- model_interaction 
        } else if(all(sig1, sig3) & !sig2) {
            model <- model_both
        } else if(sig1 & !sig2 & !sig3) {
            model <- model_ct
        } else if(!sig1 & sig3 & !sig2) {
            model <- model_site
          } else model <- model00
          model
      }
      ),
    
    marginals = map(H9_ns_model, \(x) emtrends(x, ~ msf_sc, var = "msf_sc")),
    marginals = map2(H9_1_sig, marginals,
                     \(sig, marginal) {
                        if(sig) return(marginal |> as.tibble()) else return(tibble(msf_sc = NA))
                          }
    )
  )

refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)

Warning: There was 1 warning in `mutate()`.
ℹ In argument: `marginals = map2(...)`.
Caused by warning:
! `as.tibble()` was deprecated in tibble 2.0.0.
ℹ Please use `as_tibble()` instead.
ℹ The signature and semantics have changed, see `?as_tibble`.

In [22]:

Code

H9_data <- 
H9_data |> 
  mutate(
    H9_r2 = pmap(list(H9_ns_model, data, family, response), r2_helper),
    H9_marginal = map_dbl(H9_r2, \(x) x$R2_marginal),
    H9_marginal = ifelse(H9_1_sig, H9_marginal, NA)
  ) 
# |> pull(H9_r2)

Basic table

In [23]:

Code

H9_tab <- 
H9_data |> 
  select(
    metric_type, name, H9_1_sig, H9_1_p, marginals, H9_marginal
  ) |> 
  unnest(marginals) |> 
  gt() |> 
  fmt_number() |>
    sub_missing(H9_marginal) |>
  fmt(H9_1_p, fns = style_pvalue) |>
  cols_merge(
    c(msf_sc.trend, SE, lower.CL, upper.CL),
    pattern = "<<**{1}** ±{2} ({3} - {4})>>"
  ) |> 
  fmt(name, fns = \(x) x |> str_replace_all("_", " ") |> str_to_sentence()) |> 
  as.data.frame() |> 
  gt() |> 

  tab_style(style = cell_text(weight = "bold"),
            locations = cells_body(H9_1_p, H9_1_sig == "TRUE")) |> 
  cols_hide(c(H9_1_sig, df, msf_sc, metric_type, SE, lower.CL, upper.CL)) |> 
  cols_label(
    H9_1_p = "p-value",
    msf_sc.trend = md("$MSF_{sc}$"),
    name = "Metric",
    H9_marginal = md("$R^2_{marginal}$")
  ) |> 
  fmt_markdown(msf_sc.trend) |> 
  tab_header(
    "H9: Model results",
    subtitle = H9
  ) |> 
  tab_footnote(
    "p-value adjustment for n=5 model comparisons with FDR correction",
    locations = cells_column_labels(H9_1_p)
  ) |> 
  sub_missing()

In [24]:

Code

H9_tab

Metric	p-value¹	MSF_{sc}	R^2_{marginal}
H9: Model results
Hypothesis: Timing-based metrics of personal light exposure are significantly associated with chronotype.
Brightest 10h midpoint	<0.001	0.31 ±0.07 (0.17 - 0.44)	0.05
Darkest 10h midpoint	<0.001	0.40 ±0.07 (0.27 - 0.53)	0.08
Mean timing above 250	<0.001	0.32 ±0.07 (0.19 - 0.45)	0.06
First timing above 250	<0.001	0.44 ±0.09 (0.26 - 0.62)	0.06
Last timing above 250	0.12		—
¹ p-value adjustment for n=5 model comparisons with FDR correction

Code

H9_tab |> gtsave("tables/H9.png")

file:////var/folders/9p/326_k3kx43qbn_cyl1rqfhb00000gn/T//RtmpMZyGIO/file7a7e292ab1c8.html screenshot completed

Visualization

In [25]:

Code

#| fig-width: 10
#| fig-height: 5
H9_plot1 <- 
H9_data |> 
  select(name, data) |> 
  unnest(data) |> 
  site_conv_mutate(rev = FALSE) |> 
  filter_out(name == "last_timing_above_250") |> 
  ggplot(aes(x = msf_sc, y = metric)) + 
  geom_point(aes(color = site), alpha = 0.5) + 
  facet_wrap(~name, scales = "free") +
  guides(color = "none") +
  theme_cowplot() +
  labs(y = "Local time of day (hr, 0 is midnight)", x = "MSF_sc") +
  geom_smooth(method = "lm") +
  scale_color_manual(values = melidos_colors)

H9_plot2 <- 
H9_data |> 
  select(name, data) |> 
  unnest(data) |> 
  site_conv_mutate(rev = TRUE) |> 
  filter_out(name == "last_timing_above_250") |> 
  ggplot(aes(x = msf_sc, y = site, color = site)) + 
  geom_boxplot() + 
  theme_cowplot() +
  guides(color = "none") +
  labs(y = NULL, x = "MSF_sc") +
  scale_color_manual(values = melidos_colors)

H9_plot1 + H9_plot2 + plot_layout(widths = c(3,2)) + plot_annotation(tag_levels = "A")

`geom_smooth()` using formula = 'y ~ x'

Warning: Removed 66 rows containing non-finite outside the scale range
(`stat_smooth()`).

Warning: Removed 66 rows containing missing values or values outside the scale range
(`geom_point()`).

Code

#| fig-width: 10
#| fig-height: 5
ggsave("figures/Fig7.pdf", width = 10, height = 5)

`geom_smooth()` using formula = 'y ~ x'

Warning: Removed 66 rows containing non-finite outside the scale range
(`stat_smooth()`).
Removed 66 rows containing missing values or values outside the scale range
(`geom_point()`).

Code

#| fig-width: 10
#| fig-height: 5
ggsave("figures/Fig7.png", width = 10, height = 5)

`geom_smooth()` using formula = 'y ~ x'

Warning: Removed 66 rows containing non-finite outside the scale range
(`stat_smooth()`).
Removed 66 rows containing missing values or values outside the scale range
(`geom_point()`).

In [26]:

Code

lm(msf_sc ~ site, data = ct, contrasts = list(site = "contr.sum")) |> drop1(test = "Chisq")

Single term deletions

Model:
msf_sc ~ site
       Df Sum of Sq    RSS     AIC  Pr(>Chi)    
<none>              195.53  28.246              
site    8    123.54 319.08 102.839 3.524e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Code

lm(msf_sc ~ site, data = ct, contrasts = list(site = "contr.sum")) |> 
  emmeans(~ site) |> contrast() |> plot()

Code

lm(msf_sc ~ photoperiod + site, data = H9_data$data[[1]]) |> drop1(test = "Chisq")

Single term deletions

Model:
msf_sc ~ photoperiod + site
            Df Sum of Sq     RSS    AIC  Pr(>Chi)    
<none>                    868.75  81.35              
photoperiod  1     30.26  899.01 106.91 1.519e-07 ***
site         8    564.55 1433.29 468.39 < 2.2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Code

lm(msf_sc ~ photoperiod + site, data = H9_data$data[[1]]) |> summary()


Call:
lm(formula = msf_sc ~ photoperiod + site, data = H9_data$data[[1]])

Residuals:
    Min      1Q  Median      3Q     Max 
-2.8954 -0.7269 -0.1027  0.6803  3.2822 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  5.52364    0.36600  15.092  < 2e-16 ***
photoperiod -0.11086    0.02107  -5.263 1.83e-07 ***
siteFUSPCEU  0.63382    0.18265   3.470 0.000548 ***
siteIZTECH   1.26405    0.16457   7.681 4.65e-14 ***
siteKNUST   -1.65067    0.17764  -9.292  < 2e-16 ***
siteMPI      0.44636    0.15966   2.796 0.005305 ** 
siteRISE    -0.28559    0.15627  -1.828 0.067990 .  
siteTHUAS    0.50039    0.15955   3.136 0.001774 ** 
siteTUM      1.53958    0.16909   9.105  < 2e-16 ***
siteUCR     -0.46223    0.22657  -2.040 0.041671 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.045 on 795 degrees of freedom
Multiple R-squared:  0.4194,    Adjusted R-squared:  0.4128 
F-statistic: 63.81 on 9 and 795 DF,  p-value: < 2.2e-16

Code

lm(msf_sc ~ photoperiod + site, data = H9_data$data[[1]]) |> 
  emmeans(~ site) |> contrast()

 contrast       estimate     SE  df t.ratio p.value
 BAUA effect      -0.221 0.1120 795  -1.977  0.0484
 FUSPCEU effect    0.413 0.1140 795   3.627  0.0005
 IZTECH effect     1.043 0.1030 795  10.130 <0.0001
 KNUST effect     -1.871 0.1160 795 -16.136 <0.0001
 MPI effect        0.226 0.0928 795   2.432  0.0171
 RISE effect      -0.506 0.1200 795  -4.222 <0.0001
 THUAS effect      0.280 0.1140 795   2.462  0.0171
 TUM effect        1.319 0.1400 795   9.429 <0.0001
 UCR effect       -0.683 0.1740 795  -3.929  0.0002

P value adjustment: fdr method for 9 tests

Interpretation

Chronotype is an important dependency timing-based metrics. Per one hour later MSF_sc, participants experience important aspects of the day about 0.31 to 0.44 hours later in the day, including the brightest and darkest 10h midpoint, mean and first timing about 250lx melanopic EDI. The last timing above 250 lx melanopic EDI was not significant
Chronotype is significantly different between sites

Hypothesis 10

Details

Hypothesis 10 states:

Personal light exposure metrics depend on age and sex

Type of model: GLMM

Base formulas: - \text{mel EDI} \sim \text{sex/age} + \text{Site} + (1|\text{Participant})

Notes: - Site was sum-coded, so as to calculate an average effect of site - Error family depends on the metric

Specific formulas: - \text{mel EDI} \sim \text{Site} + (1|\text{Participant}) (Null-model) - \text{metric} \sim \text{sex/age}*\text{Site} + (1|\text{Participant}) (Interaction-model)

False-discovery-rate correction: - 17 metrics with one confirmatory analysis times 2 variables: n=34 - For sites (interaction), n=9

Preparation

In [27]:

Code

load demographics

#| filename: load demographics
demographics <- load_data("demographics") |> flatten_data()

loading modality: demographics ■■■■■■■■■■■                       33% |  ETA:  2s

In [28]:

Code

lm(age ~ site, data = demographics) |> emmeans(~ site) |> contrast()

 contrast       estimate   SE  df t.ratio p.value
 BAUA effect        4.72 1.85 182   2.552  0.0259
 FUSPCEU effect     3.57 1.88 182   1.898  0.1066
 IZTECH effect     -6.30 2.15 182  -2.935  0.0113
 KNUST effect      -7.95 2.27 182  -3.501  0.0026
 MPI effect        -2.95 1.79 182  -1.648  0.1456
 RISE effect        9.41 2.15 182   4.381  0.0002
 THUAS effect       2.30 2.00 182   1.152  0.2823
 TUM effect        -4.35 2.73 182  -1.591  0.1456
 UCR effect         1.54 1.52 182   1.013  0.3122

P value adjustment: fdr method for 9 tests

Code

lm(age ~ site, data = demographics) |> drop1(test = "Chisq")

Single term deletions

Model:
age ~ site
       Df Sum of Sq   RSS    AIC  Pr(>Chi)    
<none>              16217 866.33              
site    8    4498.5 20715 897.09 1.702e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Code

demographics |> ggplot(aes(x = site, y = age)) + geom_boxplot()

In [29]:

Code

glm(sex ~ site, data = demographics, family = binomial()) |> emmeans(~ site) |> contrast()

 contrast       estimate    SE  df z.ratio p.value
 BAUA effect     0.00554 0.394 Inf   0.014  0.9888
 FUSPCEU effect -0.45601 0.417 Inf  -1.093  0.6631
 IZTECH effect  -0.43354 0.475 Inf  -0.913  0.6631
 KNUST effect    0.03906 0.483 Inf   0.081  0.9888
 MPI effect      0.01844 0.381 Inf   0.048  0.9888
 RISE effect     0.77873 0.475 Inf   1.640  0.6631
 THUAS effect    0.57806 0.433 Inf   1.336  0.6631
 TUM effect     -0.23287 0.591 Inf  -0.394  0.9888
 UCR effect     -0.29741 0.331 Inf  -0.899  0.6631

Results are given on the log odds ratio (not the response) scale. 
P value adjustment: fdr method for 9 tests

Code

glm(sex ~ site, data = demographics, family = binomial()) |> drop1(test = "Chisq")

Single term deletions

Model:
sex ~ site
       Df Deviance    AIC   LRT Pr(>Chi)
<none>      255.94 273.94               
site    8   262.89 264.89 6.948   0.5423

Code

demographics |> ggplot(aes(x = site, fill = sex)) + geom_bar(position = position_dodge())

age shows significant differences between sites, sex does not

In [30]:

Code cell - Formula setup for inference

#| code-summary: Code cell - Formula setup for inference
H10_data <-
  metrics |> 
  left_join(model_tbl, by = "name") |> 
  filter_out(is.na(response)) |> 
  mutate(
    H10_1a = "~ site + age + (1|Id)",
    H10_1s = "~ site + sex + (1|Id)",
    H10_0 = "~ site + (1|Id)",
    H10_a = "~ age + (1|Id)",
    H10_00 = "~ 1 + (1|Id)",
    H10_ia = "~ site*age + (1|Id)",
    H10_is = "~ site*sex + (1|Id)",
    across(H10_1a:H10_is, 
           \(x) ifelse(engine == "lm", str_remove(x, " \\+ \\(1\\|Id\\)"), x)
             ),
         )

H10_formulas <-  names(H10_data) |> subset(grepl("H10_", names(H10_data)))

H10_data <- 
  H10_data |> relocate(all_of(H10_formulas), .after = family)
H10_fdr_n <- H10_data |> filter(!is.na(response)) |> nrow() *2

In [31]:

Code

#| echo: false
H10_data |> 
  select(-data) |>
  filter(!is.na(response)) |> 
  arrange(metric_type, name) |> 
  mutate(family = map(family, 1),
         rowname = seq_along(1:length(name))) |> 
  group_by(metric_type) |> 
  gt() |> 
  cols_label_with(fn = \(x) x |> str_to_title() |> str_replace_all("_", " ")) |> 
  sub_missing(missing_text = "") |> 
  sub_values(values = "NULL", replacement = "") |> 
  fmt(name, fns = \(x) x |> str_to_title() |>  str_replace_all("_", " ")) |> 
  sub_values(values = "MDER", replacement = "MDER") |> 
  tab_header("Hypothesis 9: metric model specifications",
             subtitle = H10) |> 
  tab_footnote("Confirmatory analysis will use a p-value correction through the false-discovery rate for n=34 comparisons") |>  
  tab_footnote("These are the models used for the confirmatory analysis. Each H1 1 and H1 0 pair are compaired with likelihood-ratio tests to determine whether site is meaningful", locations = cells_column_labels(c(H10_1s, H10_1a, H10_0))) |> 
  tab_footnote("These are the models used for exploratory analyses. Models are compared with all sensible combinations of hierarchical effects", locations = cells_column_labels(c(H10_is, H10_ia, H10_00)))

	Name	Type	Metric family	Engine	Response	Family	H10 1a¹	H10 1s¹	H10 0¹	H10 a	H10 00²	H10 ia²	H10 is²
Hypothesis 9: metric model specifications
Hypothesis: Personal light exposure metrics depend on age and sex.
duration
1	Duration above 1000	participant-day		glmmTMB	metric	tweedie	~ site + age + (1\|Id)	~ site + sex + (1\|Id)	~ site + (1\|Id)	~ age + (1\|Id)	~ 1 + (1\|Id)	~ site*age + (1\|Id)	~ site*sex + (1\|Id)
2	Duration above 250 wake	participant-day		glmmTMB	metric	tweedie	~ site + age + (1\|Id)	~ site + sex + (1\|Id)	~ site + (1\|Id)	~ age + (1\|Id)	~ 1 + (1\|Id)	~ site*age + (1\|Id)	~ site*sex + (1\|Id)
3	Duration below 10 pre-Sleep	participant-day		glmmTMB	metric	tweedie	~ site + age + (1\|Id)	~ site + sex + (1\|Id)	~ site + (1\|Id)	~ age + (1\|Id)	~ 1 + (1\|Id)	~ site*age + (1\|Id)	~ site*sex + (1\|Id)
4	Duration below 1 sleep	participant-day		glmmTMB	metric	tweedie	~ site + age + (1\|Id)	~ site + sex + (1\|Id)	~ site + (1\|Id)	~ age + (1\|Id)	~ 1 + (1\|Id)	~ site*age + (1\|Id)	~ site*sex + (1\|Id)
5	Period above 250	participant-day		lmer	log_zero_inflated(metric)	gaussian	~ site + age + (1\|Id)	~ site + sex + (1\|Id)	~ site + (1\|Id)	~ age + (1\|Id)	~ 1 + (1\|Id)	~ site*age + (1\|Id)	~ site*sex + (1\|Id)
dynamics
6	Interdaily stability	participant	Interdaily stability	lm	qlogis(metric)	gaussian	~ site + age	~ site + sex	~ site	~ age	~ 1	~ site*age	~ site*sex
7	Intradaily variability	participant	Intradaily variability	lm	metric	gaussian	~ site + age	~ site + sex	~ site	~ age	~ 1	~ site*age	~ site*sex
exposure history
8	Dose	participant-day		lmer	log_zero_inflated(metric)	gaussian	~ site + age + (1\|Id)	~ site + sex + (1\|Id)	~ site + (1\|Id)	~ age + (1\|Id)	~ 1 + (1\|Id)	~ site*age + (1\|Id)	~ site*sex + (1\|Id)
level
9	Mean	participant-day		lmer	log_zero_inflated(metric)	gaussian	~ site + age + (1\|Id)	~ site + sex + (1\|Id)	~ site + (1\|Id)	~ age + (1\|Id)	~ 1 + (1\|Id)	~ site*age + (1\|Id)	~ site*sex + (1\|Id)
10	Brightest 10h mean	participant-day		lmer	log_zero_inflated(metric)	gaussian	~ site + age + (1\|Id)	~ site + sex + (1\|Id)	~ site + (1\|Id)	~ age + (1\|Id)	~ 1 + (1\|Id)	~ site*age + (1\|Id)	~ site*sex + (1\|Id)
11	Darkest 10h mean	participant-day		lmer	log_zero_inflated(metric)	gaussian	~ site + age + (1\|Id)	~ site + sex + (1\|Id)	~ site + (1\|Id)	~ age + (1\|Id)	~ 1 + (1\|Id)	~ site*age + (1\|Id)	~ site*sex + (1\|Id)
spectrum
12	MDER	participant-day	MDER	lmer	metric	gaussian	~ site + age + (1\|Id)	~ site + sex + (1\|Id)	~ site + (1\|Id)	~ age + (1\|Id)	~ 1 + (1\|Id)	~ site*age + (1\|Id)	~ site*sex + (1\|Id)
timing
13	Brightest 10h midpoint	participant-day		lmer	metric	gaussian	~ site + age + (1\|Id)	~ site + sex + (1\|Id)	~ site + (1\|Id)	~ age + (1\|Id)	~ 1 + (1\|Id)	~ site*age + (1\|Id)	~ site*sex + (1\|Id)
14	Darkest 10h midpoint	participant-day		lmer	metric	gaussian	~ site + age + (1\|Id)	~ site + sex + (1\|Id)	~ site + (1\|Id)	~ age + (1\|Id)	~ 1 + (1\|Id)	~ site*age + (1\|Id)	~ site*sex + (1\|Id)
15	First timing above 250	participant-day		lmer	metric	gaussian	~ site + age + (1\|Id)	~ site + sex + (1\|Id)	~ site + (1\|Id)	~ age + (1\|Id)	~ 1 + (1\|Id)	~ site*age + (1\|Id)	~ site*sex + (1\|Id)
16	Last timing above 250	participant-day		lmer	metric	gaussian	~ site + age + (1\|Id)	~ site + sex + (1\|Id)	~ site + (1\|Id)	~ age + (1\|Id)	~ 1 + (1\|Id)	~ site*age + (1\|Id)	~ site*sex + (1\|Id)
17	Mean timing above 250	participant-day		lmer	metric	gaussian	~ site + age + (1\|Id)	~ site + sex + (1\|Id)	~ site + (1\|Id)	~ age + (1\|Id)	~ 1 + (1\|Id)	~ site*age + (1\|Id)	~ site*sex + (1\|Id)
Confirmatory analysis will use a p-value correction through the false-discovery rate for n=34 comparisons
¹ These are the models used for the confirmatory analysis. Each H1 1 and H1 0 pair are compaired with likelihood-ratio tests to determine whether site is meaningful
² These are the models used for exploratory analyses. Models are compared with all sensible combinations of hierarchical effects

In [32]:

Code cell - Formula creation

#| code-summary: Code cell - Formula creation
H10_data <-
  H10_data |> 
  mutate(across(all_of(H10_formulas),
                \(x) map2(x, response, \(y,z) make_formula(z, y))
                )
         )

In [33]:

Code

#| fig-height: 8
#| fig-width: 15
H10_data |> 
  select(name, data) |> 
  unnest(data) |> 
  ggplot(aes(x = age, y = metric)) + 
  geom_point() + 
  facet_wrap(~name, scales = "free") +
  geom_smooth()

`geom_smooth()` using method = 'loess' and formula = 'y ~ x'

Warning: Removed 293 rows containing non-finite outside the scale range
(`stat_smooth()`).

Warning: Removed 293 rows containing missing values or values outside the scale range
(`geom_point()`).

In [34]:

Code

#| fig-height: 8
#| fig-width: 15
H10_data |> 
  select(name, data) |> 
  unnest(data) |> 
  ggplot(aes(x = sex, y = metric)) + 
  geom_boxplot() + 
  facet_wrap(~name, scales = "free") +
  geom_smooth()

Warning: Removed 293 rows containing non-finite outside the scale range
(`stat_boxplot()`).

`geom_smooth()` using method = 'loess' and formula = 'y ~ x'

Warning: Removed 293 rows containing non-finite outside the scale range
(`stat_smooth()`).

Analysis

Fit models

In [35]:

Code cell - Model fit

#| code-summary: Code cell - Model fit
#| message: false
H10_data <-
  H10_data |> 
  mutate(
    across(all_of(H10_formulas),
           \(x) pmap(list(data, x, engine, family), fit_model),
           .names = "{.col}_model")
    )

Model diagnostics are omitted here. We checked in RQ1 for sensible base models

In [36]:

Code cell - Model comparisons

#| code-summary: Code cell - Model comparisons
#| message: false
H10_data <- 
  H10_data |> 
  mutate(
    H10_1s_comp = map2(H10_0_model, H10_1s_model, model_anova),
    H10_1s_p = map2(H10_1s_comp, engine, model_comp_p, n = H10_fdr_n) |> list_c(),
    H10_1s_sig = H10_1s_p <= sig.level,
    
    H10_1a_comp = map2(H10_0_model, H10_1a_model, model_anova),
    H10_1a_p = map2(H10_1a_comp, engine, model_comp_p, n = H10_fdr_n) |> list_c(),
    H10_1a_sig = H10_1a_p <= sig.level,
  
    H10_ia_comp = map2(H10_1a_model, H10_ia_model, model_anova),
    H10_ia_p = map2(H10_ia_comp, engine, model_comp_p, n = H10_fdr_n) |> list_c(),
    H10_ia_sig = H10_ia_p <= sig.level,
    
    marginals = map(H10_1a_model, \(x) emtrends(x, ~ age, var = "age") |> as.tibble()),
  )

refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)
refitting model(s) with ML (instead of REML)

In [37]:

Code

H10_data |> 
  select(metric_type, name, ends_with(c("_p", "_sig"))) |> 
  gt() |> 
  fmt(where(is.numeric), fns = style_pvalue) |> 
  tab_style(cell_text(weight = "bold"),
            locations = cells_body(H10_1a_p, H10_1a_sig)) |> 
  cols_hide(ends_with("_sig")) |> 
  tab_header("H10: Overall model results",
             subtitle = H10) |> 
  tab_footnote("H10_1s_p: p-values for sex, H10_1a_p: p-values for age, H10_ia_p: p-values for interaction of age with site. p-values where adjusted with FDR method and n=34", locations = cells_column_labels(ends_with("_p")))

metric_type	name	H10_1s_p¹	H10_1a_p¹	H10_ia_p¹
H10: Overall model results
Hypothesis: Personal light exposure metrics depend on age and sex.
dynamics	interdaily_stability	>0.9	>0.9	>0.9
dynamics	intradaily_variability	>0.9	0.8	>0.9
level	Mean	>0.9	>0.9	>0.9
level	brightest_10h_mean	>0.9	0.14	>0.9
timing	brightest_10h_midpoint	>0.9	>0.9	0.3
level	darkest_10h_mean	>0.9	>0.9	0.8
timing	darkest_10h_midpoint	>0.9	>0.9	>0.9
duration	duration_above_1000	0.7	0.017	>0.9
duration	period_above_250	>0.9	0.5	>0.9
timing	mean_timing_above_250	>0.9	>0.9	>0.9
timing	first_timing_above_250	>0.9	>0.9	>0.9
timing	last_timing_above_250	>0.9	>0.9	>0.9
exposure history	dose	>0.9	0.2	0.5
spectrum	MDER	>0.9	>0.9	>0.9
duration	duration_below_10_pre-sleep	>0.9	>0.9	>0.9
duration	duration_below_1_sleep	>0.9	>0.9	>0.9
duration	duration_above_250_wake	>0.9	>0.9	>0.9
¹ H10_1s_p: p-values for sex, H10_1a_p: p-values for age, H10_ia_p: p-values for interaction of age with site. p-values where adjusted with FDR method and n=34

After the initial testing, only a single model was significant. The next table summarizes this model

In [38]:

Code

H10_data <- 
H10_data |> 
  mutate(
    H10_r2 = pmap(list(H10_a_model, data, family, response), r2_helper),
    H10_marginal = map_dbl(H10_r2, \(x) x$R2_marginal),
    H10_marginal = ifelse(H10_1a_sig, H10_marginal, NA)
  ) 
# |> pull(H10_r2)

Basic table

In [39]:

Code

H10_tab <-
H10_data |> 
  filter(H10_1a_sig) |> 
  select(
    metric_type, name, H10_1a_sig, H10_1a_p, marginals, H10_marginal
  ) |> 
  unnest(marginals) |> 
  gt() |> 
  fmt_number(decimals = 3) |>
    sub_missing(H10_marginal) |>
  fmt(H10_1a_p, fns = style_pvalue) |>
  cols_merge(
    c(age.trend, SE, asymp.LCL, asymp.UCL),
    pattern = "<<**{1}** ±{2} ({3} - {4})>>"
  ) |> 
  fmt(name, fns = \(x) x |> str_replace_all("_", " ") |> str_to_sentence()) |> 
  as.data.frame() |> 
  gt() |> 

  tab_style(style = cell_text(weight = "bold"),
            locations = cells_body(H10_1a_p, H10_1a_sig == "TRUE")) |> 
  cols_hide(c(H10_1a_sig, df, age, metric_type, SE, asymp.LCL, asymp.UCL)) |> 
  cols_label(
    H10_1a_p = "p-value",
    age.trend = "Age",
    name = "Metric",
    H10_marginal = md("$R^2_{marginal}$")
  ) |> 
  fmt_markdown(age.trend) |> 
  tab_header(
    "H10: Model results",
    subtitle = H10
  ) |> 
  tab_footnote(
    "p-value adjustment for n=34 model comparisons with FDR correction",
    locations = cells_column_labels(H10_1a_p)
  ) |> 
  sub_missing()

In [40]:

Code

H10_tab

Metric	p-value¹	Age	R^2_{marginal}
H10: Model results
Hypothesis: Personal light exposure metrics depend on age and sex.
Duration above 1000	0.017	0.024 ±0.007 (0.011 - 0.037)	0.074
¹ p-value adjustment for n=34 model comparisons with FDR correction

Code

H10_tab |> gtsave("tables/H10.png")

file:////var/folders/9p/326_k3kx43qbn_cyl1rqfhb00000gn/T//RtmpMZyGIO/file7a7e198de9a1.html screenshot completed

Visualization

In [41]:

Code

#| fig-width: 10
#| fig-height: 5
H10_plot1 <- 
H10_data |> 
  select(name, data) |> 
  unnest(data) |> 
  site_conv_mutate(rev = FALSE) |> 
  filter(name == "duration_above_1000") |> 
  ggplot(aes(x = age, y = metric)) + 
  geom_point(aes(color = site), alpha = 0.5) + 
  facet_wrap(~name, scales = "free") +
  guides(color = "none") +
  theme_cowplot() +
  labs(y = "Local time of day (hr)", x = "Age (yrs)") +
  geom_smooth(method = "lm") +
  scale_color_manual(values = melidos_colors)

H10_plot2 <- 
H10_data |> 
  select(name, data) |> 
  unnest(data) |> 
  site_conv_mutate(rev = TRUE) |> 
  filter_out(name == "last_timing_above_250") |> 
  ggplot(aes(x = age, y = site, color = site)) + 
  geom_boxplot() + 
  theme_cowplot() +
  guides(color = "none") +
  labs(y = NULL, x = "Age (yrs)") +
  scale_color_manual(values = melidos_colors)

H10_plot1 + H10_plot2 + plot_layout(widths = c(3,2)) + plot_annotation(tag_levels = "A")

`geom_smooth()` using formula = 'y ~ x'

Code

#| fig-width: 10
#| fig-height: 5
ggsave("figures/Fig8.pdf", width = 10, height = 5)

`geom_smooth()` using formula = 'y ~ x'

Code

#| fig-width: 10
#| fig-height: 5
ggsave("figures/Fig8.png", width = 10, height = 5)

`geom_smooth()` using formula = 'y ~ x'

Interpretation

Overall, there are no strong connections between melanopic EDI based metrics and age or sex in these datasets. The execption is a strong dependency of age with Duration above 1000 lx melanopic EDI, where for every 10 years of age, about 15 minutes more daily time above 1000 lx is reached. This effect is marginally relevant, with about 7.4% of variance explained.

Hypothesis 11

Hypothesis 11 states:

Personal light exposure patterns depend on sex

Type of model: HGAM (Hierarchical GAM)

Base formula: \text{melanopic EDI} \sim s(\text{Time}) + s(\text{Time}, \text{Sex}) + s(\text{Time}, \text{Site}) + s(\text{Time}, \text{Participant})

Additional notes: - Site effects are modelled with the sz basis, that allows for the most identifiable modelling of parametric differences - Participant effects are modelled with the fs basis, which models the participant-effects as random smooths - melanopic EDI is scaled with log_zero_inflated() before model fitting. - The modle is based on the result from H2

Specific formulas: - \text{melanopic EDI} \sim s(\text{Time}) + s(\text{Time}, \text{Site}) + s(\text{Time}, \text{Participant}) (comparison to the base formula)

False-discovery-rate correction: 1 (there is only one main comparison that tests the effect of site)

Preparation

In [42]:

Code

Load prepared data

#| filename: Load prepared data
load("data/metrics_separate_glasses.RData")
load("data/preprocessed_glasses_2.RData")

metric_participanthour <-
  metric_glasses_participanthour |>
  add_Time_col() |> 
  ungroup() |> 
  mutate(Time = as.numeric(Time)/3600 + 0.25,
         across(c(site, Id, sleep, State.Brown, wear, photoperiod.state),
                fct),
         Date = factor(Date),
         Id_date = interaction(Id, Date),
         lzMEDI = log_zero_inflated(MEDI),
         photoperiod = (dusk - dawn) |> as.numeric()
  ) |> 
  group_by(Id) |> 
  mutate(AR.start = ifelse(row_number() == 1, TRUE, FALSE)) |> 
  ungroup() |> 
  left_join(demographics, by = c("site", "Id")) |> 
  mutate(across(c(site, Id), fct))

In [43]:

Code

Overview plot of the data (median with IQR)

#| filename: Overview plot of the data (median with IQR)
metric_participanthour |>
  group_by(sex) |> 
  aggregate_Date(
    date.handler = \(x) ymd("2025-01-01"),
    numeric.handler = \(x) median(x, na.rm = TRUE),
    p75 = quantile(MEDI, p=0.75, na.rm = TRUE),
    p25 = quantile(MEDI, p=0.25, na.rm = TRUE)
  ) |> 
  gg_doubleplot(geom = "blank", jco_color = FALSE, facetting = FALSE,
                x.axis.breaks.repeat = ~Datetime_breaks(.x, by = "12 hours", shift =
    lubridate::duration(0, "hours"))) +
  geom_ribbon(aes(ymin = p25, ymax = p75, fill = sex), alpha = 0.5) +
  geom_line(aes(color = sex)) +
  geom_hline(yintercept = 250, linetype = "dashed") +
  # facet_wrap(~sex, scales = "free_x") +
  guides(fill = "none") +
  theme(panel.spacing = unit(2, "lines"))

In [44]:

Code

H11 formulas

#| filename: H11 formulas
H11_form_1 <- lzMEDI ~ s(Time, k = 12) + s(Time, photoperiod.state, bs = "sz", k=12) + s(Time, sex, bs = "sz", k = 12) + s(Time, site, bs = "sz", k = 12) + s(Time, Id, bs = "fs") + s(Id_date, bs = "re")
H11_form_0 <- lzMEDI ~ s(Time, k = 12) + s(Time, photoperiod.state, bs = "sz", k=12) + s(Time, site, bs = "sz", k = 12) + s(Time, Id, bs = "fs") + s(Id_date, bs = "re")

Analysis

Base model (H11_1)

In [45]:

Code

Main model

#| filename: Main model
H11_model_1 <- 
  bam(H11_form_1, 
      data = metric_participanthour, 
      discrete = TRUE,
      method = "fREML",
      nthreads = 10
      )

r1 <- start_value_rho(H11_model_1, 
                      plot=TRUE)

Code

#| filename: Main model
H11_model_1 <- 
  bam(H11_form_1, 
      data = metric_participanthour, 
      discrete = TRUE,
      method = "fREML",
      nthreads = 10,
      rho = r1,
      AR.start = AR.start
      )

acf_resid(H11_model_1, split_pred = "AR.start")

Code

#| filename: Main model
H11_summary_1 <- summary(H11_model_1)
H11_summary_1


Family: gaussian 
Link function: identity 

Formula:
lzMEDI ~ s(Time, k = 12) + s(Time, photoperiod.state, bs = "sz", 
    k = 12) + s(Time, sex, bs = "sz", k = 12) + s(Time, site, 
    bs = "sz", k = 12) + s(Time, Id, bs = "fs") + s(Id_date, 
    bs = "re")

Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.74880    0.03973   18.84   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                              edf   Ref.df       F  p-value    
s(Time)                    10.733   10.862 192.567  < 2e-16 ***
s(Time,photoperiod.state)   6.369    7.411   6.702  < 2e-16 ***
s(Time,sex)                 9.745   10.915   3.410 8.24e-05 ***
s(Time,site)               65.108   72.415   5.398  < 2e-16 ***
s(Time,Id)                725.287 1390.000   3.504  < 2e-16 ***
s(Id_date)                352.651  800.000   0.856  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.767   Deviance explained = 77.5%
fREML =  33225  Scale est. = 0.50524   n = 37603

Code

#| filename: Main model
H11_model_1 |> glance()

# A tibble: 1 × 9
     df  logLik    AIC    BIC deviance df.residual  nobs adj.r.squared  npar
  <dbl>   <dbl>  <dbl>  <dbl>    <dbl>       <dbl> <int>         <dbl> <int>
1 1171. -31291. 64944. 75026.   20018.      36432. 37603         0.767  2351

Code

#| filename: Main model
H11_model_1 |> appraise()

In [46]:

Code

Variance explained

#| filename: Variance explained
Xp <- predict(H11_model_1, type = "terms")
variance <- apply(Xp, 2, var)
cor(Xp)

                                s(Time) s(Time,photoperiod.state)   s(Time,sex)
s(Time)                    1.0000000000               0.322103335 -0.0434117252
s(Time,photoperiod.state)  0.3221033348               1.000000000  0.0059135114
s(Time,sex)               -0.0434117252               0.005913511  1.0000000000
s(Time,site)               0.0349948576               0.069968384  0.0561000618
s(Time,Id)                 0.0040160603               0.019865415 -0.0067274337
s(Id_date)                -0.0002508802               0.001535659  0.0004158915
                           s(Time,site)   s(Time,Id)    s(Id_date)
s(Time)                    3.499486e-02  0.004016060 -2.508802e-04
s(Time,photoperiod.state)  6.996838e-02  0.019865415  1.535659e-03
s(Time,sex)                5.610006e-02 -0.006727434  4.158915e-04
s(Time,site)               1.000000e+00  0.020393842 -1.175562e-05
s(Time,Id)                 2.039384e-02  1.000000000  1.429862e-01
s(Id_date)                -1.175562e-05  0.142986197  1.000000e+00

Code

#| filename: Variance explained
H11_R2 <- (variance/sum(variance)) |> enframe(value = "R2", name = "parameter")
H11_R2

# A tibble: 6 × 2
  parameter                      R2
  <chr>                       <dbl>
1 s(Time)                   0.837  
2 s(Time,photoperiod.state) 0.00334
3 s(Time,sex)               0.00251
4 s(Time,site)              0.0473 
5 s(Time,Id)                0.0990 
6 s(Id_date)                0.0105

Null model (H11_0)

In [47]:

Code

Null model

#| filename: Null model
H11_model_0 <- 
  bam(H11_form_0, 
      data = metric_participanthour, 
      discrete = TRUE,
      method = "fREML",
      nthreads = 8
      )

r1 <- start_value_rho(H11_model_0, 
                      plot=TRUE)

Code

#| filename: Null model
H11_model_0 <- 
  bam(H11_form_0, 
      data = metric_participanthour, 
      discrete = TRUE,
      method = "fREML",
      nthreads = 10,
      rho = r1,
      AR.start = AR.start
      )

acf_resid(H11_model_0, split_pred = "AR.start")

Code

#| filename: Null model
H11_summary_0 <- summary(H11_model_0)
H11_summary_0


Family: gaussian 
Link function: identity 

Formula:
lzMEDI ~ s(Time, k = 12) + s(Time, photoperiod.state, bs = "sz", 
    k = 12) + s(Time, site, bs = "sz", k = 12) + s(Time, Id, 
    bs = "fs") + s(Id_date, bs = "re")

Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.74254    0.03972    18.7   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                              edf   Ref.df       F p-value    
s(Time)                    10.728   10.859 189.248  <2e-16 ***
s(Time,photoperiod.state)   6.372    7.415   6.726  <2e-16 ***
s(Time,site)               65.220   72.433   5.497  <2e-16 ***
s(Time,Id)                739.267 1392.000   3.565  <2e-16 ***
s(Id_date)                352.959  800.000   0.857  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.767   Deviance explained = 77.5%
fREML =  33229  Scale est. = 0.50527   n = 37603

Code

#| filename: Null model
H11_model_0 |> glance()

# A tibble: 1 × 9
     df  logLik    AIC    BIC deviance df.residual  nobs adj.r.squared  npar
  <dbl>   <dbl>  <dbl>  <dbl>    <dbl>       <dbl> <int>         <dbl> <int>
1 1176. -31291. 64952. 75069.   20014.      36427. 37603         0.767  2339

Code

#| filename: Null model
H11_model_0 |> appraise()

Code

#| filename: Null model
H11_model_0 |> draw()

Model comparison

In [48]:

Code

H11_AICs <- AIC(H11_model_1, H11_model_0)
H11_AICs |> as_tibble(rownames = "model") |> gt()

model	df	AIC
H11_model_1	1181.214	64944.31
H11_model_0	1185.365	64952.13

While sex only explains very little of the overall structure (<1% of variance explained), is has a ∆AIC >2 and will thus be included

Model visualization

In [49]:

Code

prepare lightsource model H11 plot

#| filename: prepare lightsource model H11 plot
H11_data <- 
H11_model_1 |> 
  conditional_values(
    condition = list(Time = seq(0.25, 23.75, by = 0.5), "sex", "photoperiod.state"),
    exclude = c("s(Time,Id)", "s(Id_date)", "s(Time,site)")
    ) |> 
  mutate(across(c(.fitted, .lower_ci, .upper_ci),
                exp_zero_inflated)
  )

H11_data <-
metric_participanthour |> 
  Datetime2Time() |> 
  group_by(sex, Time) |> 
  summarize(
    .groups = "drop_last",
    dawn = dawn |> as.numeric() |> mean(na.rm = TRUE)/3600,
    dusk = dusk |> as.numeric() |> mean(na.rm = TRUE)/3600,
    n = n()
  ) |> 
  mutate(
    photoperiod.state = ifelse(between(Time, dawn, dusk), "day", "night")
  ) |> 
  select(-c(dawn, dusk)) |> 
  left_join(H11_data, by = c("sex", "Time", "photoperiod.state"))

H11_data_phot <- 
H11_data |> 
  number_states(photoperiod.state) |> 
  filter(photoperiod.state == "night") |> 
  group_by(sex, photoperiod.state.count) |> 
  summarize(xmin = min(Time),
            xmax = max(Time),
            .groups = "drop_last") |> 
  mutate(across(c(xmin, xmax), \(x) replace_values(x,
                                                   0.5 ~ 0,
                                                   23.5 ~ 24)))

In [50]:

Code

Visualize lightsource model H11

#| filename: Visualize lightsource model H11
#| fig-height: 5
#| fig-width: 14
source("scripts/Brown_bracket.R")

Loading required package: legendry

Code

#| filename: Visualize lightsource model H11
#| fig-height: 5
#| fig-width: 14
H11_pred <-
H11_data |> 
  ggplot(aes()) +
  map(c(1,10,250, 10^4), 
          \(x) geom_hline(aes(yintercept = x), 
                          col = "grey", linetype = "dashed")
      ) +
  geom_rect(data = H11_data_phot |> filter(sex == "Female"),
            aes(xmin = xmin, ymin = -Inf, ymax = Inf, xmax = xmax),
            alpha = 0.25, inherit.aes = FALSE) +
  geom_ribbon(aes(x = Time, ymin = .lower_ci, 
                  ymax = .upper_ci, fill = sex), alpha = 0.5) +
  
  geom_line(aes(x = Time, y = .fitted, 
                col = sex), linewidth = 1) +
  # scale_fill_manual(values = act_colors) +
  # scale_color_manual(values = act_colors) +
  theme_cowplot() +
  scale_y_continuous(trans = "symlog", breaks = c(0,1,10,100,250, 1000)) +
  scale_x_continuous(breaks = seq(0, 24, by = 6)) +
  coord_cartesian(xlim = c(0,24), ylim = c(0,10^4), expand = FALSE) +
  labs(y = "Melanopic EDI (lx)", x = "Local time (hrs)", fill = "Sex", col = "Sex") +
  # facet_wrap(~sex, nrow = 1) +
  guides( y = guide_axis_stack(Brown_bracket, "axis")) +
  # gghighlight(
    # label_key = sex,
    # label_params = list(x = 12, y = 4.5, force = 0, hjust = 0.5)
  # ) +
  # geom_text(aes(label = paste0("|", n, "|"), x = Time, y = 7000), size = rel(1.5)) +
  theme_sub_strip(background = element_blank(),
                  text = element_blank()) +
  theme(panel.spacing = unit(1, "lines"),
        plot.caption = element_textbox_simple(),
        legend.position = "inside",
        legend.position.inside = c(0.45, 0.15)
        )
  # labs(caption = Brown_caption)
H11_pred

In [51]:

Code

df <- H11_model_1$model |> distinct(Time, .keep_all = TRUE)
H11_termplot <- 
difference_sz(H11_model_1, 
              df, 
              "Time", "sex", "Female", "Male",
              "s(Time,sex)",
              exclude = c("s(Time,site)", "s(Time,Id)", "s(Id_date)"),
              unconditional = FALSE,
              level = 0.95
  ) |> 
    mutate(
         signif = lower > 0 | upper < 0,
         signif_id = consecutive_id(signif)) |> 
  ggplot(aes()) +
  map(c(log10(c(0.1, 0.2, 0.5, 2, 5, 10))),
      \(x) geom_hline(yintercept = x, linewidth = 0.5, col = "grey", linetype = 2)) +
  geom_rect(data = H11_data_phot |> filter(sex == "Female"),
            aes(xmin = xmin, ymin = -Inf, ymax = Inf, xmax = xmax),
            alpha = 0.25, inherit.aes = FALSE) +
  geom_ribbon(aes(x = Time, ymin = lower, ymax = upper), alpha = 0.5) +
  geom_hline(yintercept = 0, linewidth = 1, linetype = 2) +
  geom_line(aes(x = Time, y = diff), 
            linewidth = 1) +
  geom_line(data = \(x) filter(x, signif),
            aes(x = Time, y = diff, group = signif_id), 
            col = "red", linewidth = 1) +
    theme_cowplot() +
  scale_y_continuous(labels = expression(10^-1, 5^-1, 2^-1, "1  ", "2  ", "5  ", "10  "), 
                     breaks = log10(c(0.1, 0.2, 0.5, 1, 2, 5, 10)))+
  scale_x_continuous(breaks = seq(0, 24, by = 6)) +
  coord_cartesian(xlim = c(0,24), expand = FALSE) +
  guides(fill = "none") +
  labs(y = "Rel. difference Female to Male", x = "Local time (hrs)") +
  theme_sub_strip(background = element_blank(),
                  text = element_blank()) +
  theme(panel.spacing = unit(1, "lines"),
        plot.caption = element_textbox_simple())

In [52]:

Code

Combination of partial plots H11

#| fig-height: 8
#| fig-width: 8
#| filename: Combination of partial plots H11

H11_pred / H11_termplot +
  plot_annotation(tag_levels = "A", 
                  caption = paste(Brown_caption, 
                                  "Panel B: Significant deviations of work from free days are shown in red", sep = "<br>"),
                  theme = 
                    theme(plot.caption = element_textbox_simple(size = rel(1))
                          )
                  )

Code

#| fig-height: 8
#| fig-width: 8
#| filename: Combination of partial plots H11

ggsave("figures/Fig13.pdf", height = 8, width = 7.3)
ggsave("figures/Fig13.png", height = 8, width = 7.3)

Interpretation

Sex is a minor differentiator in terms of melanopic stimulus, that explanes less than 1% of the variance. The only significant difference between the sexes is a lower value during the morning for females compared to males. This phase lasts about 2 hours and has a peak difference of about factor 2.

Testing for other drivers

In [53]:

Code

mHLEA <- load_data("lightexposurediary") |> flatten_data()

loading modality: lightexposurediary ■■■■■■■■■■■■■■                    44% |  E…

Code

mHLEA_labels <- 
  mHLEA|> extract_labels() |> 
  enframe() |> 
  filter(str_detect(name, "act_")) |> 
  mutate(name = name |> str_remove("act_"))

hourly_data <- 
  metric_participanthour #|> 
  # aggregate_Datetime(
  #   "1 hour",
  #   type = "floor",
  #   numeric.handler = \(x) x |> mean(na.rm = TRUE),
  #   geo.MEDI = MEDI |> log_zero_inflated() |> mean(na.rm = TRUE) |> exp_zero_inflated()
  # )|>
  # add_Date_col(group.by = TRUE) |> 
  # mutate(static = all(MEDI == MEDI[1])) |> 
  # filter_out(static) |> 
  # select(-static) |> 
  # ungroup(Date)

hourly_data <- 
  hourly_data |> 
  group_by(site, Id, Date) |> 
  add_states(mHLEA |> mutate(Date = factor(Date)))

hourly_data2 <-
  hourly_data |> 
  group_by(site, lightsource_primary) |> 
  mutate(n = n(),
         lightsource_primary = lightsource_primary |> 
           replace_when((n < 20 ) ~ NA)
           ) |> 
  group_by(lightsource_primary) |> 
  mutate(n = n(),
         lightsource_primary = lightsource_primary |> 
           replace_when((n < 200 ) ~ NA,
                        lightsource_primary == "Light entering from outside during sleep" ~ NA),
         site = factor(site)
           )

activity_data <- 
hourly_data2 |> 
  ungroup() |> 
  select(site, sex, Id, Time, lzMEDI, Id_date, photoperiod, photoperiod.state, starts_with("act_"), Datetime, lightsource_primary) |> 
  pivot_longer(cols = starts_with("act_"), names_to = "activity", names_prefix = "act_", values_to = "value") |> 
  filter(value, !is.na(lzMEDI)) |>
  distinct(.keep_all = TRUE, site, sex, Time, Id, Datetime, photoperiod.state, activity) |>
  select(-value) |> 
  mutate(
    Id = factor(Id)
  ) |> 
    group_by(Id) |> 
  mutate(AR.start = ifelse(row_number() == 1, TRUE, FALSE)) |> 
  ungroup()

activity_data <-
  activity_data |> 
  mutate(activity = 
           fct(activity) |> 
           fct_collapse(outdoor = c("free_outdoor", "working_outdoor", "road_open")) |> 
           fct_recode(NULL = "other") |> 
           fct_relevel("home")
         )

In [54]:

Code

H11 formulas

#| filename: H11 formulas
H11_form_1b <- lzMEDI ~ s(Time, k = 12) + s(Time, photoperiod.state, bs = "sz", k=12) + s(Time, activity, bs = "sz", k = 12) + s(Time, site, bs = "sz", k = 12) + s(Time, Id, bs = "fs") + s(Id_date, bs = "re")
H11_form_2 <- lzMEDI ~ s(Time, k = 12) + s(Time, photoperiod.state, bs = "sz", k=12) + s(Time, sex, bs = "sz", k = 12) + s(Time, activity, bs = "sz", k = 12) + s(Time, site, bs = "sz", k = 12) + s(Time, Id, bs = "fs") + s(Id_date, bs = "re")

In [55]:

Code

Main model

#| filename: Main model
H11_model_1b <- 
  bam(H11_form_1b, 
      data = activity_data, 
      discrete = TRUE,
      method = "fREML",
      nthreads = 10
      )

r1 <- start_value_rho(H11_model_1b, 
                      plot=TRUE)

Code

#| filename: Main model
H11_model_1b <- 
  bam(H11_form_1b, 
      data = activity_data, 
      discrete = TRUE,
      method = "fREML",
      nthreads = 10,
      rho = r1,
      AR.start = AR.start
      )

acf_resid(H11_model_1b, split_pred = "AR.start")

In [56]:

Code

Activity model

#| filename: Activity model
H11_model_2 <- 
  bam(H11_form_2, 
      data = activity_data, 
      discrete = TRUE,
      method = "fREML",
      nthreads = 10
      )

r1 <- start_value_rho(H11_model_2, 
                      plot=TRUE)

Code

#| filename: Activity model
H11_model_2 <- 
  bam(H11_form_2, 
      data = activity_data, 
      discrete = TRUE,
      method = "fREML",
      nthreads = 10,
      rho = r1,
      AR.start = AR.start
      )

acf_resid(H11_model_2, split_pred = "AR.start")

In [57]:

Code

# H11_model_1b |> draw(select = "s(Time,sex)")
H11_model_2 |> draw(select = "s(Time,sex)")

Code

H11_model_2 |> draw(select = "s(Time,activity)")

Code

AIC(H11_model_1b, H11_model_2)

                   df      AIC
H11_model_1b 1190.520 55694.61
H11_model_2  1194.293 55692.87

Session Info

In [58]:

Code

sessionInfo()

R version 4.5.0 (2025-04-11)
Platform: aarch64-apple-darwin20
Running under: macOS 26.3.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/Berlin
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
 [1] legendry_0.2.4      gtsummary_2.5.0     gghighlight_0.5.0  
 [4] glmmTMB_1.1.14      gt_1.3.0            broom.mixed_0.2.9.7
 [7] rlang_1.2.0         patchwork_1.3.2     emmeans_2.0.3      
[10] gratia_0.11.2       itsadug_2.5         plotfunctions_1.5  
[13] broom_1.0.12        sjPlot_2.9.0        ggtext_0.1.2       
[16] ggridges_0.5.7      cowplot_1.2.0       performance_0.16.0 
[19] lme4_2.0-1          Matrix_1.7-3        mgcv_1.9-4         
[22] nlme_3.1-168        LightLogR_0.10.2    melidosData_1.0.6  
[25] lubridate_1.9.5     forcats_1.0.1       stringr_1.6.0      
[28] dplyr_1.2.1         purrr_1.2.2         readr_2.2.0        
[31] tidyr_1.3.2         tibble_3.3.1        ggplot2_4.0.2      
[34] tidyverse_2.0.0    

loaded via a namespace (and not attached):
 [1] Rdpack_2.6.6        sandwich_3.1-1      magrittr_2.0.5     
 [4] furrr_0.4.0         otel_0.2.0          compiler_4.5.0     
 [7] systemfonts_1.3.2   vctrs_0.7.3         pkgconfig_2.0.3    
[10] fastmap_1.2.0       backports_1.5.1     labeling_0.4.3     
[13] utf8_1.2.6          promises_1.5.0      rmarkdown_2.31     
[16] markdown_2.0        tzdb_0.5.0          nloptr_2.2.1       
[19] ragg_1.5.2          xfun_0.57           litedown_0.9       
[22] jsonlite_2.0.0      later_1.4.8         parallel_4.5.0     
[25] R6_2.6.1            stringi_1.8.7       RColorBrewer_1.1-3 
[28] parallelly_1.47.0   boot_1.3-31         numDeriv_2016.8-1.1
[31] estimability_1.5.1  Rcpp_1.1.1-1        knitr_1.51         
[34] zoo_1.8-15          base64enc_0.1-6     mvnfast_0.2.8      
[37] splines_4.5.0       timechange_0.4.0    tidyselect_1.2.1   
[40] rstudioapi_0.18.0   yaml_2.3.12         websocket_1.4.4    
[43] TMB_1.9.21          codetools_0.2-20    processx_3.8.7     
[46] listenv_0.10.1      lattice_0.22-7      withr_3.0.2        
[49] S7_0.2.1-1          coda_0.19-4.1       evaluate_1.0.5     
[52] future_1.70.0       xml2_1.5.2          pillar_1.11.1      
[55] renv_1.1.4          reformulas_0.4.4    insight_1.5.0      
[58] generics_0.1.4      chromote_0.5.1      hms_1.1.4          
[61] commonmark_2.0.0    mirai_2.6.1         scales_1.4.0       
[64] minqa_1.2.8         globals_0.19.1      xtable_1.8-8       
[67] glue_1.8.0          webshot2_0.1.2      tools_4.5.0        
[70] fs_2.0.1            mvtnorm_1.3-7       grid_4.5.0         
[73] ggokabeito_0.1.0    rbibutils_2.4.1     cards_0.7.1        
[76] cli_3.6.6           textshaping_1.0.5   gtable_0.3.6       
[79] tweedie_3.0.17      sass_0.4.10         digest_0.6.39      
[82] pbkrtest_0.5.5      nanonext_1.8.2      htmlwidgets_1.6.4  
[85] farver_2.1.2        htmltools_0.5.9     lifecycle_1.0.5    
[88] statmod_1.5.1       gridtext_0.1.6      MASS_7.3-65