Analysis of human visual experience data

Authors

Affiliations

Johannes Zauner

Technical University of Munich, Germany

Max Planck Institute for Biological Cybernetics, Germany

Aaron Nicholls

Reality Labs Research, USA

Lisa A. Ostrin

University of Houston College of Optometry, USA

Manuel Spitschan

Technical University of Munich, Germany

Max Planck Institute for Biological Cybernetics, Germany

Technical University of Munich, Institute for Advanced Study (TUM-IAS), Germany

Last modified:

December 19, 2025

Doi

10.5281/zenodo.16566014

Abstract

Exposure to the optical environment — often referred to as visual experience — profoundly influences human physiology and behavior across multiple time scales. In controlled laboratory settings, stimuli can be held constant or manipulated parametrically. However, such exposures rarely replicate real-world conditions, which are inherently complex and dynamic, generating high-dimensional datasets that demand rigorous and flexible analysis strategies. This tutorial presents an analysis pipeline for visual experience datasets, with a focus on reproducible workflows for human chronobiology and myopia research. Light exposure and its retinal encoding affect human physiology and behavior across multiple time scales. Here we provide step-by-step instructions for importing, visualizing, and processing viewing distance and light exposure data. This includes time-series analyses for working distance, biologically relevant light metrics, and spectral characteristics. The tasks are standardized through the open-source R package LightLogR. By leveraging a modular approach, the tutorial supports researchers in building flexible and robust pipelines that accommodate diverse experimental paradigms and measurement systems.

Keywords

wearable, light logging, viewing-distance, visual experience, metrics, circadian, myopia, risk factors, spectral analysis, open-source, reproducibility

1 Introduction

Exposure to the optical environment — often referred to as visual experience — profoundly influences human physiology and behavior across multiple time scales. Two notable examples, from distinct research domains, can be understood through a common retinally-referenced framework.

The first example relates to the non-visual effects of light on human circadian and neuroendocrine physiology. The light–dark cycle entrains the circadian clock, and light exposure at night suppresses melatonin production (Brown et al. 2022; Blume, Garbazza, and Spitschan 2019). The second example concerns the influence of visual experience on ocular development, particularly myopia. Time spent outdoors — which features distinct optical environments — has been consistently associated with protective effects on ocular growth and health outcomes (Dahlmann-Noor et al. 2025).

In controlled laboratory settings, light exposure can be held constant or manipulated parametrically. In contrast, real-world conditions are inherently complex and dynamic, and cannot be captured by single spot measurements. As people move in and between spaces (indoors and outdoors) and move their body, head, and eyes, exposure to the optical environment varies significantly (Webler et al. 2019) and is modulated by behavior (Biller, Balakrishnan, and Spitschan 2024). Wearable devices for measuring light exposure have thus emerged as vital tools to capture the ecological visual experience. These tools generate high-dimensional datasets that demand rigorous and flexible analysis strategies.

Starting in the 1980s (Okudaira, Kripke, and Webster 1983), technology to measure optical exposure has matured, with miniaturized illuminance sensors now (in 2025) very common in consumer wearables (van Duijnhoven et al. 2025). In research, several devices are available that differ in functionality, ranging from small pins measuring ambient illuminance (Mohamed et al. 2021) to head-mounted multi-modal devices capturing nearly all relevant aspects of visual experience (Gibaldi et al. 2024). Increased capabilities in wearables bring complex, dense datasets. These go hand-in-hand with a proliferation of metrics, as highlighted by recent review papers in both circadian and myopia research (Hönekopp and Weigelt 2023; Hartmeyer and Andersen 2023).

At present, the analysis processes to derive metrics are often implemented on a per-laboratory or even per-researcher basis. This fragmentation is a potential source of errors and inconsistencies between studies, consumes considerable researcher time (Hartmeyer, Webler, and Andersen 2022), and these bespoke processes and formats hinder harmonization or meta-analysis across multiple studies. It is very common that more time is spent preparing data than gaining insights through rigorous statistical analysis. These preparation tasks are best handled, or at least facilitated, by standardized, transparent, community-based analysis pipelines (J. Zauner, Udovicic, and Spitschan 2024).

In circadian research, the R package LightLogR was developed to address this need (J. Zauner, Hartmeyer, and Spitschan 2025). LightLogR is an open-source, MIT-licensed, community-driven package specifically designed for data from wearable light loggers and optical radiation dosimeters. It contains functions to calculate over sixty different metrics used in the field (Hartmeyer and Andersen 2023). The package functions come with light-related defaults, but they remain fundamentally agnostic to modality. As a result, parameters like viewing distance and light spectra, both highly relevant to myopia research (Hönekopp and Weigelt 2023), can easily be handled.

In this article, we demonstrate that LightLogR’s analysis pipelines and metric functions apply broadly across the field of visual experience research, not just to circadian rhythms and chronobiology. Our approach is modular and extensible, allowing researchers to adapt it to a variety of devices and research questions. Emphasis is placed on clarity, transparency, and reproducibility, aligning with best practices in scientific computing and open science. We use example data from two devices (worn by different individuals and at different times) to showcase the LightLogR workflow with metrics relevant to myopia research, covering working distance, (day)light exposure, and spectral analysis. Readers are encouraged to recreate the analysis using the provided code. All necessary data and code are openly available in the GitHub repository.

Scope

This article focuses on workflows for deriving condensed metrics from time-series data collected with wearable devices in the visual-experience domain. Specifically, we address illuminance, viewing distance, and spectral irradiance. Example datasets from two types of wearable devices are used for illustration.

Many relevant considerations arise when collecting data with wearable devices. This article covers only a subset of these. In particular, it does not address:

device selection (see, e.g., (van Duijnhoven et al. 2025; Johannes Zauner, Stefani, et al. 2025))
measurement accuracy or device calibration
auxiliary data such as sleep/wake information (see, e.g., (J. Zauner et al. 2025; Guidolin et al. 2024))

More information on those aspects can be found in the Technical guide for wearable optical radiation dosimetry and visual experience assessment (Johannes Zauner, Baraas, et al. 2025).

To demonstrate the workflows, this article uses expert-informed definitions of metrics and metric parameters (see, e.g., Table 1, Table 2, and the non-wear detection rules based on activity data described in Supplement 1). These definitions and thresholds should not be interpreted as universal standards, nor are they hard-coded into the software package. For any application, parameter choices must be tailored to the research domain, study context and design, and the specifications of the wearable device.

Further, the article is split up in the main analysis part, where all metrics are calculated, and the Supplement 1, where data is imported, screened, and prepared. Thus, the reader is referred to Supplement 1 for all aspects regarding data formats, preparation steps, and handling of gaps, i.e., missing data.

Lastly, the example data used in the article do not stem from a controlled experimental data collection but consist of pilot data gathered in an ecological setting without a fixed protocol. Given the substantial interindividual differences in visual experience metrics, and because the analyses focus on one participant at a time, the reported results should be interpreted as illustrative rather than representative of typical or population-level values.

2 Methods and materials

2.1 Software

This tutorial was built with Quarto, an open-source scientific and technical publishing system that integrates text, code, and code output into a single document. The source code to reproduce all results is included and accessible via the Quarto document’s code tool menu. All analyses were conducted in R (version 4.5.0, “How About a Twenty-Six”) using LightLogR (version 0.10.0 “High noon”). We also used the tidyverse suite (version 2.0.0) for data manipulation (which LightLogR follows in its design), and the gt package (version 1.1.0) for generating summary tables. A comprehensive overview of the R computing environment is provided in the session info (see Session info section).

2.2 Metric selection and definitions

In March 2025, two workshops with myopia researchers — initiated by the Research Data Alliance (RDA) Working Group on Optical Radiation Exposure and Visual Experience Data — focused on current needs and future opportunities in data analysis, including the development and standardization of metrics. Based on expert input from these workshops, the authors of this tutorial compiled a list of visual experience metrics, shown in Table 1. These include many currently used metrics and definitions (Wen et al. 2020, 2019; Bhandari and Ostrin 2020; Williams et al. 2019), as well as new metrics enabled by spectrally-resolved measurements. While they are not derived by a formal consensus process, they are expert-informed and used in current scientific research, and thus will serve as example-definitions for metrics and thresholds throughout this article.

Table 1: Overview of metrics as they are used in this article. In all cases, the averages for weekday, weekend, and the mean daily value are calculated through mean_daily.

No.	Name	Implementation¹
	Distance
1	Total wear time daily	durations()
2	Duration of Near work, Intermediate Work, Near + Intermediate Work, or per each Distance range (10cm steps)	filter for distance range + durations() (for single ranges) or grouping by distance range + durations() (for all ranges)
3	Frequency of Continuous near work	extract_clusters() + summarize_numeric()
4	Frequency, duration, and distances of Near Work episodes	extract_clusters() + extract_metric() + summarize_numeric()
5	Frequency and duration of Visual breaks	extract_clusters() + filter
	Light
6	Light exposure (in lux)	summarize_numeric()
7	Duration per Outdoor range	grouping by Outdoor range + durations()
8	The number of times light level changes from indoor (<1000 lx) to outdoor (>1000 lx)	extract_states() + summarize_numeric()
9	Longest period above 1000 lx	period_above_threshold()
	Spectrum
10	Ratio of short vs. long wavelength light	spectral_integration() + summarize_numeric()
11	Melanopic daylight efficacy ratio (MDER)	spectral_integration() + summarize_numeric()
12	Short-wavelength light at certain times of day	spectral_integration() + filter_Time() (for defined times) or cut_Datetime() (for regular time intervals) or add_photoperiod() (for solar times) + grouping by time state + summarize_numeric()

Table 2 provides definitions for the terms used in Table 1. Note that specific definitions may vary depending on the research question or device capabilities.

Table 2: Definitions of mean daily and conditions for distance and illuminance calculation as used in the article

Metric	Description / pseudo formula
Total wear time	$\sum(t)*dt, \textrm{ where } t\textrm{: valid observations }$
Mean daily	$\frac{5\bar{\textrm{weekday}} + 2\bar{weekend}}{7}$
Near work	$\textrm{working distance}, [15,60)cm$
Intermediate Work	$\textrm{working distance}, [60,100)cm$
Total work²	$\textrm{working distance}, [15,120)cm$
Distance range	$\textrm{working distance}, {[15,20)cm \textrm{, Extremely near} \\ [20,30)cm \textrm{, Very near} \\ [30,40)cm \textrm{, Fairly near} \\ [40,50)cm \textrm{, Near} \\ [50,60)cm \textrm{, Moderately near} \\ [60,70)cm \textrm{, Near intermediate} \\ [70,80)cm \textrm{, Intermediate} \\ [80,90)cm \textrm{, Moderately intermediate} \\ [90,100)cm \textrm{, Far intermediate}}$
Continuous near work	$\textrm{working distance}, [20,60)cm,$ $T_\textrm{duration} ≥ 30 minutes, \textrm{ }T_{interruptions} ≤ 1 minute$
Near work episodes	$\textrm{working distance}, [20,60)cm,$ $T_\textrm{interruptions} ≤ 20 seconds$
Ratio of daily near work	$\frac{T_\textrm{near work}}{T_\textrm{total wear}}$
Visual break	$\textrm{working distance} ≥ 100cm, \\ T_\textrm{duration} ≥ 20 seconds, \textrm{ }T_\textrm{previous episode} ≤ 20 minutes$
Outdoor range	$\textrm{illuminance}, {[1000,2000)lx \textrm{, Outdoor bright} \\ [2000,3000)lx \textrm{, Outdoor very bright} \\ [3000, \infty) lx \textrm{, Outdoor extremely bright}}$
Light exposure³	$\bar{illuminance}$
Spectral bands	$\textrm{spectral irradiance}, {[380,500]nm \textrm{, short wavelength light} \\ [600, 780]nm \textrm{, long wavelength light}}$
Ratio of short vs. long wavelength light	$\frac{E_{e\textrm{,short wavelength}}}{E_{e\textrm{,long wavelength}}}$

It should be noted that although daylight levels can far exceed the thresholds defined in Table 2 - and may reach or even exceed 10^5 lux - empirical daylight levels measured at eye level are much lower, typically around 10^3 lux, especially when considering aggregated time-series data over minutes to hours.

2.3 Devices

Data from two wearable devices are used in this analysis:

Clouclip: A wearable device that measures viewing distance and ambient light simultaneously [Glasson Technology Co., Ltd, Hangzhou, China; Wen et al. (2021); Wen et al. (2020)]. The Clouclip provides a simple data output with only distance (working distance, in centimeters) and illuminance (ambient light, in lux). Data in our example were recorded at 5-second intervals. Approximately one week of data (~120,960 observations) is about 1.6 MB in size.
Visual Environment Evaluation Tool (VEET): A head-mounted multi-modal device that logs multiple data streams [Reality Labs Research, Menlo Park, CA, USA; Sah, Narra, and Ostrin (2025); Sullivan et al. (2024)]. The VEET dataset used here contains simultaneous measurements of distance (via a time-of-flight sensor), ambient light (illuminance), activity (accelerometer & gyroscope), and spectral irradiance (multi-channel light sensor). Data were recorded at 2-second intervals, yielding a very dense dataset (~270 MB per week).

2.4 Data processing summary

The Results section uses imported and pre-processed data from the two devices to calculate metrics. Supplement 1 contains the annotated code and description for the steps involved, which are summarized as follows, and as shown in Figure 1. Please refer to the supplement for details.

Code

flowchart TD

classDef input fill:#f7f7f7,stroke:#2f3d4a,color:#0f1a22,stroke-width:1px;
classDef process fill:#ffffff,stroke:#2f3d4a,color:#0f1a22,stroke-width:1px;
classDef output fill:#eef2f7,stroke:#2f3d4a,color:#0f1a22,stroke-width:1px;

%% ===== Raw inputs =====
RAW_CC[(Raw Clouclip file)]:::input
RAW_VEET[(Raw VEET file)]:::input

%% ===== Clouclip preprocessing =====
CC_IMP[Import<br/>Clouclip data]:::process
CC_QC[Inspect data,<br/>check for gaps and<br/>irregular data]:::process
CC_REG[Regularize timestamps<br/>and handle gaps]:::process
CC_VIS[Visualize cleaned<br/>Clouclip data]:::process

RAW_CC --> Clouclip
subgraph Clouclip[**Light and distance**]
CC_IMP --> CC_QC
CC_QC --> CC_REG
CC_REG --> CC_VIS
end

%% ===== VEET ALS light preprocessing =====
ALS_IMP[Import VEET<br/>*ALS*<br/>light modality]:::process
ALS_QC[Inspect, adjust interval,<br/>handle gaps, remove high<br/>missing days]:::process
IMU_IMP[Import VEET<br/>*IMU*<br/>actigraphy modality]:::process
IMU_NONWEAR[Detect **non-wear**<br/>using activity rules]:::process
ALS_NONWEAR[Add, label and visualize<br/>**non-wear**]:::process
ALS_CLEAN[Remove **non-wear**<br/>observations]:::process

RAW_VEET --> ALS_IMP
RAW_VEET --> IMU_IMP
subgraph Light[**Ambient light**]
ALS_IMP --> ALS_QC
ALS_QC --> ALS_NONWEAR
IMU_IMP --> IMU_NONWEAR
IMU_NONWEAR --> ALS_NONWEAR
ALS_NONWEAR --> ALS_CLEAN
end

%% ===== VEET spectral preprocessing =====
SPEC_IMP[Import VEET<br/>*PHO*<br/>spectral channels]:::process
SPEC_NORM[Normalize spectral<br/>channels by gain and<br/>integration time]:::process
SPEC_CLEAN[Inspect, adjust interval,<br/>handle gaps, remove high<br/>missing days]:::process
SPEC_CALfile[(Calibration matrix)]:::input
SPEC_CAL[Import calibration matrix]:::process
SPEC_RECON[Reconstruct and visualize<br/>spectra,<br/>calculate illuminance]:::process

RAW_VEET --> SPEC_IMP
subgraph Spectrum[**Spectral data**]
SPEC_IMP --> SPEC_NORM
SPEC_NORM --> SPEC_CLEAN
SPEC_CLEAN --> SPEC_RECON
SPEC_CALfile --> SPEC_CAL
SPEC_CAL --> SPEC_RECON
end

%% ===== VEET distance preprocessing =====
TOF_IMP[Import VEET distance<br/>*TOF*<br/>modality]:::process
TOF_QC[Inspect, adjust interval,<br/>handle gaps, remove high<br/>missing days]:::process
TOF_PIV[Pivot from *wide* to *long*<br/>format and label spatial<br/>position]:::process
TOF_NONWEAR[Add, label and remove<br/>**non-wear**]:::process

IMU_NONWEAR -.-> TOF_NONWEAR
RAW_VEET --> Distance
subgraph Distance[**Spatial distance**]
TOF_IMP --> TOF_QC
TOF_QC --> TOF_PIV
TOF_PIV --> TOF_NONWEAR
end
%% ===== Save outputs =====
SAVE[(Pre-processed datasets)]:::input

CC_VIS -- Cleaned Clouclip<br/>dataset<br/>**dataCC** --> SAVE
ALS_CLEAN -- Cleaned VEET<br/>ALS dataset<br/>**dataVEET** --> SAVE
SPEC_RECON -- Cleaned VEET<br/>spectral dataset<br/>**dataVEET2** --> SAVE
TOF_NONWEAR -- Cleaned VEET<br/>distance dataset<br/>**dataVEET3** --> SAVE

class Clouclip,Distance,Spectrum,Light output;

flowchart TD

classDef input fill:#f7f7f7,stroke:#2f3d4a,color:#0f1a22,stroke-width:1px;
classDef process fill:#ffffff,stroke:#2f3d4a,color:#0f1a22,stroke-width:1px;
classDef output fill:#eef2f7,stroke:#2f3d4a,color:#0f1a22,stroke-width:1px;

%% ===== Raw inputs =====
RAW_CC[(Raw Clouclip file)]:::input
RAW_VEET[(Raw VEET file)]:::input

%% ===== Clouclip preprocessing =====
CC_IMP[Import<br/>Clouclip data]:::process
CC_QC[Inspect data,<br/>check for gaps and<br/>irregular data]:::process
CC_REG[Regularize timestamps<br/>and handle gaps]:::process
CC_VIS[Visualize cleaned<br/>Clouclip data]:::process

RAW_CC --> Clouclip
subgraph Clouclip[**Light and distance**]
CC_IMP --> CC_QC
CC_QC --> CC_REG
CC_REG --> CC_VIS
end

%% ===== VEET ALS light preprocessing =====
ALS_IMP[Import VEET<br/>*ALS*<br/>light modality]:::process
ALS_QC[Inspect, adjust interval,<br/>handle gaps, remove high<br/>missing days]:::process
IMU_IMP[Import VEET<br/>*IMU*<br/>actigraphy modality]:::process
IMU_NONWEAR[Detect **non-wear**<br/>using activity rules]:::process
ALS_NONWEAR[Add, label and visualize<br/>**non-wear**]:::process
ALS_CLEAN[Remove **non-wear**<br/>observations]:::process

RAW_VEET --> ALS_IMP
RAW_VEET --> IMU_IMP
subgraph Light[**Ambient light**]
ALS_IMP --> ALS_QC
ALS_QC --> ALS_NONWEAR
IMU_IMP --> IMU_NONWEAR
IMU_NONWEAR --> ALS_NONWEAR
ALS_NONWEAR --> ALS_CLEAN
end

%% ===== VEET spectral preprocessing =====
SPEC_IMP[Import VEET<br/>*PHO*<br/>spectral channels]:::process
SPEC_NORM[Normalize spectral<br/>channels by gain and<br/>integration time]:::process
SPEC_CLEAN[Inspect, adjust interval,<br/>handle gaps, remove high<br/>missing days]:::process
SPEC_CALfile[(Calibration matrix)]:::input
SPEC_CAL[Import calibration matrix]:::process
SPEC_RECON[Reconstruct and visualize<br/>spectra,<br/>calculate illuminance]:::process

RAW_VEET --> SPEC_IMP
subgraph Spectrum[**Spectral data**]
SPEC_IMP --> SPEC_NORM
SPEC_NORM --> SPEC_CLEAN
SPEC_CLEAN --> SPEC_RECON
SPEC_CALfile --> SPEC_CAL
SPEC_CAL --> SPEC_RECON
end

%% ===== VEET distance preprocessing =====
TOF_IMP[Import VEET distance<br/>*TOF*<br/>modality]:::process
TOF_QC[Inspect, adjust interval,<br/>handle gaps, remove high<br/>missing days]:::process
TOF_PIV[Pivot from *wide* to *long*<br/>format and label spatial<br/>position]:::process
TOF_NONWEAR[Add, label and remove<br/>**non-wear**]:::process

IMU_NONWEAR -.-> TOF_NONWEAR
RAW_VEET --> Distance
subgraph Distance[**Spatial distance**]
TOF_IMP --> TOF_QC
TOF_QC --> TOF_PIV
TOF_PIV --> TOF_NONWEAR
end
%% ===== Save outputs =====
SAVE[(Pre-processed datasets)]:::input

CC_VIS -- Cleaned Clouclip<br/>dataset<br/>**dataCC** --> SAVE
ALS_CLEAN -- Cleaned VEET<br/>ALS dataset<br/>**dataVEET** --> SAVE
SPEC_RECON -- Cleaned VEET<br/>spectral dataset<br/>**dataVEET2** --> SAVE
TOF_NONWEAR -- Cleaned VEET<br/>distance dataset<br/>**dataVEET3** --> SAVE

class Clouclip,Distance,Spectrum,Light output;

Figure 1: Pre-processing steps in the supplement document

Data import: We imported raw data from the Clouclip and VEET devices using LightLogR’s built-in import functions, which automatically handle device-specific formats and idiosyncrasies.

The Clouclip export file (provided as a tab-delimited text file) contains timestamped records of distance (cm) and illuminance (lux). LightLogR’s import$Clouclip function reads this file, after specifying the device’s recording timezone, and converts device-specific sentinel codes into proper missing values. For instance, the Clouclip uses special numeric codes to indicate when it is in “sleep mode” or when a reading is out of the sensor’s range, rather than recording a normal value. LightLogR identifies -1 (for both distance and lux) as indicating the device’s sleep mode and 204 (for distance) as indicating the object was beyond the measurable range, replacing these with NA and logging their status in separate columns. The import routine also provides an initial summary of the dataset, including start and end times and any irregular sampling intervals or gaps.

For the VEET device, data were provided as CSV logs (zipped on Github, due to size). We focused on the ambient light sensor modality first. Using import$VEET(..., modality = "ALS"), we extracted the illuminance (Lux) data stream and its timestamps. The raw VEET data similarly contains irregular intervals and can contain missing periods (e.g., if the device stopped recording or was reset); the import summary flags these issues.

Besides the Clouclip and VEET, LightLogR 0.10.0 contains import functions for 18 more wearable devices. The package further supports versions due to evolving data formats, and includes documentation for both code-based and code-less additions of new device import-functions.

Irregular intervals, gaps, non-wear times: Both datasets showed irregular timing and missing data, i.e., gaps. Irregular data means that some observations did not align to the nominal sampling interval (e.g., slight timing drift or pauses in recording). For the Clouclip 5-second data, we detected irregular timestamps spanning all but the first and last day of the recording. Handling such irregularities is important because many downstream analyses assume a regular time series. We evaluated strategies to address this, including:

Removing an initial portion of data if irregularities occur mainly during device start-up.
Rounding all timestamps to the nearest regular interval (5 s in this case).
Aggregating to a coarser time interval (with some loss of temporal resolution).

Based on the import summary and visual inspection of the time gaps, we chose to round the observation times to the nearest 5-second mark, as this addressed the minor offsets without significant data loss. After rounding timestamps, we added an explicit date column for convenient grouping by day.

We then generated a summary of missing data for each day. Implicit gaps (intervals where the device should have recorded data but did not) were converted into explicit missing entries using LightLogR’s gap-handling functions. We also removed days that had very little data to focus on days with substantial wear time. In our Clouclip example, days with <1 hour of recordings were dropped. This threshold should be adjusted based on how much complete days matter for a given analysis at hand. E.g., in circadian science, the metrics of interdaily stability and intradaily variation require measurements for each hour of the day.

After these preprocessing steps, the Clouclip dataset had no irregular timestamps remaining and contained explicit markers for all periods of missing data (e.g., times when the device was off or not worn). The distance and illuminance values were now ready for metric calculations. Because the device was put in sleep mode when not worn, there are no measurements during non-wear times.

The VEET illuminance data underwent a similar cleaning procedure. To make the VEET’s 2-second illuminance data more comparable to the Clouclip’s and to reduce computational load, we aggregated the illuminance time series to 5-second intervals. Aggregation was performed with the arithmetic mean of values in a 5-second bin. We then inserted explicit missing entries for each whole day and removed days with more than one hour of missing illuminance data. After cleaning, six days of VEET illuminance data with good coverage remained for analysis (see Supplement 1 for details).

Finally, for spectral analysis, we imported the VEET’s spectral sensor modality, and, for the distance analysis, the time-of-flight modality. This required additional processing: the raw spectral data consists of counts from nine wavelength-specific channels (approximately 415 nm through 940 nm, unequally spaced between 30 and 50 nm, plus one broadband clear channel covering the whole range of individual channels, another broadband channel for flicker detection, and a dark channel) along with a sensor gain setting. We aggregated the spectral data to 5-minute intervals to focus on broader trends and reduce data volume. Each channel’s counts were normalized by the appropriate gain. Using a calibration matrix provided by the manufacturer (specific to the spectral sensor model), we reconstructed full spectral power distributions for each 5-minute interval. The end result is a list-column in the dataset where each entry is the estimated spectral irradiance across wavelengths for that time interval. Detailed spectral preprocessing steps, including the calibration and normalization, are provided in the Supplement 1. After spectral reconstruction, the dataset was ready for calculating example spectrum-based metrics.

Similarly, the time-of-flight modality contains 256 values per observation, encoding an 8x8 grid of distance and confidence measurements for up to two objects (8x8 grid, times two objects, times distance + confidence column for each object and grid point -> 256 values). For computational reasons, only the first object was kept. These data were pivoted into a long format, where each row contains the distance and confidence data for a given position in the grid and a given datetime. After pivoting and converting grid positions into a deviation angle from central view, the dataset was ready to be used for distance analysis.

Because the VEET devices record even when not worn, a non-wear detection using the devices’ actigraphy modality was implemented. This process used the standard deviation of a linear motion sensor in a 5-minute bin with a visually derived threshold to separate wear from non-wear time. Measurements of illuminance and distance were consequently removed during the calculated non-wear times.

This tutorial will start by importing a Clouclip dataset and providing an overview of the data. The Clouclip export is considerably simpler compared to the VEET export, only containing Distance and Illuminance measurements. The VEET dataset will be imported later for the spectrum related metrics.

Load libraries and preprocessed data

library(LightLogR)
library(tidyverse)
library(gt)
load("data/cleaned/data.RData")

Store coordinates of data collection

1coordinates <- c(29.75, -95.36)

1: Coordinates for Houston, Texas; coordinates are important to calculate and visualize photoperiods later

3 Results

Figure 2 shows an overview of the covered workflows in the results section.

Code

%%{init: {"flowchart": {"diagramPadding": 50}}}%%

flowchart TD

classDef input fill:#f7f7f7,stroke:#2f3d4a,color:#0f1a22,stroke-width:1px;
classDef process fill:#ffffff,stroke:#2f3d4a,color:#0f1a22,stroke-width:1px;
classDef output fill:#eef2f7,stroke:#2f3d4a,color:#0f1a22,stroke-width:1px;

%% =========================
  %% MANUSCRIPT (ANALYSIS / METRICS)
  %% =========================
    FILE[(Pre-processed datasets)]:::input

    %% Distance branch
    subgraph MS_DIST["**Distance**"]
      direction LR
      Clouclip[Clouclip data<br/>*dataCC*]:::input
      M_WEAR["Total wear time: daily duration"]:::process
      M_Near["Near work: daily duration"]:::process
      M_RANGES["Binned ranges: daily duration"]:::process
      M_CONTNW["Continuous near work:<br/>daily frequency + plot"]:::process
      M_NEARW["Near work episodes:<br/>frequency, duration, distance"]:::process
      M_BREAKS["Visual breaks: frequency"]:::process
      Distance[VEET<br/>spatial distance<br/>*dataVEET3*]:::input
      M_GRID["Plot example distance grids"]:::process
      M_CONDENSE["Condense grid"]:::process
    end
    
  FILE --> MS_DIST
  Clouclip --> M_WEAR 
  Clouclip --> M_Near
  Clouclip --> M_RANGES 
  Clouclip --> M_CONTNW 
  Clouclip --> M_NEARW
  Clouclip --> M_BREAKS
  Distance --> M_GRID
  Distance --> M_CONDENSE

%% Light branch (VEET ALS)
    subgraph MS_LIGHT["**Light**"]
      direction LR
      VEET_light["VEET<br/>light data<br/>*dataVEET*"]:::input
      M_HIST["Distribution plots<br/>of light exposure<br/>for Clouclip and VEET"]:::process
      M_MEAN_LUX["Illuminance:<br/>daily average with and<br/>without logarithmic<br/>adjustment"]:::process
      M_OUTDOOR["Outdoor conditions:<br/>daily duration"]:::process
      M_TRANS["Indoor to outdoor<br/>transitions:<br/>frequency"]:::process
      M_PAT_TAT["Longest period and<br/>total time above threshold"]:::process
      M_MERGE["Merge device data,<br/>recalculate daily average<br/>with both datasets at once"]:::process
      Clouclip2[Clouclip data<br/>*dataCC*]:::input
    end
    
  FILE --> MS_LIGHT
  VEET_light --> M_HIST
  VEET_light --> M_MEAN_LUX
  VEET_light --> M_OUTDOOR
  VEET_light --> M_TRANS
  VEET_light --> M_PAT_TAT
  VEET_light --> M_MERGE
  Clouclip2 --> M_MERGE

%% Spectrum branch (VEET PHO)
    subgraph MS_SPEC["**Spectrum**"]
      direction LR
      VEET_spectrum["VEET<br/>spectral data<br/>*dataVEET2*"]:::input
      M_SHORTLONG["Integrate spectrum across<br/>wavelength bands and<br/>calculate short/long<br/>wavelength ratio"]:::process
      M_MDER["Compute illuminance and<br/>melanopic EDI<br/>from spectrum"]:::process
      M_MDER2["Calculate the melanopic<br/>daylight efficacy ratio<br/>*MDER*)"]:::process
      M_DIURNAL["Calculate morning<br/>blue light exposure"]:::process
      M_HOURLY["Plot relative hourly<br/>blue light exposure"]:::process
      M_PHOTOPER["Calculate day vs night<br/>comparison of<br/>blue light exposure"]:::process
    end

 FILE --> MS_SPEC
 VEET_spectrum --> M_SHORTLONG
 VEET_spectrum --> M_MDER
 M_MDER --> VEET_spectrum
 VEET_spectrum --> M_MDER2
 VEET_spectrum --> M_DIURNAL
 VEET_spectrum --> M_HOURLY
 VEET_spectrum --> M_PHOTOPER
 
 class MS_DIST,MS_LIGHT,MS_SPEC output;

%%{init: {"flowchart": {"diagramPadding": 50}}}%%

flowchart TD

classDef input fill:#f7f7f7,stroke:#2f3d4a,color:#0f1a22,stroke-width:1px;
classDef process fill:#ffffff,stroke:#2f3d4a,color:#0f1a22,stroke-width:1px;
classDef output fill:#eef2f7,stroke:#2f3d4a,color:#0f1a22,stroke-width:1px;

%% =========================
  %% MANUSCRIPT (ANALYSIS / METRICS)
  %% =========================
    FILE[(Pre-processed datasets)]:::input

    %% Distance branch
    subgraph MS_DIST["**Distance**"]
      direction LR
      Clouclip[Clouclip data<br/>*dataCC*]:::input
      M_WEAR["Total wear time: daily duration"]:::process
      M_Near["Near work: daily duration"]:::process
      M_RANGES["Binned ranges: daily duration"]:::process
      M_CONTNW["Continuous near work:<br/>daily frequency + plot"]:::process
      M_NEARW["Near work episodes:<br/>frequency, duration, distance"]:::process
      M_BREAKS["Visual breaks: frequency"]:::process
      Distance[VEET<br/>spatial distance<br/>*dataVEET3*]:::input
      M_GRID["Plot example distance grids"]:::process
      M_CONDENSE["Condense grid"]:::process
    end
    
  FILE --> MS_DIST
  Clouclip --> M_WEAR 
  Clouclip --> M_Near
  Clouclip --> M_RANGES 
  Clouclip --> M_CONTNW 
  Clouclip --> M_NEARW
  Clouclip --> M_BREAKS
  Distance --> M_GRID
  Distance --> M_CONDENSE

%% Light branch (VEET ALS)
    subgraph MS_LIGHT["**Light**"]
      direction LR
      VEET_light["VEET<br/>light data<br/>*dataVEET*"]:::input
      M_HIST["Distribution plots<br/>of light exposure<br/>for Clouclip and VEET"]:::process
      M_MEAN_LUX["Illuminance:<br/>daily average with and<br/>without logarithmic<br/>adjustment"]:::process
      M_OUTDOOR["Outdoor conditions:<br/>daily duration"]:::process
      M_TRANS["Indoor to outdoor<br/>transitions:<br/>frequency"]:::process
      M_PAT_TAT["Longest period and<br/>total time above threshold"]:::process
      M_MERGE["Merge device data,<br/>recalculate daily average<br/>with both datasets at once"]:::process
      Clouclip2[Clouclip data<br/>*dataCC*]:::input
    end
    
  FILE --> MS_LIGHT
  VEET_light --> M_HIST
  VEET_light --> M_MEAN_LUX
  VEET_light --> M_OUTDOOR
  VEET_light --> M_TRANS
  VEET_light --> M_PAT_TAT
  VEET_light --> M_MERGE
  Clouclip2 --> M_MERGE

%% Spectrum branch (VEET PHO)
    subgraph MS_SPEC["**Spectrum**"]
      direction LR
      VEET_spectrum["VEET<br/>spectral data<br/>*dataVEET2*"]:::input
      M_SHORTLONG["Integrate spectrum across<br/>wavelength bands and<br/>calculate short/long<br/>wavelength ratio"]:::process
      M_MDER["Compute illuminance and<br/>melanopic EDI<br/>from spectrum"]:::process
      M_MDER2["Calculate the melanopic<br/>daylight efficacy ratio<br/>*MDER*)"]:::process
      M_DIURNAL["Calculate morning<br/>blue light exposure"]:::process
      M_HOURLY["Plot relative hourly<br/>blue light exposure"]:::process
      M_PHOTOPER["Calculate day vs night<br/>comparison of<br/>blue light exposure"]:::process
    end

 FILE --> MS_SPEC
 VEET_spectrum --> M_SHORTLONG
 VEET_spectrum --> M_MDER
 M_MDER --> VEET_spectrum
 VEET_spectrum --> M_MDER2
 VEET_spectrum --> M_DIURNAL
 VEET_spectrum --> M_HOURLY
 VEET_spectrum --> M_PHOTOPER
 
 class MS_DIST,MS_LIGHT,MS_SPEC output;

Figure 2: Workflows that are covered in the results section. The pre-processed datasets are covered in detail in the supplement

3.1 Distance

We first examine metrics related to viewing distance, using the processed Clouclip dataset. Many distance-based metrics are computed for each day and then averaged over weekdays, weekends, or across all days. To facilitate this, we define a helper function that will take daily metric values and calculate the mean values for weekdays, weekends, and the overall daily average:

Define helper function to_mean_daily()

to_mean_daily <- function(data, prefix = "average_") {
  data |> 
1    ungroup(Date) |>
2    mean_daily(prefix = prefix) |>
3    rename_with(.fn = \(x) str_replace_all(x,"_"," ")) |>
4    gt()
}

1: Ungroup by days
2: Calculate the averages per grouping
3: Remove underscores in names
4: Format as a gt table for display

3.1.1 Total wear time daily

Total wear time daily refers to the amount of time the device was actively collecting distance data each day (i.e. the time the device was worn and operational). We compute this by summing all intervals where a valid distance measurement is present, ignoring periods where data are missing or the device was off. The results are shown in Table 3.

Calculate total wear time

dataCC |> 
1  durations(Dis) |>
2  to_mean_daily("Total wear ")

1: Calculate total duration of data per day
2: Using the helper function defined above

Table 3: Total wear time per day (average across days)

Date	Total wear duration
Clouclip
Mean daily	31448s (~8.74 hours)
Weekday	34460s (~9.57 hours)
Weekend	23918s (~6.64 hours)

3.1.2 Duration within distance ranges

Many myopia-relevant metrics concern the time spent at certain viewing distances (e.g., “near work” vs. intermediate or far distances). We calculate the duration of time spent in specific distance ranges. Table 4 shows the average daily duration of near work, defined here as time viewing at 15–60 cm. Table 5 provides a more detailed breakdown across multiple distance bands.

Calculate daily duration of near work

dataCC |> 
1  filter(Dis >= 15, Dis < 60) |>
2  durations(Dis) |>
  to_mean_daily("Near work ")

1: Consider only distances in [15, 60) cm
2: Total duration in that range per day

Table 4: Daily duration of near work (15–60 cm viewing distance)

Date	Near work duration
Clouclip
Mean daily	17309s (~4.81 hours)
Weekday	21173s (~5.88 hours)
Weekend	7648s (~2.12 hours)

First, we define a set of distance breakpoints and descriptive labels for each range:

Defining distance ranges (in cm)

dist_breaks <- c(15, 20, 30, 40, 50, 60, 70, 80, 90, 100, Inf)
dist_labels <- c(
    "Extremely near",          # [15, 20)
    "Very near",               # [20, 30)
    "Fairly near",             # [30, 40)
    "Near",                    # [40, 50)
    "Moderately near",         # [50, 60)
    "Near intermediate",       # [60, 70)
    "Intermediate",            # [70, 80)
    "Moderately intermediate", # [80, 90)
    "Far intermediate",        # [90, 100)
    "Far"                      # [100, Inf)
  )

Now we cut the distance data into these ranges and compute the daily duration spent in each range:

Calculate daily duration within viewing distance range

dataCC |> 
1  mutate(Dis_range = cut(Dis, breaks = dist_breaks, labels = dist_labels)) |>
2  drop_na(Dis_range) |>
3  group_by(Dis_range, .add = TRUE) |>
4  durations(Dis) |>
5  pivot_wider(names_from = Dis_range, values_from = duration) |>
  ungroup() |> 
  mean_daily(prefix = "") |> 
  pivot_longer(-Date) |> 
  pivot_wider(names_from = Date) |> 
  mutate(name = factor(name, levels = rev(dist_labels))) |> 
  arrange(name) |> 
  gt() |> 
6  fmt_duration(input_units = "seconds", output_units = "minutes")

1: Categorize distances
2: Remove intervals with no data
3: Group by distance range (in addition to the date)
4: Duration per range per day
5: Pivot data from long to wide format (ranges as columns)
6: Convert seconds to minutes

Table 5: Daily duration in each viewing distance range

name	Mean daily	Weekday	Weekend
Far	16m	20m	5m
Far intermediate	11m	14m	2m
Moderately intermediate	5m	6m	3m
Intermediate	4m	6m	1m
Near intermediate	7m	7m	8m
Moderately near	13m	16m	5m
Near	27m	36m	5m
Fairly near	46m	60m	12m
Very near	102m	128m	38m
Extremely near	74m	88m	40m

To visualize this, Figure 3 illustrates the relative proportion of time spent in each distance range:

Plot time spent within each viewing distance range

dataCC |> 
1  mutate(Dis_range = cut(Dis, breaks = dist_breaks, labels = dist_labels)) |>
  drop_na(Dis_range) |>
  group_by(Dis_range, .add = TRUE) |>
  durations(Dis) |>
  group_by(Dis_range) |>
  mean_daily(prefix = "") |>
  ungroup() |>
  mutate(Dis_range = forcats::fct_relabel(Dis_range, \(x) str_replace(x, " ", "\n"))) |> 
2  mutate(duration = duration/sum(duration), .by = Date) |>
  ggplot(aes(x = Dis_range, y = duration, fill = Date)) +
  geom_col(position = "dodge") +
  scale_y_continuous(labels = scales::label_percent()) +
  ggsci::scale_fill_jco() +
  theme_minimal() +
  labs(y = "Relative duration (%)", x = NULL, fill = NULL) +
  coord_flip()
ggsave("manuscript/figures/Figure3.png",
                width = 7,
                height = 5)

1: Broadly, this portion repeats the prior code cell
2: Convert to percentage of daily total

Figure 3: Percentage of total time spent in each viewing distance range for an average day (mean daily), average weekday, or weekend

3.1.3 Frequency of continuous near work

Continuous near-work can be understood as sustained viewing within a near distance for some minimum duration, allowing only brief interruptions. We use LightLogR’s cluster function to identify episodes of continuous near work. Here, we define a near-work episode as viewing distance between 20 cm and 60 cm that lasts at least 30 minutes, with interruptions of up to 1 minute allowed (meaning short breaks ≤1 min do not end the episode). Using extract_clusters() with those parameters, we count how many such episodes occur per day.

Table 6 summarizes the average frequency of continuous near-work episodes per day, and Figure 4 provides an example visualization of these episodes on the distance time series.

Calculate the frequency of continuous near-work episodes per day

dataCC |> 
  extract_clusters(
1    Dis >= 20 & Dis < 60,
2    cluster.duration = "30 mins",
3    interruption.duration = "1 min",
4    drop.empty.groups = FALSE
  ) |> 
  summarize_numeric(remove = c("start", "end", "epoch", "duration"),
5                    add.total.duration = FALSE) |>
6  mean_daily(prefix = "Frequency of ") |>
  gt() |> fmt_number()

1: Condition: near-work distance
2: Minimum duration of a continuous episode
3: Maximum gap allowed within an episode
4: Keep days with zero episodes in output
5: Count number of episodes per day
6: Compute daily mean frequency

Table 6: Frequency of continuous near-work episodes per day

Date	Frequency of episodes
Clouclip
Mean daily	0.86
Weekday	1.20
Weekend	0.00

Plot continuous near-work episodes

1dataCC |>
  add_clusters(
    Dis >= 20 & Dis < 60,
    cluster.duration = "30 mins",
    interruption.duration = "1 min"
  ) |>
  gg_day(y.axis = Dis, y.axis.label = "Distance (cm)", geom = "line",
         y.scale = "identity", y.axis.breaks = seq(0,100, by = 20)) |> 
2  gg_photoperiod(coordinates) |>
3  gg_state(state, fill = "red") +
  geom_hline(yintercept = c(20, 60), col = "red", linetype = "dashed") +
  coord_cartesian(ylim = c(0,100))

1: As in code cell above
2: Add photoperiod information
3: Add state bands

Figure 4: Example of continuous near-work episodes. Red shaded areas indicate periods of continuous near work (20–60 cm for ≥30 min, allowing ≤1 min interruptions). Black trace is viewing distance over time; red dashed lines mark the 20 cm and 60 cm boundaries. Grey shaded areas indicate nighttime.

3.1.4 Near-work episodes

Beyond frequency, we can characterize near-work episodes by their duration and typical viewing distance. This section extracts all near-work episodes (using a 5-second minimum duration to capture more routine near-work bouts) and summarizes three aspects:

frequency (count of episodes per day),
average duration of episodes, and
average distance during those episodes.

These results are combined in Table 7.

Calculate near-work episodes

dataCC |> 
  extract_clusters(
    Dis >= 20 & Dis < 60,
1    cluster.duration = "5 secs",
    interruption.duration = "20 secs",
    drop.empty.groups = FALSE
  ) |> 
2  extract_metric(dataCC, distance = mean(Dis, na.rm = TRUE)) |>
3  summarize_numeric(remove = c("start", "end", "epoch"),
                    prefix = "",
                    add.total.duration = FALSE) |>
4  mean_daily(prefix = "") |>
5  gt() |>
  fmt_number(c(distance, episodes), decimals = 0) |>
  cols_units(distance = "cm")

1: Minimal duration to count as an episode (set to interval level of dataCC)
2: Calculate mean distance during each episode
3: Calculate averages for all numeric columns per group
4: Daily averages for each metric
5: Table generation

Table 7: Near-work episodes: frequency, mean duration, and mean viewing distance

Date	duration	distance, cm	episodes
Clouclip
Mean daily	233s (~3.88 minutes)	32	57
Weekday	284s (~4.73 minutes)	32	64
Weekend	104s (~1.73 minutes)	32	40

In the code cell above, extract_metric(..., distance = mean(Dis, ...)) computes the mean viewing distance during each episode, and the subsequent summarize_numeric and mean_daily steps derive daily averages of episode count, duration, and distance.

3.1.5 Visual breaks

Visual breaks as defined in this article, require a minimum break-length, and the previous episode is important. This leads to a two step process, where we first extract instances of Distance above 100 cm for at least 20 seconds, before we filter for a previous duration of at maximum 20 minutes. Table 8 provides the daily frequency of visual breaks and Figure 5 shows when these occu.

Calculate visual breaks

dataCC |> 
1  extract_clusters(Dis >= 100,
2                   cluster.duration = "20 secs",
3                   return.only.clusters = FALSE,
4                   drop.empty.groups = FALSE
                   ) |> 
5  filter((start - lag(end) <= duration("20 mins")), is.cluster) |>
6  summarize_numeric(remove = c("start", "end", "epoch", "is.cluster", "duration"),
                    prefix = "",
                    add.total.duration = FALSE) |>
7  mean_daily(prefix = "Daily ") |>
8  gt() |>
  fmt_number(decimals = 1)

1: Define the condition, greater 100 cm away
2: Define the minimum duration
3: Return non-clusters as well
4: Keep all days, even without clusters
5: Return only clusters with previous episode lengths of maximum 20 minutes
6: Count the number of episodes
7: Calculate daily means
8: Table generation

Table 8: Frequency of visual breaks

Date	Daily episodes
Clouclip
Mean daily	5.9
Weekday	6.2
Weekend	5.0

Plot visual breaks

1dataCC |>
    extract_clusters(Dis >= 100,
                   cluster.duration = "20 secs",
                   return.only.clusters = FALSE,
                   drop.empty.groups = FALSE
                   ) |>
  filter((start - lag(end) <= duration("20 mins")), is.cluster) %>%
2  add_states(dataCC, ., ) |>
  gg_day(y.axis = Dis, y.axis.label = "Distance (cm)", geom = "line") |> 
3  gg_photoperiod(coordinates) +
  geom_point(data = \(x) filter(x, is.cluster), col = "red")

1: As in the code cell above
2: Add the resulting states
3: Add photoperiod information to the plot

Figure 5: Plot of visual breaks (red dots). Black traces show distance measurement data. Grey shaded areas show nighttime between civil dusk and civil dawn

3.1.6 Distance with spatial distribution

The Clouclip device outputs a singular measure for distance, while the visual environment in natural conditions contains many distances, depending on the solid angle and direction of the measurement. A device like the VEET increases the spatial resolution of these measurements, allowing for more in-depth analyses of the size and position of an object within the field of view. In the case of the VEET, data are collected from an 8x8 measurement grid, spanning 52° vertically and 41° horizontally. Figure 6 shows sample observations from six different days at the same time.

Plot spatial distance grids for different days

dataVEET3 |> 
1  filter_Time(start = "13:14:02", length = "00:00:05") |>
2  mutate(dist1 = ifelse(dist1 == 0, Inf, dist1)) |>
3  filter(conf1 >= 0.1 | dist1 == Inf) |>
4  ggplot(aes(x=x.pos, y=y.pos, fill = dist1/10))+
  extras +
5  facet_wrap(~Datetime)

1: Choose a particular observation
2: Replace 0 distances with infinity
3: Remove data that has less than 10% confidence
4: Plot the data and add the plot partials from the code cell above
5: Show one plot per day

Figure 6: Example observations of the measurement grid at 1:14 p.m. for each measurement day. Text values show distance in cm. Empty grid points show values with low confidence. Zero-distance values were replaced with infinite distance and plotted despite low confidence.

To use these distance data in the framework shown above for the Clouclip device, a sensible method to condense the data has to be applied. There are many ways how a spatially resolved distance measure could be utilized for analysis:

Where in the field of view are objects in close range?
How large are near objects in the field of view?
How varied are distances within the field of view?
How close are objects / is viewing distance in a region of interest within the field of view?

Possible methods include:

average across all (high confidence) distance values within the grid
closest (high confidence) distance within the grid
(high confidence) values at or around a given grid position, e.g., ±10 degrees around the central view (0°)

Many more options are available based on the spatial dataset, e.g., condensation rules based on the number of points in the grid with a given condition, or the variation within the grid.

We will demonstrate these three methods for a single day (2024-06-10), all leading to a data structure akin to the Clouclip, i.e., to be used for further calculation of visual experience metrics.

Calculating and plotting method results

1dataVEET3_part <-
dataVEET3 |>
  filter_Date(start = "2024-06-10", length = "1 day")

dataVEET3_condensed <- 
dataVEET3_part |> 
2  group_by(Datetime, .add = TRUE) |>
3  filter(conf1 >= 0.1) |>
  summarize(
4    distance_mean = mean(dist1),
5    distance_min = min(dist1),
6    distance_central = mean(dist1[between(x.pos, -10,10) & between(y.pos, -10,10)]),
7    n = n(),
    .groups = "drop_last"
  )

dataVEET3_condensed |> 
8  aggregate_Datetime("15 mins", numeric.handler = \(x) mean(x, na.rm = TRUE)) |>
9  remove_partial_data(by.date = TRUE) |>
10  pivot_longer(contains("distance"),
               names_to = c(".value", "method"),
    names_pattern = "(distance)_(mean|min|max|central)"
    ) |>
11  gg_day(y.axis = distance/10,
         geom = "line",
         aes_col = method,
         group = method,
         linewidth = 1,
         alpha = 0.75,
         y.scale = "identity",
         y.axis.breaks = seq(0,150, by = 20),
         y.axis.label = "Distance (cm)"
         ) |>
  gg_photoperiod(coordinates)

1: Filter one day
2: Group additionally by every observation
3: Remove data with low confidence
4: Average across all distance values
5: Closest across all distance values
6: Central distance
7: Number of (valid) grid points
8: Aggregate to 15 minute data
9: Remove data points that fall exactly on midnight of the following day
10: Pivoting the method results from wide to long for plotting
11: Setting up the plot for distance.

Figure 7: Comparison of condensation methods for spatial grid of distance measurements. The lines represent an average across all data points (yellow), the minimum distance (grey), or the central 10° (blue). Data points with confidence less than 10% were removed prior to calculation.

As can be seen in Figure 7, while the overall pattern is similar regardless of the used method, there are notable differences between the methods, which will consequently affect downstream analyses. Most importantly, the process of condensation has to be well documented and reproducible, as shown above. Any of these data could now be used to calculate the frequency of continuous near work, visual breaks, or near-work episodes as described above.

3.2 Light

The Clouclip illuminance data in our example cover indoor environments and are thus comparatively low, which would make certain daylight exposure summaries trivial or not meaningful. To better illustrate light exposure metrics, we turn to a different dataset, this one taken from the VEET device’s illuminance data, which capture a broader range of lighting conditions (though both device types are able to capture broadly the same range of illuminance). We import the VEET ambient light data (already preprocessed to have regular 5-second intervals as described above) and briefly examine its distribution.

Illuminance distribution: The illuminance values from the Clouclip were comparatively low, while the VEET data include outdoor exposures up to several thousand lux. The contrast is evident from comparing histograms of the two datasets’ lux values (Clouclip vs. VEET), where the main peak is similarly positioned between 10 and 100 lx, but the tails differ. The VEET illuminance histogram (see Figure 9) shows a heavily skewed distribution with a considerable number of zero lx values (indicating intervals of complete darkness or the sensor being covered) and a long tail extending to very high lux values. Such zero-inflated and skewed data are common in wearable light measurements (J. Zauner, Guidolin, and Spitschan).

Figure 8: Histogram of illuminance values from the Clouclip dataset (5-second data). The values are typical of indoor conditions.

Figure 9: Histogram of illuminance values from the VEET dataset (aggregated to 5 s). Note the logarithmic x-axis: the distribution is highly skewed with many low values (including zeros) and a long tail of high lux readings. Outdoor light exposures in bright conditions are distinguishable around 10^3 to 10^4 lx.

After confirming that the VEET data cover a broad dynamic range of lighting, we proceed with calculating light exposure metrics. (The VEET data had been cleaned for gaps and irregularities as described earlier, and non-wear times were removed; see Supplement 1 for the details.)

3.2.1 Average light exposure

A basic metric is the average illuminance over the day. Table 9 shows the mean illuminance (in lux) for weekdays, weekends, and the overall daily mean, calculated directly from the raw lux values.

Calculating mean light exposure per day

dataVEET |> 
  select(Id, Date, Datetime, Lux) |> 
  summarize_numeric(prefix = "mean ", remove = c("Datetime")) |> 
  to_mean_daily() |>
  fmt_number(decimals = 1) |> 
  cols_hide(`average episodes`) |>
  cols_label(`average mean Lux` = "Mean photopic illuminance (lx)")

Table 9: Mean light exposure (illuminance) per day

Date	Mean photopic illuminance (lx)
VEET
Mean daily	481.8
Weekday	538.1
Weekend	341.1

However, because illuminance data tend to be extremely skewed and contain many zero values (periods of darkness), the arithmetic mean can be misleading. A common approach is to apply a logarithmic transform to illuminance before averaging, which down-weights extreme values and accounts for the multiplicative nature of light intensity effects. LightLogR provides helper functions log_zero_inflated() and its inverse exp_zero_inflated() to handle log-transformation when zeros are present (by adding a small offset before log, and back-transforming after averaging). Using this approach, we recompute the daily mean illuminance. The results in Table 10 show that the log-transformed mean (back-transformed to lux) is much lower, reflecting the fact that for much of the time illuminance was near zero. This transformed mean is often more representative of typical exposure for skewed data.

Calculating mean light exposure per day with log transformation

dataVEET |> 
  select(Id, Date, Datetime, Lux) |> 
1  mutate(Lux = Lux |> log_zero_inflated()) |>
  summarize_numeric(prefix = "mean ", remove = c("Datetime")) |> 
2  mean_daily(prefix = "") |>
3  mutate(`mean Lux` = `mean Lux` |> exp_zero_inflated()) |>
  gt() |> fmt_number(decimals = 1) |> cols_hide(episodes) |> 
  cols_label(`mean Lux` = "Mean photopic illuminance (lx)")

1: Log transform with zero handling (base 10)
2: Calculate daily mean of log-lux
3: Back-transform to lux

Table 10: Mean light exposure per day (after logarithmic transformation to account for zero inflation and skewness)

Date	Mean photopic illuminance (lx)
VEET
Mean daily	57.0
Weekday	70.1
Weekend	33.9

3.2.2 Duration in high-light (outdoor) conditions

Another important metric is the amount of time spent under bright light, often used as a proxy for outdoor exposure. We define thresholds corresponding to outdoor light levels (e.g. 1000 lx and above). Here, we categorize each 5-second interval of illuminance into bands: Outdoor bright (≥1000 lx), Outdoor very bright (≥2000 lx), and Outdoor extremely bright (≥3000 lx). We then sum the duration in each category per day.

While daylight levels can far exceed the recorded light levels, those are usually recorded with direct sunlight and without obstruction. Under normal viewing conditions, at eye level, and avoiding glare, daylight levels of a few thousand lux are at the higher end of the distribution (Murukesu, Zauner, and Spitschan 2025). Figure 9 shows a bimodal distribution, with the right mode representing outdoor lighting conditions. In a 2023 review of light dosimeters to investigate the light-myopia relationship (Hönekopp and Weigelt 2023), 1000 lx was the predominant cutoff value to distinguish indoor vs. outdoor environments. It is not, however, without critique, and both other thresholds (Patterson Gentile et al. 2025) and classification methods are proposed (Tabandeh and Spitschan 2025).

Define outdoor illuminance thresholds (in lux)

out_breaks <- c(0, 1e3, 2e3, 3e3, Inf)
out_labels <- c(
    "Indoor",                  # [0, 1000) lx
    "Outdoor bright",          # [1000, 2000) lx
    "Outdoor very bright",     # [2000, 3000) lx
    "Outdoor extremely bright" # [3000, ∞) lx
  )

dataVEET <- dataVEET |> 
  mutate(Lux_range = cut(Lux, breaks = out_breaks, labels = out_labels))

Now we compute the mean daily duration spent in each of these outdoor light ranges (Table 11):

Calculate the mean daily duration spent in each light range

dataVEET |> 
  drop_na(Lux_range) |> 
  group_by(Lux_range, .add = TRUE) |> 
  durations(Lux) |>                         
  pivot_wider(names_from = Lux_range, values_from = duration) |> 
  to_mean_daily("") |> 
  fmt_duration(input_units = "seconds", output_units = "minutes")

Table 11: Average daily duration in outdoor-equivalent light conditions

Date	Indoor	Outdoor bright	Outdoor very bright	Outdoor extremely bright
VEET
Mean daily	663m	24m	28m	42m
Weekday	688m	29m	35m	48m
Weekend	602m	10m	10m	28m

It is also informative to visualize when these high-light conditions occurred. Figure 10 shows a timeline plot with periods of outdoor-level illuminance highlighted in color. In this example, violet denotes ≥1000 lx, green ≥2000 lx, and yellow ≥3000 lx. Grey shading indicates nighttime (from civil dusk to dawn) for context.

Visualize time spent outdoors

dataVEET |> 
1  aggregate_Datetime("2 mins", type = "floor") |>
2  mutate(Lux_range = fct_recode(Lux_range, NULL = "Indoor")) |>
3  gg_day(y.axis = Lux,
         y.axis.label = "Photopic illuminance (lx)",
         geom = "line",
         jco_color = FALSE) |>
4  gg_state(Lux_range, aes_fill = fct_rev(Lux_range),
           alpha = 0.75, ymin = 10^3, ymax = 10^4) +
  scale_fill_viridis_d() +
  labs(fill = "Illuminance range") +
  theme(legend.position = "bottom") +
5  coord_cartesian(xlim = c(8, 19.5)*3600)

1: Aggregating data to 5-minute bins
2: Removing the indoor condition
3: Setting up the basic plot
4: Adding state information on the illuminance ranges
5: Setting the x-axis limits to cover daytime hours

Figure 10: Outdoor light exposure over time with 2-minute interval. Colored bands indicate periods when illuminance exceeded outdoor thresholds for at least half of each interval: violet for ≥1000 lx, green for ≥2000 lx, and yellow for ≥3000 lx. Grey shaded regions denote night (from civil dusk to dawn).

3.2.3 Frequency of transitions from indoor to outdoor light

We next consider how often the subject moved from an indoor light environment to an outdoor-equivalent environment. We operationally define an “outdoor transition” as a change from <1000 lx to ≥1000 lx. Using the cleaned VEET data, we extract all instances where illuminance crosses that threshold from below to above.

Table 12 shows the average number of such transitions per day. Note that if data are recorded at a fine temporal resolution (5 s here), very brief excursions above 1000 lx could count as transitions and inflate this number. Indeed, the initial count is fairly high, reflecting fleeting spikes above 1000 lx that might not represent meaningful outdoor exposures.

Calculate the number of transitions from indoor to outdoor

dataVEET |> 
1  extract_states(Outdoor, Lux >= 1000, group.by.state = FALSE) |>
2  filter(!lead(Outdoor) & Outdoor) |>
  summarize_numeric(prefix = "mean ",
    remove = c("Datetime", "Outdoor", "start", "end", "duration"),
    add.total.duration = FALSE) |> 
  mean_daily(prefix = "") |> 
  gt() |> 
  fmt_number(episodes, decimals = 0) |> 
  fmt_duration(`mean epoch`, input_units = "seconds", output_units = "seconds")

1: Label each interval as Outdoor (Lux≥1000) or not
2: Find instances where the previous interval was “indoor” and current is “outdoor”

Table 12: Average daily count of transitions from indoor (<1000 lx) to outdoor (≥1000 lx) lighting when looking at 5-second epochs

Date	mean epoch	episodes
VEET
Mean daily	5s	64
Weekday	5s	72
Weekend	5s	46

To obtain a more meaningful measure, we can require that the outdoor state persists for some minimum duration to count as a true transition (filtering out momentary fluctuations around the 1000 lx mark). For example, we can require that once ≥1000 lx is reached, it continues for at least 5 minutes (allowing short interruptions up to 20 s). Table 13 applies this criterion, resulting in a lower, more plausible transition count.

Calculate the number of transitions from indoor to outdoor with clusters

dataVEET |> 
  extract_clusters(Lux >= 1000,
                   cluster.duration = "5 min", 
                   interruption.duration = "20 secs",
                   return.only.clusters = FALSE,
                   drop.empty.groups = FALSE) |> 
  filter(!lead(is.cluster) & is.cluster) |> 
  summarize_numeric(prefix = "mean ",
    remove = c("Datetime", "start", "end", "duration"),
    add.total.duration = FALSE) |> 
  mean_daily(prefix = "") |> 
  gt() |> fmt_number(episodes, decimals = 0)

Table 13: Daily indoor-to-outdoor transition count (requiring ≥5 min duration of ≥1000 lx to count)

Date	mean epoch	episodes
VEET
Mean daily	5s	5
Weekday	5s	6
Weekend	5s	4

3.2.4 Longest sustained bright-light period

The final light exposure metric we illustrate is the longest continuous period above a certain illuminance threshold (often termed longest period above threshold, e.g. PAT₁₀₀₀ for 1000 lx). This gives us a sense of the longest outdoor exposure in a day. Along with it, one might report the total duration above that threshold in the day (TAT₁₀₀₀). While we could derive these from the earlier analyses, LightLogR provides dedicated metric functions for such calculations, which can compute multiple related metrics at once.

Using the function period_above_threshold() for PAT and duration_above_threshold() for TAT, we calculate both metrics for the 1000 lx threshold. Table 14 shows the mean of these metrics across days (i.e., average longest bright period and average total bright time per day).

Calculate PAT1000 and TAT1000

dataVEET |> 
  summarize(
    period_above_threshold(
      Lux, Datetime, threshold = 1000, na.rm = TRUE, as.df = TRUE),
    duration_above_threshold(
      Lux, Datetime, threshold = 1000, na.rm = TRUE, as.df = TRUE),
    .groups = "keep"
  ) |> 
  to_mean_daily("")

Table 14: Longest period and total duration illuminance above 1000 lx (PAT1000 and TAT1000)

Date	period above 1000	duration above 1000
VEET
Mean daily	1208s (~20.13 minutes)	5703s (~1.58 hours)
Weekday	1469s (~24.48 minutes)	6815s (~1.89 hours)
Weekend	555s (~9.25 minutes)	2922s (~48.7 minutes)

3.2.5 Merging data streams

Note that while imports from different devices can be merged, devices differ in their sensors, electronics, housing or diffuser form factors, and on-device data-processing pipelines. All of these factors affect the comparability of measurements, even when devices output the same variable (e.g., illuminance or distance). If data from different devices with the same measurement variable are to be merged, the corresponding variable names should be standardized beforehand - for example, renaming Lux and LIGHT to illuminance. If we wanted to analyse the VEET data together with the Clouclip data, for example, we would not have to rename anything, as both carry their illuminance measurements in the variable Lux. The following example shows how the combination of datasets would lead to a combined dataset, and how that would affect analysis outcomes. It is the responsibility of the researcher to perform device calibration and/or checks for a similar measurement fidelity.

Merge Clouclip and VEET data

data <- join_datasets(dataCC, dataVEET)
data |> summary_overview(Lux, threshold.missing = 0.5) |> gt()

Table 15: Overview of the merged dataset

name	mean	min	max
Participants	2.00	NA	NA
Participant-days	13.00	6.00	7.00
Days ≥50% complete	6.00	3.00	3.00
Missing/Irregular	0.52	0.38	0.86

We will reuse the example from Section 3.2.1, but instead of one participant, we now have data from two devices and participants

Re-calculate mean photopic illuminance with the merged dataset

1data |>
2  select(Id, Date, Datetime, Lux) |>
  mutate(Lux = Lux |> log_zero_inflated()) |>
  summarize_numeric(prefix = "mean ", remove = c("Datetime")) |>
  mean_daily(prefix = "") |>
  mutate(`mean Lux` = `mean Lux` |> exp_zero_inflated()) |>
  gt() |> fmt_number(decimals = 1) |> cols_hide(episodes) |>
  cols_label(`mean Lux` = "Mean photopic illuminance (lx)")

1: Instead of dataVEET we now supply the merged data object
2: Verbatim from Section 3.2.1

Table 16: Recalculation of the mean light exposure per day (after logarithmic transformation to account for zero inflation and skewness) with the merged dataset

Date	Mean photopic illuminance (lx)
Clouclip
Mean daily	17.6
Weekday	18.6
Weekend	15.2
VEET
Mean daily	57.0
Weekday	70.1
Weekend	33.9

3.3 Spectrum

The VEET device’s spectral sensor provides multimodal data beyond simple lux values, but it requires reconstruction of the actual light spectrum from raw sensor counts. We processed the spectral sensor data in order to compute two example spectrum-based metrics. Detailed data import, normalization, and spectral reconstruction steps are given in Supplement 1; here we present the resulting metrics. Briefly, the VEET’s spectral sensor recorded counts in nine wavelength bands (roughly 415 nm to 910 nm), plus a Dark, a Clear, and a flicker detection channel⁴. After normalizing by sensor gain and applying the calibration matrix, we obtained an estimated spectral irradiance distribution for each 5-minute interval in the recording. With these reconstructed spectra, we can derive novel metrics that consider spectral content of the light.

Note

Spectrum-based metrics in wearable data are relatively new and less established compared to distance or broadband light metrics. The following examples illustrate potential uses of spectral data in a theoretical sense, which can be adapted as needed for specific research questions.

3.3.1 Ratio of short- vs. long-wavelength light

Our first spectral metric is the ratio of short-wavelength light to long-wavelength light, which is relevant, for example, in assessing the blue-light content of exposure. We define “short” wavelengths as 400–500 nm and “long” as 600–700 nm (which are not standardized thresholds and can be freely adjusted). Using the list-column of spectra in our dataset, we integrate each spectrum over these ranges (using spectral_integration()), and then compute the ratio short/long for each time interval. We then summarize these ratios per day.

Extract wavelength sections and integrate over them

dataVEET <- dataVEET2 |> 
1  select(Id, Date, Datetime, Spectrum) |>
  mutate(
    short = 
      Spectrum |> map_dbl(spectral_integration, wavelength.range = c(400, 500)),
    long  = 
      Spectrum |> map_dbl(spectral_integration, wavelength.range = c(600, 700)),
    `sl ratio` = 
2      ifelse(is.nan(short / long), NA, short / long)
  )

1: Focus on ID, date, time, and spectrum
2: Compute short-to-long wavelength ratio

Table 17 shows the average short/long wavelength ratio, averaged over each day (and then as weekday/weekend means if applicable). In this dataset, the values give an indication of the spectral balance of the light the individual was exposed to (higher values mean relatively more short-wavelength content).

Calculate daily average values of short and long wavelength content

dataVEET |> 
  summarize_numeric(prefix = "", remove = c("Datetime", "Spectrum")) |> 
  gt() |> 
  fmt_number(decimals = 1, scale_by = 1000) |>
  fmt_number(`sl ratio`, decimals = 3) |>
  cols_hide(episodes)

Table 17: Average (mW/m²) and ratio of short-wavelength (400–500 nm) to long-wavelength (600–700 nm) light

Date	short	long	sl ratio
VEET
2025-06-18	83.7	71.7	0.610
2025-06-20	114.0	81.6	0.372

3.3.2 Melanopic daylight efficacy ratio (MDER)

The same idea is behind calculating the melanopic daylight efficacy ratio (or MDER), which is defined by the CIE (“CIE System for Metrology of Optical Radiation for ipRGC-Influenced Responses to Light” 2018) as the melanopic EDI divided by the photopic illuminance (Hartmeyer and Andersen 2023). Results are shown in Table 18. In this case, instead of a simple integration over a wavelength band, we apply an action spectrum to the spectral power distribution (SPD), integrate over the weighted SPD, and apply a correction factor. All alphaopic action spectra are implemented in the spectral_integration() function. These will result in photopic illuminance and melanopic equivalent daylight illuminance (melEDI).

Calculate melEDI and illumiance

dataVEET <- 
  dataVEET |> 
  select(Id, Date, Datetime, Spectrum, short, long, `sl ratio`) |>
  mutate(
1    melEDI =
      Spectrum |> map_dbl(spectral_integration, action.spectrum = "melanopic"),
2    illuminance  =
      Spectrum |> map_dbl(spectral_integration, action.spectrum = "photopic")
  )

1: Calculate melanopic EDI by applying the $s_{mel (\lambda)}$ action spectrum, integrating, and weighing
2: Calculate photopic illuminance by applying the $V_{(\lambda)}$ action spectrum, integrating, and weighing

Calculate MDER

dataVEET  |> 
  summarize_numeric(prefix = "", remove = c("Datetime", "Spectrum")) |> 
  mutate(MDER = melEDI / illuminance) |>
  gt() |> 
  fmt_number(-`sl ratio`, decimals = 1, scale_by = 1000) |>
  fmt_number(c(MDER, `sl ratio`), decimals = 3) |>
  cols_hide(episodes)

Table 18: Average melanopic daylight efficacy ratio (MDER)

Date	short	long	sl ratio	melEDI	illuminance	MDER
VEET
2025-06-18	83.7	71.7	0.610	80.6	103.7	0.777
2025-06-20	114.0	81.6	0.372	108.0	123.9	0.871

3.3.3 Short-wavelength light at specific times of day

The third spectral example examines short-wavelength light exposure as a function of time of day. Certain studies might be interested in, for instance, blue-light exposure during midday versus morning or night. We demonstrate three approaches: (a) filtering the data to a specific local time window, and (b) aggregating by hour of day to see a daily profile of short-wavelength exposure. Additionally, we (c) look at differences between day and night periods.

Table 19 isolates the time window between 7:00 and 11:00 each day and computes the average short-wavelength irradiance in that interval. This represents a straightforward query: “How much blue light does the subject get in the morning on average?”

Calculate short-wavelength light before noon

dataVEET |> 
1  filter_Time(start = "7:00:00", end = "11:00:00") |>
  select(c(Id, Date, short)) |>
  summarize_numeric(prefix = "") |> 
  gt() |> 
  fmt_number(short, scale_by = 1000) |> 
  cols_label(short = "Short-wavelength irradiance (mW/m²)") |> 
  cols_hide(episodes)

1: Filter data to local 7am–11am

Table 19: Average short-wavelength light (400–500 nm) exposure between 7:00 and 11:00 each day

Date	Short-wavelength irradiance (mW/m²)
VEET
2025-06-18	5.44
2025-06-20	0.95

To visualize short-wavelength exposure over the course of a day, we aggregate the data into hourly bins. We cut the timeline into 1-hour segments (using local time), compute the mean short-wavelength irradiance in each hour for each day. Figure 11 shows the resulting diurnal profile, with short-wavelength exposure expressed as a fraction of the daily maximum for easier comparison.

Plot a diurnal profile

1dataVEETtime <- dataVEET |>
2  cut_Datetime(unit = "1 hour", type = "floor", group_by = TRUE) |>
  select(-c(Spectrum, long, Datetime)) |>
  summarize_numeric(prefix = "") |> 
3  add_Time_col(Datetime.rounded)  |>
  mutate(rel_short = short / max(short))

4dataVEETtime |>
  ggplot(aes(x=Time, y = rel_short)) +
  geom_col(aes(fill = factor(Date)), position = "dodge") +
  ggsci::scale_fill_jco() +
  theme_minimal() +
  labs(y = "Normalized short-wavelength irradiance", 
       x = "Local time (HH:MM)",
       fill = "Date") + 
  scale_y_continuous(labels = scales::label_percent()) +
  scale_x_time(labels = scales::label_time(format = "%H:%M"))

1: Prepare hourly binned data
2: Bin timestamps by hour
3: Add a Time column (hour of day)
4: Creating the plot

Figure 11: Diurnal profile of short-wavelength light exposure. Each bar represents the average short-wavelength irradiance at that hour of the day (0–23 h), normalized to the daily maximum.

Finally, we compare short-wavelength exposure during daytime vs. nighttime. Using civil dawn and dusk information (based on geographic coordinates, here set for Houston, TX, USA), we label each measurement as day or night and then compute the total short-wavelength exposure in each period. Table 20 summarizes the daily short-wavelength dose received during the day vs. during the night.

Calculate photoperiod dependent measures

dataVEET |>
  select(-c(Spectrum, long, `sl ratio`, melEDI, illuminance)) |>
  add_photoperiod(coordinates) |> 
  group_by(photoperiod.state, .add = TRUE) |> 
  summarize_numeric(prefix = "", 
                    remove = c("dawn", "dusk", "photoperiod", "Datetime")) |> 
  group_by(Id, photoperiod.state) |> 
  select(-episodes) |> 
  pivot_wider(names_from =photoperiod.state, values_from = short) |> 
  gt() |> 
  fmt_number(scale_by = 1000, decimals = 1)

Table 20: Short wavelength light exposure (mW/m²) during the day and at night

Date	day	night
VEET
2025-06-18	126.5	12.3
2025-06-20	181.8	1.0

Note

In the code cell above, add_photoperiod(coordinates) is used as a convenient way to add columns to the data frame, indicating for each timestamp whether it was day or night, given the latitude/longitude.

4 Discussion and conclusion

This tutorial demonstrates a standardized, step-by-step pipeline to calculate a variety of visual experience metrics. We illustrated how a combination of LightLogR functions and tidyverse workflows can yield clear and reproducible analyses for wearable device data. While the full pipeline is detailed, each metric is computed through a dedicated sequence of well-documented steps, yet remains configurable to realize different metric definitions or thresholds.

By leveraging LightLogR’s framework alongside common data analysis approaches, the process remains transparent and relatively easy to follow. The overall goal is to make analysis transparent (with open-source functions), accessible (through thorough documentation, tutorials, and human-readable function naming, all under an MIT license), robust (the package includes >900 unit tests and continuous integration with bug tracking on GitHub), and community-driven (open feature requests and contributions via GitHub).

Even with standardized pipelines, researchers must still make and document many decisions during data cleaning, time-series handling, and metric calculations — especially for complex metrics that involve grouping data in multiple ways (for example, grouping by distance range as well as by duration for cluster metrics). We have highlighted these decision points in the tutorial (such as how to handle irregular intervals, choosing thresholds for “near” distances or “outdoor” light, and deciding on minimum durations for sustained events). Explicitly considering and reporting these choices is important for reproducibility and for comparing results across studies.

The broad set of features in LightLogR — ranging from data import and cleaning tools (for handling time gaps and irregularities) to visualization functions and metric calculators — make it a powerful toolkit for visual experience research. Our examples spanned circadian-light metrics and myopia-related metrics, demonstrating the versatility of a unified analysis approach. By using community-supported tools and workflows, researchers in vision science, chronobiology, myopia, and related fields can reduce time spent on low-level data wrangling and focus more on interpreting results and advancing scientific understanding.

5 Session info

sessionInfo()

R version 4.5.2 (2025-10-31)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 24.04.3 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0

locale:
 [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
 [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
 [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
[10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   

time zone: UTC
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
 [1] gt_1.1.0         lubridate_1.9.4  forcats_1.0.1    stringr_1.6.0   
 [5] dplyr_1.1.4      purrr_1.2.0      readr_2.1.6      tidyr_1.3.1     
 [9] tibble_3.3.0     ggplot2_4.0.1    tidyverse_2.0.0  LightLogR_0.10.0

loaded via a namespace (and not attached):
 [1] sass_0.4.10        generics_0.1.4     renv_1.1.5         class_7.3-23      
 [5] xml2_1.5.1         KernSmooth_2.23-26 stringi_1.8.7      hms_1.1.4         
 [9] digest_0.6.39      magrittr_2.0.4     evaluate_1.0.5     grid_4.5.2        
[13] timechange_0.3.0   RColorBrewer_1.1-3 fastmap_1.2.0      jsonlite_2.0.0    
[17] e1071_1.7-16       DBI_1.2.3          viridisLite_0.4.2  scales_1.4.0      
[21] textshaping_1.0.4  cli_3.6.5          rlang_1.1.6        units_1.0-0       
[25] cowplot_1.2.0      withr_3.0.2        yaml_2.3.12        tools_4.5.2       
[29] tzdb_0.5.0         vctrs_0.6.5        R6_2.6.1           proxy_0.4-27      
[33] classInt_0.4-11    lifecycle_1.0.4    fs_1.6.6           htmlwidgets_1.6.4 
[37] ragg_1.5.0         pkgconfig_2.0.3    pillar_1.11.1      gtable_0.3.6      
[41] Rcpp_1.1.0         glue_1.8.0         sf_1.0-23          systemfonts_1.3.1 
[45] xfun_0.54          tidyselect_1.2.1   knitr_1.50         farver_2.1.2      
[49] htmltools_0.5.9    rmarkdown_2.30     ggsci_4.1.0        labeling_0.4.3    
[53] suntools_1.1.0     compiler_4.5.2     S7_0.2.1

6 Statements

6.1 Acknowledgement

We thank George Hatoun and David Sullivan (Reality Labs Research) for reviewing a draft of the tutorial manuscript and providing sample data for the VEET for spectral analyses.

6.2 Data availability statement

All data and code in this tutorial and Supplement 1 are available from the GitHub repository: https://github.com/tscnlab/ZaunerEtAl_JVis_2026/, archived on Zenodo: https://doi.org/10.5281/zenodo.16566014 under a MIT license (data under CC-BY license).

6.3 Funding statement

JZ’s position is funded by the MeLiDos project. The project has received funding from the European Partnership on Metrology (22NRM05 MeLiDos), co-financed from the European Union’s Horizon Europe Research and Innovation Programme and by the Participating States. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or EURAMET. Neither the European Union nor the granting authority can be held responsible for them. JZ, LAO, and MS received research funding from Reality Labs Research. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.

6.4 Conflict of interest statement

JZ declares the following potential conflict of interest in the past five years (2021-2025). Funding: Received research funding from Reality Labs Research.

AN declares the following potential conflicts of interest in the past five years (2021-2025). none

LAO declares the following potential conflict of interest in the past five years (2021-2025). Consultancy: Zeiss, Alcon, EssilorLuxottica; Research support: Topcon, Meta, LLC; Patents: US 11375890 B2

MS declares the following potential conflicts of interest in the past five years (2021–2025). Academic roles: Member of the Board of Directors, Society of Light, Rhythms, and Circadian Health (SLRCH); Chair of Joint Technical Committee 20 (JTC20) of the International Commission on Illumination (CIE); Member of the Daylight Academy; Chair of Research Data Alliance Working Group Optical Radiation and Visual Experience Data. Remunerated roles: Speaker of the Steering Committee of the Daylight Academy; Ad-hoc reviewer for the Health and Digital Executive Agency of the European Commission; Ad-hoc reviewer for the Swedish Research Council; Associate Editor for LEUKOS, journal of the Illuminating Engineering Society; Examiner, University of Manchester; Examiner, Flinders University; Examiner, University of Southern Norway. Funding: Received research funding and support from the Max Planck Society, Max Planck Foundation, Max Planck Innovation, Technical University of Munich, Wellcome Trust, National Research Foundation Singapore, European Partnership on Metrology, VELUX Foundation, Bayerisch-Tschechische Hochschulagentur (BTHA), BayFrance (Bayerisch-Französisches Hochschulzentrum), BayFOR (Bayerische Forschungsallianz), and Reality Labs Research. Honoraria for talks: Received honoraria from the ISGlobal, Research Foundation of the City University of New York and the Stadt Ebersberg, Museum Wald und Umwelt. Travel reimbursements: Daimler und Benz Stiftung. Patents: Named on European Patent Application EP23159999.4A (“System and method for corneal-plane physiologically-relevant light logging with an application to personalized light interventions related to health and well-being”). With the exception of the funding source supporting this work, M.S. declares no influence of the disclosed roles or relationships on the work presented herein.

6.5 Statement of generative AI and AI-assisted technologies in the writing process

The authors used ChatGPT during the preparation of this work. After using this tool, the authors reviewed and edited the content as needed and take full responsibility for the content of the publication.

Use of AI in contributor roles ⁵: Conceptualization: no Data curation: no Formal analysis: bug fixing Methodology: no Software: bug fixing Validation: no Visualization: tweaking of options Writing – original draft: abstract refinement Writing – review & editing: improve readability and language

7 References

Bhandari, Khob R, and Lisa A Ostrin. 2020. “Validation of the Clouclip and Utility in Measuring Viewing Distance in Adults.” Ophthalmic and Physiological Optics 40 (6): 801–14. https://doi.org/https://doi.org/10.1111/opo.12735.

Biller, A. M., P. Balakrishnan, and M. Spitschan. 2024. “Behavioural Determinants of Physiologically-Relevant Light Exposure.” Journal Article. Commun Psychol 2 (1): 114. https://doi.org/10.1038/s44271-024-00159-5.

Blume, C., C. Garbazza, and M. Spitschan. 2019. “Effects of Light on Human Circadian Rhythms, Sleep and Mood.” Journal Article. Somnologie (Berl) 23 (3): 147–56. https://doi.org/10.1007/s11818-019-00215-x.

Brown, T. M., G. C. Brainard, C. Cajochen, C. A. Czeisler, J. P. Hanifin, S. W. Lockley, R. J. Lucas, et al. 2022. “Recommendations for Daytime, Evening, and Nighttime Indoor Light Exposure to Best Support Physiology, Sleep, and Wakefulness in Healthy Adults.” Journal Article. PLoS Biol 20 (3): e3001571. https://doi.org/10.1371/journal.pbio.3001571.

“CIE System for Metrology of Optical Radiation for ipRGC-Influenced Responses to Light.” 2018, no. CIE S 026/E:2018. https://doi.org/10.25039/S026.2018.

Dahlmann-Noor, A. H., D. Bokre, M. Khazova, and L. L. A. Price. 2025. “Measuring the Visual Environment of Children and Young People at Risk of Myopia: A Scoping Review.” Journal Article. Graefes Arch Clin Exp Ophthalmol. https://doi.org/10.1007/s00417-024-06719-z.

Gibaldi, A., E. N. Harb, C. F. Wildsoet, and M. S. Banks. 2024. “A Child-Friendly Wearable Device for Quantifying Environmental Risk Factors for Myopia.” Journal Article. Transl Vis Sci Technol 13 (10): 28. https://doi.org/10.1167/tvst.13.10.28.

Guidolin, Carolina, Johannes Zauner, Steffen Lutz Hartmeyer, and Manuel Spitschan. 2024. “Collecting, Detecting and Handling Non-Wear Intervals in Longitudinal Light Exposure Data.” bioRxiv. https://doi.org/10.1101/2024.12.23.627604.

Hartmeyer, S. L., and M. Andersen. 2023. “Towards a Framework for Light-Dosimetry Studies: Quantification Metrics.” Journal Article. Lighting Research & Technology 56 (4): 337–65. https://doi.org/10.1177/14771535231170500.

Hartmeyer, S. L., F. S. Webler, and M. Andersen. 2022. “Towards a Framework for Light-Dosimetry Studies: Methodological Considerations.” Journal Article. Lighting Research & Technology 55 (4-5): 377–99. https://doi.org/10.1177/14771535221103258.

Hönekopp, A., and S. Weigelt. 2023. “Using Light Meters to Investigate the Light-Myopia Association - a Literature Review of Devices and Research Methods.” Journal Article. Clin Ophthalmol 17: 2737–60. https://doi.org/10.2147/OPTH.S420631.

Mohamed, A., V. Kalavally, S. W. Cain, A. J. K. Phillips, E. M. McGlashan, and C. P. Tan. 2021. “Wearable Light Spectral Sensor Optimized for Measuring Daily Alpha-Opic Light Exposure.” Journal Article. Opt Express 29 (17): 27612–27. https://doi.org/10.1364/OE.431373.

Murukesu, Resshaya Roobini, Johannes Zauner, and Manuel Spitschan. 2025. “A Day in Daylight,” November. https://doi.org/10.5281/zenodo.17513996.

Okudaira, N., D. F. Kripke, and J. B. Webster. 1983. “Naturalistic Studies of Human Light Exposure.” Journal Article. Am J Physiol 245 (4): R613–5. https://doi.org/10.1152/ajpregu.1983.245.4.R613.

Patterson Gentile, Carlyn, Ryan Shah, Blanca Marquez De Prado, Nichelle Raj, Christina L. Szperka, Andrew D. Hershey, and Geoffrey K. Aguirre. 2025. “Daily Light Exposure Habits of Youth with Migraine: A Prospective Exploratory Study.” Npj Biological Timing and Sleep 2 (1). https://doi.org/10.1038/s44323-025-00056-y.

Sah, Raman Prasad, Pavan Kalyan Narra, and Lisa A. Ostrin. 2025. “A Novel Wearable Sensor for Objective Measurement of Distance and Illumination.” Ophthalmic and Physiological Optics 00 (n/a): 1–13. https://doi.org/https://doi.org/10.1111/opo.13523.

Sullivan, David, Aaron Nicholls, George Hatoun, Samuel Thompson, Cory Schwarzmiller, Fathollah Memarzanjany, Alyssa Gunderson, et al. 2024. “The Visual Experience Evaluation Tool: A Myopia Research Instrument for Quantifying Visual Experience.” bioRxiv. https://doi.org/10.1101/2024.09.20.614212.

Tabandeh, Niloufar, and Manuel Spitschan. 2025. “Photoreceptor-Specific Scene Statistics Reveal Melanopic Structure in Natural Environments,” November. http://dx.doi.org/10.1101/2025.11.10.687567.

van Duijnhoven, J., S. L. Hartmeyer, A. Didikoglu, O. Stefani, K. W. Houser, V. Kalavally, and M. Spitschan. 2025. “Measuring Light Exposure in Daily Life: A Review of Wearable Light Loggers.” Journal Article. Build Environ 274. https://doi.org/10.1016/j.buildenv.2025.112771.

Webler, F. S., M. Spitschan, R. G. Foster, M. Andersen, and S. N. Peirson. 2019. “What Is the ’Spectral Diet’ of Humans?” Journal Article. Curr Opin Behav Sci 30: 80–86. https://doi.org/10.1016/j.cobeha.2019.06.006.

Wen, Longbo, Yingpin Cao, Qian Cheng, Xiaoning Li, Lun Pan, Lei Li, HaoGang Zhu, Weizhong Lan, and Zhikuan Yang. 2020. “Objectively Measured Near Work, Outdoor Exposure and Myopia in Children.” British Journal of Ophthalmology 104 (11): 1542–47. https://doi.org/10.1136/bjophthalmol-2019-315258.

Wen, Longbo, Qian Cheng, Yingpin Cao, Xiaoning Li, Lun Pan, Lei Li, Haogang Zhu, Ian Mogran, Weizhong Lan, and Zhikuan Yang. 2021. “The Clouclip, a Wearable Device for Measuring Near-Work and Outdoor Time: Validation and Comparison of Objective Measures with Questionnaire Estimates.” Acta Ophthalmologica 99 (7): e1222–35. https://doi.org/https://doi.org/10.1111/aos.14785.

Wen, Longbo, Qian Cheng, Weizhong Lan, Yingpin Cao, Xiaoning Li, Yiqiu Lu, Zhenghua Lin, Lun Pan, Haogang Zhu, and Zhikuan Yang. 2019. “An Objective Comparison of Light Intensity and Near-Visual Tasks Between Rural and Urban School Children in China by a Wearable Device Clouclip.” Translational Vision Science & Technology 8 (6): 15–15. https://doi.org/10.1167/tvst.8.6.15.

Williams, Rachel, Suyash Bakshi, Edwin J Ostrin, and Lisa A Ostrin. 2019. “Continuous Objective Assessment of Near Work.” Scientific Reports 9 (1): 6901.

Zauner, J., C. Guidolin, and M. Spitschan. “How to Deal with Darkness: Modeling and Visualization of Zero-Inflated Personal Light Exposure Data on a Logarithmic Scale.” Journal of Biological Rhythms 0 (0): 07487304251336624. https://doi.org/10.1177/07487304251336624.

Zauner, J., S. Hartmeyer, and M. Spitschan. 2025. “LightLogR: Reproducible Analysis of Personal Light Exposure Data.” Journal Article. J Open Source Softw 10 (107): 7601. https://doi.org/10.21105/joss.07601.

Zauner, Johannes, Rigmor C. Baraas, Elise N. Harb, Ali Heshmati, Francisco Imai, Pauline Kang, Raymond P. Najjar, et al. 2025. “Technical Guide for Wearable Optical Radiation Dosimetry and Visual Experience Assessment.” https://doi.org/10.17617/6XCA-CG59.

Zauner, Johannes, Oliver Stefani, Anna M. Biller, Carolina Guidolin, and Manuel Spitschan. 2025. “A Web-Based Specification Tool for Wearable Light Loggers and Optical Radiation Dosimeters.” Preprints. https://doi.org/10.20944/preprints202511.0644.v1.

Zauner, J., O. Stefani, G. Bocanegra, C. Guidolin, B. Schrader, L. Udovicic, and M. Spitschan. 2025. “Auxiliary Data, Quality Assurance and Quality Control for Wearable Light Loggers and Optical Radiation Dosimeters.” bioRxiv. https://doi.org/10.1101/2025.09.11.675633.

Zauner, J., L. Udovicic, and M. Spitschan. 2024. “Power Analysis for Personal Light Exposure Measurements and Interventions.” PLOS ONE 19 (12): 1–15. https://doi.org/10.1371/journal.pone.0308768.

Footnotes

Functions from LightLogR are presented as links to the function documentation. General analysis functions (from package dplyr) are presented as normal text.↩︎
The upper threshold refers to the Clouclips` maximum distance measurement and is not theoretically based↩︎
This deviates from the common definition of luminous exposure, which is the sum of illuminance measurements scaled to hourly observation intervals↩︎
Note that older firmware versions of the VEET prior to 2.1.7 contained two Clear channels and the highest spectral channel was indicated as 940 nm. Data collected with this early firmware version are not suitable for spectral reconstruction in the context of research projects.↩︎
Based on the CRediT taxonomy. Funding acquisition, investigation, project administration, resources, and supervision were deemed irrelevant in this context and thus removed.↩︎