GLOBEM — Work in Progress

FASE 1 · BLOQUE 0 — COMPRENSIÓN DEL NEGOCIO PHASE 1 · BLOCK 0 — BUSINESS UNDERSTANDING Problem framing — Definición operativa de deterioro Problem framing — Operational definition of deterioration

Completado Completed

Pregunta resuelta: ¿Cómo se define operativamente "deterioro" cuando la etiqueta clínica del dataset (BDI-II) no es fiable y los métodos poblacionales han fracasado en predecirla?

Question solved: How is "deterioration" operationally defined when the dataset's clinical label (BDI-II) is not reliable and population methods have failed to predict it?

Papers previos para este bloque

Prerequisite papers for this module

Bhattacharya et al. (2024) — "Imputation Strategies for Longitudinal Behavioral Studies: Predicting Depression Using GLOBEM Datasets".

Delimita con precisión el hueco que este proyecto ocupa: los enfoques más recientes sobre GLOBEM siguen sin abordar el enfoque intra-sujeto.

It clearly defines the niche this project occupies: the most recent approaches to GLOBEM still fail to address the intra-subject perspective.

Dominio

Hunt, Auriemma & Cashaw (2003) — "Self-report bias and underreporting of depression on the BDI-II".

Justifica metodológicamente el rechazo del BDI-II como target. Muestra experimentalmente que los sujetos reportan más síntomas cuando el propósito del cuestionario está enmascarado — evidencia directa de infraestimación sistemática.

It provides a methodological justification for rejecting the BDI-II as a target measure. It demonstrates experimentally that participants report more symptoms when the purpose of the questionnaire is concealed—direct evidence of systematic underestimation.

Dominio

Decisión

Decision

Rechazar la etiqueta del dataset (BDI-II) y construir una definición operativa propia de deterioro basada en señales conductuales pasivas y persistencia temporal. Adoptar marco N-of-1 como diseño y SPC como técnica.

Reject the dataset's label (BDI-II) and build an own operational definition of deterioration based on passive behavioral signals and temporal persistence. Adopt the N-of-1 framework as design and SPC as technique.

Renuncia

Trade-off given up

Comparabilidad directa con el benchmark de 18 algoritmos de Xu et al. (2022). No se puede comparar apples-to-apples si se cambia la pregunta de fondo.

Direct comparability with the 18-algorithm benchmark of Xu et al. (2022). You cannot compare apples-to-apples if the underlying question changes.

Uso real

Real-world use

Monitorización pasiva continua sin cuestionarios ni evaluaciones periódicas. El sistema funciona en background, como un detector de humo, no como una consulta clínica.

Continuous passive monitoring without questionnaires or periodic evaluations. The system runs in the background, like a smoke detector, not like a clinical consultation.

Trade-off

Se gana detección temprana y no intrusiva. Se pierde certeza de estar midiendo exactamente depresión clínica tal como la define el DSM. Es un trade-off consciente y defendible.

Early and non-intrusive detection is gained. Certainty of measuring exactly clinical depression as DSM defines it is lost. It is a conscious and defensible trade-off.

FASE 2 · BLOQUE 1 — VALIDACIÓN TÉCNICA DEL DATASET PHASE 2 · BLOCK 1 — TECHNICAL DATASET VALIDATION Validación del dataset — Cobertura, warmup y viabilidad Dataset validation — Coverage, warmup and feasibility

Completado Completed

Pregunta resuelta: ¿Qué cohortes, sensores y ventanas temporales del dataset son técnicamente viables para construir un sistema de monitorización individual?

Question solved: Which cohorts, sensors and temporal windows of the dataset are technically feasible to build an individual monitoring system?

Papers previos para este bloque

Prerequisite papers for this module

Xu et al. (2022) — "GLOBEM Dataset: Multi-Year Datasets for Longitudinal Human Behavior Modeling".

La fuente del dataset y el benchmark de referencia. El 54.7% de precisión de su mejor modelo es el punto de partida que motiva cambiar el enfoque.

The source of the dataset and the benchmark. The 54.7% accuracy of their best model is the starting point that prompted a change in approach.

Dominio

Decisión

Decision

Trabajar exclusivamente con INS-W_1 (155 participantes, pre-COVID). Features 7dhist sobre 14dhist. Excluir WiFi (96.8% NaN), Call (NaN ambiguo) y Bluetooth (señal sin valor conductual).

Work exclusively with INS-W_1 (155 participants, pre-COVID). 7dhist features over 14dhist. Exclude WiFi (96.8% NaN), Call (ambiguous NaN) and Bluetooth (signal without behavioral value).

Renuncia

Trade-off given up

Los 3 estudios restantes del dataset (2019–2021). Mayor muestra, menor fiabilidad del contexto. No merece la pena sacrificar la limpieza del baseline por más n.

The 3 remaining studies of the dataset (2019–2021). Larger sample, lower context reliability. Not worth sacrificing baseline cleanliness for more n.

Uso real

Real-world use

Cualquier sistema de monitorización real necesita datos de warmup por persona antes de poder detectar desviaciones. Aquí se fija ese período en 7 días mínimo por sujeto.

Any real monitoring system needs warmup data per person before being able to detect deviations. Here that period is set at a minimum of 7 days per subject.

Trade-off

7dhist sobre 14dhist: se pierde historia, se gana cobertura de participantes. Excluir WiFi/Call/Bluetooth: se pierden señales, se gana fiabilidad. Solo INS-W_1: se pierde n, se gana contexto limpio.

7dhist over 14dhist: history is lost, participant coverage is gained. Excluding WiFi/Call/Bluetooth: signals are lost, reliability is gained. Only INS-W_1: n is lost, clean context is gained.

FASE 3 · BLOQUE 2 — PREPARACIÓN DE DATOS PHASE 3 · BLOCK 2 — DATA PREPARATION Construcción del baseline individual Individual baseline construction

Completado Completed

Pregunta que resuelve: ¿Qué es el "comportamiento normal" de cada participante, cómo se mide día a día y cómo se decide en qué momentos puede servir de referencia?

Question being solved: What is each participant's "normal behaviour", how is it measured day by day, and how do you decide at which moments it can serve as reference?

Papers previos para este bloque

Prerequisite papers for this module

Difrancesco et al. (2019) — "Sleep, circadian rhythm, and physical activity patterns in depressive and anxiety disorders". Depression and Anxiety.

Baseline / mapeo de señales: Respaldo empírico para elegir sueño y actividad como dimensiones núcleo.

Baseline / Signal Mapping: Empirical support for selecting sleep and activity as core dimensions.

Dominio

Daza (2018). Daza, E. J. (2018). Causal Analysis of Self-tracked Time Series Data Using a Counterfactual Framework for N-of-1 Trials.

Marco counterfactual para estudios N-of-1 observacionales. Define formalmente el diseño N1OS frente al N1RT y sustenta la inferencia idiográfica sin randomización, núcleo conceptual del enfoque intra-sujeto del proyecto.

Counterfactual framework for N-of-1 observational studies. Formally defines the N1OS design as distinct from the N1RT and grounds idiographic inference without randomisation, the conceptual core of the project's within-subject approach.

N-of-1

Vieira et al. (2017). Dynamic modelling of n-of-1 data: powerful and flexible data analytics applied to individualised studies.

Modelado dinámico de datos N-of-1. Justifica formalmente el tratamiento de tendencia y autocorrelación en series longitudinales individuales, base técnica del tratamiento de autocorrelación entre z-scores en el Bloque 4.

Dynamic modelling of N-of-1 data. Provides the formal justification for handling temporal trend and autocorrelation in single-subject longitudinal series, the technical basis for the autocorrelation treatment between z-scores in Block 4.

N-of-1

Decisión

Decision

Baseline rolling con buffer, calculado por persona, dimensión y día. La referencia individual se actualiza día a día con el histórico reciente del participante, lo que captura la no-estacionariedad propia de las series conductuales reales y evita por construcción el sesgo de exclusión hacia perfiles no rutinarios. Operativamente: par (media, desviación típica) sobre los 21 días que terminan 7 días antes de cada día evaluado (ventana [d-28, d-8]). El buffer de 7 días evita que la observación del día d comparta datos crudos con su propia referencia, dado que las features 7dhist son sumas deslizantes de 7 días. Feature primaria por dimensión: pasos, radio de giro, duración de sueño, duración de pantalla. Criterio de cobertura: ≥70% de días no nulos en la ventana del baseline.

Buffered rolling baseline, computed per person, dimension and day. The individual reference updates day by day with the participant's recent history, which captures the non-stationarity inherent to real behavioural series and avoids by construction the exclusion bias toward routine profiles. Operationally: pair (mean, standard deviation) over the 21 days ending 7 days before each evaluated day (window [d-28, d-8]). The 7-day buffer prevents the day d observation from sharing raw data with its own reference, given that 7dhist features are 7-day rolling sums. Primary feature per dimension: steps, radius of gyration, sleep duration, screen duration. Coverage criterion: ≥70% non-null days in the baseline window.

Renuncia

Trade-off given up

Se renuncia al baseline fijo al inicio del registro, que sería más simple de auditar. Su limitación estructural es que, al exigir que la ventana inicial cumpla un criterio de estabilidad para entrar en monitorización, todo participante que en ese tramo concreto esté en transición (mudanza, cambio de empleo, fase distinta de su rutina, etapa de mayor variabilidad) queda sistemáticamente excluido del sistema, no porque no sea evaluable, sino porque las tres semanas iniciales no eran representativas. El sistema acaba operando preferentemente sobre perfiles rutinarios y se invisibiliza el resto. La clasificación día a día de evaluabilidad por dimensión que sí mantiene el rolling no es renuncia: es una funcionalidad del sistema (módulo de detectabilidad por persona-día) que decide en qué momentos puede operar la monitorización automática y en cuáles no.

A fixed baseline at the start of the record is given up: it would be simpler to audit. Its structural limitation is that, by requiring the initial window to meet a stability criterion to enter monitoring, every participant who happens to be in transition during that stretch (relocation, job change, a different phase of their routine, a period of higher variability) is systematically excluded from the system, not because they are unevaluable, but because the initial three weeks were not representative. The system ends up operating preferentially on routine profiles and the rest become invisible. The day-by-day evaluability classification per dimension that the rolling design does keep is not a trade-off: it is a system feature (per person-day detectability module) that decides when automatic monitoring can operate and when it cannot.

Uso real

Real-world use

Las primeras 5 semanas de un usuario en una plataforma de salud digital son acumulación de histórico: no se disparan alertas, el sistema acumula los días previos necesarios para construir el primer baseline rolling con buffer. A partir del día 35 el sistema evalúa día a día qué dimensiones tienen baseline calculable para ese usuario y aplica la regla de convergencia 2/3 sobre los días evaluables. Si en algún periodo la cobertura cae en una dimensión, el sistema no opera sobre esa dimensión esos días, sin sacar al usuario del pipeline.

A user's first 5 weeks on a digital health platform are history accumulation: no alerts are triggered, the system accumulates the prior days needed to build the first buffered rolling baseline. From day 35 onwards, the system evaluates day by day which dimensions have a computable baseline for that user and applies the 2/3 convergence rule over evaluable days. If coverage drops in a dimension during some period, the system does not operate on that dimension those days, without removing the user from the pipeline.

Trade-off

Baseline rolling con buffer frente a baseline fijo. Se elige rolling porque la referencia individual se mantiene operativa sobre toda la cohorte sin sesgar el sistema hacia perfiles de comportamiento rutinario. Se acepta una mayor complejidad de auditoría (el baseline ya no es un par único por persona-dimensión, sino una función del día) y la posibilidad de absorción del deterioro gradual, mitigada por la regla de persistencia 7 días, la regla de convergencia 2/3 y los métodos de detección de cambio estructural del Bloque 7.

Buffered rolling baseline vs fixed baseline. Rolling is chosen because the individual reference remains operative across the whole cohort without biasing the system towards routine behavioural profiles. Greater audit complexity is accepted (the baseline is no longer a single pair per person-dimension but a function of the day), as well as the possibility of gradual deterioration absorption, mitigated by the 7-day persistence rule, the 2/3 convergence rule and the structural change detection methods of Block 7.

FASE 3 · BLOQUE 3 — PREPARACIÓN DE DATOS PHASE 3 · BLOCK 3 — DATA PREPARATION Z-score intra-sujeto / Grafico de control de Shewart Intra-subject z-score / Shewhart control chart

Completado Completed

Pregunta que resolverá: Una vez construido el baseline individual, ¿Cómo se convierte cada día de comportamiento en una medida de distancia respecto al propio patrón de cada persona?

Question to be solved: Once an individual baseline has been established, how is each day’s behavior measured against that person’s own baseline?

Decisión

Decision

Estandarizar la desviación de cada persona respecto a su propio baseline expresando cada día como el número de desviaciones típicas que se aleja de su media: z = (observación − media_baseline) / std_baseline. La fórmula aplica por persona, dimensión y día evaluable con baseline válido. La estandarización hace comparables las cuatro dimensiones — pasos, radio de giro, sueño, pantalla — que tienen escalas completamente distintas en sus valores originales.

Standardize each person's deviation from their own baseline by expressing each day as the number of standard deviations from their mean: z = (observation − baseline_mean) / baseline_std. The formula applies per person, dimension and evaluable day with a valid baseline. Standardization makes the four dimensions comparable — steps, radius of gyration, sleep, screen — which have completely different scales in their original values.

Renuncia

Trade-off given up

Se renuncia a comparar personas entre sí. El z-score intra-sujeto no dice si una persona está alta o baja respecto a la población; dice si está alta o baja respecto a ella misma. Un z-score de +2 significa lo mismo para alguien muy activo que para alguien sedentario: se ha desviado dos veces su propia variabilidad habitual. No es posible comparar directamente los z-scores de dos participantes distintos.

Comparison between people is given up. The intra-subject z-score does not say whether a person is high or low relative to the population; it says whether they are high or low relative to themselves. A z-score of +2 means the same for a very active person as for a sedentary one: they have deviated two times their own habitual variability. It is not possible to directly compare the z-scores of two different participants.

Uso real

Real-world use

En producción, el z-score intra-sujeto se calcula cada día para cada usuario activo con baseline válido. Si una persona lleva varios días con z-score elevado en alguna dimensión, el sistema registra la señal sin necesitar compararla con ningún otro usuario. El umbral que activa la alerta es el mismo para todo el mundo aunque cada persona tenga un comportamiento absolutamente distinto.

In production, the intra-subject z-score is calculated each day for every active user with a valid baseline. If a person has had a high z-score in some dimension for several days, the system records the signal without needing to compare it with any other user. The threshold that triggers the alert is the same for everyone, even though each person has completely different behavior.

Trade-off

Se gana sensibilidad individual: el sistema es igualmente capaz de detectar deterioro en una persona con alta actividad habitual que en una persona sedentaria. Se pierde la capacidad de establecer referencias poblacionales o de comparar entre personas. El sistema no puede responder "esta persona está en el percentil X de la población"; solo puede responder "esta persona se ha alejado X desviaciones típicas de su propio patrón".

Individual sensitivity is gained: the system is equally capable of detecting deterioration in a habitually active person as in a sedentary one. The capacity to establish population references or compare between people is lost. The system cannot answer "this person is at the Xth population percentile"; it can only answer "this person has drifted X standard deviations from their own pattern".

FASE 3 · BLOQUE 4 — PREPARACIÓN DE DATOS PHASE 3 · BLOCK 4 — DATA PREPARATION Series temporales · Ventanas rolling · Autocorrelación Time series · Rolling windows · Autocorrelation

Completado Completed

Pregunta que resolverá: ¿Qué estructura temporal tienen las series del z-score? ¿La autocorrelación que se observa entre días consecutivos es señal conductual real o es una consecuencia del propio cálculo del baseline?

Question to be solved: What is the temporal structure of the z-score series? Is the autocorrelation observed between consecutive days a true behavioral signal, or is it a consequence of the baseline calculation itself?

Papers previos para este bloque

Prerequisite papers for this module

Wichers, Smit, Snippe (2020) — "Early Warning Signals Based on Momentary Affect Dynamics can Expose Nearby Transitions in Depression: A Confirmatory Single-Subject Time-Series Study"

Introduce formalmente la idea de que la varianza creciente y la autocorrelación creciente en momentos emocionales son indicadores anticipatorios de transiciones psicopatológicas.

Formally introduces the idea that rising variance and rising autocorrelation in momentary emotional data are anticipatory indicators of psychopathological transitions.

DOMINIO

Olthof, Hasselman, Strunk (2020) — "Critical Fluctuations as an Early-Warning Signal for Sudden Gains and Losses in Patients Receiving Psychotherapy for Mood Disorders"

Aplicación empírica concreta sobre 328 casos de psicoterapia. Demuestra que las fluctuaciones críticas anticipan cambios sostenidos en sesiones clínicas.

Concrete empirical application on 328 psychotherapy cases. Demonstrates that critical fluctuations anticipate sustained changes in clinical sessions.

DOMINIO

Decisión

Decision

Construir el diagnóstico temporal de las series del z-score en tres pasos: inspección visual de trayectorias individuales, análisis de autocorrelación con simulación de ruido como línea de base estructural, y caracterización mediante rolling mean y rolling std en ventana de 7 días. La ventana de 7 días no se optimiza aquí: está fijada por coherencia con las features 7dhist del dataset y con la regla de persistencia del Bloque 0. El hallazgo central del bloque es que la autocorrelación observada entre días consecutivos del z-score es prácticamente en su totalidad artefacto del propio cálculo del baseline rolling, no señal conductual neta.

Build the temporal diagnosis of the z-score series in three steps: visual inspection of individual trajectories, autocorrelation analysis with noise simulation as a structural baseline, and characterization using rolling mean and rolling std over a 7-day window. The 7-day window is not optimized here: it is fixed for coherence with the 7dhist features of the dataset and with the persistence rule from Block 0. The central finding of the block is that the autocorrelation observed between consecutive days of the z-score is almost entirely an artifact of the rolling baseline calculation itself, not a net behavioral signal.

Renuncia

Trade-off given up

Se renuncia al uso de lags cortos del z-score como features directas para el Bloque 5. El hallazgo de que la autocorrelación lag-1 es estructural hace inútil construir features de tipo "z-score de ayer" o "diferencia entre hoy y ayer": capturarían ruido del pipeline de cálculo, no señal conductual real. Las features del Bloque 5 se construirán sobre niveles sostenidos del z-score, no sobre sus variaciones diarias.

The use of short z-score lags as direct features for Block 5 is given up. The finding that lag-1 autocorrelation is structural makes it useless to build features such as "yesterday's z-score" or "difference between today and yesterday": they would capture noise from the calculation pipeline, not real behavioral signal. Block 5 features will be built on sustained z-score levels, not on their daily variations.

Uso real

Real-world use

En producción, el sistema no opera sobre la variación diaria del z-score sino sobre su nivel sostenido. El rolling mean de 7 días del z-score es el objeto que el sistema lee para decidir si una dimensión está desviada de forma persistente. La serie diaria del z-score queda como registro crudo disponible para análisis retrospectivo, no como señal de intervención directa.

In production, the system does not operate on the daily variation of the z-score but on its sustained level. The 7-day rolling mean of the z-score is the object the system reads to decide whether a dimension is persistently deviated. The daily z-score series remains as a raw record available for retrospective analysis, not as a direct intervention signal.

Trade-off

El rolling de 7 días introduce inercia: si el deterioro comienza hoy, el rolling mean no lo reflejará completamente hasta varios días después. Se gana robustez frente al ruido diario (días atípicos aislados no disparan el sistema) y coherencia con la regla de persistencia del marco. La tensión entre anticipación y robustez queda documentada y se abordará con los métodos CUSUM/EWMA del Bloque 7.

The 7-day rolling introduces inertia: if deterioration begins today, the rolling mean will not fully reflect it until several days later. Robustness against daily noise is gained (isolated atypical days do not trigger the system), as well as coherence with the framework's persistence rule. The tension between anticipation and robustness is documented here and will be addressed with the CUSUM/EWMA methods of Block 7.

FASE 3 · BLOQUE 5 — PREPARACIÓN DE DATOS PHASE 3 · BLOCK 5 — DATA PREPARATION Feature engineering temporal Temporal feature engineering

Completado Completed

Pregunta que resolverá: ¿Cómo se construye operativamente la variable que define el deterioro? ¿Qué decisiones de diseño hay que tomar para pasar de las series del z-score a la convergencia 2/3 con persistencia de siete días?

Question to be solved: How is the variable that defines default constructed in practice? What design decisions need to be made to transition from the z-score series to 2/3 convergence with a seven-day persistence?

Decisión

Decision

Materializar la definición operativa de deterioro del Bloque 0 en una variable binaria por persona y día: convergencia 2/3 con persistencia de 7 días. La construcción sigue cuatro capas: umbral de voto diario por dimensión sobre el rolling mean del z-score, indicador de persistencia que exige 7 días consecutivos de voto, composición de los tres pilares conductuales (sueño, activación, uso pasivo) con la dimensión activación alimentada por steps y location con regla OR, y regla final de convergencia 2/3. Dos decisiones técnicas heredadas del Bloque 4 se aplican: winsorización al p95 del rolling mean de location y regla OR para el pilar activación.

Materialize the operational definition of deterioration from Block 0 into a binary variable per person and day: 2/3 convergence with 7-day persistence. The construction follows four layers: daily vote threshold per dimension on the z-score rolling mean, persistence indicator requiring 7 consecutive days of vote, composition of the three behavioral pillars (sleep, activation, passive use) with the activation dimension fed by steps and location using an OR rule, and the final 2/3 convergence rule. Two technical decisions inherited from Block 4 are applied: winsorization at p95 of the location rolling mean, and OR rule for the activation pillar.

Renuncia

Trade-off given up

Se renuncia a features genéricas de series temporales (pendientes, ratios de cambio, lag features) como entrada al modelo del Bloque 6. La variable de convergencia 2/3 encapsula la teoría del proyecto; añadir features no ancladas en la definición operativa de deterioro añadiría complejidad de selección sin justificación conceptual. Si el Bloque 6 necesitara más señal, se revisaría con criterio derivado de la evaluación.

Generic time series features (slopes, change ratios, lag features) as input to the Block 6 model are given up. The 2/3 convergence variable encapsulates the project's theory; adding features not anchored in the operational definition of deterioration would add selection complexity without conceptual justification. If Block 6 needed more signal, it would be reviewed based on evaluation criteria.

Uso real

Real-world use

En producción, el sistema calcula cada día, para cada usuario activo, si los tres pilares han estado en estado de voto durante los últimos 7 días y si al menos dos de ellos han convergido. El resultado es un único bit por persona y día: alerta activa o no activa. Esa señal es auditable por cualquier clínico: "el sistema disparó porque sueño y activación estuvieron desviados durante 7 días seguidos".

In production, the system calculates each day, for every active user, whether the three pillars have been in a vote state for the last 7 days and whether at least two of them have converged. The result is a single bit per person per day: alert active or not active. That signal is auditable by any clinician: "the system triggered because sleep and activation were deviated for 7 consecutive days".

Trade-off

La convergencia 2/3 es una regla explícita y auditable. Se pierde la flexibilidad de un modelo que pondere las dimensiones numéricamente y pueda descubrir relaciones no anticipadas. Se acepta ese trade-off a favor de transparencia e interpretabilidad, especialmente en un dominio clínico donde la confianza del usuario final es crítica y donde una regla opaca dificulta la adopción.

The 2/3 convergence is an explicit and auditable rule. The flexibility of a model that weights dimensions numerically and could discover unanticipated relationships is lost. This trade-off is accepted in favor of transparency and interpretability, especially in a clinical domain where end-user trust is critical and where an opaque rule hinders adoption.

FASE 3 · BLOQUE 5 — PREPARACIÓN DE DATOS PHASE 3 · BLOCK 5 — DATA PREPARATION Feature engineering temporal Temporal feature engineering

Completado Completed

Pregunta resuelta: ¿Cómo se construye operativamente la variable que señala la desviación sostenida de la conducta, y dónde se fija su punto de operación? ¿Qué decisiones de diseño llevan de las series del z-score a una alerta por persona y día con "X" dias de persistencia y convergencia de 2/3?

Question solved: How is the variable that signals sustained behavioral deviation built in practice, and where is its operating point set? What design decisions lead from the z-score series to a per-person, per-day alert — 2/3 convergence with five-day persistence?

Decisión

Decision

Materializar la definición operativa de desviación del Bloque 0 en una variable binaria por persona y día. La construcción sigue cuatro capas: umbral de voto diario por dimensión sobre el rolling mean del z-score, indicador de persistencia que exige cinco días consecutivos de voto, composición de los tres pilares conductuales (sueño, activación, uso pasivo) con la dimensión activación alimentada por steps y location con regla OR, y regla final de convergencia 2/3. El punto de operación: umbral 2.0, persistencia 5 días, se fija mediante una calibración explícita que prioriza la credibilidad sobre el volumen.

Materialize the operational definition of deviation from Block 0 into a binary variable per person and day. The construction follows four layers: daily vote threshold per dimension on the z-score rolling mean, persistence indicator requiring five consecutive days of vote, composition of the three behavioral pillars (sleep, activation, passive use) with the activation dimension fed by steps and location using an OR rule, and the final 2/3 convergence rule. The operating point — threshold 2.0, 5-day persistence — is set through explicit calibration that prioritizes credibility over volume: the most sensitive configuration flagged nearly half the cohort, too many for a clinician to trust, so the cleanest point that still alerts on a useful number of people is chosen. Two decisions inherited from Block 4 are applied: winsorization at p95 of the location rolling mean, and OR rule for the activation pillar.

Renuncia

Trade-off given up

Se renuncia a parte de la detección. La configuración elegida deja escapar a cerca de la mitad de las personas que acaban con depresión, más de las que dejaría un umbral más sensible. Se acepta ese coste a cambio de credibilidad: un sistema que marca a demasiada gente deja de ser creíble para quien lo usa, y la confianza del clínico pesa más que unas pocas detecciones adicionales. El número final del punto de operación queda abierto a un pilotaje real.

Part of the detection is given up. The chosen configuration lets roughly half the people who end up with depression go undetected, more than a more sensitive threshold would. That cost is accepted in exchange for credibility: a system that flags too many people stops being trusted by whoever uses it, and the clinician's trust outweighs a few extra detections. The final operating-point value is left open to a real pilot.

Uso real

Real-world use

En producción, el sistema calcula cada día, para cada usuario activo, si los pilares han estado en estado de voto durante los últimos cinco días y si al menos dos de los tres han convergido. El resultado es un único bit por persona y día: alerta activa o no activa. Esa señal es auditable por cualquier clínico: "el sistema avisó porque sueño y activación llevaban cinco días seguidos desviados".

In production, the system calculates each day, for every active user, whether the pillars have been in a vote state for the last five days and whether at least two of the three have converged. The result is a single bit per person per day: alert active or not active. That signal is auditable by any clinician: "the system alerted because sleep and activation had been deviated for five consecutive days".

Trade-off

La convergencia 2/3 es una regla explícita y auditable. Se renuncia a la flexibilidad de un modelo que pondere las dimensiones numéricamente y pueda descubrir relaciones no anticipadas. Se acepta ese trade-off a favor de transparencia e interpretabilidad, crítico en un dominio clínico donde la confianza del usuario final condiciona la adopción y una regla opaca la dificulta. Es una elección que el bloque siguiente pone a prueba, comprobando con modelos si alguno supera a la regla.

The 2/3 convergence is an explicit and auditable rule. The flexibility of a model that weights the dimensions numerically and could discover unanticipated relationships is given up. That trade-off is accepted in favor of transparency and interpretability, critical in a clinical domain where end-user trust drives adoption and an opaque rule hinders it. It is a choice the next block puts to the test, checking with models whether any beats the rule.

FASE 4 · BLOQUE 6 — MODELADO PHASE 4 · BLOCK 6 — MODELING Modelado con validación temporal Modeling with temporal validation

Completado Completed

Pregunta que resolverá: ¿Qué arquitectura de validación (walk-forward vs. expanding window) preserva mejor la integridad temporal de los datos longitudinales y evita data leakage?

Question to be solved: Which validation architecture (walk-forward vs. expanding window) best preserves the temporal integrity of longitudinal data and avoids data leakage?

FASE 5 · BLOQUE 7 — EVALUACIÓN PHASE 5 · BLOCK 7 — EVALUATION Evaluación orientada a intervención: CUSUM-EWMA | BOCPD-PELT Intervention-oriented evaluation: CUSUM-EWMA | BOCPD-PELT

Completado Completed

Pregunta que resolverá: ¿Cuántos días antes de la ventana clínica de riesgo detecta la señal CUSUM/EWMA el deterioro? ¿Cómo se localiza el punto exacto en el que la trayectoria conductual cambia de régimen?

Question to be solved: How many days before the clinical risk window does the CUSUM/EWMA signal detect deterioration? How is the exact point at which the behavioral trajectory changes regime located?

FASE 5 · BLOQUE 8 — EVALUACIÓN PHASE 5 · BLOCK 8 — EVALUATION Análisis individual - Trayectorias por persona Individual analysis - Per-person trajectories

En curso In progress

Pregunta que resolverá: ¿Cómo se evalúa el sistema persona a persona, no solo en métricas globales? ¿Qué patrones de deterioro emergen al analizar trayectorias individuales?

Question to be solved: How is the system evaluated person-by-person, not only in global metrics? Which deterioration patterns emerge when analyzing individual trajectories?

FASE 6 · BLOQUE 9 — DESPLIEGUE PHASE 6 · BLOCK 9 — DEPLOYMENT Umbral de intervención - Decisión final (OCAP) Intervention threshold - Final decision (OCAP)

Pendiente Pending

Pregunta que resolverá: ¿Cómo se traduce la señal estadística de deterioro en una regla de disparo accionable que un clínico o una plataforma pueda usar sin necesitar entender el modelo?

Question to be solved: How is the statistical deterioration signal translated into an actionable trigger rule that a clinician or platform can use without needing to understand the model?

FASE 6 · BLOQUE 10 — PRODUCCIÓN Y DESPLIEGUE PHASE 6 · BLOCK 10 — PRODUCTION & DEPLOYMENT Producción del sistema — Arquitectura, API, contenedorización y despliegue público System productionization — Architecture, API, containerization and public deployment

Pendiente Pending

Pregunta que resolverá: ¿Cómo se transforma un sistema desarrollado en un notebook exploratorio con variables globales en un sistema modular, reproducible, contenedorizado y operable como servicio público?

Question to be solved: How is a system developed in an exploratory notebook with global variables transformed into a modular, reproducible, containerized system operable as a public service?

Capas del bloque

Layers of this block

Capa 1 — Refactorización a módulos en src/

Separación del código del notebook en módulos Python por responsabilidad: preprocessing, baseline, deviation, spc, modeling, evaluation, visualization. El notebook queda como orquestación; las funciones core viven en src/.

Separation of notebook code into Python modules by responsibility: preprocessing, baseline, deviation, spc, modeling, evaluation, visualization. The notebook remains as orchestration; core functions live in src/.

Modularización

Capa 2 — API con FastAPI (api/main.py)

Capa HTTP que expone las funciones de src/ como endpoints. Diseño alineado con la naturaleza idiográfica del proyecto: /participants/{id}/zscores, /alerts, /trajectory, /dashboard. No copia el patrón genérico de clasificador ML (POST /predict_risk con score único).

HTTP layer that exposes src/ functions as endpoints. Design aligned with the project's idiographic nature: /participants/{id}/zscores, /alerts, /trajectory, /dashboard. Does not copy the generic ML classifier pattern (POST /predict_risk with a single score).

API

Capa 3 — Aplicación Streamlit

Interfaz web que consume la API FastAPI por HTTP y renderiza resultados. Selector de participante, vistas temporales de las cuatro dimensiones con baseline rolling sobreimpreso, z-scores, disparos CUSUM/EWMA, alertas SPC con explicación SHAP, métricas globales y OCAP. La demostración interactiva del proyecto.

Web interface that consumes the FastAPI API via HTTP and renders results. Participant selector, temporal views of the four dimensions with overlaid rolling baseline, z-scores, CUSUM/EWMA triggers, SPC alerts with SHAP explanation, global metrics and OCAP. The project's interactive demo.

UI

Capa 4 — Tests con pytest

Tests automatizados sobre funciones de src/ (baseline, z-score, CUSUM, EWMA, walk-forward validation) y sobre endpoints de api/ (códigos HTTP, manejo de errores, validación de inputs). El sello de calidad que separa prototipo de código profesional defendible.

Automated tests on src/ functions (baseline, z-score, CUSUM, EWMA, walk-forward validation) and on api/ endpoints (HTTP codes, error handling, input validation). The quality seal that separates prototype from defensible professional code.

Testing

Capa 5 — Contenedorización con Docker

Empaquetado del sistema (API + Streamlit + dependencias) en contenedores Docker. Dockerfile en la raíz, .dockerignore, docker-compose.yml para levantar ambos servicios con un único comando. Reproducibilidad y portabilidad.

Packaging of the system (API + Streamlit + dependencies) in Docker containers. Dockerfile at root, .dockerignore, docker-compose.yml to bring up both services with a single command. Reproducibility and portability.

Contenedorización

Capa 6 — Repositorio GitHub profesional

Repositorio público con README extenso (problema clínico, marco metodológico, decisiones de diseño, arquitectura, instrucciones de uso, resultados, limitaciones), requirements.txt con versiones fijadas, .gitignore correcto, estructura limpia (src/, api/, notebooks/, tests/, app.py, Dockerfile, docker-compose.yml).

Public repository with extensive README (clinical problem, methodological framework, design decisions, architecture, usage instructions, results, limitations), pinned requirements.txt, correct .gitignore, clean structure (src/, api/, notebooks/, tests/, app.py, Dockerfile, docker-compose.yml).

Repositorio

Capa 7 — Despliegue público

Despliegue del sistema completo (FastAPI + Streamlit orquestados con docker-compose) en un servicio cloud compatible con contenedores (Render, Railway o Fly.io). URL pública accesible desde cualquier navegador sin instalación local.

Deployment of the complete system (FastAPI + Streamlit orchestrated with docker-compose) on a container-compatible cloud service (Render, Railway or Fly.io). Public URL accessible from any browser without local installation.

Cloud

Decisión

Decision

Ejecutar las siete capas en orden tras cerrar el Bloque 10: refactorización a módulos en src/, exposición de la lógica vía API con FastAPI, construcción de UI con Streamlit consumiendo la API, validación con tests pytest, contenedorización con Docker, publicación en GitHub y despliegue público accesible vía URL. No introducir MLOps a gran escala (Kubernetes, Airflow, MLflow, CI/CD complejo).

Execute the seven layers in order after closing Block 10: refactoring to src/ modules, exposing logic via API with FastAPI, building UI with Streamlit consuming the API, validating with pytest tests, containerization with Docker, publishing on GitHub and public deployment accessible via URL. Do not introduce large-scale MLOps (Kubernetes, Airflow, MLflow, complex CI/CD).

Renuncia

Trade-off given up

Se renuncia a la simplicidad de mantener todo en un notebook único. Se renuncia también a una capa MLOps completa (orquestación con Airflow, tracking con MLflow, serving escalable, CI/CD avanzado), que pertenece a perfiles de ML Engineer y desviaría el foco del trabajo Health Data Scientist hacia ingeniería de plataforma sin retorno proporcional en el mercado objetivo.

The simplicity of keeping everything in a single notebook is given up. A full MLOps stack (Airflow orchestration, MLflow tracking, scalable serving, advanced CI/CD) is also given up, as it belongs to ML Engineer profiles and would shift the focus from Health Data Scientist work to platform engineering without proportional return in the target market.

Uso real

Real-world use

Cualquier persona puede abrir la URL pública y ver el sistema funcionando sin instalar nada. Cualquier desarrollador puede clonar el repositorio y levantar el entorno completo con un único comando (docker compose up). La lógica de cálculo (z-scores, SPC, alertas) queda accesible programáticamente vía API para integración futura con otros sistemas (apps clínicas, dashboards hospitalarios, EHR).

Any person can open the public URL and see the system working without installing anything. Any developer can clone the repository and bring up the complete environment with a single command (docker compose up). Computation logic (z-scores, SPC, alerts) remains programmatically accessible via API for future integration with other systems (clinical apps, hospital dashboards, EHR).

Trade-off

Se gana posicionamiento profesional como Data Scientist aplicado moderno, no solo investigador de notebook: se demuestra separación arquitectónica de responsabilidades, reproducibilidad por contenedores y entrega del trabajo como servicio operable. Se acepta el coste temporal de aprender y aplicar FastAPI, Docker y configuración de despliegue, herramientas no usadas durante el desarrollo del notebook pero que el mercado actual de Health DS espera ver en un portfolio competitivo.

Professional positioning as a modern applied Data Scientist (not just notebook researcher) is gained: architectural separation of responsibilities, reproducibility through containers and delivery of the work as an operable service. The time cost of learning and applying FastAPI, Docker and deployment configuration is accepted — tools not used during notebook development but that the current Health DS market expects to see in a competitive portfolio.

Detección temprana de desviación conductual a partir de datos conductuales procedentes de wearables

Early detection of behavioral deviations based on behavioral data collected via wearable devices

Una consulta de Psicología dura 50 minutos.
Lo que pasa entre una cita y la siguiente, donde pasa casi todo, no entra.

You see a patient for 50 minutes during the appointment.
Between one date and the next, you don't see a thing.

El problema

The problem

Qué hace el sistema

What the system does

Qué ganas

What you gain

Cómo está construido el proyecto

Cómo está construido el proyecto

Conceptos claves del proyecto

Key Concepts of the Project

Diseño longitudinal intra-sujeto + Control Estadístico de Procesos

Intra-subject longitudinal design + Statistical Process Control

Diseño longitudinal intra-sujeto

Intra-subject longitudinal design

Control estadístico sobre baseline individual

Statistical control over individual baseline

Dónde estamos y adónde vamos

Where we are and where it's going

Timeline

Cada bloque, su flujo de trabajo completo

Each block, its complete workflow

¿Quieres bajar al código?

Would you like to download the code?

Detección temprana de desviación conductual a partir de datos conductuales procedentes de wearables

Early detection of behavioral deviations based on behavioral data collected via wearable devices

Una consulta de Psicología dura 50 minutos. Lo que pasa entre una cita y la siguiente, donde pasa casi todo, no entra.

You see a patient for 50 minutes during the appointment.Between one date and the next, you don't see a thing.

El problema

The problem

Qué hace el sistema

What the system does

Qué ganas

What you gain

Cómo está construido el proyecto

Cómo está construido el proyecto

Conceptos claves del proyecto

Key Concepts of the Project

Diseño longitudinal intra-sujeto + Control Estadístico de Procesos

Intra-subject longitudinal design + Statistical Process Control

Diseño longitudinal intra-sujeto

Intra-subject longitudinal design

Control estadístico sobre baseline individual

Statistical control over individual baseline

Dónde estamos y adónde vamos

Where we are and where it's going

Timeline

Cada bloque, su flujo de trabajo completo

Each block, its complete workflow

¿Quieres bajar al código?

Would you like to download the code?

Una consulta de Psicología dura 50 minutos.
Lo que pasa entre una cita y la siguiente, donde pasa casi todo, no entra.

You see a patient for 50 minutes during the appointment.
Between one date and the next, you don't see a thing.