The UCLA Institutional Review Board approved all study procedures (NCT03503669). All participants provided written informed consent. Participants were recruited from the UCLA Neuropsychiatric Hospital inpatient and outpatient services and from community advertising between May 2018 and February 2021. A total of 359 women were assessed for eligibility of which 251 declined to participate or did not meet inclusion criteria. The remaining 108 participants were screened by phone with 29 subsequently excluded for screening cancellation/no show (n = 8), failure to meet inclusion criteria (n = 10), or drop out prior to randomization (n = 11). Of the 79 consented participants, 40 were randomized to the yoga intervention and 39 to the memory training intervention. The CONSORT diagram for the study is shown in Fig. S1.
Inclusion/exclusion criteria
Eligibility criteria were as follows: 1) age ≥ 50 years with self-reported menopause; 2) self-reported subjective cognitive decline (SCD) from the prior year’s functioning; 3) the presence of one or more cardiovascular risk factors (assessed by the Cerebrovascular Risk Factor Prediction Chart and hematologic testing), which included (a) history of myocardial infarction no less than 6 months prior, (b) prior diagnosis of diabetes, (c) current pharmacological treatment for blood pressure (>140/90), or (d) current pharmacological treatment for hyperlipidemia (LDL > 160); 4) sufficient English proficiency to comprehend the intervention instructions and materials; and 5) sufficient mental capacity to provide informed consent.
SCD is defined as the subjective experience of declining memory function, despite a normal range of memory function using neuropsychological measures. We employed the criteria set by Innes and colleagues [15], which require an individual meeting all of the following criteria; (1) self-reported memory problems within the past 6 months; (2) frequency of memory problems at least once per week; (3) ability to give an example in which memory problems occur in everyday life; (4) belief that one’s memory capacity has declined in comparison to 5–10 years previously; (5) absence of overt cognitive deficits or dementia diagnosis; and (6) concerns/worries regarding memory problems.
Exclusion criteria were as follows: (1) prior history of psychiatric illness, including psychosis, bipolar disorder, drug or alcohol dependence, or a neurological disorder; (2) surgery within the past three months or planned surgery within the next year, as well as unstable medical conditions; (3) disabilities, such as severe visual or hearing impairment; (4) insufficient English proficiency; (5) a diagnosis of dementia by Mini Mental State Examination (MMSE) [16] < = 23 or Clinical Dementia Rating Scale (CDR) [17] > = 0.5; (6) current participation in cognitive training in a therapeutic setting; (7) current treatment with a psychoactive medication; (8) prior experience with Kundalini yoga; or (9) myocardial infarction within the past 6 months. Patients were not excluded for a prior history of major depressive disorder or current antidepressant treatment.
Interventions
Kundalini yoga (KY)
The KY intervention consisted of weekly, 60-min in-person lessons with a certified KY instructor for 12 weeks. Each class of 6–10 participants followed the same structure: (1) tuning in (5 min); (2) warm up (15 min); (3) breathing techniques “Pranayama” (15 min); (4) Kirtan Kriya (12 min); (5) final resting pose “Savasana” (10 min) and closing (3 min). In addition, each participant received a CD containing a 12-minute KK recording with gentle background music and guidance for the exercise sequence. Participants performed this exercise at home every day. They were instructed to chant along with their eyes closed in a seated position, the feet flat on the floor (i.e., relaxed with a straight spine), to visualize a beam of white light entering the center of the top of the head and exiting the middle of the forehead, which is spiritually considered the third eye. While chanting, the thumb of each hand would touch the other fingers sequentially (“mudras”) along with the words “Saa” (thumb touches second finger), “Taa” (middle finger), “Naa” (ring finger), and “Maa” (fifth finger). Saa Taa Naa Maa translates to “Birth, Life, Death, and Rebirth”. The first round is chanted out loud, the next round whispered, the third is thought silently, the fourth is also whispered, and the fifth round is chanted out loud again. This sequence is repeated for 11 min with the last minute of energetic integration and meditation (total 12 min). This technique is thought to engage different senses simultaneously (visualization, vocalization, motor, and sensory stimulation). Furthermore, the chanting and breathing pattern modulate respiratory muscles, lung volume, cardiovascular and autonomic nervous system functions [18].
Memory training (MET)
MET involved 12 weekly in-person group classes presented by a qualified memory training instructor. The classes aimed to teach memory strategies, while participants completed weekly homework assignments and handed them in to ascertain participant compliance. MET was developed by researchers at the UCLA Longevity Center. This MET program involves a scripted curriculum for the trainer and a companion workbook for each participant. The detailed standard protocol for MET was derived from evidence-based techniques that use verbal and visual association, as well as practical strategies for memory learning [19, 20]. MET is performed in small group sessions of 6–10 people and includes (1) education about memory; (2) introduction to memory strategies; (3) instruction of the use of specific memory strategies; (4) home practice along with logs to track activity; and (5) the discussion of non-cognitive factors, such as self-confidence, anxiety, and negative expectations. Each weekly session has the same structure; trainers (1) document the number of participants per session, engage patients in alternative treatments, and collect homework completion logs; (2) review the previous homework exercises to reinforce learned techniques; (3) teach new techniques, review, and conduct exercises in the group session; and (4) assign new homework for the following week. Participants were directed to spend approximately 20 min daily on homework and document their activity in their logs. Each group session was devoted to learning and practicing memory techniques, and 15 min were reserved for reviewing the completed homework. Specific techniques taught include the following: verbal associative techniques (such as the use of stories) to remember lists; organizational strategies (categorizing items on a grocery list); visual associative strategies for learning faces and names [21]; learning to implement memory habits to recall where the person placed an item, what recent activities they performed (e.g. locking doors, turning off appliances); and how they can remember future tasks (i.e. appointments).
Adherence and side effects
Staff members tracked the attendance of participants for their weekly in-person training classes. Each participant was allowed a maximum of two missed classes. Participants self-reported if they had completed their homework. Completed homework sheets were submitted to staff during class or testing sessions. Additionally, participants were asked not to participate in any other mind-body practices during the trial period, such as Tai Chi, Qi Gong, or yoga. Side effects and adverse events of interventions were monitored using the UKU Side Effect Rating Scale [22].
Assessments
Cognitive domain
A delayed recall domain score was computed from three tests: (1) Hopkins Verbal Learning Test-Revised (delayed recall), (2) Wechsler Memory Scale-IV (Verbal Paired Associated, delayed recall), and (3) Rey-Osterreith Complex Figure delayed recall trial. An executive function domain score was calculated from two tests: (1) Stroop Interference [Golden version] [23], and (2) Trail Making Test B [24]. Trails B was reverse scored such that higher values indicate better performance. Cognitive domain assessments were completed at baseline and 24-week follow-up.
Subjective memory
The Memory Functioning Questionnaire (MFQ) assesses subjective memory functioning and consists of 64 items rated on a seven-point scale, and provides four unit-weight factor scores measuring: Factor 1, frequency of forgetting (including ratings of how often forgetting occurs in 28 specific situations and five ratings of general memory performance; 33 items); Factor 2, seriousness of forgetting (memory failure ratings from 18 different situations; 18 items); Factor 3, retrospective functioning (changes in current memory ability relative to five time points earlier in life; 5 items), and Factor 4, mnemonics usage (frequency of mnemonics usage in eight specific situations; 8 items). Higher scores indicate higher levels of perceived memory functioning, i.e., fewer forgetting incidents, less frequent use of mnemonics. Factor structure is stable across age groups and internal consistency is high, with Cronbach’s alpha values for its four factor scores ranging from 0.83 to 0.94 [25]. In the present study, we focused on two of the more commonly used factors, namely the frequency of forgetting (MFQ1), and seriousness of forgetting (MFQ2) which have been shown to more robustly reflect AD pathology than other MFQ components [26]; higher scores indicate better functioning. The MFQ was administered at baseline, 12-week, and 24-week follow-up.
Patients additionally completed the Hamilton Anxiety Rating Scale (HAM-A) [27], Connor-Davidson Resilience Scale (CD-RISC-25) [28], Perceived Stress Scale (PSS) [29], 36-Item Short Form Survey (SF-36), and Beck Depression Inventory (BDI) [30]. These assessments were administered at baseline, 12-week, and 24-week follow-up.
Outcomes
Primary outcomes of interest were changes in (1) cognitive domain scores (delayed recall, executive functioning) at 24-week follow-up compared to baseline; and (2) subjective memory (MFQ) scores at 12- and 24-week follow-up compared to baseline. Secondary outcomes examined were changes in depression (BDI), anxiety (HAM-A), perceived stress (PSS), resilience (CD-RISC-25), and health-related quality of life (SF-36, all subscales).
Statistical analysis
Data were entered at the time of collection and analyzed after completion of the trial. All data were inspected for outliers, homogeneity of variance and other assumptions to ensure their appropriateness for parametric statistical tests. Intervention groups were compared using t-tests (continuous variables) or chi-squared tests (categorical variables) on all demographic and outcomes measures at baseline. For cognitive domain scores, raw scores were z-transformed for each test according to the study sample’s mean and the z-scores were averaged within each domain to produce domain z-scores. Continuous outcomes were analyzed using a mixed effects general linear model, as implemented in SAS PROC MIXED, including treatment group, time, and the interaction between time and treatment group. Age, sex, and education (only for cognitive outcomes) were used as covariates. Significance of the interaction between time and intervention group was used to assess whether the groups differed in changes in outcome measures. Post-hoc analyses determined the significance of specific pair-wise group differences and within-group changes. Changes in test scores and statistics as well as effect sizes (Cohen’s d) for group differences are provided. All analyses were conducted using SAS 9.4 (SAS Institute, Cary, North Carolina).
Cytokine/chemokine assay & analysis
ACD-anticoagulated blood was transported at room temperature and processed within 18 h of blood draw. Whole blood was centrifuged at 2000 rpm for 10 min and plasma immediately stored at −80 °C. Human 38-plex magnetic cytokine/chemokine kits (EMD Millipore, HCYTMAG-60K-PX38, Burlington, MA) were used per manufacturer’s instructions and as previously described [31, 32]. The panel includes IL-1RA, IL-10, IL-1α, IL-1β, IL-6, IFN-α2, TNF/TNF-α, TNF-β/LT-α, sCD40L, IL-12p40, IFN-γ, IL-12/IL-12p70, IL-4, IL-5, IL-13, IL-9, IL-17A, GRO/CXCL1, IL-8/CXCL8, eotaxin-1/CCL11, MDC/CCL22, fractalkine/CX3CL1, IP-10/CXCL10, MCP-1/CCL2, MCP-3/CCL7, MIP-1α/CCL3, MIP-1β/CCL4, IL-2, IL-7, IL-15, GM-CSF, Flt-3L/CD135, G-CSF, IL-3, EGF, FGF-2, TGF-α, and VEGF. Fluorescence was quantified using a Luminex 200™ instrument (Austin, TX). Cytokine/chemokine concentrations were calculated using Milliplex Analyst software version 4.2 (EMD Millipore, Burlington, MA). Luminex assay and analysis were performed by the UCLA Immune Assessment Core. Manufacturer’s recommended quality control procedures were followed to ensure validity. Only those cytokines with no more than 25% of samples were undetectable were included in analyses. Seventeen analytes (EGF, FGF_2, eotaxin-1/CCL11, Flt-3L/CD135, IFN-γ, GRO/CXCL1, IL-10, MCP-3/CCL7, IL-12p40, MDC/CCL22, sCD40L, IL-1RA, IL-8/CXCL8, IP-10/CXCL10, MCP-1/CCL2, MIP-1β/CCL4) were identified in this manner. Cytokine concentration levels were log-transformed before analyses. Significance was set at p ≤ 0.05 for all analyses.
RNA-Sequencing
Sample collection & processing
Peripheral whole blood samples were collected at baseline, 12-week, and 24-week follow-up in EDTA-coated tubes. Sample were incubated in red blood cell lysis buffer, washed, pelleted, then stored at –80 °C in RNAprotect Tissue Reagent (Qaigen, Valencia, CA) until processing. Total RNA extraction and cDNA library construction were carried out by the UCLA Technology Center for Genomics & Bioinformatics (TCGB, Los Angeles, CA). 156 samples were sequenced on two Illumina NovaSeq S4 lanes using 150 bp paired-end chemistry (Illumina, San Diego, CA). A total of 2.6 × 109 million reads were generated from 156 RNA samples (mean 33.9 +/− 4.4 (SD) million reads per sample). Prior to read trimming and quality filtering, 83% of all forward and reverse reads had an average quality score ≥ Q30 with a total aligned percentage of 71%.
Read trimming, quality filtering, and mapping
The quality of raw paired-end reads was assessed with fastqc (v0.11.8) [33] and multiqc (v1.13) [34]. Reads were evaluated for insert size, average sequence quality, and percentage GC content. Adapter removal, quality trimming, and filtering (Q ≥ 20, average read quality score ≥25, and read length ≥50 bp), and base corrections were done using fastp (v0.23.2) [35]. Quality-processed reads were then re-assessed using multiqc to ensure the effectiveness of filtering. Approximately 90% of reads in each sample were ≥Q30 after trimming and. Transcript quantification for the quality-processed reads was estimated using salmon (version 1.10.1) in selective alignment mode with the “seqBias” and “gcBias” parameters using a decoy-aware reference transcriptome index (built from Ensembl GRCh38.97) [36]. Read mapping efficacy, by percentage and total reads maps, was assessed using multiqc to ensure a minimum of 50% of filtered reads or 10 million total reads mapped to the transcriptome index.
Differential gene expression analyses
All subsequent analyses were performed in R (v4.2.3). Transcript per million (TPM) quantifications of transcript abundance were used to estimate gene-level pseudo-read counts with tximeta (v1.16.1), which corrects count estimates for library size and read length biases (i.e., within sample normalization) [37]. Hierarchical cluster (single linkage) trees were used to identify outlier samples. A total of 9 samples were discarded. Quantile normalization was used for between-sample normalization using the qsmooth approach [38]. Only protein-coding genes were considered for further analysis. Genes with zero counts across all samples were removed, leaving a total of n = genes for further analysis. Differential gene expression analysis was performed using edgeR (v3.40.2) using treatment and timepoint as a combined covariate (each timepoint for each treatment defining one group level) in a negative binomial generalized linear model evaluated by a quasi-likelihood F-test [39]. Genes were considered differentially expressed if log2 fold change >1 and FDR < 0.1.
To assess for differential expression in a threshold-free manner, a stratified Rank-Rank Hypergeometric Overlap (RRHO) test using the RRHO2 package (v1.0) was performed [40]. Differential gene expression lists from the egdeR analysis were ranked by their log2 fold change values. The RRHO test calculates a p-value for each rank pair, representing the probability of observing the overlap by chance. The rank-rank plot displays the extent and significance of the overlap between the two gene lists. Discordant genes were examined for enrichment using the enrichR package (v3.2) using the GTEx_Aging_Signatures_2021 database [41, 42]. Results were ranked based on the adjusted p-values (Benjamini-Hochberg method) and the combined score, a measure that considers both the p-value and the z-score for each enriched term. A term was deemed statistically significant if the adjusted p-value was less than 0.05 and the combined score was greater than 1.
Weighted gene co-expression network analysis (WGCNA)
WGCNA was performed to identify co-expressed gene modules and investigate their association with phenotypic traits. The analysis was carried out using the WGCNA package (v1.72.1) in R [43]. From normalized counts, to reduce noise, the top 5000 genes by coefficient of variance were selected (CV = (standard deviation / mean) × 100%). To construct the signed co-expression network, a pair-wise Pearson’s correlation matrix was calculated across all samples for each gene pair. The correlation matrix was subsequently transformed into an adjacency matrix using a power adjacency function, with the soft-thresholding power (β) selected based on the criterion of approximate scale-free topology (R2 > 0.8). This was achieved by performing the pickSoftThreshold function provided by the WGCNA package, which computes the power that best satisfies the scale-free topology criterion while preserving the connectivity of the network. Next, the adjacency matrix was transformed into a topological overlap matrix (TOM) to capture the relative interconnectedness of genes within the network. The TOM-based dissimilarity matrix (1-TOM) was used for average linkage hierarchical clustering, and gene modules were identified using the dynamic tree cut algorithm with the following parameters: minimum module size of 30 genes, deepSplit = 4, and cut height = 0.2. Module eigengenes (MEs) were calculated as the first principal component of each module’s expression data, which represents the overall gene expression profile of the module. To associate the gene modules with phenotypic traits, Pearson’s correlation coefficients were calculated between MEs and the traits of interest. Modules with significant correlations (p < 0.1) were considered associated with the given traits.
RESULTSKY and MET are well-tolerated interventions in postmenopausal women with cardiovascular risk factors and subjective cognitive impairment
The baseline demographic, clinical, and cognitive characteristics of the randomized sample (N = 79) are summarized in Table 1 by treatment group, either KY (KY, N = 40) or Memory Enhancement Training (MET, N = 39). The mean age of all participants at baseline was 66.5 (SD = 9.2) years, mean BMI was 27.2 (SD = 6.0), mean CVRF was 10.1 (SD = 4.6), and mean MMSE was 28.4 (SD = 1.4). At baseline, treatment groups did not differ significantly in age, race, years of education, BMI, CVRF, HAM-A, MFQ, CD-RISC, PSS, SF-36, and BDI. Two participants (2.5% of the sample, 1 KY and 1 MET) met criteria for MCI at baseline (defined as scoring >1 standard deviation below normal on Hopkins Verbal Learning Test-Revised or Rey-Osterreith delayed recall). No significant differences were noted in cognitive domain scores between the two treatment groups. Twenty-six KY (65%) and 37 MET (95%) participants completed the trial and post-treatment assessment at 6 months (χ2(1) = 10.9, p < 0.001). Pre-intervention dropout rate did not significantly differ between the 2 arms (5 (12.5%) KY and 1 MET (2.6%), χ2(1) = 2.8, p = 0.1) but differences were noted in discontinuation during intervention (9 KY (25.7%) and 1 MET (2.6%), χ2(1) = 10.9, p < 0.001). Tolerability and number of side effects also did not differ. Class attendance for the two treatment arms were comparable.
KY participants experienced long-term benefits in subjective memory measures compared to MET participants but reduced delayed recall
Changes in all outcome measures at 12 weeks and 24 weeks from baseline for the two study arms as well as between-group and within-group statistics are presented in Table 2 and estimated effect sizes (Cohen’s d) with associated 95% confidence intervals are presented in Table 3. At 12-weeks and 24-weeks follow-up, both interventions demonstrated improvement in frequency of forgetting (MFQ-Factor 1). Between group differences, however, were not significant (F(1, 76) = 0.2, p = 0.7). At 24-weeks, KY participants demonstrated between- and within-groups improvements in seriousness of forgetting/MFQ-Factor 2 (KY mean change (SD) = 0.65 (1.25), t(76) = 2.1, p = 0.04; MET mean change (SD) = −0.31 (1.35), t(76) = −0.9, p = 0.4; F(1, 76) = 4.9, p = 0.03; effect size (95% confidence interval) = −0.73 (−1.26, −0.19)). KY participants demonstrated between- and within-group decline in delayed recall scores at 24-weeks (KY mean change (SD) = −0.31 (0.37 t(76) = −3.8, p = 0.0003; MET mean change (SD = 0.02 (0.55), t(76) = 0.5, p = 0.6; F(1, 76) = 10.3, p = 0.002; effect size (95% confidence interval = 0.69 (0.17, 1.21)). Executive functioning, however, showed no between- or within-groups differences (F(1, 76) = 0.8, p = 0.4). Removing the two participants with MCI did not change the direction or significant of any results (data not shown). Significant differences were not observed among secondary outcomes at 12-week or 24-weeks, within or between groups, save for a within-group decrease in SF-36 Role limitations (emotional) subscore for MET only at 24-week follow-up.
Subjective cognitive decline measures associate with underlying gene expression signatures at baseline
Weighted gene co-expression network analysis (WGCNA) was conducted on baseline gene expression from all participants to examine if distinct gene expression signatures underscore outcome measures at baseline. Prior to the analysis, the suitability of the data for WGCNA was assessed by examining the scale-free topology index (R²) and found it to be greater than 0.8, indicating a scale-free network structure (Fig. 1A, B). The subjective memory outcomes, MFQ1 (frequency of forgetting) and MFQ2 (seriousness of forgetting) demonstrated significant association with 8 modules with the overall pattern of correlation to modules being similar for both measures (Fig. 1C). The genes from these modules were combined and enrichment analysis performed using the MSigDB Hallmark Pathways dataset. The analysis revealed that the modules associated with the subjective memory measures were enriched for pathways related to TNF-alpha signaling, inflammatory response, KRAS signaling, interferon gamma response, apoptosis, and IL-2/STAT5 signaling (Fig. 1D).
KY participants demonstrate reversal of aging-associated gene expression signatures
To examine differences in gene expression induced by the two interventions, rank-rank hypergeometric (RRHO) analysis was performed to identify discordant gene expression patterns in response to KY and MET treatments at 12- and 24-weeks following intervention initiation compared to baseline (Fig. 2A, D). The RRHO test is a robust, threshold-free method for comparing ranked gene lists, as it considers the rank order of genes while estimating the significance of the overlap between the lists, accounting for the global patterns of gene expression changes. A total of 1123 genes were expressed in a discordant fashion between treatments at 12-week follow-up (307 repressed following KY but overexpressed with MET; 816 overexpressed following KY intervention but repressed with MET), and 1338 genes were discordant at 24 weeks (500 repressed following KY but overexpressed following MET; 838 overexpressed following KY intervention but repressed with MET). Enrichment analysis was performed using the Genotype-Tissue Expression (GTEx) aging signatures database, which characterizes patterns of gene expression at progressively increasing ages compared to a baseline of 20–29 years [9, 44].
At 12-weeks and 24-weeks post-intervention (Supplemental Table S1), discordant genes demonstrated significant enrichment (FDR < 0.05) of aging signatures expressed in the opposing direction of that observed during aging (e.g., genes upregulated in older age were repressed following KY intervention, or vice versa, Fig. 2 B–C, E, F). At 12- and 24-weeks post-treatment, discordant genes were enriched for the 70–79 aging signature in a pattern opposing that observed following the KY intervention (12 W Combined Score = 25.2, adjusted p-value = 0.007; 24 W: Combined Score=42.4, adjusted p-value < 0.0001). Thus, at 12-week follow-up, KY participants demonstrated significant downregulation of SEC14L3, CPB2, IFNG, ANKRD33, SAA4, CCL4, CCL3, APOA1, KIR3DL1, AKR1C4, BAAT, and SLC38A3, which are upregulated in the 70–79 aging signature and significantly upregulated in MET participants compared to baseline. At 24-week follow-up, KY participants demonstrated significant downregulation of PAQR9, IL22, OR52H1, CCL4L2, ANKRD33, CCL3L3, SAA4, APOA1, BAAT, F2, BARX1, CXCL12, C9, CFHR3, CCL4, CCL3, HAO1, CTSE, SLC38A3, ACTRT3, which are upregulated in the 70–79 aging signature and significantly upregulated in MET participants compared to baseline. MET discordant genes were not significantly enriched for any aging signature at 12-week follow-up. At 24-week follow-up, MET participants displayed downregulation of the 40–49 upregulated aging signature (Combined Score = 18.6, adjusted p-value = 0.003) and upregulation of the 30–39 downregulated aging signature (Combined Score = 12.8, adjusted p-value = 0.004). Differentially expressed genes using a conventional threshold-based approach were also reviewed (Supplementary Fig. S2).
MET, but not KY participants, demonstrated increased levels of aging-associated chemokine exotoxin-1
No differences in cytokine/chemokine concentrations in peripheral blood were detected between interventions at baseline (Table 4). At 12- and 24-week follow-up, MET participants displayed a significant increase in exotoxin-1 concentrations (MET 12 W: t(67) = −2.12, p = 0.04; MET 24 W: t(67) = −2.12, p = 0.04). Levels in KY participants were unchanged. The between group difference was significant (F(2,67) = 3.94, p = 0.02). Both MET and KY participants demonstrated FGF increases at 12-week follow-up (KY: t(67) = −2.5, p = 0.01; MET: t(67) = −2.4, p = 0.02), but the between group difference was not significant. No baseline cytokine concentrations were predictive of changes in cognitive domain at 24-week follow-up or MFQ scores 12- or 24-week follow-up.