Efficacy of adaptive cognitive training through desktop virtual reality and paper-and-pencil in the treatment of mental and behavioral disorders

Cognitive deficits are a core feature of mental and behavioral disorders, leading to poor treatment adherence and functionality. Virtual reality (VR) methodologies are promising solutions for cognitive interventions in psychiatry once they provide greater ecological validity. This study assessed and compared two content-equivalent cognitive training (CT) interventions, delivered in desktop VR (Reh@City v2.0) and paper-and-pencil (Task Generator (TG)) formats, in patients with mental and behavioral disorders. 30 patients were randomly assigned to the Reh@City v2.0 group and the TG group. Both groups of patients underwent a time-matched 24-sessions intervention. Neuropsychological assessments were performed at baseline, post-intervention, and follow-up. A within-groups analysis revealed significant improvements in visual memory and depressive symptomatology after the Reh@City intervention. The TG group improved in processing speed, verbal memory, and quality of life (social relationships and environmental domains). Between groups, Reh@City led to a greater reduction in depressive symptomatology, whereas the TG group showed higher improvements in social relationships aspects of quality of life. At follow-up, previous gains were maintained and new improvements found in the Reh@City (global cognitive function, language, visuospatial and executive functions) and the TG groups (attention). The Reh@City significantly reduced depressive symptomatology, and the TG led to greater improvements in processing speed, abstraction, and social relationships domain of quality of life at follow-up. Both interventions were associated with important cognitive, emotional, and quality of life benefits, which were maintained after two months. Reh@City and TG should be considered as complementary CT methods for patients with mental and behavioral disorders. Trial registration The trial is registered at ClinicalTrials.gov, number NCT04291586.


Introduction
Around 792 million people worldwide live with a mental disorder (Ritchie and Roser 2018). In Europe, it is estimated that each year 164.8 million people are affected by mental disorders, with anxiety and major depression being two of the most common conditions (Wittchen et al. 2011). In 2010 the American Psychiatric Association estimated that neuropsychiatric disorders, which encompass mental and behavioral disorders and neurological disorders, are the third leading cause of disability worldwide, corresponding to over 10% of the global burden (American Psychiatric Association 2010). In a large-scale community epidemiological survey (n = 73.441) conducted in 15 countries, most respondents attributed higher levels of disability to mental disorders than to physical disorders (Ormel et al. 2008).
It is well established that cognitive impairment is a core clinical feature of most psychiatric conditions (Rock et al. 2014;Robinson et al. 2006;McIntyre et al. 2013). Although specific symptoms of psychiatric disorders can be mitigated by current pharmacological treatments (e.g., depression, anxiety, delusions), cognitive deficits tend to be persistent, occurring in both acute and remitted stages of the disorder (Millan et al. 2012). Additionally, mood symptoms (e.g., depression and anxiety) have the potential to exacerbate cognitive deficits (Iosifescu 2012). According to Reichnberg et al. (2009), all patients from 4 different diagnostic groups, namely schizophrenia (n = 94), schizoaffective disorder (n = 15), bipolar disorder (n = 78) and major depression (n = 48), exhibited multidomain cognitive impairment (e.g., processing speed, attention, memory and executive functions). A more indepth analysis revealed that 67% and 53% of people with schizoaffective disorder and schizophrenia, respectively, were classified as neuropsychologically impaired, whereas the degree of impairment was consistently lower among people with major depressive disorder (35%) and bipolar disorder (23%). Previous studies indicate that cognitive deficits are a robust predictor of treatment response and functional ability (e.g., activities of daily living (ADLs)) in various psychiatric conditions, and thus an essential target for intervention (Groves et al. 2018;Harvey 2011;McCleery and Nuechterlein 2019;Green 2006).
Cognitive training (CT) interventions, including both computerized and paper-and-pencil programs, have proven to be an effective method to improve cognitive, emotional, and functional outcomes in patients with psychiatric conditions (Grynszpan et al. 2011;Kim et al. 2018). Considering particularly computerized CT, a meta-analysis revealed small to large effects for computerized CT on cognitive function (e.g., attention, working memory, and global cognitive function), depressive symptomatology, and daily functioning in patients with major depression (Motter et al. 2016). In another meta-analysis conducted among patients with schizophrenia-spectrum disorders, results indicated small to moderate effects of computerized CT in processing speed, memory, verbal fluency, verbal and visual learning. However, almost no effect on social cognition and functional outcome measures was found (Prikken et al. 2019). Unfortunately, inconsistent results in the field of CT are frequent and typically due to methodological issues, such as small sample sizes and diversity of CT programs in terms of target domains, duration, frequency, structure, types of exercises, and outcome measures.
In the last few years, virtual reality (VR) has emerged as a valuable approach in the context of neurological and psychiatric disorders. In a recent literature overview about the application of VR in the treatment of psychiatric disorders, Park et al. (2019) conclude that VR can be used with different therapeutic objectives, such as exposure therapy (traumatic and anxiety-inducing situations), social skills training and medication managing skills, in a wide range of disorders, namely phobia, posttraumatic stress disorder (PTSD), and schizophrenia. Another systematic review conducted by Cieślik et al. (2020) indicates that VR approaches are also useful to the assessment and treatment of bulimia and binge eating disorders, neurodevelopmental disorders, and neurocognitive disorders. Focusing specifically on the application of VR-based approaches for CT purposes, few studies assess the impact of these kinds of interventions in cognitive and non-cognitive domains in patients with mental and behavioral disorders. VR-based interventions have innumerous advantages over traditional paper-and-pencil methods, namely: the systematic and hierarchical presentation of stimuli and challenges; adaptation and personalization of training content to the patient's cognitive profile; immediate feedback; gaming elements to enhance motivation and engagement; increased ecological validity and, possibly, greater transfer of gains to everyday life (Gamito et al. 2015;Parsons 2016). Previous systematic reviews highlight VR's positive impact on cognitive and non-cognitive outcomes in patients with neurological disorders (Coyle et al. 2015;Maggio et al. 2019). In addition, prior studies emphasize the beneficial effect of combining VR-based CT with more conventional CT methods on cognitive function of stroke and brain tumor patients (Kim et al. 2011;Yang et al. 2014). There are many VR environments (e.g., cities, malls, kitchens, supermarkets) devised for CT, which comprise simulations of ADLs that allow the training of multiple cognitive domains simultaneously with the aim of promoting, not only cognitive function but also functional abilities Oliveira et al. 2020;Rand et al. 2005). However, most of 1 3 these VR-based CT interventions are administered in the context of neurological conditions. For instance, a study from Zając-Lamparska et al. (2019) evaluated the effectiveness of the VR-based CT using the GRADYS game in healthy older adults and older adults with mild dementia. Findings indicated that the GRADYS game was more effective in healthy older adults, even though both groups showed progress in the course of training, with mild dementia patients exhibiting less progress. Maier et al. (2020) conducted a study with chronic community-dwelling stroke patients to assess a VR program that provided multidimensional CT. Results revealed positive influences of the VR program in attention, spatial awareness, and depressive symptomatology. Gamito et al. (2015) assessed the effectiveness of a VR-based CT application that encompassed several daily activities on stroke patients in comparison with a waiting list control group. Results showed significant improvements in attention and memory in the VR-based CT intervention. Another example of a virtual environment is the Reh@City, which consists of a virtual city with several ADLs simulations clinically validated for stroke CT . Some Reh@City studies had the following aims: (1) to evaluate the efficacy of the Reh@City v1.0 compared to conventional rehabilitation (CT exercises administered by an occupational therapist) in several outcome measures; (2) to compare patients' performance in the Reh@City and in its paper-and-pencil version (Task Generator (TG)); and (3) to evaluate the short-term and long-term effectiveness of the Reh@City v2.0 compared to the TG, in various primary and secondary domains. In the first study, Reh@City v1.0 proved to be more effective than conventional rehabilitation, in overall cognitive functioning, processing speed, attention, memory, visuospatial abilities, executive functions, and self-reported general health status (e.g., strength and mobility, memory, emotion, social participation). Regarding the second study, Faria et al. (2019) compared stroke patients' performance in the Reh@City and the Task Generator (TG)-two content-equivalent CT tools based on the same difficulty adaptation progression framework. 1 The authors concluded that there was no effect of training methodology in overall patients' performance, which means that both groups of patients performed at the same level irrespectively of the CT method employed. Finally, in a more recent study, Faria et al. (2020) performed a one-month longitudinal randomized controlled trial (RCT) with 42 chronic stroke patients comparing the Reh@City v2.0 with the TG. The results demonstrated that Reh@City v2.0, an ecologically valid intervention, showed higher effectiveness, with improvements in different cognitive domains (e.g., global cognitive functioning, attention, visuospatial abilities, executive functions, processing speed and verbal memory), and self-perceived cognitive deficits in everyday life, and the TG retained fewer cognitive gains (orientation, processing speed and verbal memory) that were maintained at follow-up.
As to the use of VR for CT purposes in psychiatric disorders, there are considerably fewer studies assessing its effectiveness, and the existing research is conducted mostly among schizophrenia patients. A pilot-study from Marques et al. (2008) evaluated the effectiveness of the Integrated Virtual Environment for Cognitive Rehabilitation (AVIRC) and an adaptation of the Virtual Environments for panic disorder (VEPD) in 14 patients with schizophrenia. The AVIRC consisted of a virtual city with streets, houses, shops, a church, and a supermarket, and the VEPD encompassed a coffee house, an urban context, and a supermarket. Their results suggested that the VR intervention led to improvements in several cognitive measures, such as processing speed, attention, perceptual organization, working memory, verbal comprehension, executive functions, and contributed to the reduction of relapse, re-hospitalization, and drop-out rates. Chan et al. (2010) examined the impact of a VR-based CT program adapted from the Interactive Rehabilitation Exercise System (IREX) in older adults with chronic schizophrenia. Results indicated that the VR group revealed significant improvements in global cognitive function (assessed with the Mini Mental State Examination (MMSE)). Similarly, Plagia et al. (2013) found that schizophrenia patients enrolled in a VR-based CT intervention showed an increasement in global cognitive function, sustained attention, and executive function measures, as opposed to the control condition that only revealed gains in sustained attention.
Due to the scarcity of research in this field, we intend to contribute with evidence on the effectiveness of desktop VR-based CT compared to adaptive paper-and-pencil CT. Therefore, the main aims of the present study are to assess and compare the impact of two CT interventions, initially developed and clinically validated for stroke CT -the Reh@City v2.0 and the TG -in a heterogeneous sample of patients with mental and behavioral disorders from a longterm care psychiatric setting. To the best of our knowledge, this is the first two-month RCT investigating the impact of two content-equivalent CT tools, developed under the same adaptation and personalization framework, and delivered in different formats in this clinical population.

Participants and trial design
The recruitment of participants was carried out at Casa de Saúde Câmara Pestana (CSCP) (Funchal, Madeira), which is a female mental health institution that belongs to the Instituto das Irmãs Hospitaleiras do Sagrado Coração de Jesus. For this study, we defined the following inclusion criteria: (a) attending the psychosocial rehabilitation program; (b) having sufficiently preserved expressive and receptive language abilities in order to communicate difficulties when facing demanding tasks and following instructions; (c) maintaining visual and auditory acuity; and (d) being motivated to participate. Patients experiencing an acute psychiatric episode were excluded. All patients completed informed consent prior to participation.
The study was approved by the CSCP's Ethics Committee (reference number: 1/2020) and registered at ClinicalTrials.gov (number NCT04291586).
A total of 30 patients diagnosed with mental and behavioral disorders, according to the International Statistical Classification of Diseases and Related Health Problems 10th manual (ICD-10) (WHO 2010), met the inclusion criteria. They were further allocated to one of two interventions (Reh@City v2.0 or TG) by the psychologists involved in data collection (Fig. 1). The randomization process was conducted using a free web-based resource named Research Randomizer (https:// www. rando mizer. org/). Concerning patients' allocation to the CT interventions, 15 patients were randomly assigned to the Reh@ City v2.0 group (3 were lost at follow-up) and 15 to the TG group (1 dropped-out after the baseline assessment and 2 were lost at follow-up).

Intervention protocol
The study was run between June 2019 and February 2020. We conducted a single-blind RCT comparing the impact of the two interventions -Reh@City v2.0 and TG. All patients involved in this study were also enrolled in the CSCP's psychosocial rehabilitation program. The ultimate goals of the CSCP's psychosocial rehabilitation program are to increase autonomy, by enhancing cognitive, emotional, social, and functional abilities, and to facilitate social reintegration. This program was devised for patients with clinically stable conditions, irrespectively of their mental and behavioral disorder, and focuses on helping them to learn or relearn essential skills for independent living (e.g., communication, social interaction, basic and instrumental activities of daily living (IADLs), and occupational attainment) so that they can reintegrate into the community. Besides, this program encompassed psychotherapy and monthly multidisciplinary meetings to assess patients' current situation and progress.
All patients went through a detailed neuropsychological assessment at baseline, post-intervention, and two-month follow-up. The baseline assessments were performed a week before the beginning of the intervention. Psychologists delivered two 30 min CT sessions per week for three months. After the 24th CT session, all participants were assessed within a week. To evaluate the long-term impact of both interventions, a follow-up neuropsychological assessment was performed two months after the end of both interventions. Each assessment session lasted approximately one hour and thirty minutes.
Before starting the intervention, all patients from both groups participated in a brief individual training session that lasted between 10 and 15 min, in order to get familiar with the CT tools. Concerning patients in the Reh@City v2.0 group, the initial session was devoted to a short training in the Reh@City v2.0 platform, where patients interacted with the software using the joystick. As to the patients in the TG group, the initial training session consisted of performing some of its paper-and-pencil tasks. After the initial training session, the two groups underwent 24 time-matched sessions of CT supervised by certified psychologists.

Task Generator: paper-and-pencil intervention
The TG is a free web-based tool that generates personalized paper-and-pencil CT tasks in a PDF format that are tailored to the user's cognitive profile. The TG comprises 11 different CT tasks, namely: cancellation, numeric sequences, problem solving, association, comprehension of contexts, image pairs, word search, mazes, categorization, action sequencing, and memory of stories and pictures. These tasks were selected, and models for their personalization were created through a participatory design process involving rehabilitation experts (Faria et al. 2018). In order to personalize the CT program through the TG, the psychologist only needs to access the TG website. In the website, it is necessary to complete the following steps to personalize the training content: (1) perform task parameterization (i.e., definition of the attention level, memory level, executive function level,

Reh@City v2.0: desktop VR-based intervention
The Reh@City v2.0 consists of a virtual city with streets, sidewalks, and buildings, where participants are required to perform several CT tasks that resemble common ADLs, in eight different locations (e.g., pharmacy, supermarket, bank, home) (Paulino et al. 2019). In this virtual city, participants are presented with specific errands they need to run, such as: baking cookies at home (Fig. 3a); collecting jewelry items in the store (Fig. 3b); buying groceries in the supermarket (Fig. 3c); and paying the supermarket's bill (Fig. 3d).
During task performance, participants can access different information by pressing specific buttons on the keyboard, namely: task instructions (e.g., go to the post-office); a mini-map and/or a street arrow, illustrating the optimal navigation route to reach the intended location; time and the point counters, which are used as visual feedback elements related to the accomplishment of task objectives. Regarding the point system, participants accumulate points whenever they complete an intermediate task (+ 1) and the overall task (+ 20) and lose points (-1) when mistakes are committed, or a help button used. To enhance the relatedness of the VR tasks to the real world, the eight locations display billboards and products that are found in Portugal's retail stores. Reh@ City v2.0 tasks are ecologically valid desktop VR versions of the same paper-and-pencil tasks that compose the TG (see Table 1). Despite the absence of a desktop VR version of the comprehension of contexts and the word search tasks, we did not remove these two paper-and-pencil tasks. Specific computational models were developed after a participatory design study involving 20 rehabilitation experts to generate both interventions' content and to adjust difficulty parameterization (Faria et al. 2018). Removing CT content at this point could hamper the construct validity and methodological foundations of both tools. Faria et al. (2019) compared both interventions and found that patients had a similar performance irrespectively of the CT method employed. In former studies (Faria et al. , 2020, all tasks were kept. Therefore, these two paper-and-pencil tasks were maintained to preserve the validity of the comparison.

Personalization procedure
Both TG and Reh@City v2.0 tools, besides sharing the same CT content, are personalized in the same manner, i.e., through the definition of five parameters, namely attention, memory, language, executive functions, and overall difficulty level. To personalize both interventions, we administered the Montreal Cognitive Assessment (MoCA) to all patients with the goal of determining their global cognitive profile. Then, different parameters were established as follows: Attention parameter: MoCA's attention domain score (0-6 points); Memory parameter: MoCA's memory domain score (i.e., delayed recall and spatial and temporal orientation) (0-11 points); Executive functions parameter: MoCA's visuospatial, executive and abstraction domains scores (0-7 points); Language parameter: MoCA's naming and language domains scores (0-6 points); Difficulty parameter: MoCA's total score (0-30 points). All CT tasks have a different multi-domain cognitive profile, defined through the various parameters, namely the attention, memory, language, executive functions, and difficulty parameters. This means that the generated CT tasks do not train a specific cognitive domain but tackle all the above-mentioned domains to different extents, according to the established difficulty level.
Consistent with previous work, after every training session, a mean performance score was calculated using a 0-100% scale, and the difficulty for the next set of tasks was established with the following reasoning: (a) the difficulty parameter was decreased by 0.5 if the mean performance was below 50%; (b) the difficulty parameter was increased by 0.5 if the mean performance was higher than 70%; and (c) the difficulty was maintained if the mean performance ranged from 50 to 70% (Faria and Bermúdez i Badia 2018). In the TG, the personalization and adaptation process were done manually. The Reh@City v2.0 presents additional features, when compared to the TG, namely the automatic normalization process of the different parameters and adjustment of the desktop VR CT tasks based on participant's performance, as well as the ability to save participant's performance from session to session ).

Task Generator (TG)
The TG consists of an online application that is freely available at https:// neuro rehab ilita tion.m-iti. org/ TaskG enera tor/. In this website, it is possible to personalize the CT tasks content through the definition of the five aforementioned parameters (obtained from MoCA). After this step, it is possible to generate a set of 11 personalized CT tasks, presented in a PDF file, that are to be downloaded and then printed. After printed, patients completed the tasks with the assistance of certified psychologists.

Reh@City v2.0
The Reh@City v2.0 was implemented using the Unity 3D game engine and installed on a PC. Patients worked on a tabletop, facing an LCD monitor, and interacted with the virtual environment through a joystick handle with two buttons, one for selection and another for help. This simplified user interface (PC and joystick) was considered more suitable for our sample for several reasons: (a) patients' clinical diagnosis [patients' with chronic mental and behavioral disorders are more prone to display a low Find the route from the start to the exit of a labyrinth Find the shortest navigation route to arrive to a given location in the virtual city Categorization Identify the category of different images Select target items from a clothing store according to a given category (e.g., shoes, sunglasses, women's clothing) Action sequencing Organize a set of actions to perform a given activity Select, in the proper order, the steps needed to accomplish an activity of daily living at home Memory of stories or pictures Memorize information about a story or picture and recall it after a few minutes by answering true or false questions Memorize verbal or visual information from a newspaper or magazine at the kiosk and then answer true or false questions, regarding the previous information, when reaching the next location tolerance to external stimuli (e.g., HMD)]; (b) patients' chronic consumption of psychotropic medication and associated unwanted adverse effects (e.g., dizziness, nausea, and increased fall risk); and (c) patients' potential low digital literacy.

Outcome measures
2.6.1 Primary outcome measures: global cognitive functioning, processing speed, sustained and selective attention, verbal memory, visual memory, and executive functions The Montreal Cognitive assessment was used to assess global cognitive functioning (Freitas et al. 2011). Moreover, we selected specific processing speed, attention, memory, and executive function instruments. To assess processing speed, we used the Digit Symbol and Symbol Search subtests of Wechsler Adult Intelligence Scale III (WAIS III) (Wechsler 1997a). The assessment of sustained and selective attention was performed with Toulouse-Piéron, which is a widely used cancellation ten-minute test (Toulouse and Piéron 1986). To assess verbal memory and visual memory, we used the Verbal Paired Associates I subtest of Wechsler Memory Scale-III (WMS-III) (Wechsler 1997b) and the Rey-Osterrieth Complex Figure Test (RCFT) (Rey 1998), respectively. Finally, executive functions were assessed with Semantic and Phonemic Verbal Fluency Tests (Cavaco et al. 2013).

Secondary outcome measures: depressive symptomatology and quality of life
We assessed the presence and severity of depressive symptomatology with the Beck Depression Inventory II (BDI-II), which is a 21-item self-report rating inventory (Beck et al. 1996

Statistical analysis
The statistical analysis was performed using the Statistical Package for the Social Sciences version 26 (SPSS Inc., Chicago IL, USA). Normality of the data was assessed with the Shapiro-Wilk test. Normally distributed continuous variables (age and years of schooling) were presented as mean and standard deviation, and categorical variables (diagnosis) as frequency and percentage. Differences between both groups in terms of demographic and clinical data were assessed with the independent samples t-test and the Fisher's exact test, respectively. Considering that most of the neuropsychological assessment data were not normally distributed, non-parametric tests were run to evaluate within and between-groups differences in the different assessment moments. Thus, non-normally distributed continuous variables were presented as median and interquartile ranges (IQR). The Wilcoxon signed-rank test was used to analyze within-groups differences over time, whereas the two-tailed Mann-Whitney test was employed to assess between-groups differences from baseline to post-intervention and from baseline to follow-up. Effect sizes (r) estimates were calculated (r = Z/√N) and interpreted as follows: 0.2 = small, 0.5 = medium, and 0.8 = large (Cohen 1988). In all statistical analysis, p-values ≤ 0.05 are reported and considered statistically significant.

Sample description
The demographic and clinical characteristics of the sample are reported in Table 2. All 29 patients were female, had an average age of 55.93 (SD = 11.57) years old, and an average of 5.55 (SD = 4.24) years of schooling. Most patients (51.72%) had a schizophrenia diagnosis, while the remaining presented distinct clinical conditions (e.g., mental retardation, recurrent depressive disorder, schizoaffective disorder). In addition, digital literacy was informally assessed by simply asking patients if they had previous computer experience and internet navigation skills. We found that the majority of our sample had low digital literacy, with only 4 (13.79%) out of 29 patients (2 from the Reh@City v2.0 group and 2 from the TG group) having used computers prior to this study and displaying basic Internet skills. According to the Shapiro-Wilk test, data were normally distributed in both groups for age ( Fisher's exact test = 5.424, p = 0.592). Regarding the neuropsychological assessment baseline scores, the Mann-Whitney test indicated that there were no statistically significant differences between the Reh@City v2.0 and the TG groups in any of the primary and secondary outcome measures at baseline (see the supplementary material). Table 3 describes

Verbal paired associates I-verbal memory
The Verbal Paired Associates I scores for both groups at baseline, post-intervention, and follow-up are illustrated in Table 6. Only the TG group demonstrated a significant improvement in the recognition trial [Baseline: Mdn = 21.5, IQR = 8.4; Post: Mdn = 24, IQR = 3.2 (W (14) = 33.000, Z = −2.113, p = 0.035, r = 0.56)] at postintervention. No between-groups differences were identified in any of the three assessment moments (see the supplementary material).  No between-groups differences were found in any of the three assessment moments (see the supplementary material).

Semantic and phonemic verbal fluency-executive functions
The Semantic and Phonemic Verbal Fluency scores for both groups in the various assessment moments are listed in Table 8. No within-or between-group differences were identified at baseline, post-intervention, and follow-up.

Beck depression inventory II-depressive symptomatology
The BDI-II scores in both groups at baseline, post-intervention, and follow-up are presented in Table 9.

Discussion
VR applications to CT have been growing in the last few years, with encouraging results in neurological and psychiatric populations. VR methods provide a more ecological training experience through the performance of simulations of familiar ADLs. Also, VR simulations of ADLs are a rather global way of training cognitive functions, while traditional CT methods typically tackle domain-specific processes. In this two-month RCT, we aimed to evaluate and compare the impact of two content-equivalent CT approaches, delivered in a desktop VR format (Reh@City v2.0) and a paper-and-pencil format (TG), in a sample of institutionalized patients with chronic mental and behavioral disorders. We intended to contribute with additional research on the impact of two CT methodologies, grounded on the same adaptation and personalization framework, in an understudied and clinically diverse sample of patients from a long-term care psychiatric facility. This study highlighted both CT interventions' cognitive and non-cognitive benefits, suggesting that the combination of both interventions could lead to better outcomes. This is important since traditional and technology-based CT interventions should be viewed as complementary training approaches.

Primary outcome measures
The within-group analysis, from pre-to post-intervention, revealed statistically significant improvements in the Reh@ City v2.0 group in visual memory (immediate recognition trial of RCFT). At two-month follow-up, the desktop VR group maintained previous gains in visual memory, and new improvements were found in global cognitive function and other specific cognitive domains, namely language, visuospatial abilities, and executive functions, as assessed by MoCA. Concerning the TG group, there were improvements in processing speed (coding and incidental learning trials of Digit Symbol) and verbal memory (recognition trial of Verbal Paired Associates I). At follow-up, gains were maintained by the TG group, and new improvements identified in sustained and selective attention (work efficiency index of Toulouse-Piéron). The between-groups comparison showed greater gains in processing speed and abstraction (MoCA) in the TG group. These findings are consistent with previous research. Visual memory improvements after VR-based CT have been reported in Zając-Lamparska et al. (2019), which assessed the effects of the GRADYS VR system in a sample of older adults living without and with mild dementia. In this study, older adults without dementia had more improvements in visual memory and visuospatial processing than older adults with mild dementia, despite the latter also showed positive changes in easier visual memory tasks (i.e., copy task of RCFT). Additionally, in Gamito et al. (2015), despite the absence of statistically significant changes, data of the VR group indicated an improvement in visual memory at followup. Also, the positive impact of VR application to CT in global cognitive function is well documented in prior studies with schizophrenia (Chan et al. 2010;Plagia et al. 2013) and stroke patients (Faria et al. , 2020. In Faria et al. (2016) study, which compared the Reh@City v1.0 intervention with conventional CT, stroke patients in the desktop VR group exhibited significantly higher improvements, not only in global cognitive function, but also in more specific cognitive domains, such as attention, memory and visuospatial abilities, at post-intervention. Similar results were found in Faria et al. (2020), where stroke patients in the Reh@ City v2.0 group showed significantly higher improvements in the MoCA global cognitive function and its attention, visuospatial, and executive functions domains at post-intervention. Our results corroborate Reh@City v2.0 positive impact on global cognitive function, visuospatial abilities, and executive functions. Interestingly, these changes only appeared at follow-up, two months after the intervention. We also found an additional improvement in language that is consistent with Marques et al. (2008) study in schizophrenia patients. TG's positive impact is also in line with previous findings in stroke patients (Faria et al. 2020), namely the processing speed (Symbol Search (WAIS III)), sustained attention (TMT A execution time), and verbal memory improvements (retention trial of Verbal Paired Associates I). Nonetheless, we found greater improvements in the TG group's performance in processing speed and abstraction (MoCA) at follow-up. TG's data must be considered with caution as results can be related to the similar nature of the TG's CT tasks and the cognitive assessment measures. Given that the TG's paper-and-pencil CT tasks tackle specific cognitive processes, which is also a common feature of cognitive instruments, it is reasonable to assume that there may be a more expressive translation of training gains to the assessment measures. However, studies show that these gains may not be necessarily representative of functional improvements in ADLs (Coyle et al. 2015;Prikken et al. 2019). On the other hand, the Reh@City v2.0 offers a multidimensional CT that may have a larger impact on broader domains (e.g., global cognitive function) and possibly a greater transfer of training gains to everyday life, which is the ultimate goal of a rehabilitation program. Most of our participants are inpatients and only basic ADLs are routinely assessed at the institution. As such, we could not assess the transfer of training gains to more advanced functional outcomes (i.e., instrumental activities of daily living).
It is also important to note that, since both interventions resulted in differential gains in cognitive function, it could be useful to combine desktop VR CT and adaptive paper-andpencil CT in order to potentiate cognitive improvements. As reported by Kim et al. (2011) andYoung et al. (2014), the combined use of VR and computerized conventional CT resulted in additional gains in cognition, both in stroke and brain tumor patients, in comparison with computerized CT alone.

Secondary outcome measures
Apart from cognitive functions, we evaluated the impact of both interventions in emotional and quality of life measures. The within-and between-groups analysis, from pre to post-intervention, indicated that the Reh@City v2.0 group showed a statistically significant reduction in depressive symptomatology. The TG group revealed improvements in social relationships and environmental domains of quality of life, being that only the social relationships domain improvement was significant in the between-groups analysis. At two-month follow-up, both groups maintained the previous gains, with the Reh@City v2.0 group exhibiting a higher reduction in depressive symptomatology and the TG revealing greater improvements in the social relationships' domain of quality of life.
The improvement in depressive symptomatology after the desktop VR-based CT intervention is consistent with a study performed by Maier et al. (2020) that compared adaptive conjunctive cognitive training (ACCT) in VR with conventional at-home CT in chronic stroke patients. Results indicated that the ACCT VR intervention was associated with lower levels of depressive symptomatology and that improvements in particular cognitive domains, namely attention and memory, predicted lower levels of depression. Also, a systematic review from Coyle et al. (2015) exploring the effects of computerized CT and VRbased CT for individuals with an increased risk of cognitive decline (e.g., mild cognitive impairment and dementia) confirmed the beneficial impact of VR interventions in psychological outcomes, such as depressive symptomatology and anxiety. In the case of our study, we speculate that the depressive symptomatology reduction observed in the Reh@City v2.0 group could be due to improvements in cognitive outcome measures and, eventually, to the resemblance of the desktop VR-based CT tasks with everyday life. Regarding the first motif, evidence suggests that depression is correlated with cognitive function (Harvey 2011;Iosifescu 2012); in this sense, we can hypothesize that the global cognitive functioning improvements in the desktop VR group may explain depressive symptomatology amelioration, or, conversely, the desktop VR-based CT intervention itself could have impacted mood, which resulted in cognitive gains. Either way, this finding emphasizes the need of devising interventions that consider both cognitive and emotional aspects of human functioning, due to its indissociable nature. The second motif is associated with the naturalistic features of the desktop VR-based intervention, i.e., the existence of ADLs simulations using popular Portuguese brands, which increase the similarity of the training setting with daily life. This fact could have positively influenced motivation and engagement in the desktop VR group, and therefore reduced depressive symptomatology, given that these patients experienced a sense of relatedness with day-to-day functioning, which could have attributed a higher significance to this type of training. In addition, replicating ADLs in an institutionalized setting could lead to stronger emotional engagement due to the recreation of daily living outer environments, of which patients are normally deprived. Concerning the TG, positive effects in the social relationships domain of quality of life may be related to the fact that this intervention involved more interaction with the psychologist, which could, in fact, have strengthened the therapeutic relationship. The Reh@City v2.0 group required less help from the psychologist during sessions, which resulted in greater independence and autonomy from the psychologist while executing the tasks. On the other hand, patients in the TG typically require more assistance from the psychologist while executing the task, since there is not a feedback system in the paper-and-pencil CT. This social interaction element, particularly present in the paper-and-pencil intervention, represents a crucial part of every therapeutic process and is known to contribute to patients' improvement.

Limitations
Our results must be interpreted with caution due to some limitations that should be acknowledged. Firstly, there was considerable heterogeneity in terms of clinical diagnosis in both groups. Since having a clinically stable condition, independently of the formal diagnosis, was one of the main criteria for admission on the CSCP's psychosocial rehabilitation program, the included patients had quite different clinical diagnosis. Consequently, it was not possible to create homogenous patient groups. Perhaps, patients diagnosed with a certain type of mental and behavioral disorder could have benefited more from the interventions; however, we cannot ascertain that. Besides, we did not control important clinical information, such as medication regimen, chronicity of illness or institutionalization, and relapse history. Concerning specifically the medication regimen, all patients were prescribed different medications according to their diagnosis and medical history. As such, we did not measure the influence of the pharmacological treatments in patients' performance and, therefore, cannot exclude the effects of medication, nor of the aforementioned clinical variables, in our patients' response to the cognitive interventions. It is also important to note that only participants were blinded to the interventions. Psychologists were aware of the participant's assignment to both treatment conditions, having then performed all the assessments and interventions. Being aware of this information may increase the likelihood of bias, since this knowledge might influence psychologists' behavior toward participants, even if unconsciously. Furthermore, the Reh@City v2.0 and the TG were originally designed for a different clinical condition-stroke-what poses challenges when transferring and assessing these interventions to other patient populations. In addition, despite consisting of two content-equivalent CT methods the Reh@City v2.0 and the TG imply different interaction patterns with the CT training method (i.e., desktop VR vs. paper-and-pencil), and, also, with the psychologist. The type of CT content (i.e., ADLs virtual simulations) displayed in the Reh@City v2.0 platform may be more engaging and motivating for patients living in a long-term psychiatric setting, deprived of performing these types of activities in the community setting. In the TG, the printed CT tasks do not consist of everyday life tasks and often contain artificial stimuli (e.g., a cancellation task with squares, and circles). We assume that this type of content can be less motivating because it is harder for the patient to understand the significance of the training, i.e., how will a cancellation task with squares and circles be helpful for the improvement of attentional ability in their daily life? As to the psychologist and patient interaction, there is also a difference in both approaches. In the Reh@City v2.0 intervention, the patient is seated in front of a PC and the system provides the instructions and performance feedback, while in the TG it is the psychologist who gives both instructions and feedback, resulting in more interaction with the participants. For the reasons described above, we cannot exclude the influence of these two types of interaction (training method and psychologist-patient relationship) in our results. Regarding the neuropsychological assessment protocol, it would have been useful to include a measure of IADLs to assess the transfer of cognitive gains to more complex ADLs. Finally, there may have been learning effects given that most neuropsychological assessment measures lacked parallel versions for multiple assessments, except for MoCA. Nonetheless, the same neuropsychological assessment protocol was administered to both groups, so, irrespectively of the existence of learning effects, this comparison is still valid.

Conclusions
To the best of our knowledge, this is the first longitudinal RCT investigating the impact of two content-equivalent cognitive interventions, delivered in different formats-desktop VR (Reh@City v2.0) and paper-and-pencil (TG)-, in a sample of patients with mental and behavioral disorders from a long-term care psychiatric setting. Overall, patients from both groups revealed differential positive changes in primary and secondary outcome measures in the different assessment moments, which may suggest that the combined use of these two interventions could lead to higher short and long-term benefits in the assessed domains. In terms of cognitive gains, the Reh@City v2.0 group, at post-intervention, showed improvements in visual memory, and at follow-up, besides maintaining former gains, revealed additional improvements in global cognitive functioning, language, visuospatial abilities, and executive functions. The TG group exhibited processing speed and verbal memory improvements at postintervention. At two-month follow-up, this group maintained processing speed gains and revealed new improvements in sustained and selective attention. Considering emotional and quality of life outcomes, Reh@City v2.0 contributed to lower levels of depressive symptomatology and the TG resulted in improvements in social relationships and environmental aspects of quality of life at post-intervention and follow-up. When comparing both interventions, we have found that the desktop VR-based CT intervention consistently led to a higher reduction in depressive symptomatology, observed at both post-intervention and follow-up, while the adaptive paper-and-pencil CT intervention was superior in social relationships aspects of quality of life, processing speed and abstraction at follow-up. These results are quite important in the context of mental and behavioral disorders, once these conditions often result in cognitive and functional impairment that substantially affect mood and quality of life. Improving these two last outcomes through CT methods emphasizes the indissociable relationship between cognition and emotional domains, indicating that CT has a broader impact in different aspects of human functioning.
Future research in this field should take into account the aforementioned limitations and further explore the influence of desktop VR-based CT methods in mood symptoms (e.g., depression and anxiety), since these are a common clinical feature of several mental and behavioral disorders, known to aggravate cognitive deficits and, therefore, compromising quality of life and functional abilities. Also, given the differential benefits of both interventions, it is important to investigate the combined use of desktop VR and conventional CT methods in cognitive and non-cognitive outcomes, due to their potential complementary nature and potential cumulative benefits.