The abstracts of 432 articles identified using the search strategy described above were considered (Figure 2). Fifty-one potentially relevant articles were retrieved for comprehensive review. Twenty four articles were excluded as they did not meet criteria for an observational study of depression (see exclusion criteria). Forty articles from 17 observational prospective cohort studies were identified, 27 articles from the original search [32–58] and 13 from secondary references [59–71].
Methods used
Table 1 describes the methods for the 17 longitudinal studies included in the review. The studies varied widely in their original purpose. The studies can be grouped into those that focused on depressive symptoms (n = 7) [45, 51, 54, 61, 63, 66, 68] and those that focused on major depressive disorder satisfying DSM IV criteria (n = 10) [37, 38, 43, 44, 48, 49, 59, 64, 67, 69]. Nine of the 17 studies aimed to describe the course of depression over time and identify risk factors associated with recovery or improvement in depression [37, 43, 44, 54, 61, 63, 64, 68, 69]. Four studies were interested in examining detection of depression by the practitioner and depression outcome [38, 49, 64, 66]. One study examined the seasonality prevalence and incidence of depressive disorder [59], one examined the process and outcomes of rural depression care [67], one examined the outcomes for cases 'missed' at the screening encounter [45], one examined the prevalence of bipolar II disorder with depressive and anxiety subtypes [48], and one examined whether managed care was associated with reduced access to mental health specialists and poorer outcomes among patients with depressive symptoms [51].
Studies varied widely in methods; including the screening and assessment instruments used, eligibility for inclusion in the cohort and the length of follow-up (Table 1). Cohort sizes ranged from 35 [68] to 1336 [51] patients. Follow-up ranged from 20 weeks [68] to 3.5 years [64]. The majority of the studies followed patients for 12 months, with nine studies following patients for less than 12 months (range 20 weeks to 9 months) [37, 43, 48, 49, 51, 61, 66–68] (Table 1).
Setting
Most studies were undertaken in the US or Europe (Figure 2). The review also includes two multi-country studies: the Longitudinal Investigation of Depression Outcomes in primary care (LIDO) [44] and the World Health Organization (WHO) Collaborative Project on Psychological Problems in General Health Care [69]. Several of the study sites (Netherlands [55], Italy [36] and US [57, 71]) involved in the multi countries WHO Collaborative Project on Psychological Problems in General Health Care have published results independently. The individual results are presented from these sites alongside the cumulative results from the WHO Collaborative Project on Psychological Problems in General Health Care in Table 2.
Selection procedures
Consecutive and convenience sampling methods were used in the majority of studies. No study recruited a random sample of patients from primary care. Only six studies detailed how the settings or clinicians were selected: GPs were randomly selected in two studies [43, 63]; settings were selected in two studies based on research experience and capacity, and on previous collaboration [44, 69]; GPs were a representative sample from the total population of GPs in the area in one study [64] and in another, GPs were a consenting sub-sample from a larger study on physician referral [51]. Those remaining were conducted on convenience samples selected from health centres [59]; general practices or family practice clinics [37, 45, 48, 54, 66, 68], with rural practices [67], from family physicians and a University Psychiatry Outpatient Department [49] or in multi-specialty clinics that had mental health care services, outpatient services, day care and inpatient services [61]. Using statewide telephone screening, one study identified and followed a cohort with a current major depression who made one or more visits to a primary care physician during the six months following baseline [38].
Inclusion criteria
Six studies included only patients with major depression [38, 43, 44, 49, 59, 67], three studies included patients with depressive symptoms [51, 66, 68], two studies included patients with depression or anxiety disorders or symptoms [37, 48] and one included 'new' (i.e. a psychiatric diagnosis had not been diagnosed during the 12 months prior to the index visit) and 'old' (i.e. a psychiatric diagnosis had been diagnosed during the 12 months prior to the index visit) patients with depression and anxiety disorders, including borderline disorders and non-specific psychiatric symptoms [35]. Of the remaining five studies, four included patients with depressive symptoms and asymptomatic patients in the follow-up [45, 54, 61, 63] and one included patients with current psychiatric disorder and a random sample of patients without a current psychiatric disorder in the follow-up [69]. Two studies excluded patients who had received recent treatment for depression; one in the previous three months [44] and one where patients had contact with the clinic where the research was being conducted in the six months prior to the study [66]. One study included only non-referred patients presenting with anxiety or depression [48]. One study screened people for depression from random households using state-wide telephone lists and presented follow-up data on those who had visited a general practitioner in the six months following baseline interview [38].
Screening procedures
Seventeen different instruments were used at baseline to measure depressive symptoms or disorders. The Center for Epidemiologic Studies-Depression Scale (CES-D) [72] and various versions of the General Health Questionnaire (GHQ) [73] were the most commonly used screening instruments (Table 1).
Comorbidity measures
The majority of studies also measured co-morbid psychiatric symptoms or disorders, mainly anxiety related. Only six examined physical co-morbidities or days out of role [38, 44, 48, 54, 67, 69]. Grembowski et al. [51] reported measuring 21 co-morbid conditions, however these conditions were not reported.
Treatment and health service use
Only one study did not report on the care received by patients [59]. Two of the sixteen studies that report collecting data on health service use did not report the findings [38, 61]. While Kessler et al. [61] reviewed medical records, their purpose was to determine point recognition and validate mental disorders given in the context of an associated physical disorder, not to examine care received. Rost et al. [38] examined medical, pharmaceutical and insurance records to determine the detected and undetected depression during follow-up. Five studies examined medication use [44, 48, 63, 68, 69] and two presented data on the use of antidepressant medication [51, 67]. Seven studies examined health care/service utilization [44, 49, 54, 63, 66, 67, 69] and one described use of and referral to mental health specialists [51]. Parker et al. [68] examined GP and psychiatric care, and four studies examined GP treatment [37, 45, 63, 64]. The MaGPie Research Group [63] also examined barriers to care and patients' attitudes to their GP. Seven studies asked patients to self report on the care they received between baseline and follow-up [37, 44, 51, 54, 63, 66, 69], four studies asked the primary care physician to report on care [43, 49, 63, 64] and eight reviewed medical records, chart evidence, insurance records and/or pharmaceutical records [37, 38, 45, 48, 51, 63, 66, 67] [results not mutually exclusive]. Parker et al. [68] collected data on medication, GP and psychiatric care during follow-up, however they did not report whether the data were patient self-report, physician report or a review of records.
Methodological quality of the studies included in the review
Many of the studies have methodological limitations, including small sample sizes (range 35–108 patients at baseline and 20–59 patients at follow-up) [48, 49, 67, 68] and small numbers in the cohort with depression [35, 66]. Furthermore, Schulberg et al. [66] was the only study to consider characteristics of the sample screened with the primary care population to determine whether the study patients were representative. No study randomly selected general practices and then approached a random selection of their patient list to avoid the frequency of attendance bias present in studies recruiting consecutive patients. Previous research has highlighted that a high proportion of eligible patients are missed when recruiting patients from general practice waiting rooms, thus limiting the generalisability of the findings [74]. Moreover, only three studies included a random or representative sample of primary care physicians [43, 63, 64], and seven studies recruited patients from just one centre or general practice [37, 45, 48, 54, 59, 61, 66]. These methodological limitations must be acknowledged when considering whether findings from the studies included in this review can be generalised to primary care populations.
Representativeness of samples to the primary care population
Only one study was able to compare characteristics of the sample screened with the primary care population [66]. Schulberg et al. [66] compared their cohort of patients with depressive symptoms with the total clinic population and noted the cohort was younger and had more females than would be expected from the medical facilities where the research took place. Rost et al. [38] compared patients who agreed to take part in the baseline interview with those patients who were eligible but refused, and reported no differences in socio-demographic data and clinical characteristics including the severity of depression, except that the cohort were younger and more likely to live in metropolitan areas.
Between 62%–91% of patients were retained in the studies at 12 months, and 67%–93% at six months (Table 1). However, as some cohorts included asymptomatic patients [45, 54, 61, 63, 69], the power of some studies to determine predictors of depression outcomes is limited.
Course of depression
History and duration
Limosin et al. [43] reported that the current episode of depression was not the first for 38% of the patients in their study. In that study, the average number of previous depressive episodes was 2.1 (SD 1.7 episodes; range 1–12) and the average time reported between the first and current depressive episode was 5.9 years (SD 5.8 years; range 0.5 to 30 years).
Only four of the 17 studies provide information on the chronicity of depressive symptoms [37, 43, 58, 65]. The mean duration of the current episode of depression varied across the two studies that reported it. Limosin et al. [43] reported the average length of the current episode of depression was 2.8 months (SD 7 months; range 0.5 to 8 years), while in the Groningen Primary Care Study, the mean episode duration for depressive disorders was 9 to 10 months [65], however the small sample size in the Groningen study and non-random selection of patients in both studies limits the generalisability of these findings. Ronalds et al. [37] reported a greater improvement in depressive disorder for patients with a depressive disorder of less than six months duration compared to patients with a duration of greater than six months. In the Groningen Primary Care Study, patients with depression, anxiety or neurasthenia disorders with a recent onset or exacerbation, were twice as likely to have that disorder recognized by a GP and to have improved at follow-up, than patients with chronic psychological disorders [64].
Recovery
Eight studies have presented data on recovery from major depressive disorder. Among these, 32% of patients had recovered at four months [67], 65–71% had recovered at six months [43, 61, 66], 35% had recovered at nine months [52], 32%–67% had recovered at 12 months [41, 42, 54, 64, 69, 70] and 47% had recovered at 3.5 years [35] (Table 2). It is not possible to meaningfully compare the findings across studies as there was no consistency in methods, with studies using different instruments for screening and diagnosis, and different methods of recruitment and administration (clinician administered, researcher administered or self report) of instruments (Table 1). Furthermore, four of these studies included small numbers of patients with depression; Schulberg et al. [66] followed up 17 patients with major depressive disorder, Ormel et al. [35] followed up 20 patients with depression and 13 with borderline depression, Rost et al. [67] followed up 38 patients with major depressive disorder and Wagner et al. [54] followed up 51 patients with major depression and 66 with minor depression. There were three studies with large sample sizes that presented data on recovery from major depression/depressive disorder. In France, Limosin et al. [43] found that 65% (308/476) of patients with major depressive disorder recovered without relapse at six months; at nine month follow-up, the LIDO study [52] conducted in six countries, found that 35% (340/968) of patients reported complete remission from major depressive disorder (ranged from 25%–48% across countries); and in the WHO Collaborative Project on Psychological Problems in General Health Care conducted in 14 countries, 67% (482/725) of patients with depressive episodes recovered at 12 months [41]. Although both the LIDO and WHO studies administered (different versions of) the CIDI, the results of the studies are very different. The authors of the LIDO study offer no explanation for the lower rate of recovery in their study compared to other studies [52], however these follow-ups were done at nine and 12 months respectively which may contribute to the recovery rates.
The interpretation of the findings on recovery is further complicated when recovery rates are compared for patients whose depression was detected or undetected [38, 46, 49, 64, 66]. Despite the methodological limitations of some of the studies presented such as small sample size; the results suggest there is no difference in depression outcomes between patients whose depression is recognized or unrecognized. Rost et al. [38] reported that 47.2% of undetected patients and 39% of detected patients no longer met criteria for major depression at 12 month follow-up. At six month follow-up, Schulberg [66] found that "psychiatric status at initial assessment and the number of assigned medical diagnoses rather than the physician's recognition and treatment of depression strongly predict continued affective disorder" (p.312), however, only 6.2% and 2.9% of this cohort had major depressive disorder at baseline and follow-up respectively. In the WHO Psychological Problems in General Health Care study conducted in 14 countries, Simon et al. [46] found that at baseline recognized patients had significantly higher mean GHQ scores and were more disabled than unrecognized patients. At three month follow-up, recognized patients reported a significantly greater reduction in GHQ scores than unrecognized patients; however by 12 month follow-up there was no difference between recognized and unrecognized patients in change in GHQ score or change in diagnostic status from baseline. The authors conclude that "recognition and appropriate diagnosis of depression in primary care is associated with significantly greater short-term improvement [and] that increasing recognition of depression in primary care is only a first step toward more appropriate treatment" (p.97). In the Groningen Primary Care Study at 12 month follow-up psychopathology had improved for 75% of patients whose psychological disorder was recognized (n = 100) by a GP compared to 33% of unrecognized patients (n = 79) (p < 0.001) (OR 6.1 for PSE-ID) [64]. A similar pattern was found with improvement in social disability: a significantly greater proportion of recognized patients compared to unrecognized patients reported improvement at 12 months (56% vs. 24%, p < 0.001) (OR = 4.0). The majority of participants in this study were 'new' cases (i.e. a psychiatric diagnosis had not been diagnosed during the 12 months prior to the index visit) which may explain in part why the results conflict with the results from the other studies presented.
Relapse rates
Only two studies in the current review presented data on relapse rates [32, 43]. Limosin et al. [43] reported at six months that 65% (308/476) of patients with major depressive disorder had recovered without relapse, 25% (117/476) developed a chronic condition and 11% (51/476) relapsed after recovery. In the Groningen Primary Care Study, 93% of depressed patients had remitted from index episode at 12 months [32] and the relapse (described as "transition from an asymptomatic state of at least two months to a state of mental disorder") rate among depressed patients was 30%, however the cohort included only 20 participants with major depression. Limosin et al. [43] found that a history of recurrent major depressive disorder was associated with a higher risk of relapse at six months, while Parker et al. [68] found patients with episodic or recurrent episodes were more likely to improve at 20 weeks than those with other patterns of depression, however due to this study's small sample size and short follow-up time (20 weeks) the results should be considered tentatively.
Risk factors for the course of major depressive disorder and depressive symptoms
Six studies examined the predictors of the course of depression [32, 41–43, 52, 58, 67].
Chronicity of depression
Longer pre-baseline duration of the depressive episode in the WHO Collaborative Project on Psychological Problems in General Health Care study was a predictor of a poor course of depression [58]. Multivariate analysis reported that among those whose pre-baseline duration was at least one year compared to those whose pre-baseline duration was less than three months, the odds of a poor short-term course of depression (no full recovery within half a year) were over five times higher (versus those whose pre-baseline duration was less than three months) (OR = 5.22, 95% CI 2.45–11.15). The same was found for long-term outcomes, with those who had a pre-baseline duration of one year being more likely to report a poor outcome compared to those whose pre-baseline duration was less than one year (OR = 3.54, 95% CI 1.67–7.52). In the Groningen Primary Care study, duration of index episode was not associated with the occurrence of a relapse within the 12 month follow-up after remission [32].
Severity of depression
Wagner et al. [54] found that a greater proportion of patients with minor depression (56%, 37/66) than major depression (37%, 19/51) at baseline were asymptomatic at 12 months. They found that a diagnosis of minor depression was associated with almost the same degree of impairment in health status, functional status and disability, and psychiatric service utilization as a diagnosis of major depression. However, 20% (13/66) of patients with minor depression at baseline met criteria for major depression at 12 months, while 22% (11/51) of patients with major depression at baseline met criteria for minor depression at 12 months. The authors conclude that sub-threshold depression or the persistence of depressive symptoms is a risk for developing major depression. In the Groningen Primary Care Study, 31% of patients with borderline depression had recovered at 12 months and 70% at 3.5 years [35]. Indeed, the Groningen Primary Care Study [35] found that partial remission rather than complete recovery "was the rule and was associated with residual disability" (p.759). This study also found that depression had better outcomes than anxiety and mixed anxiety-depression. At baseline, patients with both anxiety and depression reported the highest symptoms levels on the Present State Exam. However given the small sample size with each disorder these results should be interpreted with caution.
The data from studies measuring depressive symptoms are also difficult to compare for similar reasons. Parker et al. [68] reported a 6% improvement in depressive symptoms at 20 weeks; and others reported that depressive symptoms had significantly reduced at six month follow-up [37, 51]. At four and a half month follow-up, one study found a significant reduction in depressive symptom scores among primary care patients whose depression was not detected compared to no significant reduction in depressive symptom scores among detected patients [49]. Kessler et al [45] reported that of the 88 patients who met criteria for a case on the GHQ, 50% (16/32) of those not detected by a GP at baseline or during the three year follow up, were no longer cases at three year follow-up. Grembowski et al. [51] and Ronalds et al. [37] retained a large sample of patients at follow-up, however Grembowski et al. [51] only included insured patients and therefore their findings are limited to a sample of mainly middle-class, Caucasian adults with depressive symptoms. The other three studies followed up small numbers of patients and may not have had sufficient power to determine reduction in depressive symptoms [45, 49, 68].
There were two studies where improvement in depressive symptoms was presented [37, 68]. Parker et al. [68] found that baseline predictors of a "better outcome" for the 20 patients with depressive symptoms at baseline who were followed up for 20 weeks were: having a history of episodic or recurrent episodes; a more severe depression; lower social class; break up of an intimate relationship as a precipitant; a neutralizing life event and family support. Multivariate analysis conducted by Ronalds et al. [37] found that at six month follow-up, high baseline depression score, higher educational level and current employment were associated with greater reduction in depression score among patients with major depressive disorder and generalized anxiety or panic disorders at baseline. The factors associated with outcome in this study were not reported for patients with each disorder, therefore it is difficult to draw any firm conclusions about the factors associated with improvement in depression among depressed patients.
Comorbidity
Gaynes et al. [42] reported that the risk for persistent depression at 12 months for those with major depression at baseline was 44% greater in those with co-existing anxiety disorder (RR = 1.44, 95% CI 1.02–2.04). In the Groningen Primary Care Study, half of the patients who experienced a positive life change remitted within four months [32], the probability of remission was 2.3 times higher following positive life change (HR = 2.3). The positive life change increased the probability of remission in women fourfold but not in men (HR = 4.4). Multivariate analysis found that quicker time to remission was associated with low severity of pre-morbid difficulties (HR = 0.7), high self-esteem (HR = 1.4), and a coping style aimed at reducing tension (HR = 1.4).
Treatment
Rost et al. [67] reported that patients with major depression who received pharmacologic treatment concordant with guidelines between baseline and five month follow-up were more likely to be in remission at follow-up than subjects who did not, however the sample size was small and of the 38 patients followed up, only 11 received such treatment. The findings on whether being prescribed antidepressants was associated with recovery were conflicting. Rost et al. [67] reported that patients who received pharmacologic treatment concordant with guidelines between index visit and five month follow-up were more likely to be in remission at follow-up than subjects who did not, while Barkow et al. [41] found antidepressant use was related to persisting depression at 12 month follow-up. The WHO Mental Disorders in General Health Care Study found that while patients receiving antidepressants reported significantly less symptoms on the GHQ at three months than patients receiving sedatives, this was not the case at 12 months [56]. However the authors highlight that as the study was not a trial, efficacy of psychoactive drugs cannot be inferred.
Sex
Despite the majority of patients in the 17 studies being female, only three studies reported on outcome by sex [37, 41, 68]. All three studies reported no difference between depression outcome for males and females at follow-up. However two of these studies may not have had sufficient power to detect differences between males and females [37, 68].
Predictors of the course of depression from multivariate analyses
Whilst there are difficulties in comparing results across the three large scale studies that measured risk factors for persistence or recovery from depression [41, 43, 52, 58], some conclusions can be drawn. Remission from depression at nine months was associated with higher level of education (OR = 1.06, 95% CI 1.051–1.11), higher quality of life (OR = 0.94, 95% CI 0.92–0.97) and experiencing key life events (OR = 0.71, 95% CI 0.66–0.83) in the LIDO study, after adjusting for centres, socio-demographic data, severity of depression, co-morbidity and general quality of life [52]. A significantly greater proportion of patients whose major depression had remitted at nine month follow-up had medical conditions, dysthymia or anxiety disorders than patients who were not in complete remission. While the authors found that there was no consistent variable that predicted remission across the six country sites, they believed this may have been a result of the "modest" sample size (range in cohort sizes by country 142–185). In the WHO Collaborative Project on Psychological Problems in General Health Care study, sustained non-remission (i.e. presence of a non-remitted or new depression) at 12 month follow-up was associated with lower levels of education (0 years versus 11+ years: OR = 3.78, 95% CI 1.83–7.79; 1–5 years versus 11+ years OR = 1.81, 95% CI 1.02–3.19), unemployment (employed versus unemployed: OR = 1.57, 95% CI 1.02–2.43), severity of depression (severe versus moderate: OR = 3.27, 95% CI 1.91–5.62), antidepressant use (OR = 1.79, 95% CI 1.06–3.03), repeated suicidal thoughts ("crossed my mind" versus no suicidal thoughts) (OR = 1.82, 95% CI 1.14–2.93), and abdominal pain as main reason for consulting the general practitioner (OR = 2.30, 95% CI 1.17–4.52) [41]. The study also reported that patients had a greater probability of a poor long term course (no recovery over the 12 month follow-up period) if the severity of their depression was moderate or worse (versus mild) (OR = 3.38, 95% CI 1.49–7.65), their pre-baseline duration was greater than one year (versus less than one year) (OR = 3.54, 95% CI 1.67–7.52), they did not have a chronic physical illness (OR = 0.31, 95% CI 0.13–0.73), they had low social support (versus high/average) (OR = 0.4 5, 95% CI 0.19–1.07), and they had lower levels of education (≥ 13 years versus < 10 years) (OR = 0.18, 95% CI 0.07–0.47) [58]. A previous episode of depression increased the probability of chronicity for younger (OR = 3.60, 95% CI 0.92–14.14) but not older (OR = 0.28, 95% CI 0.05–1.45) patients. They found that among patients with co-morbid anxiety, depressed women had a smaller probability of chronicity than depressed men (OR = 0.13, 95% CI 0.04–0.41) [58]. Limosin et al reported that relapse from depression was associated with a history of recurrent major depressive disorder at six month follow-up (OR = 1.6, 95% CI 1.08–3.43) [43].
Risk factors for persistence of depression identified in this review were: severity and chronicity of the depressive episode, the presence of suicidal thoughts, antidepressant use, poorer self-reported quality of life, lower self-reported social support, experiencing key life events, lower education level and unemployment.
Treatment and health service use
The proportion of patients receiving antidepressant medication during the study follow-up period ranged from 0% (St Petersbourg site in Fleck et al. [52]) to 100% [43] (Table 2). However, the proportion of these patients prescribed antidepressants according to guidelines in the three studies that reported this, ranged from 27% [38] to 61% [41].
Three studies reported that the likelihood of receiving treatment was associated with severity of illness [51, 54, 71]. In addition, The WHO Mental Disorders in General Health Care Study found that younger age, being male and less time since first onset were associated with not being prescribed psychoactive drugs [56]. Grembowski et al. [51] found that more severe depressive symptoms at baseline, previously attending a mental health specialist, more years of education, younger age and being female were the best predictors of referral and utilization of a mental health specialist and that managed care was not associated with a reduced likelihood of referral to or of visiting a mental health specialist. Another study found major depression (OR 1.83), female gender (OR 2.17), white race (OR 2.34), and higher education (OR 1.21) were associated with higher odds of a mental health visit in the last four months [54]. The US site of the WHO Collaborative Project on Psychological Problems in General Health Care found that participants with higher symptom severity as measured by the GHQ-28 at baseline, and more disability, were more likely to receive antidepressant medication or use any specialty mental health services [71]. This study also reported that patients with anxiety or depressive disorders at baseline had higher health care costs in the six months prior to baseline (US $2,390) than patients with sub-threshold (US $1,098) or no disorders (US $1,397). These cost differences were due to higher use of general medical services rather than higher mental health treatment costs [57]. In the Groningen Primary Care Study, recognition of psychiatric disorder by a GP among new cases resulted in greater likelihood of referral to a mental health specialist (OR 3.0), receiving psychotropic medications (OR 4.5), having a counseling session (OR 12.2) and having any mental health treatment (OR 6.7) [64].