Observational Studies (OSs) and Randomized Controlled Trials (RCTs) are the main types of studies used to evaluate treatments. In the last ones, patients are assigned to active or control group by chance - through randomization - in order to reduce errors or bias and to remark only the differences due to the treatment. On the contrary, observational studies do not require randomization: differences in outcomes are only observed after a particular therapy has been opted for.
Although RCTs are considered to be more reliable than OSs when evaluating treatment effectiveness, meta-analyses that confronted the results on different interventions typologies from both types of studies did not systematically show significant differences in the estimates of the effects.[1-3]
For example, considering the association between hypertension treatment and first stroke episode, the final estimates indicated a relative risk of 0.58 (95% confidence interval, 0.50 to 0.67) for RCTs and an odds ratio of 0.62 (95% confidence interval, 0.60 to 0.65) for OSs.
More often observational studies (compared to randomized controlled trials) tend to overestimate the effects of the treatment and show more variability in the estimates of the effects because of residual confounding. In general, both RCTs and OSs estimates of treatment effectiveness are significantly affected by the quality of the study design. In RCTs, for example, correct randomization is fundamental; if randomization is not adequate the effects of the treatment are overestimated.[4,5]
Comparison of studies
Although randomized controlled trials are preferable when evidences of treatment effectiveness must be provided (every regulatory agency requires these studies to allow the registration of a new drug), things become more complex when the risk of adverse effects needs to be assessed. The lack of adverse events data from RCTs is well known; besides, RCTs often do not use big enough population sample or do not have adequate follow-up to identify rare adverse effects (or adverse effects that happen months/years after the intervention), or the quality of safety data is not good enough. Besides, generalizability is limited for RCTs-results because patients at high risk of adverse effects, medically fragile or with multiple comorbidity are often excluded.[6,7] In order to overcome these limitations, a big amount of data from observational studies is often taken in consideration for the safety-profile evaluation of a particular intervention.
Meta-analyses on safety have also been conducted to confront RCTs and OSs adverse events data. The difference is small on average and, particularly for less-frequent adverse effects (or rare), the imprecision in risk estimates might not actually reflect a real difference between RCT and OS estimates. For this reason it could be more useful to focus on overlapping confidence intervals (by doing so, RCTs and OSs confidence intervals overlap in more than 90% cases).[8,9]
There are some important cases, however, when randomized controlled trials and observational studies reach divergent conclusions: it is then necessary to analyze the studies in depth to highlight their limitations. One example comes from substitutive hormonal therapy, recommended in 2000 on the basis of observational studies evidences. In 2002 the results of a randomized controlled trial on more than 16,000 women in menopause - assigned to hormonal therapy or placebo - changed the clinical practice. The treated women showed an increased risk for coronary cardiopathies, mammary tumor, venous thromboembolism and stroke. These discrepancies can be explained considering that several confounding factors like exercise, smoking, education or income - in fact correlated to the outcome - were not included in the OSs data analysis. In addition, for what concerns coronaropathies, another reason can be found in the hormonal therapy exposure time. In fact, in RCTs the coronaropathy risk seems to be higher during the first year of treatment and then it reduces. OSs did not take into consideration that patients had started the therapy in the past and then, when entered the study, they were already in a phase of minor coronaropathy risk. When data from OSs are adjusted for the exposure time, the estimates of risk for coronaropathy obtained from RCTs and OSs become similar.
A different case clarifies on the other side how observational studies can add relevant information to validate the evidences from randomized controlled trials: NSAIDs cardiovascular risk. RCTs-based evidences consist of few cases of cardiovascular events and relate only to certain types of NSAIDs and to selected patients populations. On the contrary, a systematic review conducted on observational studies provides an exhaustive profile on NSAIDs cardiovascular risk because it includes a wide spectrum of NSAIDs, prescribed at different doses to the clinical practice population. The results strictly correlate with those obtained from randomized controlled trials, but reveal that there is no minor risk for cardiovascular events when NSAIDs are prescribed at low doses.
It is then clear that, to correctly interpret the results of any kind of study, quality and methods used for data analysis must be carefully evaluated. It is opportune to remind that although observational studies can incur in higher risk of error because of residual confounding, it is also true that several statistical techniques (like matching, propensity score, risk adjustment factors) allow to control confounding factors, and, when correctly used, can provide more accurate risk estimates.
Until now we have being discussing about differences (or concurrences) in the results obtained from different type of studies for a specific treatment. This is obviously only possible when more than one RCT and more than on OS on the same clinical question are available; therefore, ex post it is possible to remark the differences, re-analyze the data and provide more accurate interpretations. It is much more complicated when only one RCT or OS is available and a decision should be taken upon it; when it is not possible to confront data from different populations and with different designs a certain level of uncertainty is inevitable. In order to limit uncertainty and take decision based on valid evidences, it is necessary to replicate the studies, answering to clinical questions on safety and effectiveness of treatments through more than one study and when possible with different designs.
“The opinions expressed herein by the author do not necessarily reflect the official views of the Italian Medicines Agency (Agenzia Italiana del Farmaco, AIFA)”.
- NEJM 2000;342:1887-92. CDI NS
- N Engl J Med 2000;342:1878-86. CDI NS
- JAMA 2001;285:437-43. CDI NS
- Brit Med J 1998;317:1185-90. CDI#fff#
- Brit Med J 2011;343:d7020. CDI#fff#
- Brit Med J 1999;319:312.1. CDI#fff#
- N Engl J Med 2000;342:1907-9. CDI NS
- Plos Med 2011;8:e1001026. CDI#fff#
- CMAJ 2006;174:635-41. CDI#fff#
- JAMA 2002;288:321-33. CDI NS
- Lancet 2009;373:1233-5. CDI#fff#
- Plos Med 2011;10.1371/journal.pmed.1001098 CDI NS