General
This randomized, double-blind, parallel-group 12-week study was conducted at 67 sites in 28 countries, including the U.S. (see acknowledgments for list of countries and investigators). Patients were enrolled between November 1999 and June 2000. Each site received the approval of its Ethics Review Committee or Institutional Review Board to perform the study. Written informed consent was obtained from every patient evaluated. Patients who discontinued the study due to lack of efficacy or who completed the 12-week, placebo-controlled trial were offered the opportunity to enter a blinded active comparator-controlled 40-week extension. The data from the 40-week, active comparator-controlled extension will be reported separately.
Patients
Eligible patients were age ≥18 years and fulfilled diagnostic criteria for RA as specified by the 1987 revised criteria of the American Rheumatism Association [10]. In addition, patients were required to have an established diagnosis of RA for at least 6 months prior to entering the study, a history of a clinical response to NSAID therapy, and to have been taking NSAID therapy on a regular basis (at least 25 of the past 30 days). Patients with a history of angina or congestive heart failure, with symptoms that occurred at rest or minimal activity, and/or who had a history of myocardial infarction, coronary angioplasty, or coronary bypass within the past year were excluded as were those with a history of stroke, transient ischemic attack or hepatitis in the previous two years. Patients with uncontrolled hypertension at screening were also excluded. Patients with any medical condition which, in the opinion of the investigator, could have confounded study results or caused undue risk to the patient (e.g., comorbid conditions for which NSAIDs are contraindicated) were also excluded. Three hemoccult screens were performed prior to allocation and patients with any evidence of active gastrointestinal bleeding were excluded. At randomization, patients could not be taking concomitant warfarin, ticlopidine, clopidrogel or digoxin. Patients on stable doses of disease modifying therapy (except TNF inhibitors) and low doses of corticosteroids (prednisone <10 mg daily) were allowed to continue therapy. Patients were permitted to take low dose aspirin (up to 100 mg/day), but in practice <3% patients used aspirin during the study.
Procedure
Patients were assessed for disease activity and those who met entry criteria were asked to discontinue their current NSAID use and return for evaluation when symptoms worsened (disease flare). At re-evaluation for study inclusion, patients were required to have ≥ 6 tender joints, ≥ 3 swollen joints, and at least a 20% increase in the number of tender and swollen joints compared with screening visit assessments. In addition, investigators must have rated patients as "fair," "poor," or "very poor" on the investigator global assessment of disease activity, and noted either of the following: 1) morning stiffness for ≥45 minutes plus increased duration of morning stiffness by at least 15 minutes since screening visit evaluation, or 2) a score of >40 mm on patient global assessment of pain (a 100-mm visual analog scale [VAS]) and at least a 10-mm increase in patient assessment of pain over that reported at screening visit evaluation.
Patients meeting the above flare criteria were randomized to placebo, etoricoxib 90 mg once daily, or naproxen 500 mg twice daily in a 2:2:1 allocation ratio. Randomization was stratified by use or non-use of low-dose corticosteroids. Efficacy evaluations were performed at baseline and at weeks 2, 4, 8, and 12. Efficacy assessments included all components of the American College of Rheumatology (ACR) core set of outcome measures: tender joint count, swollen joint count, patient global assessment of disease activity, investigator global assessment of disease activity, Stanford Health Assessment Questionnaire (HAQ) of disability (an assessment of the patient's mobility and ability to carry out activities of daily living), patient global assessment of pain, and C-reactive protein level [2, 10]. Four endpoints were specified as primary: tender joint count (total 68 joints), and swollen joint count (total 66 joints), patient global assessment of disease activity (100-mm VAS; 0 = "very well", 100 = "very poor"), investigator global assessment of disease activity (0 to 4 Likert scale; 0 = "very well", 1 = "well", 2 = "fair", 3 = "poor", 4 = "very poor"). Key secondary measures included patient global assessment of pain (100-mm VAS; 0 = "no pain", 100 = "extreme pain"), HAQ disability score (the average score of 9 disability questions, each graded on a 0 to 3 Likert scale: 0 = "without any difficulty", 1 = "with some difficulty", 2 = "with much difficulty", 3 = "unable to do"), and the proportion of patients who met the ACR20 criteria for a clinically relevant response (a composite criteria requiring 20% improvement in tender and swollen joint counts and 20% improvement in 3 of the 5 remaining ACR core measures) [10] and who completed the study (ACR20-completers). The percentage of patients who discontinued due to lack of efficacy were also measured. For completeness, we also looked at the percentage of patients meeting ACR20 criteria regardless of whether or not they completed the study.
Laboratory assessments (serum chemistry, complete blood count, urinalysis) were performed at baseline and at weeks 2, 4, 8 and 12. Clinical and laboratory adverse events were recorded throughout the study. Investigators rated the intensity, relation to study drug (possibly, probably, or definitely drug-related; probably not or definitely not drug-related), and seriousness (includes events which are life threatening, result in hospitalization, or cause permanent incapacity, or other significant event) of adverse events. All potential upper gastrointestinal perforations, ulcers and bleeds (PUBs) and all potential cardiovascular thrombotic events (including cardiac, peripheral vascular and cerebrovascular events) were reviewed by independent blinded adjudication committees, who determined if they were confirmed events according to pre-specified case definitions (confirmed adjudicated events) [6].
Statistical analysis
The primary analytic method for evaluating efficacy was to compare treatment groups using the time-weighted average change from baseline across 12 weeks for the 7 ACR core measures. The rates at 12 weeks for ACR20-completers, and the cumulative rates over 12 weeks for discontinuations due to lack of efficacy were also compared between treatment groups. Pair-wise comparisons were based on the difference between mean responses, except for C-reactive protein level, where the mean ratio was analyzed via log transformation. A modified intent-to-treat approach was employed – all patients with baseline and at least 1 post-baseline measurement were included in the analysis. Analysis of covariance (including terms for baseline covariate, stratum [corticosteroid use], and treatment) was used for all efficacy variables except ACR20-completers, and discontinuation rates due to lack of efficacy. The percentages of patients meeting ACR20-completers criteria were compared between treatment groups using the Cochran-Mantel-Haenszel test with corticosteroid use as a stratification factor, and Fisher's Exact test was used to make between-treatment comparisons of the discontinuation rates due to lack of efficacy. The analysis of serum C-reactive protein was based on the log of on-treatment value over baseline value. Plots of mean changes from baseline at each time point for the 4 primary endpoints were made to assess the maintenance of therapeutic effect for etoricoxib and naproxen. A last-observation-carried-forward method was used for these longitudinal graphs, but not for the time-weighted average changes shown in the table of results.
Tolerability was evaluated by tabulation of all clinical and laboratory safety parameters, including adverse events. Active treatments were compared with placebo using Fisher's exact test for the percentages of patients with any drug-related clinical adverse event, with any serious clinical adverse event, or who discontinued due to a clinical adverse event. Evaluations of data regarding safety were made by various means, including an examination of patients exceeding predefined limits for laboratory values of interest (e.g., consecutive decreases in hemoglobin and hematocrit, increased aminotransferase values, or increases in serum creatinine), common events associated with NSAIDs or COX-2 inhibitors (e.g., hypertension and lower extremity edema), and percentages of patients discontinuing due to adverse events.