No search strategy was required because this review was of material made available by Pfizer UK Ltd in the form of clinical trial reports used in a marketing authorisation application for sildenafil (Viagra) in September 1997. QUORUM guidelines were otherwise followed . The prior intention was to use studies that were relevant to the use of sildenafil in clinical practice. This required the setting to be the home, not the clinic, use of sildenafil as required, rather than fixed dosing schedules (such as daily tablets), and studies of a minimum duration, which we set arbitrarily as four weeks.
Excluded were studies with laboratory measures of penile tumescence or rigidity with single doses of sildenafil, studies that only investigated erectile function in a clinic setting, studies that used fixed daily dosing rather than as required, and studies that were shorter than four weeks. Included were randomised trials that investigated sildenafil, with efficacy or safety data, were longer than four weeks, conducted in the home setting, and with doses in the licensed range of 25 mg to 100 mg as required, although lower and higher doses would be analysed if there were sufficient information. Clinical trials in men with erectile dysfunction caused specifically by single causes like spinal cord trauma or diabetes were not included because, taken with the other data, they would constitute clinical heterogeneity.
Each report was scored for quality using a three item, 1–5 score, quality scale . Points were awarded to studies according to whether they were randomised and double blind and mentioned withdrawals or drop-outs from the study. An additional point was awarded if the method of randomisation or double blinding was described and was appropriate.
From each trial we extracted the number of patients treated per group, dosing regimen, study design, and the number of patients with efficacy and/or safety outcomes. The denominator was the number of patients randomised so that results were on an intention-to-treat basis. This analysis includes all randomised patients regardless of the completion of diaries, protocol concordance or missing data. Patients with missing or illegible diary data were assumed to have 0% intercourse success rate. In addition, this analysis included sexual intercourse attempts that were unsuccessful for reasons not attributable to sildenafil i.e. factors other than the erection being insufficiently hard or long-lasting. RAM extracted the data into tables, and these were then read and checked by other authors.
For the review, a prior definition of efficacy was a man with a consistent three-part outcome, consisting of an erection, sufficiently rigid for penetration, and followed by successful intercourse. Other efficacy outcomes of interest were the number of men with the highest two responses on the International Index of Erectile Function (IIEF) questions 3 and 4, and global evaluations of treatment efficacy by patients . The number of grade 3 or 4 erections (at least hard enough for penetration) and successful erections were also noted.
Adverse events were also sought. These were the number of men with any treatment-related adverse event, the total number of men discontinuing, those discontinuing through lack of efficacy or through adverse events, adverse events rated severe or serious, and information on particular adverse events.
Outcomes actually available and chosen were
• Number of men in whom the proportion of successful attempts at sexual intercourse was more than 60%
• Number of men in whom the proportion of successful attempts at sexual intercourse was more than 40%
• Number of men reporting that their erections had been improved on a global question (global A; "Has the treatment you have been taking over the past four weeks improved your erections?").
• The weighted mean number of weekly erections was calculated.
• The weighted mean success rate was calculated.
• The weighted mean weekly number of successful occasions where intercourse occurred was calculated from these numbers.
• Treatment-related adverse events
• Severe adverse events
• Serious adverse events
• Vasodilation (flushing)
• All-cause discontinuations
• Discontinuations due to inefficacy
• Discontinuations due to adverse events
A prior intention was to analyse effectiveness and harm according to dose. Dosing could be fixed, or could be optimised where patients took an initial dose of 50 mg, and then move up to 100 mg or down to 25 mg on subsequent occasions depending on their individual judgement of the efficacy or adverse events caused by that dose.
There was no intention of pooling mean data because the results were not known to have a normal distribution , but rather to find dichotomous data. Relative benefit and relative risk estimates were calculated with 95% confidence intervals using a fixed effects model . No pooling was done unless there were at least two studies or at least 200 men in the comparison. The number needed to treat (NNT) and number needed to harm (NNH), with confidence intervals, were calculated by the method of Cook and Sackett . Confidence intervals (95%) for single samples were calculated for proportions . Heterogeneity tests were not used as they have previously been shown to be unhelpful . Clinical criteria for homogeneity was defined before analysis and examined graphically . Publication bias was not assessed using funnel plots as these tests have been shown to be unhelpful [15, 16], and publication bias was not an issue here.
Relative benefit or risk was considered to be statistically significant when the 95% confidence interval did not include 1. NNT or NNH values were only calculated when the relative risk or benefit was statistically significant, and are reported with the 95% confidence interval. Statistical significance of any difference between numbers needed to treat for different doses was assumed if there was no overlap of the confidence intervals, and additionally tested using the z statistic . Calculations were performed using Microsoft Excel 98 on a Power Macintosh G4.