Research ToolsWin Ratio

Win Ratio for Hierarchical Composite Endpoints

⚗️ Frontier method: The win ratio (Pocock 2012) has spread rapidly in heart-failure/cardiovascular/renal trials in recent years. This tool implements a two-level (survival time + secondary continuous measure) unmatched comparison, with confidence intervals by bootstrap (small samples may fail to yield a CI — enough cases are needed). For formal analysis, cross-check with the R packages WWR/WinRatio.

In a composite endpoint, "death" is clearly more important than "hospitalization", yet the traditional approach treats them equally. The win ratio stratifies by pre-specified importance: it compares the most important endpoint first, then moves to the secondary measure if no winner is found. Every treatment patient is compared pairwise with every control patient, counting wins/losses/ties to give the Win Ratio, Win Odds and Net Benefit. Computed locally in your browser; data are not uploaded.

① Paste data

One subject per row, columns in order: group (1=treatment, 0=control) time (level-1 endpoint event time) status (1=event, 0=censored) secondary (optional, level-2 continuous value). Separate by space/Tab/comma.

How to use & methodology

How is the Win Ratio better than a traditional composite endpoint?

The traditional "time to first event" treats death and hospitalization equally and only looks at the first event. The win ratio stratifies by clinical importance, comparing the most important endpoint first, so it captures that "death matters more than hospitalization" and uses all the information — widely adopted in heart failure and similar fields in recent years.

How to choose among WR, Win Odds, Net Benefit?

WR = wins/losses, ignores ties, most commonly reported; Win Odds splits ties in half, suited to when ties are frequent; Net Benefit = (wins−losses)/total comparisons, ranges −1 to 1 and is easy to interpret in absolute terms. All three can be reported together.

How to set the hierarchy?

Order from most to least clinically important. This tool defaults level 1 to the endpoint event time (e.g. cardiovascular death) and level 2 to one continuous secondary measure (e.g. quality-of-life score, 6-minute walk). If level 1 decides a winner, level 2 is not consulted.

Why does a confidence interval sometimes fail to compute?

The CI uses bootstrap, which requires each resample to contain both a 'win' and a 'loss'. With very small samples (e.g. 2 per group) most resamples cannot satisfy both, so the CI shows 'needs more data'. Increasing the sample yields a stable CI.