DeLong Test: Comparing the AUC of Two ROC Curves
Tests whether the areas under the ROC curve (AUC) of two markers/models, measured on the same subjects, differ significantly. The DeLong method accounts for the correlation between the two curves (paired ROC), giving each AUC with its 95% CI, the AUC difference and the test P value. Computed locally in your browser; data are not uploaded. To plot a single ROC, use the ROC tool.
① Paste data
First row = column names, then one subject per row: an outcome column (1=positive/0=negative) and two markers' predicted values/scores. The two markers must come from the same subjects. You can copy-paste from Excel.
How to use & methodology
When should the DeLong test be used?
When you compare the ROC AUCs of two diagnostic markers or two prediction models on the same subjects and want to know which discriminates better and whether the difference is significant. The two ROC curves come from the same people and are correlated; the DeLong method specifically handles this paired correlation.
How does it differ from just looking at the two AUC confidence intervals?
Judging the difference by whether the two AUC CIs overlap is not rigorous (it ignores correlation and is conservative). DeLong tests the AUC difference directly, using the paired information, which is more accurate and more powerful.
How is the data prepared?
One subject per row, with the outcome (1/0) and the values of two markers (predicted probability, score or raw measure all fine — larger = more positive). Both markers must be measurements on the same subjects.
What does "variance degenerate" mean?
When a marker perfectly separates positives/negatives (AUC=1) or its values are almost constant, the DeLong variance estimate is 0 and the test fails. This is usually due to too small a sample or unusual data — more sample is needed or the data should be checked.