Research Tools › Cluster-Randomized Sample Size

Cluster-Randomized Trials: Design Effect & Sample Size

When randomization is by cluster (department, community, hospital), individuals within a cluster are correlated and carry less information than independent subjects, so the sample size must be inflated by the design effect DE = 1 + (m−1)·ICC. This tool scales an individually-randomized sample size up to the total number of subjects and clusters a cluster trial needs.

① Source of the individual-randomization sample size

Per-arm individual sample size

② Cluster parameters

Average cluster size m

Intraclass correlation ICC

How to use & methodology

Why must cluster randomization inflate the sample?

Individuals within the same cluster (department/community) are more alike and provide less independent information than fully independent subjects. Ignoring this overstates precision and inflates false positives. The design effect converts the data back to an 'effective sample'.

What value should the ICC be?

The intraclass correlation ICC reflects how similar individuals within a cluster are, typically 0.01–0.10. It is best to cite ICCs reported by previous cluster trials in the same field; when uncertain, run a sensitivity analysis (try several values and watch how the sample size changes).

How is the cluster size m chosen?

m is the average number of individuals enrolled per cluster. The larger the cluster, the more redundancy within it and the larger the DE. For a fixed total sample, many small clusters are usually more efficient than a few large ones.

Can the result go straight into a protocol?

It can serve as the primary basis for the cluster-trial sample size. When the number of clusters is small (e.g. <10 per arm), the t degrees of freedom follow the cluster count, so a dedicated method or software is advised; this tool gives the common approximation.