Design Issues for Partially Clustered Trials

Yelland, LisaSullivan, ThomasKasza, Jessica (Monash University)Lange, Kylie Megan2025-07-292025-07-292025https://hdl.handle.net/2440/146401Background Many clinical trials involve partially clustered data, where some observational units are independent and others belong to a cluster. Examples occur in neonatology, where participants include infants from both singleton and multiple births, and ophthalmology, where one or two eyes per participant may need treatment. The statistical implications of partial clustering are not widely recognised and often ignored in trial design and analysis, which can lead to trials being over or under powered and incorrect type I error rates. More research into partially clustered trials is needed. Aims The aim of the thesis is to improve the design of partially clustered trials. Specific objectives include: 1. Developing standardised definitions, terminology and reporting guidelines. 2. Assessing the performance of existing sample size and analysis methods for designs with a maximum cluster size of 2. 3. Developing new sample size methods for designs with a maximum cluster size greater than 2. 4. Translating sample size methods into online tools and providing guidance on their use. Methods Published partially clustered trials and CONSORT guidelines for other clustered trial designs were reviewed to develop terminology and propose reporting guidelines for the range of designs used in practice. Data simulation was used to evaluate existing sample size and analysis methods. Design effects for sample size estimation when the maximum cluster size exceeds 2 were derived algebraically and validated using data simulation. An online sample size tool was developed using R Shiny. Results A unified definition of partially clustered trials, terminology to describe the different types of designs, and reporting recommendations are presented. When clusters have a maximum size of 2, generalised estimating equations (GEEs) with an independence working correlation structure perform well for analysis. GEEs with an exchangeable working correlation structure and mixed effects models also often perform well, though under-coverage and inflated type I error rates can occur in some settings. If a mixed model analysis is planned, the trial can be designed using an existing formula for the exchangeable GEE. New design effects when the maximum cluster size exceeds 2 are presented and shown to depend on the intracluster correlation coefficient, range of cluster sizes, proportion of observations belonging to clusters of each size, randomisation method, type of outcome (continuous or binary), and working correlation structure and link function of the analysis model. The design effects are validated across a range of realistic settings and implemented in an online sample size calculator with a step-by-step tutorial. Conclusions Partially clustered trials are common, however there are substantial limitations to how they are currently designed, analysed and reported. This thesis comprehensively evaluates key statistical aspects of partially clustered trials, with a focus on their design. It provides terminology to consistently describe these designs, reporting recommendations, greater understanding of the performance of available analysis methods, and new theory and practical tools to support sample size planning. It is hoped this work will improve the design of future partially clustered trials.enclinical trialspowersample sizepartial clusteringclustered dataDesign Issues for Partially Clustered TrialsThesis