Power maximizing sample design for cluster Randomized Control Trials with only an endline measurement subject to a costs constraint.

This function delivers the number of clusters and the number of units (individuals, firms, etc) within cluster that maximizes power subject to a cost constraint in a cluster Randomized Control Trial (RCT). The function assumes that the outcome variable is continuous and has only been measured at endline. This function assumes that the cost function includes a fixed costs per cluster (that can be different for treatment and control clusters) as well as a cost per unit within a cluster (that can be different for treatment and control clusters): \(Costs = k0*(f0 + (v0*m0)) + k1*(f1 + (v1*m1))\). The function provides the optimal number of clusters and units within clusters for three different cases: (1) when both the optimal number of clusters and units are allowed to be different between treatment and control arms, (2) when the number of clusters are allowed to be different between treatment and control arms, but the number of units is constrained to be the same in both arms, and (3) when the number of units within cluster are allowed to be different between treatment and control arms, but the number of clusters is constrained to be the same in both arms.

Usage

maxpower.opt(
  delta,
  sigma,
  rho,
  alpha,
  C,
  q = 1,
  v0,
  v1,
  f0,
  f1,
  optimal.s = c("CLUST-IND", "CONST-IND", "CONST-CLUST"),
  initial.cond = NULL,
  seed = 210613,
  lb = NULL,
  ub = NULL,
  temp = NULL,
  output = NULL
)

Arguments

delta: Vector. Size of the effect on the outcome variable (effect size measured in same units as the outcome variable).
sigma: Vector. Standard deviation of the outcome variable.
rho: Vector. Intra-cluster correlation.
alpha: Vector. Significance level for the null hypothesis of no effect.
C: Vector. Maximum level of costs of implementing the RCT. It includes data collection costs (baseline and endline) and the costs of implementing the intervention under study.
q: Where (K - q) are the degrees of freedom to test the null hypothesis of null effect. Default is 1.
v0: Vector. Variable costs per unit in the control clusters. It includes the cost of data collection (baseline and endline) and the cost of implementing the intervention under study.
v1: Vector. Variable costs per unit in the treatment clusters. It includes the cost of data collection (baseline and endline) and the cost of implementing the intervention under study.
f0: Vector. Fixed costs per control cluster. It includes the total fixed cost: baseline and endline.
f1: Vector. Fixed costs per treatment cluster. It includes the total fixed cost: baseline and endline.
optimal.s: Indicates whether the sample design should constrain the number of units per treatment and control clusters to be the same ("CONST-IND") or whether the sample design should constrain the treatment and control clusters to be the same "CONST-CLUST" or whether the solution should be fully unconstrained ("CLUST-IND").
initial.cond: Vector. Initial values of the number of sample units per cluster (m0, m1) and the number of clusters (k0, k1) - keep the order- that the optimization routine will use. Default is NULL, in which case, the function will compute these initial conditions.
seed: Integer. Seed for the random number generator that the optimization routine GenSA will use. Default is 210613.
lb: Vector. Minimum possible value for the optimal number of clusters and optimal number of units. Default is 1 for each parameter.
ub: Vector. Maximum possible value for the optimal number of clusters and optimal number of units. Default is 1000 for each parameter.
temp: Numeric. Temperature parameter for the GenSA optimization function. Default is NULL, in which case, the default value in GenSA function will be used.
output: Indicates the name of the xlsx file where you want to save the results. Default is NULL, in which case, the results will be presented in a matrix.

Value

maxpower.opt returns a matrix of size (14 x number of Scenarios). Each scenario is a combination of of specify parameters, and fixed and variable costs per unit. For each scenario the matrix provides the following components:

scenario: This is a vector of the number of the scenario displayed.
delta: This is a vector of the size of the effect on the outcome variable.
sigma: This is a vector of the standard deviation of the outcome variable.
rho: This is a vector of the intra-cluster correlation.
C: This is a vector of the maximum level of total costs of implementing the RCT. It includes data collection costs (baseline and endline) and the costs of implementing the intervention under study.
v0: This is a vector of the variable cost per control unit.
v1: This is a vector of the variable costs per treatment unit.
f0: This is a vector of the fixed costs per control cluster.
f1: This is a vector of the fixed costs per treatment cluster.
k0: This is a vector of the optimum number of control clusters that maximize power.
k1: This is a vector of the optimum number of treatment clusters that maximize power.
m0: This is a vector of the optimum number of sample units per control cluster that maximize power.
m1: This is a vector of the optimum number of sample units per treatment cluster that maximize power.
power: This is the vector of the power of the RCT with the optimum number of clusters and units provided by this function.

References

McConnell and Vera-Hernández (2022). More Powerfull Cluster Randomized Control Trials. Mimeo

Author

Nancy A. Daza-Báez, n.baez@ucl.ac.uk

Brendon McConnell, B.I.Mcconnell@soton.ac.uk>

Marcos Vera-Hernández, m.vera@ucl.ac.uk

Examples


## In this example, both fixed costs per cluster and variable cost per unit within cluster are different between treatment and control.
## There are three different scenarios, each with a different total costs. The syntax (optimal.s = "CLUST-IND") allows both the optimal number of clusters and units per
## cluster to be different between the treatment and control arms. The results will be saved in the "myresults.xlsx" file.

maxpower.opt(delta = 0.25,
             sigma = 1,
             rho = 0.05,
             alpha = 0.05,
             C = c(815052.294, 974856.169, 1095876.675),
             v0 = 150,
             v1 = 2200,
             f0 = 500,
             f1 = 18000,
             optimal.s = "CLUST-IND",
             output = "myresults")

## If you wish, you can specify initial conditions for the optimization algorithm: m0=20, m1=18, k0=15 and k1=18.

maxpower.opt(delta = 0.25,
             sigma = 1,
             rho = 0.05,
             alpha = 0.05,
             C = c(815065.655, 877717.857, 995811.458),
             v0 = 150,
             v1 = 2200,
             f0 = c(500, 1500, 5000),
             f1 = 18000,
             optimal.s = "CLUST-IND",
             initial.cond = c(20, 18, 15, 18))
#>                      1              2              3
#> scenario      1.000000      2.0000000      3.0000000
#> delta         0.250000      0.2500000      0.2500000
#> sigma         1.000000      1.0000000      1.0000000
#> rho           0.050000      0.0500000      0.0500000
#> C        815065.655000 877717.8570000 995811.4580000
#> v0          150.000000    150.0000000    150.0000000
#> v1         2200.000000   2200.0000000   2200.0000000
#> f0          500.000000   1500.0000000   5000.0000000
#> f1        18000.000000  18000.0000000  18000.0000000
#> k0           87.968898     52.6143724     30.4356543
#> k1           14.661482     15.1884612     16.0409986
#> m0            7.958223     13.7840489     25.1661136
#> m1           12.468141     12.4681412     12.4681407
#> power         0.674025      0.6677639      0.6536381

## This is an example with three scenarios, each with a different value of the fixed cost per cluster in the treatment group (f1).
## The syntax (optimal.s = "CONST-IND") requests that the number of units per cluster is constrained to be the same in treatment as in control.

maxpower.opt(delta = 0.25,
             sigma = 1,
             rho = 0.27,
             alpha = 0.05,
             C = c(75862.836, 145230.184, 204196.756),
             v0 = 25,
             v1 = 100,
             f0 = 381,
             f1 = c(500, 1981, 3500),
             optimal.s = "CONST-IND")
#>                      1              2              3
#> scenario     1.0000000      2.0000000      3.0000000
#> delta        0.2500000      0.2500000      0.2500000
#> sigma        1.0000000      1.0000000      1.0000000
#> rho          0.2700000      0.2700000      0.2700000
#> C        75862.8360000 145230.1840000 204196.7560000
#> v0          25.0000000     25.0000000     25.0000000
#> v1         100.0000000    100.0000000    100.0000000
#> f0         381.0000000    381.0000000    381.0000000
#> f1         500.0000000   1981.0000000   3500.0000000
#> k0          64.1971019     81.6868247     92.6482906
#> k1          46.2134146     37.2016788     34.2375765
#> m0           4.5447816      7.0129476      8.5481776
#> m1           4.5447816      7.0129476      8.5481776
#> power        0.4971883      0.5342804      0.5467766

## This is an example with three scenarios, each with a different value of the variable cost per unit in the treatment group (v1).
## The syntax (optimal.s = "CONST-CLUST") requests that the number of clusters is constrained to be the same in treatment as in control.

maxpower.opt(delta = 0.25,
             sigma = 1,
             rho = 0.05,
             alpha = 0.05,
             C = c(144412.242, 251543.646, 384610.811),
             v0 = 150,
             v1 = c(250, 750, 1500),
             f0 = 500,
             f1 = 18000,
             optimal.s = "CONST-CLUST")
#>                       1              2              3
#> scenario      1.0000000      2.0000000      3.0000000
#> delta         0.2500000      0.2500000      0.2500000
#> sigma         1.0000000      1.0000000      1.0000000
#> rho           0.0500000      0.0500000      0.0500000
#> C        144412.2420000 251543.6460000 384610.8110000
#> v0          150.0000000    150.0000000    150.0000000
#> v1          250.0000000    750.0000000   1500.0000000
#> f0          500.0000000    500.0000000    500.0000000
#> f1        18000.0000000  18000.0000000  18000.0000000
#> k0            4.7719101      7.1633417      9.6463857
#> k1            4.7719101      7.1633417      9.6463857
#> m0           34.2296240     34.2296138     34.2296186
#> m1           26.5141636     15.3079510     10.8243554
#> power         0.1888505      0.2731188      0.3375474