Cost minimizing sample designs for cluster Randomized Control Trials with only an endline measurement subject to a power constraint.

This function delivers the number of clusters and the number of units (individuals, firms, etc) within cluster that minimizes the total costs subject to a power constraint in a cluster Randomized Control Trial (RCT). The function assumes that the outcome variable is continuous and has only been measured at endline. This function assumes that the cost function includes a fixed costs per cluster (that can be different for treatment and control clusters) as well as a cost per unit within a cluster (that can be different for treatment and control clusters): \(Costs = k0*(f0 + (v0*m0)) + k1*(f1 + (v1*m1))\). The function provides the optimal number of clusters and units within clusters for three different cases: (1) when both the optimal number of clusters and units are allowed to be different between treatment and control arms, (2) when the number of clusters are allowed to be different between treatment and control arms, but the number of units is constrained to be the same in both arms, and (3) when the number of units within cluster are allowed to be different between treatment and control arms, but the number of clusters is constrained to be the same in both arms.

Usage

mincost.opt(
  delta,
  sigma,
  rho,
  alpha,
  beta,
  q = 1,
  v0,
  v1,
  f0,
  f1,
  optimal.s = c("CLUST-IND", "CONST-IND", "CONST-CLUST"),
  initial.cond = NULL,
  seed = 210613,
  lb = NULL,
  ub = NULL,
  temp = NULL,
  output = NULL
)

Arguments

delta: Vector. Size of the effect on the outcome variable (effect size measured in same units as the outcome variable).
sigma: Vector. Standard deviation of the outcome variable.
rho: Vector. Intra-cluster correlation.
alpha: Vector. Significance level for the null hypothesis of no effect.
beta: Vector. Maximum level of power of implementing the RCT.
q: Where (K - q) are the degrees of freedom to test the null hypothesis of null effect. Default is 1.
v0: Vector. Variable costs per unit in the control clusters. It includes the cost of data collection (baseline and endline) and the cost of implementing the intervention under study.
v1: Vector. Variable costs per unit in the treatment clusters. It includes the cost of data collection (baseline and endline) and the cost of implementing the intervention under study.
f0: Vector. Fixed costs per control cluster. It includes the total fixed cost: baseline and endline.
f1: Vector. Fixed costs per treatment cluster. It includes the total fixed cost: baseline and endline.
optimal.s: Indicates whether the sample design should constrain the number of units per treatment and control clusters to be the same ("CONST-IND") or whether the sample design should constrain the treatment and control clusters to be the same "CONST-CLUST" or whether the solution should be fully unconstrained ("CLUST-IND").
initial.cond: Vector. Initial values of the number of sample units per cluster (m0,m1) and the number of clusters (k0,k1) - keep the order- that the optimization routine will use. Default is NULL, in which case, the function will compute these initial conditions.
seed: Integer. Seed for the random number generator that the optimization routine GenSA will use. Default is 210613.
lb: Vector. Minimum possible value for the optimal number of clusters and optimal number of units. Default is 1 for each parameter.
ub: Vector. Maximum possible value for the optimal number of clusters and optimal number of units. Default is 1000 for each parameter.
temp: Numeric. Temperature parameter for the GenSA optimization function. Default is NULL, in which case, the default value in GenSA function will be used.
output: Indicates the name of the xlsx file where you want to save the results. Default is NULL, in which case, the results will be presented in a matrix.

Value

Returns a matrix of size (13 x number of Scenarios). Each scenario is a combination of specify parameters, and fixed and variable costs per unit. For each scenario the matrix provides the following components:

scenario: This is a vector of the number of the scenario displayed.
delta: This is a vector of the size of the effect on the outcome variable.
sigma: This is a vector of the standard deviation of the outcome variable.
rho: This is a vector of the intra-cluster correlation.
v0: This is a vector of the variable cost per control unit.
v1: This is a vector of the variable costs per treatment unit.
f0: This is a vector of the fixed costs per control cluster.
f1: This is a vector of the fixed costs per treatment cluster.
k0: This is a vector of the optimum number of control clusters that minimize the costs.
k1: This is a vector of the optimum number of treatment clusters that minimize the costs.
m0: This is a vector of the optimum number of sample units per control cluster that minimize the costs.
m1: This is a vector of the optimum number of sample units per treatment cluster that minimize the costs.
Cost: This is the vector of the minimum cost of the RCT with the optimum number of clusters and units provided by this function.

References

McConnell and Vera-Hernández (2022). More Powerfull Cluster Randomized Control Trials. Mimeo

Author

Nancy A. Daza-Báez, n.baez@ucl.ac.uk

Brendon McConnell, B.I.Mcconnell@soton.ac.uk>

Marcos Vera-Hernández, m.vera@ucl.ac.uk

Examples


## In this example, both fixed costs per cluster and variable cost per unit within cluster are different between treatment and control.
## There are three different scenarios, each with a different rho. The syntax (optimal.s = "CLUST-IND") allows both the optimal number of clusters and units per
## cluster to be different between the treatment and control arms. The results will be saved in the "myresults.xlsx" file.

mincost.opt(delta = 0.25,
            sigma = 1,
            rho = c(0.05, 0.10, 0.15),
            alpha = 0.05,
            beta = 0.80,
            v0 = 150,
            v1 = 2200,
            f0 = 500,
            f1 = 18000,
            optimal.s = "CLUST-IND",
            output = "myresults")

## If you wish, you can specify initial conditions for the optimization algorithm: m0=20, m1=18, k0=15 and k1=18.

mincost.opt(delta = 0.25,
            sigma = 1,
            rho = 0.05,
            alpha = 0.05,
            beta = 0.80,
            v0 = 150,
            v1 = 2200,
            f0 = c(500, 1500, 5000),
            f1 = 18000,
            optimal.s = "CLUST-IND",
            initial.cond = c(20, 18, 15, 18))
#>                       1             2             3
#> scenario       1.000000       2.00000       3.00000
#> delta          0.250000       0.25000       0.25000
#> sigma          1.000000       1.00000       1.00000
#> rho            0.050000       0.05000       0.05000
#> v0           150.000000     150.00000     150.00000
#> v1          2200.000000    2200.00000    2200.00000
#> f0           500.000000    1500.00000    5000.00000
#> f1         18000.000000   18000.00000   18000.00000
#> k0           118.035743      71.35595      42.37807
#> k1            19.672624      20.59869      22.33520
#> m0             7.958224      13.78405      25.16611
#> m1            12.468141      12.46814      12.46814
#> cost     1093646.654018 1190366.55801 1386550.23526

## This is an example with three scenarios, each with a different value of the fixed cost per cluster in the treatment group (f1).
## The syntax (optimal.s = "CONST-IND") requests that the number of units per cluster is constrained to be the same in treatment as in control.

mincost.opt(delta = 0.25,
            sigma = 1,
            rho = 0.27,
            alpha = 0.05,
            beta = 0.80,
            v0 = 25,
            v1 = 150,
            f0 = 381,
            f1 = c(500, 1981, 3500),
            optimal.s = "CONST-IND")
#>                      1             2             3
#> scenario      1.000000      2.000000      3.000000
#> delta         0.250000      0.250000      0.250000
#> sigma         1.000000      1.000000      1.000000
#> rho           0.270000      0.270000      0.270000
#> v0           25.000000     25.000000     25.000000
#> v1          150.000000    150.000000    150.000000
#> f0          381.000000    381.000000    381.000000
#> f1          500.000000   1981.000000   3500.000000
#> k0          144.657565    164.295732    179.255650
#> k1           95.924534     70.516569     62.841467
#> m0            3.934251      6.102834      7.485685
#> m1            3.934251      6.102834      7.485685
#> cost     173913.449533 291909.377808 392349.539144

## This is an example with three scenarios, each with a different value of the variable cost per unit in the treatment group (v1).
## The syntax (optimal.s = "CONST-CLUST")  requests that the number of clusters is constrained to be the same in treatment as in control.

mincost.opt(delta = 0.25,
            sigma = 1,
            rho = 0.05,
            alpha = 0.05,
            beta = 0.80,
            v0 = 150,
            v1 = c(250, 750, 1500),
            f0 = 500,
            f1 = 18000,
            optimal.s = "CONST-CLUST")
#>                     1            2             3
#> scenario      1.00000      2.00000       3.00000
#> delta         0.25000      0.25000       0.25000
#> sigma         1.00000      1.00000       1.00000
#> rho           0.05000      0.05000       0.05000
#> v0          150.00000    150.00000     150.00000
#> v1          250.00000    750.00000    1500.00000
#> f0          500.00000    500.00000     500.00000
#> f1        18000.00000  18000.00000   18000.00000
#> k0           21.53244     24.82447      28.05123
#> k1           21.53244     24.82447      28.05123
#> m0           34.22962     34.22962      34.22962
#> m1           26.51415     15.30795      10.82436
#> cost     651635.80233 871721.31688 1118429.91590