* Per cent reduction equals 100(1 — bm/bj) where bm and b, are post-term minus term differences in covariate means after matching and initially, respectively

* Per cent reduction equals 100(1 — bm/bj) where bm and b, are post-term minus term differences in covariate means after matching and initially, respectively post-term group (N — 749) and the matched term group (N = 749 matched term subjects out of the original N = 9241 term subjects). As can be seen, the matched sample had similar means for each of the 13 covariates included in the model. Table III shows the bias reduction for the four covariates with the largest initial bias: the Hobel Intrapartum Risk Score; child's birthweight; abnormal labour indicator, and the logit of the propensity score. As can be seen, each of these covariates had over 74 per cent bias reduction after matching.

The tables describe the results based on choosing the best available term match for each post-term subject. In addition to this, we generated a list of potential matches for each of the 749 post-term babies. For each post-term baby we provided the investigators with a list of 15 potential term matches. Based on this information, the investigators were able to identify matches for a subset of the post-term babies and then gather data on the matched pairs. Currently, the data on the matched pairs has been collected and analyses have begun to examine the hypothesis of interest, that is whether post-term birth is associated with neuropsychiatric, social and academic achievements among school-aged children (that is, 5-10 year old children).

In this example, the use of propensity scores proved useful. In particular, we were able to assess the fit of the propensity score model and compare the balance of background covariates prior to committing any resources (time or money) to collecting outcome data on the matched controls. Also, it is important to realize that, since these comparisons involve only covariates and not outcome variables, there is no chance of biasing results in favour of one treatment condition

Stratification (sometimes referred to as subclassification) is also commonly used in observational studies to control for systematic differences between the control and treated groups. This technique consists of grouping subjects into strata determined by observed background characteristics. Once the strata are defined, treated and control subjects who are in the same stratum are compared directly. Many of the same problems occur in stratification as with matching when the number of covariates increases. Cochran30 notes that as the number of covariates increases, the number of strata grows exponentially. For instance, if all covariates were dichotomous categorical variables, then there would be 2k subclasses for k covariates. If k is large, then some strata might contain subjects from only the treated group, which would make it impossible to estimate a treatment effect in that stratum. Here again the propensity score is very useful. Because the propensity score is a scalar summary of all the observed background covariates, stratification on it alone can balance the distributions of the covariates in the treated and control groups without

Rosenbaum and Rubin22 present theoretical results showing that perfect stratification based on the propensity score will produce strata where the average treatment effect within strata is an unbiased estimate of the true treatment effect. Again they assume that the treatment assignment is strongly ignorable. Rosenbaum and Rubin5 state that Cochran's31 result, which indicated that creating five strata removes 90 per cent of the bias due to the stratifying variable or covariate, holds for stratification based on the propensity score. They state that, in fact, stratification on the propensity score balances all k covariates that are used to estimate the propensity score, and often five strata based on the propensity score will remove over 90 per cent of the bias in each of these

The technique used for determining strata is straightforward. First, the propensity score is estimated by logistic regression or discriminant analysis. The investigator then must decide whether the stratum boundaries should be based on the values of the propensity score for both groups combined or in the treated or control group alone. Typically, in our work, we use the quintiles of the estimated propensity score from the combined group to determine the cut-offs for

There are many examples in the recent literature of studies that have used propensity scores for stratification.5'9,16,11,19 21 We now describe briefly some of these studies.

In Stone20 investigators wished to compare outcomes on 747 patients with community-acquired pneumonia (CAP) who were either hospitalized (n = 265) or ambulatory (n = 482). Since patients were not randomized to be either hospitalized or ambulatory, propensity scores were estimated using classification tree techniques. Patients were then assigned to one of seven strata based on their estimated propensity score. The investigators found that there were imbalances between the two groups on 29 out of 44 baseline variables, and that after stratification on the propensity score only 13 of these remained significant at p — 0 05. The investigators then estimated treatment effects using direct standardization methods of the stratum-specific means.

Fiebach et al.9 used propensity scores to stratify patients who had received one of two possible treatments when they came to a hospital with uncomplicated chest pain. The two treatments were either admittance to a stepdown unit or admittance to a coronary care unit. Covariates used to estimate the propensity score included variables for the actual triage location and independent clinical predictors for an adverse event. These clinical predictors consisted of more than 50 clinical characteristics. A stepwise procedure was used to estimate the propensity score where covariates were entered into the model if they were significant at the 0-50 level in a stepwise

In Rosenbaum and Rubin5 the authors wished to study the properties of the propensity score when used to stratify subjects in different treatment groups. In their example, the propensity score was the probability of receiving either coronary artery bypass surgery or medical therapy given 74 different covariates. These covariates consisted of haemodynamic, angiographic, laboratory and exercise test results. The investigators used a multi-stage procedure to find the best model for the propensity score. They found that using five strata based on the estimated propensity score was able to substantially reduce the bias in all 74 covariates simultaneously.

To illustrate further how to estimate and use the propensity score for stratification, we now present an applied example using data from the Active Management of Labor Trial (ACT).32 The ACT trial is a randomized experiment to study the effects of active management of labour on the

Table IV. Comparison of covariates for subjects with and without epidural before and after propensity score stratification

Table IV. Comparison of covariates for subjects with and without epidural before and after propensity score stratification

No epidural




0 0

Post a comment