## Info

1: Event status (D=dead, A=alive) and survival age

2: Event status (D=dead, A=alive) and age in 1975 (at start of observation period)

### Figure 1. Lifetimes of 12 hypothetical patients.

Analytic approaches that rely on person-years at risk can underestimate hazards and incidence rates. For example, the hazard at age 70 years is the conditional probability of event onset at age 70, or the number of events at age 70 divided by the number of individuals at risk at age 70. Individuals who contribute time that occurred before the observation period (in other words, individuals who were older than 70 when the study began) appear in the denominator, but not in the numerator.

The bias in estimating hazards and incidences can be removed by excluding follow-up time that occurs before the observation period, by left-truncation. The risk set, at any age, must include only those individuals who were at risk at that age during the observation period.

Consider, for example, a hypothetical population of 12 individuals whose lifetimes are displayed in Figure 1. Suppose the event of interest is death. The population hazard at age 70 is defined as the ratio of the number of deaths at age 70 to the number of subjects in the risk set (defined as the number of subjects at risk entering age 70) and is equal to 4/12 = 0.33. If an investigation is initiated in 1975 (see Figure 1), then 8 subjects are included in the investigation (subjects 5-12 who are alive at the beginning of the observation period in 1975). In a standard survival analysis of this 1975 cohort using survival age as the dependent variable, each of the 8 subjects contributes the number of years from his/her birth to survival age. The risk set entering age 70 includes 8 subjects, of whom 2 died at age 70 (Subjects 7 and 10). The population hazard is underestimated as 2/8 = 0.25. Subjects 5 and 6 are older than 70 years of age at the beginning of the observation period (1975); they were both at risk at age 70, however, they were not at risk at age 70 during the observation period. These subjects should be excluded from the risk set at age 70 (in 1975). If we restrict the risk set at age 70 to only those who are at risk at age 70 during the observation period, the risk set includes 6 subjects, and we correctly estimate the population hazard as 2/6 = 0.33. This approach reflects a selected risk set strategy and we use this strategy in

The estimation of cumulative incidence of AD is complicated by a fairly common situation: the development of AD is subject to the competing risk of death. Subjects who die during the observation period are treated as censored observations in traditional survival analytic techniques such as the Kaplan-Meier method [21], This method is inappropriate as it assumes that failure from the event of interest is still possible beyond the time at which the censoring occurred. For example, a person who dies of cardiovascular disease cannot develop AD and should not contribute to the estimate of development of AD. Gooley [13] shows that the potential contribution of censored observations to the probability of failure from the event of interest is distributed among those subjects remaining at risk. However, the potential contribution of a subject who has died should be zero. Treating such subjects as censored inflates the estimate of cumulative incidence. Various analytic solutions to the problem of competing risks have been proposed and implemented [12-17], but there is still no software available that addresses this issue.