WinEpi: Working IN EPIdemiology

Risk estimation: Stratified observational studies (1)

When we estimate risk corresponding to exposition to a certain variable, we must to take into account than sometimes association between disease and exposure can vary among groups of the same population depending on a third variable which is distributed heterogeneously (for example: age, sex, region, strain...). In these cases, this variable is named variable of confusion, and it must show three basic characteristics:

It should be statistically associated with the disease (so, it should be a risk factor)
It should be associated with exposure factor
It shouldn't take part of causal chain

Factors of confusion must be avoided or removed during study design or during data analysis, but sometimes a risk factor can affect to relationship between exposure and disease, in this particular case the factor of confusion is called variable of interaction (however, there are variables of interaction that they are not factors of confusion).
Stratification is a method that allows controllling the effect of possible factors of confusion in the study, and in consequence, it is able to detect if interaction exists. So, each stratum corresponds with a different level of the factor of confusion, and it will be possible to calculate the most adequate risk estimator. We have to follow the next steps:

Calculation of crude risk estimator: it is calculated without taking into account the existence of the stratification variable (as a simple observational study)
Calculation of pooled risk estimator: it is possible to use direct pooled method and Mantel-Haenszel's method in order to weight each stratum in the calculation of overall risk estimator.
Comparison of crude and pooled estimators: dividing greater value by smaller value, and only if result is greater than 1.5, we consider stratification variable as a factor of confusion.
Evaluation of homogeneicity among strata: we calculate Breslow-Day's Q statistic (based in a Chi-square cumulative distribution where degrees of freedom are equal to number of strata minus one) in order to know if risk estimator is significantly different in any strata. Depending on result we can find the following cases:

p_Q(BD) > 0.050 : risk estimator is homogeneous among strata, and if calculated ratio of point 3 is greater than 1.5 we can conclude that stratification variable is a factor of confusion, and the most adequate is to use value of pooled risk estimator (preferably by Mantel-Haenszel's method).
p_Q(BD) < 0.050 : risk estimator is heterogeneous, so, in any stratum it is significant different. In this case we can say that stratification variable act as variable of interaction (independently from the obtained value in point 3), and then we must use the specific risk estimators of each stratum as final results.

Before to estimate risk in a stratified observational study you must indicate type of available data:

Confidence level :

Type of study and result :

Risk variable :

Exposed:

Non exposed:

Disease :

Stratification variable : Name:

Number of categories :


Confidence level :
Type of study and result :
Risk variable :
	Exposed:
	Non exposed:
Disease :
Stratification variable :	Name:
	Number of categories :