Relative risk, odds ratio and hazard ratio

Shengping Yang PhD, Gilbert Berdine MD

Corresponding author: Shengping Yang
Contact Information: Shengping.Yang@pbrc.edu
DOI: 10.12746/swrccc.v12i52.1341

I am planning a study to assess the impact of a new intervention on the risk of diabetes compared to a standard treatment. Although Oral Glucose Tolerance Test (OGTT) results are continuous, they are often dichotomized for convenience. Could you please explain how to use odds ratios or relative risks to analyze these data? Furthermore, studies frequently assess survival rates using hazard ratios. I am curious about any relationship between odds ratios and hazard ratios.

In biomedical research studies, it is not uncommon that outcome measurements are not continuous variables. Examples include the positive or negative result of a clinical test or the mortality status of a patient. To appropriately analyze data collected from such studies, outcome assessment methods tailored to the distributions of these outcomes have been developed.

1. RELATIVE RISK (RR) AND ODDS RATIO (OR)

1.1 DATA WITH A BINARY OUTCOME

While many measurements in clinical studies are continuous, such as blood pressure and body weight, binary outcomes are also common. The binary nature of outcomes can stem from inherent nature of the measurement, such as alive vs. deceased in mortality studies. In addition, for the reason of simplicity and practicality, some continuous measurements are often dichotomized. As mentioned, diabetes diagnosis can be defined based on an oral glucose tolerance test in which a blood glucose level ≥200 mg/dL two hours after consuming a 75-gram glucose solution indicates diabetes, while a lower level indicates non-diabetes.

To assess the risk of a disease or condition, or to compare risks associated with different treatments, two commonly used measurements are risks and odds. While risk is typically easier to comprehend, the preference for using odds often arises from other considerations.

1.2 RISKS AND RELATIVE RISK (RR)

Risk refers to the probability of an event occurring in a population over a defined period, often expressed as a percentage or a decimal. We will use hypothetical data from Table 1 to demonstrate the calculation of risk and relative risk.

Table 1. Hypothetical Data on the Risk of Diabetes

Non-diabetic Diabetic Total (N)
Treatment a b a + b
Control c d c + d
Total a + c b + d a + b + c + d

Consider a randomized trial evaluating the risk of diabetes between a control group and a treatment group. The total number of participants is N = a + b + c + d, with a + b participants in the treatment arm and c + d participants in the control arm.

In Table 1, the risk of developing diabetes in the treatment arm is calculated as equation, and in the control arm as equation, respectively. Meanwhile, the RR, which is the ratio of the risk of an event in one arm (e.g., treatment) versus the risk of the event in the other arm (e.g., control), can be calculated as equation. An RR less than 1 indicates a reduced risk in the treatment arm compared to the control arm, while an RR greater than 1 indicates an increased risk.2,3 Another commonly used method for risk evaluation is odds.

1.3 ODDS AND ODDS RATIO (OR)

Odds represent the ratio of the probability of an event occurring to the probability of its not occurring. In Table 1, the odds of diabetes for the treatment and control arms are equation and equation, respectively. The OR, which compares the odds of an event in one group to the odds of the event in the other group, can be calculated as equation. An OR less than 1 indicates a decreased odds of the event in the treatment arm compared to the control arm, while an OR greater than 1 indicates an increased odds.

1.4 THE CHOICE BETWEEN RR AND OR

While RR is often more intuitive and directly interpretable compared to OR, there are several considerations that may influence the choice between using RR and OR in research.

1.4.1 STUDY DESIGN COMPATIBILITY

1.4.2 COMPUTATIONAL CONVENIENCE

ORs can be directly obtained from logistic regression models, which are commonly used for binary outcomes.4 This computational convenience makes ORs widely applicable in statistical modeling.

1.4.3 HISTORICAL PRECEDENT

ORs have been widely used historically and reported in many epidemiological and clinical studies, contributing to their standardization and widespread acceptance in certain fields.

1.4.4 INTERPRETABILITY

While RRs may be more intuitive for many researchers, ORs can also be straightforwardly interpreted as the odds of an event occurring in one group compared to another. This interpretative simplicity can be advantageous depending on the context and audience.

1.4.5 SIMILARITY BETWEEN RR AND OR

A desirable property of the OR is that when the outcome of interest is rare, ORs derived from case-control studies can approximate RRs from cohort studies, providing valid estimates of association with a more efficient study design. Numerical estimates of RR and OR can be quite similar under these conditions.

Considering Table 1 again as an example, we assume that both a + b and c + d equal 100. If b = 5, then the risk of diabetes is 5% for the treatment arm, and if d = 10, the risk is 10% for the control arm. In Figure 1, we set d (the number of diabetes cases in the control arm) to 2, 5, 10, and 50, representing disease rates of 2%, 5%, 10%, and 50%, respectively. We plotted the RR and OR values for different values of b (the number of diabetes cases in the treatment arm), ranging from 1 to 99. It is evident that when d is small (low event rate for the control group), RR and OR values are close when b is smaller than 10 (10% rate for the treatment group). However, as b increases (higher rate for the treatment group), RR and OR values start to differ substantially. When d = 50 (a 50% rate for the control group), RR and OR values differ even when b is small. Numerically, equation and equation. Thus when b and d are small, i.e., rare events, then a + b is close to a, and c + d is close to c, and thus RR is close to OR. It is worth noting that when b and d are equal (bigger circles superimposed on the smaller circles, Figure 1), given a + b = c + d, then both RR and OR equal 1.

Figure 1

Figure 1. RR and OR at different event rates. Both the treatment and control arms are assumed to have 100 participants each. The x-axis represents the number of diabetes participants in the treatment arm. RR and OR values were calculated for four control arm diabetes rates: 2%, 5%, 10%, and 50%, represented by black, red, green, and blue circles, respectively, corresponding to 2, 5, 10, and 50 diabetic participants in the control arm. The closed circles represent RRs, and the open circles represent ORs. Y axis is displayed on a logarithmic scale.

These results demonstrate that for rare diseases or events, the OR approximates the RR. This is useful in epidemiological studies where the actual risk is low, allowing both RR and OR to provide meaningful and consistent insights. In essence, a more efficient case-control design, as opposed to a cohort design, can effectively approximate the association between an exposure and a binary outcome.

While RR and OR are useful for assessing the risk of a binary outcome at a specific time point, they do not incorporate timing information regarding event occurrence, thus offering limited resolution. However, there is often significant interest in understanding the duration until an outcome such as disease onset or death in studies like cohort or randomized trials. For example, when evaluating the mortality of ICU patients at discharge, which typically results in a binary outcome (expired or alive), this often reflects the short duration of ICU stays. Conversely, if researchers aim to understand the time until patients expire after discharge, for example, they often need to follow patients over an extended period. This approach allows for the collection of two crucial pieces of data: the outcome status at the last observation and the time elapsed from treatment or enrollment to the last observation. This type of data is known as time-to-event data, which can be conveniently analyzed by evaluating Hazard Ratio (HR).

2. HAZARD AND HAZARD RATIO

The hazard is the instantaneous rate at which an event occurs, given that the individual has not had an event up to that point in time. It represents the probability of the event occurring in a very small time interval, conditional on not having experienced the event up to that time. Specifically, equation quantifies the instantaneous risk of an event occurring at time t, among those in one arm/group compared to those in another arm/group, while accounting for the time it takes for the event to happen.5

Hazard ratio is the ratio of the hazard rates corresponding to the conditions characterized by two distinct levels of a treatment of interest. This ratio serves as an effect size measure for time-to-event data. When the control group is used as the reference, an HR smaller than 1 indicates that the hazard of the event is lower in the treatment group compared to the control group, while an HR greater than 1 indicates that the hazard (or risk) of the event is higher in the treatment group compared to the control group.

The most widely used regression model for estimating HR is Cox regression.

2.1 COX REGRESSION

Cox regression models the hazard function as h(t|Xi) = h0(t)eβ1Xi1+…+βpXip, where h0(t) is the baseline hazard function, Xip is the observed value of the pth covariate for subject i, and βp is regression coefficient for the pth covariate.6

One of the fundamental assumptions in Cox regression is the proportional hazards assumption, which states that the hazard for any two levels of a covariate remains proportional over time. This assumption enables the Cox proportional hazards model to estimate how covariates affect the hazard of experiencing an event, while adjusting for varying follow-up times and censoring. Evaluating this assumption is crucial to ensure the validity and reliability of a Cox model.7

3. SOFTWARE FOR DATA ANALYSIS

RR, OR and HR can be estimated by using various statistical software packages, such as SAS, Stata, S-Plus/R, etc. Below are examples of using SAS in data analysis.

3.1 RELATIVE RISK

In SAS, RR can be estimated using a log-binomial regression model (below) or a Poisson regression model with a log link function.

proc genmod data=mydata DESCENDING;

class treatment;

model outcome=treatment/dist=binomial link=log;

run;

Here, outcome is the binary outcome variable (1=event, 0=no event), treatment is the treatment (exposure, etc.) variable.8

3.2 ODDS RATIO

OR can be estimated using logistic regression.9

proc logistic data=mydata DESCENDING;

class treatment;

model outcome=treatment;

run;

3.3 HAZARD RATIO

HR can be estimated using the proc phreg procedure for Cox proportional hazards regression.10

proc phreg data=mydata;

class treatment;

model time_to_event*censor(0)=treatment/ties=efron;

run;

Here, time_to_event is the time variable, censor indicates censoring (1=event, 0=censored), and ties=efron specifies the method for handling ties in the data.

4. OTHER CONSIDERATIONS

4.1 LIMITATIONS OF RATIOS

Expression of results as ratios can be misleading when events, risks, or hazards are rare. For example, the original Pfizer data on the COVID-19 vaccine claimed 95% efficacy. This statement, in isolation, suggests that the risk of getting COVID without the vaccine was 100% and the risk of getting COVID with the vaccine was 5%. The data, however, showed that the risk for being hospitalized with COVID was near zero for the control group and 5% of that near zero number for the vaccine group. There were no deaths reported in either group. One should consider the number to treat in order to prevent a single event to have a more complete understanding of the benefit of a treatment.

4.2 STATISTICAL POWER

One important consideration for studies with binary or time-to-event data, compared to those with continuous outcomes, is the sample size requirement. Those with binary outcomes often require substantially larger numbers of participants. This is because binary outcomes, or dichotomized continuous outcomes, lose data resolution, necessitating a larger sample size to achieve the same level of statistical power. For time-to-event studies, sample size is related to the number of events. The incorporation of information on the timing of events is expected to contribute positively to the study’s statistical power.

In summary, many biomedical research studies focus on binary outcomes. Estimating Relative Risk (RR) is often preferred for its interpretability and direct relevance to clinical understanding, while Odds Ratios (ORs) are useful in specific study designs, such as case-control studies. The selection between RR and OR depends on factors, such as study design, nature of data, and specific research interests. In addition, for study designs, like cohort studies and randomized trials, time-to-event analysis provides insights into how treatment or exposure affects the timing of an event, offering a dynamic view of risk over the study period. The Cox regression model is commonly employed to estimate Hazard Ratios (HR) in such analyses. Overall, the choice of study design and outcome measurement is tailored to the specific study context and requires careful consideration. Nevertheless, RR, OR, and HR are complementary and are best used in various scenarios to understand different aspects of the relationship between treatment or exposure and outcome.


REFERENCES

  1. Sedgwick P. Relative risks versus odds ratios. BMJ 2014;348: g1407.
  2. Ranganathan P, Aggarwal R, Pramesh CS. Common pitfalls in statistical analysis: Odds versus risk. Perspect Clin Res 2015 Oct–Dec; 6(4):222–4.
  3. Monaghan TF, Rahman SN, Agudelo CW, et al. Foundational Statistical Principles in Medical Research: a tutorial on odds ratios, relative risk, absolute risk, and number needed to treat. Int J Environ Res Public Health 2021 May 25;18(11):5669.
  4. Yang, Berdine G. Categorical Data Analysis – Logistic Regression. The Southwest Respiratory and Critical Care Chronicles 2014;2(7):51–54. https://pulmonarychronicles.com/index.php/pulmonarychronicles/article/view/148
  5. Cox DR, Oakes D. Analysis of survival data. Chapman and Hall, New York.
  6. Cox DR. Regression models and life-tables. J Royal Statistical Society, Series B 34(2):187–220.
  7. Kuitunen I, Ponkilainen VT, Uimonen MM, et al. Testing the proportional hazards assumption in Cox regression and dealing with possible non-proportionality in total joint arthroplasty research: methodological perspectives and review. BMC Musculoskelet Disord 2021;22:489.
  8. UCLA Statistical methods and Data Analytics. How can I estimate relative risk in SAS using proc genmod for common outcomes in cohort studies? https://stats.oarc.ucla.edu/sas/faq/how-can-i-estimate-relative-risk-in-sas-using-proc-genmod-for-common-outcomes-in-cohort-studies/. Accessed July 7, 2024.
  9. SAS Institute. Syntax: LOGISTIC Procedure. https://support.sas.com/documentation/cdl/en/statug/63347/HTML/default/viewer.htm#statug_logistic_sect003.htm. Accessed July 7, 2024.
  10. SAS Institute. Syntax: PHREG Procedure. https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_phreg_sect005.htm. Accessed July 7, 2024.


Article citation: Yang S, Berdine G. Relative risk, odds ratio and hazard ratio. The Southwest Respiratory and Critical Care Chronicles 2024;12(52):44–48
From: Department of Biostatistics (SY), Pennington Biomedical Research Center, Baton Rouge, LA; Department of Internal Medicine (GB) Texas Tech University Health Sciences Center, Lubbock, Texas
Submitted: 7/9/2024
Accepted: 7/12/2024
Conflicts of interest: none
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.