Research methods: Clinical studies based on routine laboratory tests

Clinical research using routine laboratory tests can provide important opportunities to investigators, especially those with limited resources, and can improve patient care, especially if the result improves clinical decision making without the use of more sophisticated or expensive tests. Laboratory analysis of biological parameters can be used for screening, diagnostic testing, predicting prognosis, and measuring treatment responses. Often the same parameter can be used for several purposes, depending on the clinical scenario and the patient population. For example, several studies have suggested the mean platelet volume (MPV) is different in patients with acute coronary syndrome compared to patients with coronary disease but no acute syndrome. Given this information it might seem relatively easy to start studies using this laboratory test. However, multiple questions need to be considered before starting any research using MPV measurements. We will discuss some of these considerations in this review article. This approach applies to most research projects based on laboratory tests.

2][3] Our initial efforts at this project identified multiple areas of uncertainty, and we will discuss them in this article.These ideas are applicable to almost all laboratory based projects and would be considered, for example, in projects involving red blood cell distribution width, neutrophil and lymphocyte ratios, and platelet size and function characteristics.Figure 1 outlines our approach.We do not consider the studies needed to introduce a new lab test in this discussion.

Background information
Several studies have suggested that the MPV is associated with adverse outcomes in patients with cardiovascular disease, such as acute coronary syndrome, atrial fibrillation, and stroke.Changes in the MPV may also reflect ongoing inflammation and have introduction Clinical research using routine laboratory tests can provide important opportunities to investigators, especially to those with limited resources such as trainees.However, the use of routine laboratory data in clinical studies requires several important considerations.These include the prospective utility of the laboratory test and its interpretation given the usual range of normal values.Project design will require understanding the underlying physiology or the pathophysiology of the laboratory test under consideration, a careful literature review to determine background information, decisions about data management, decisions about the proposed utility of the test in the patient group under question, and understanding normal laboratory variability in test results.We recently considered The clinician asks questions about how the test might be used in these patients and formulates a preliminary study design.
A literature review identifies articles with basic information about the test and prior studies using the test.
Relevant articles are identified and reviewed.Numerical information about the test is summarized into simple tables or into a metaanalysis.
Based on this information, the study is redesigned to determine whether or not the test is abnormal in the study population and to determine whether or not it can be used in screening, diagnosis, treatment, or outcome prediction.associations in nonvascular disorders such as sepsis.Mean platelet volumes are routinely reported in CBC studies.However, the MPV does not currently form part of any diagnostic algorithm or prognostic stratification for vascular disease.

clinical questions
Some studies have suggested both a low MPV and a high MPV have clinical implications. 4This might make this laboratory test difficult to use in routine clinical work unless the pathophysiological basis for the effect of extremes in platelet size can be established.Important questions about this effect are: 1. Can the MPV be used to identify subgroups in patients with a particular disease?
2. Is the MPV a reliable test in terms of reproducibility in a subject, between subjects, and between populations?
3. Can the MPV be considered a gold standard?What is the current gold standard used to compare the MPV to?
4. What are the optimal cut-off values for normal MPVs? 5 5. What are the sensitivity and specificity with this test in relation to predict a specific disease or an outcome? 5 6.Can there be clinically significant or detectable variation of the MPV that is related to disease activity? 6Is this variation within or outside the "normal" range for the MPV?The researcher has several databases available to identify pertinent literature for his or her project.The most commonly used databases are PubMed and Google Scholar.Other choices include EMBASE, SCOPUS, and Web of Science.PubMed searches can use MeSH terms and text terms and combinations.Filters include the type of study, age, language, and dates.Mean platelet volume is a MeSH term and most clinical disorders, including, for example, acute coronary syndrome, are MeSH terms.Google Scholar creates a list of possible studies using a proprietary algorithm which includes the number of citations for particular article.Consequently, Google Scholar likely recovers more important articles, at least based on citation history.Many clinicians do quick searches using Google and then Google Scholar.After getting some idea about the frequency of publications on the topic of interest, they can then use PubMed to do more structured searches.
article selection: how will we select articles to use in our literature review?
We initially searched the MEDLINE database using PubMed with the MeSH term mean platelet volume to determine if there were many clinical studies with this laboratory test.This search recovered 2,282 articles.The initial 50 articles included 43 studies with original data (14 were prospective studies and 29 were retrospective or cross sectional studies).The MPV was studied in at least 22 different organ systems; the most frequent association studied was with cardiovascular disease (12 studies).The number of different organ systems studied and the seemingly discrepant and variable MPV data results in different studies suggest that this is a very nonspecific marker.The range and spectrum of MPV values associated with disease and outcomes were not uniform.Our initial effort suggested that the investigator needs to structure literature searches very carefully.
We then searched Google with the terms mean platelet volume and acute coronary syndrome.All ten references listed on the first page covered this topic; the dates ranged from 2009 to 2014.One reference had 114 citations.We repeated this search using Google Scholar.Two references were retrieved; they were published in 2010 and 2013 with 10 and one citation, respectively.Finally, we repeated the PubMed search using MeSH terms mean platelet volume AND acute coronary syndrome.This search recovered 11 articles.Four of these studies investigated the utility of MPV in the prediction of outcomes in patients with acute coronary syndrome.These searches indicate that the clinician or investigator cannot depend on one search strategy and cannot close out a literature review with one or two search strategies.The primary measurements in studies involving laboratory tests such as the MPV could include the mean MPV, the median MPV, the normal range for MPV, and the quantiles of MPV in various populations.The outcomes of interest could reflect any of the following questions.Is there a difference in MPV in patient groups, e.g., patients with acute coronary syndrome and patients without acute coronary syndrome?How frequently are low MPVs present in the study population?How frequently are high MPVs present in the study population?Does the MPV change over time in each subject?Does the MPV change with treatment?Can the MPV be used as a risk factor in prediction equations for outcomes?These questions will influence the search strategy and its success.Two important ways to combine study results are with systematic reviews and meta-analyses.Performing one or both on a particular topic may help to filter, condense, and give structure to sometimes diverse information obtained in literature searches.A systematic review is usually a narrative summary of the publications identified with a pre-specified search strategy without much combination of data or data handling.A meta-analysis requires the combination of results from different The Southwest Respiratory and Critical Care Chronicles 2018;6(23):33-39 studies to derive a pooled estimate.Unlike studies with a binary outcome, studies with a continuous outcome can have heterogeneous and inconsistent types of data presentation.For example, although a large number of studies report the MPV sample mean and standard deviation, some studies report the median, the minimum and maximum (and/or the first and the third quantile) values.Therefore, it is important to accurately estimate the sample mean and standard deviation for studies reporting other types of sample statistics, so that results can be pooled using a consistent format.By applying simple inequalities, Hozo, et al. showed that when the sample size is larger than 25, the sample mean can be approximated by , where a is the minimum value, b is the maximum value, and m is the median. 7Similarly, the sample standard deviation can be approximated by To estimate the sample mean and standard deviation for studies that report only the first and the third quartiles, Wan proposed that, where q 1 and q 3 are the first and the third quartiles, respectively.
Similarly, the sample standard deviation can be derived as, S q q n n 2 0.75 0.125 0.25 where n is the sample size, and Φ -1 is the inverse of normal distribution cumulative density function. 8though the above conversion could greatly facilitate the incorporation of data from different studies in a meta-analysis for analysis by t tests, a major issue with t test based comparisons is that this test result cannot make predictions.Therefore, for many studies with one binary and one continuous variable, instead of performing a t test, a logistic regression model is used for predicting the binary outcome.An example prediction would be "for every unit increase in MPV, there is an X% increase in the odds of having a positive outcome." Due to the nature of meta-analysis, there is a large degree of inconsistency in what results are presented in the included studies; some studies present unadjusted results only, some present adjusted results only, and others present both.For randomized clinical studies, since randomization can largely remove the majority of variations caused by the confounders/effect modifiers, the unadjusted results are less prone to bias and are usually presented (very likely, together with the adjusted results).This is also more or less true for case-control studies with matched risk factors.On the other hand, for cohort studies, since there is no control over confounder/effect modifiers, the unadjusted results have high potential for bias, and thus very often only the adjusted results are presented.In fact, if there is an interaction between the factor of interest and another risk factor, then conclusions made based on results from unadjusted and adjusted analysis can be entirely different.As far as we know, there is no consensus on how to combine adjusted and unadjusted findings in a meta-analysis.A simple approach is to include studies with unadjusted results only, for example, from controlled clinical trials.Another option is to perform meta-analyses using adjusted and unadjusted results separately, and the difference between the two analyses provides an indication of the degree of heterogeneity due to adjustment.Meta regression is another approach for combining results from heterogeneous studies.This is a regression model with effect size as the outcome, and study characteristics as the covariates.The effect of the covariates can be tested to examine the heterogeneity across studies.

using a laB value to estaBlish a diagnosis or Predict an outcome: how can we use a routine laB value in clinical decision making?
Clinician researchers generally want to develop new information or confirm old information.If they are working with projects which involve laboratory testing, they likely want a number or measurement which is easy to remember and clearly classifies patients into one group or another.Mean platelet volumes represent a continuous measurement.The threshold for redefining it as a categorical variable can be obtained using a ROC curve.For an ideal predictor, a perfect classification can be achieved, if, for example, 100% of positive outcomes are above the threshold, and 100% of negative outcomes are below it.Otherwise, the threshold can be chosen by a method that best serves the study goal, such as a median value, or a cost-benefit method. 9Alternatively, thresholds can be chosen based on clinical evidence.For example, patients can be grouped into high, low, or within normal range categories based on clinical experience of the researchers.There is another type of data presentation; many studies on MPV have reported that the experimental group's value is within the normal range but statistically different from the control group.This makes it difficult to interpret results in the individual patient if this patient's result is in the normal range.Consequently, the researcher would need to decide whether or not the location in the normal range is critical.For example, if the patient is in the upper quartile or the lower quartile does that change this patient's risk for some particular outcome?An alternative approach is to classify the laboratory result into particular categories and assign a score.This score would then be entered into a multivariable scoring system to project risk for particular outcomes.
The MPV might not be the only predictor that is potentially associated with patient outcomes; other predictors are often associated with the outcome.Multiple linear regression also models the relationship between two or more predictors and an outcome by fitting a linear equation to observed data. 10o effectively use all the information to predict patient outcomes, the relationship among the predictors should be examined.In general, we expect the correlations among the predictors to be weak, so that the inclusion of one predictor will not change the effect of another predictor in a multivariable regression model.However, in reality, this is not always the case, since factors potentially associated with a disease are often related.If two or more factors are strongly correlated (i.e., have collinearity 11 ), including them in a multiple regression model renders incorrect coefficient estimates.Therefore, collinearity is commonly checked in multivariable regression models by calculating the variance inflation factor (VIF): where R j 2 is the coefficient of determination of a regression of predictor j on all the other predictors.A VIF of above 5 or 10 in general indicates a collinearity problem.
Several strategies have been proposed to deal with collinearity, including fitting a robust regression model, modeling with latent variables, and removing collinearity prior to analysis.Several considerations need to be taken into account for removing a correlated variable: 1) the difficulty/feasibility/cost level of data collection; 2) its clinical relevance; and 3) how close it is to the underlying mechanisms.In fact, a deep understanding of the disease and the potential risk factors are needed in order to appropriately remove correlated variables.
Sometimes, interaction(s) (the effect of one factor in a statistical model depends in some way on the presence or absence of another factor) among factors is another problem which complicates analysis.Although the study of interaction in general requires a meaningful reason beforehand, and such studies usually need a larger sample size than those studying main effects, one way to avoid having to deal with interaction is to use study inclusion/ exclusion criteria to include only patients belonging to a certain subgroup(s).However, the researcher must bear in mind that results from such studies can be applied only to patient populations similar to the study subgroup(s).Routine laboratory studies on platelets include the total platelet count and measurement of platelet volume indices. 12The latter calculations include the MPV, the relative width of the distribution of platelets (PDW) in the volume index which reflects the heterogeneity of platelet volumes, and platelet large cell ratio (P-LCR) which is calculated as a percentage of platelets larger than 12 fL.These measurements can be made using electrical impedance (Coulter counter), optical methods based on laser light scatter, and flow cytometry using fluorescently tagged platelet specific antibodies.Latger-Cannard, et al. measured MPV using impedance and optical equipment. 13There were significant differences in the MPV by these two methods, and the analyzers using impedance did not recognize platelets above 12 fL.In addition, platelet storage does influence the volume measurement, and there is time dependent swelling when the blood samples were anticoagulated with ethylenediaminetetraacetic acid (EDTA).This swelling does not occur when citrate is used as an anticoagulant or if the platelets are processed within 1 hour when EDTA is used as the anticoagulant.Consequently, the researcher will need to consider these technical factors when designing a study and collecting samples.
The distribution of MPV is slightly skewed to the right largely because MPV cannot have a negative value.If the skewness is severe, a logarithm transformation is generally recommended, so that the distribution of the transformed values approximately follows a normal distribution.However, if this is the case, we might have to re-evaluate how to present the basic statistics of this variable.In other words, if the MPV has a highly skewed distribution, then it might be more appropriate to present its median and quantiles rather than its mean and standard deviation.The heterogeneity on sample storage, assay method, instrument calibration, etc., across the studies introduces another layer of difficulty in data combination, and in-depth assessment on the inclusion and exclusion of individual studies should be performed in a case by case manner.

conclusions
The use of a routine laboratory test in clinical studies has significant advantages.The laboratory test is likely easily available, widely understood and used, and pertinent to the overall health of the patient.However, because these tests are easily available, investigators may initiate studies without significant planning and study design.The investigator must formulate a question(s) using a study design which has a high likelihood of completion and an outcome which is easily translated into routine patient care.The researcher must define an abnormal result and a normal result.This distinction is particularly important with continuous variables.The investigator will need to collect background information from the medical literature.Search strategies are important to understand the current information available on the particular lab test or clinical syndrome.In some situations it will be important to combine results from medical literature.This becomes more difficult when numerical results vary and might include mean values, median values, percentiles, and categorical definitions based on outcomes from prior studies.The researcher will have to decide whether or not to consider the laboratory test a characteristic of the clinical syndrome which might be used to classify patients, monitor treatment, and predict outcomes.Finally, the investigator will need to determine whether or not technical details involved in the laboratory analysis present problems which might make results from one study difficult to repeat in another laboratory or to use by other laboratories.

Figure 1 .
Figure 1.Organization of study using routine laboratory tests.
data summary: how will we summarize data from different studies to get Better estimates of aBnormal mPvs in clinical studies?
Yang et al.