TY - JOUR
T1 - Data-Driven Cutoff Selection for the Patient Health Questionnaire-9 Depression Screening Tool
AU - Levis, Brooke
AU - Bhandari, Parash Mani
AU - Neupane, Dipika
AU - Fan, Suiqiong
AU - Sun, Ying
AU - He, Chen
AU - Wu, Yin
AU - Krishnan, Ankur
AU - Negeri, Zelalem
AU - Imran, Mahrukh
AU - Rice, Danielle B
AU - Riehm, Kira E
AU - Azar, Marleine
AU - Levis, Alexander W
AU - Boruff, Jill
AU - Cuijpers, Pim
AU - Gilbody, Simon
AU - Ioannidis, John P A
AU - Kloda, Lorie A
AU - Patten, Scott B
AU - Ziegelstein, Roy C
AU - Harel, Daphna
AU - Takwoingi, Yemisi
AU - Markham, Sarah
AU - Alamri, Sultan H
AU - Amtmann, Dagmar
AU - Arroll, Bruce
AU - Ayalon, Liat
AU - Baradaran, Hamid R
AU - Beraldi, Anna
AU - Bernstein, Charles N
AU - Bhana, Arvin
AU - Bombardier, Charles H
AU - Buji, Ryna Imma
AU - Butterworth, Peter
AU - Carter, Gregory
AU - Chagas, Marcos H
AU - Chan, Juliana C N
AU - Chan, Lai Fong
AU - Chibanda, Dixon
AU - Clover, Kerrie
AU - Conway, Aaron
AU - Conwell, Yeates
AU - Daray, Federico M
AU - de Man-van Ginkel, Janneke M
AU - Fann, Jesse R
AU - Fischer, Felix H
AU - Field, Sally
AU - Fisher, Jane R W
AU - Benedetti, Andrea
AU - Rancans, Elmars
AU - Depression Screening Data (DEPRESSD) PHQ Group
PY - 2024/11/4
Y1 - 2024/11/4
N2 - IMPORTANCE: Test accuracy studies often use small datasets to simultaneously select an optimal cutoff score that maximizes test accuracy and generate accuracy estimates.OBJECTIVE: To evaluate the degree to which using data-driven methods to simultaneously select an optimal Patient Health Questionnaire-9 (PHQ-9) cutoff score and estimate accuracy yields (1) optimal cutoff scores that differ from the population-level optimal cutoff score and (2) biased accuracy estimates.DESIGN, SETTING, AND PARTICIPANTS: This study used cross-sectional data from an existing individual participant data meta-analysis (IPDMA) database on PHQ-9 screening accuracy to represent a hypothetical population. Studies in the IPDMA database compared participant PHQ-9 scores with a major depression classification. From the IPDMA population, 1000 studies of 100, 200, 500, and 1000 participants each were resampled.MAIN OUTCOMES AND MEASURES: For the full IPDMA population and each simulated study, an optimal cutoff score was selected by maximizing the Youden index. Accuracy estimates for optimal cutoff scores in simulated studies were compared with accuracy in the full population.RESULTS: The IPDMA database included 100 primary studies with 44 503 participants (4541 [10%] cases of major depression). The population-level optimal cutoff score was 8 or higher. Optimal cutoff scores in simulated studies ranged from 2 or higher to 21 or higher in samples of 100 participants and 5 or higher to 11 or higher in samples of 1000 participants. The percentage of simulated studies that identified the true optimal cutoff score of 8 or higher was 17% for samples of 100 participants and 33% for samples of 1000 participants. Compared with estimates for a cutoff score of 8 or higher in the population, sensitivity was overestimated by 6.4 (95% CI, 5.7-7.1) percentage points in samples of 100 participants, 4.9 (95% CI, 4.3-5.5) percentage points in samples of 200 participants, 2.2 (95% CI, 1.8-2.6) percentage points in samples of 500 participants, and 1.8 (95% CI, 1.5-2.1) percentage points in samples of 1000 participants. Specificity was within 1 percentage point across sample sizes.CONCLUSIONS AND RELEVANCE: This study of cross-sectional data found that optimal cutoff scores and accuracy estimates differed substantially from population values when data-driven methods were used to simultaneously identify an optimal cutoff score and estimate accuracy. Users of diagnostic accuracy evidence should evaluate studies of accuracy with caution and ensure that cutoff score recommendations are based on adequately powered research or well-conducted meta-analyses.
AB - IMPORTANCE: Test accuracy studies often use small datasets to simultaneously select an optimal cutoff score that maximizes test accuracy and generate accuracy estimates.OBJECTIVE: To evaluate the degree to which using data-driven methods to simultaneously select an optimal Patient Health Questionnaire-9 (PHQ-9) cutoff score and estimate accuracy yields (1) optimal cutoff scores that differ from the population-level optimal cutoff score and (2) biased accuracy estimates.DESIGN, SETTING, AND PARTICIPANTS: This study used cross-sectional data from an existing individual participant data meta-analysis (IPDMA) database on PHQ-9 screening accuracy to represent a hypothetical population. Studies in the IPDMA database compared participant PHQ-9 scores with a major depression classification. From the IPDMA population, 1000 studies of 100, 200, 500, and 1000 participants each were resampled.MAIN OUTCOMES AND MEASURES: For the full IPDMA population and each simulated study, an optimal cutoff score was selected by maximizing the Youden index. Accuracy estimates for optimal cutoff scores in simulated studies were compared with accuracy in the full population.RESULTS: The IPDMA database included 100 primary studies with 44 503 participants (4541 [10%] cases of major depression). The population-level optimal cutoff score was 8 or higher. Optimal cutoff scores in simulated studies ranged from 2 or higher to 21 or higher in samples of 100 participants and 5 or higher to 11 or higher in samples of 1000 participants. The percentage of simulated studies that identified the true optimal cutoff score of 8 or higher was 17% for samples of 100 participants and 33% for samples of 1000 participants. Compared with estimates for a cutoff score of 8 or higher in the population, sensitivity was overestimated by 6.4 (95% CI, 5.7-7.1) percentage points in samples of 100 participants, 4.9 (95% CI, 4.3-5.5) percentage points in samples of 200 participants, 2.2 (95% CI, 1.8-2.6) percentage points in samples of 500 participants, and 1.8 (95% CI, 1.5-2.1) percentage points in samples of 1000 participants. Specificity was within 1 percentage point across sample sizes.CONCLUSIONS AND RELEVANCE: This study of cross-sectional data found that optimal cutoff scores and accuracy estimates differed substantially from population values when data-driven methods were used to simultaneously identify an optimal cutoff score and estimate accuracy. Users of diagnostic accuracy evidence should evaluate studies of accuracy with caution and ensure that cutoff score recommendations are based on adequately powered research or well-conducted meta-analyses.
KW - Humans
KW - Patient Health Questionnaire
KW - Cross-Sectional Studies
KW - Depression/diagnosis
KW - Mass Screening/methods
KW - Sensitivity and Specificity
KW - Depressive Disorder, Major/diagnosis
KW - Female
KW - Male
UR - http://www.scopus.com/inward/record.url?scp=85210463122&partnerID=8YFLogxK
U2 - 10.1001/jamanetworkopen.2024.29630
DO - 10.1001/jamanetworkopen.2024.29630
M3 - Article
C2 - 39576645
SN - 2574-3805
VL - 7
SP - e2429630
JO - JAMA network open
JF - JAMA network open
IS - 11
ER -