TY - JOUR
T1 - Estimation of Genetic Correlation via Linkage Disequilibrium Score Regression and Genomic Restricted Maximum Likelihood
AU - Schizophrenia Working Group of the Psychiatric Genomics Consortium, SWE-SCZ Consortium
AU - Psychosis Endophenotypes International Consortium
AU - Trust Case Control Consortium
AU - Ni, Guiyan
AU - Moser, Gerhard
A2 - Ripke, Stephan
A2 - Neale, Benjamin M.
A2 - Corvin, Aiden
A2 - Walters, James T.R.
A2 - Farh, Kai How
A2 - Holmans, Peter A.
A2 - Lee, Phil
A2 - Bulik-Sullivan, Brendan
A2 - Collier, David A.
A2 - Huang, Hailiang
A2 - Pers, Tune H.
A2 - Agartz, Ingrid
A2 - Agerbo, Esben
A2 - Albus, Margot
A2 - Alexander, Madeline
A2 - Amin, Farooq
A2 - Bacanu, Silviu A.
A2 - Begemann, Martin
A2 - Belliveau, Richard A.
A2 - Bene, Judit
A2 - Bergen, Sarah E.
A2 - Bevilacqua, Elizabeth
A2 - Bigdeli, Tim B.
A2 - Black, Donald W.
A2 - Bruggeman, Richard
A2 - Buccola, Nancy G.
A2 - Buckner, Randy L.
A2 - Byerley, William
A2 - Cahn, Wiepke
A2 - Cai, Guiqing
A2 - Campion, Dominique
A2 - Cantor, Rita M.
A2 - Carr, Vaughan J.
A2 - Carrera, Noa
A2 - Catts, Stanley V.
A2 - Chambert, Kimberly D.
A2 - Chan, Raymond C.K.
A2 - Chen, Ronald Y.L.
A2 - Chen, Eric Y.H.
A2 - Cheng, Wei
A2 - Cheung, Eric F.C.
A2 - Chong, Siow Ann
A2 - Cloninger, C. Robert
A2 - Cohen, David
A2 - Cohen, Nadine
A2 - Cormican, Paul
A2 - Craddock, Nick
A2 - Nikitina-Zake, Liene
N1 - Funding Information:
This research is supported by the Australian National Health and Medical Research Council ( 1080157 , 1087889 ) and the Australian Research Council ( DP160102126 , FT160100229 ). This research has been conducted using the UK Biobank Resource. UK Biobank Research Ethics Committee (REC) approval number is 11/NW/0382. Our reference number approved by UK Biobank is 14575. GERA data came from a grant, the Resource for Genetic Epidemiology Research in Adult Health and Aging ( RC2 AG033067 ; Schaefer and Risch, PIs) awarded to the Kaiser Permanente Research Program on Genes, Environment, and Health (RPGEH) and the UCSF Institute for Human Genetics. The RPGEH was supported by grants from the Robert Wood Johnson Foundation , the Wayne and Gladys Valley Foundation , the Ellison Medical Foundation , Kaiser Permanente Northern California , and the Kaiser Permanente National and Northern California Community Benefit Programs . The RPGEH and the Resource for Genetic Epidemiology Research in Adult Health and Aging are described in the GERA website (see Web Resources ). This study makes use of data generated by the Wellcome Trust Case-Control Consortium. A full list of the investigators who contributed to the generation of the WTCCC data is available online. Funding for the WTCCC project was provided by the Wellcome Trust under awards 076113 , 085475 , and 090355 .
Funding Information:
This research is supported by the Australian National Health and Medical Research Council (1080157, 1087889) and the Australian Research Council (DP160102126, FT160100229). This research has been conducted using the UK Biobank Resource. UK Biobank Research Ethics Committee (REC) approval number is 11/NW/0382. Our reference number approved by UK Biobank is 14575. GERA data came from a grant, the Resource for Genetic Epidemiology Research in Adult Health and Aging (RC2 AG033067; Schaefer and Risch, PIs) awarded to the Kaiser Permanente Research Program on Genes, Environment, and Health (RPGEH) and the UCSF Institute for Human Genetics. The RPGEH was supported by grants from the Robert Wood Johnson Foundation, the Wayne and Gladys Valley Foundation, the Ellison Medical Foundation, Kaiser Permanente Northern California, and the Kaiser Permanente National and Northern California Community Benefit Programs. The RPGEH and the Resource for Genetic Epidemiology Research in Adult Health and Aging are described in the GERA website (see Web Resources). This study makes use of data generated by the Wellcome Trust Case-Control Consortium. A full list of the investigators who contributed to the generation of the WTCCC data is available online. Funding for the WTCCC project was provided by the Wellcome Trust under awards 076113, 085475, and 090355.
Publisher Copyright:
© 2018 American Society of Human Genetics
PY - 2018/6/7
Y1 - 2018/6/7
N2 - Genetic correlation is a key population parameter that describes the shared genetic architecture of complex traits and diseases. It can be estimated by current state-of-art methods, i.e., linkage disequilibrium score regression (LDSC) and genomic restricted maximum likelihood (GREML). The massively reduced computing burden of LDSC compared to GREML makes it an attractive tool, although the accuracy (i.e., magnitude of standard errors) of LDSC estimates has not been thoroughly studied. In simulation, we show that the accuracy of GREML is generally higher than that of LDSC. When there is genetic heterogeneity between the actual sample and reference data from which LD scores are estimated, the accuracy of LDSC decreases further. In real data analyses estimating the genetic correlation between schizophrenia (SCZ) and body mass index, we show that GREML estimates based on ∼150,000 individuals give a higher accuracy than LDSC estimates based on ∼400,000 individuals (from combined meta-data). A GREML genomic partitioning analysis reveals that the genetic correlation between SCZ and height is significantly negative for regulatory regions, which whole genome or LDSC approach has less power to detect. We conclude that LDSC estimates should be carefully interpreted as there can be uncertainty about homogeneity among combined meta-datasets. We suggest that any interesting findings from massive LDSC analysis for a large number of complex traits should be followed up, where possible, with more detailed analyses with GREML methods, even if sample sizes are lesser.
AB - Genetic correlation is a key population parameter that describes the shared genetic architecture of complex traits and diseases. It can be estimated by current state-of-art methods, i.e., linkage disequilibrium score regression (LDSC) and genomic restricted maximum likelihood (GREML). The massively reduced computing burden of LDSC compared to GREML makes it an attractive tool, although the accuracy (i.e., magnitude of standard errors) of LDSC estimates has not been thoroughly studied. In simulation, we show that the accuracy of GREML is generally higher than that of LDSC. When there is genetic heterogeneity between the actual sample and reference data from which LD scores are estimated, the accuracy of LDSC decreases further. In real data analyses estimating the genetic correlation between schizophrenia (SCZ) and body mass index, we show that GREML estimates based on ∼150,000 individuals give a higher accuracy than LDSC estimates based on ∼400,000 individuals (from combined meta-data). A GREML genomic partitioning analysis reveals that the genetic correlation between SCZ and height is significantly negative for regulatory regions, which whole genome or LDSC approach has less power to detect. We conclude that LDSC estimates should be carefully interpreted as there can be uncertainty about homogeneity among combined meta-datasets. We suggest that any interesting findings from massive LDSC analysis for a large number of complex traits should be followed up, where possible, with more detailed analyses with GREML methods, even if sample sizes are lesser.
KW - accuracy
KW - biasedness
KW - body mass index
KW - genetic correlation
KW - genome-wide SNPs
KW - genomic restricted maximum likelihood
KW - height
KW - linkage disequilibrium score regression
KW - schizophrenia
KW - SNP heritability
UR - http://www.scopus.com/inward/record.url?scp=85046126170&partnerID=8YFLogxK
U2 - 10.1016/j.ajhg.2018.03.021
DO - 10.1016/j.ajhg.2018.03.021
M3 - Article
C2 - 29754766
AN - SCOPUS:85046126170
SN - 0002-9297
VL - 102
SP - 1185
EP - 1194
JO - American Journal of Human Genetics
JF - American Journal of Human Genetics
IS - 6
ER -