• Tidak ada hasil yang ditemukan

Evaluation of power in alternate simulations

power. I implemented the MTAG method using test code written in Python because The MTAG software thought the input was defective if the median z-score was far from zero, such as the input used in the power simulation.

I evaluated the power of PLEIO, MTAG-U, MTAG-C, ASSET, and METAL using various simulation settings. For each simulation setting, I defined a specific genetic correlation structure (𝑪𝒈), heritability (𝒉𝟐), phenotypic unit (𝑼), and trait type (quantitative; 𝑄 or binary; 𝐵). In the power simulation, 𝑪𝒈 and 𝒉𝟐 given to PLEIO and MTAG are not estimates but true genetic correlation and heritability. I assumed the seven traits (𝑇 = 7) and repeated the simulation 10,000 times. For each method, the statistical power was estimated as the proportion of simulations with estimated 𝑃 < 5 × 10−8. In this power simulation, instead of directly sampling the effect sizes from a multivariate distribution, I generated the actual genotypes (See Generation of effect sizes used in power simulation).

First, I assumed a fixed heritability of 0.4 and perfect correlation (𝑟2 = 1.0) across seven traits. This represents the scenario that collects the multiple GWAS of the same traits. In this situation, PLEIO, METAL, MTAG-U performed better than MTAG-C and ASSET (Figure 9a). With a sample size of 𝑁 = 50,000, the power of PLEIO, METAL, MTAG-U, MTAG-C, and ASSET were 63.79%, 63.81%, 63.81%, 63.81%, and 61.67%. As expected, METAL performed well because it is optimized to aggregate multiple GWAS with the same trait. MTAG-U and METAL are analytically identical[19]; therefore, MTAG-U performed the same as the METAL. PLEIO attained similar (or slightly less) power of METAL and MTAG-U

as it can account for the genetic correlations. In this scenario with one trait, the multiple testing correction using Bonferroni is not necessary for MTAG-U. Because of this, the power of MTAG-C was overly conservative.

Second, I changed the heritability for seven traits from 0.005 to 0.7. I assumed a uniform genetic correlation of 𝑟 = 0.5 of all trait pairs. In this scenario, PLEIO outperformed other methods (Figure 9b). With a sample size of 𝑁 = 50,000, PLEIO attained a power of 77.6%, while the second-best method (MTAG-U) attained 67.2%, and the third-best method (MTAG-C) attained 62.7%. The result indicates that PLEIO is optimized for a joint analysis of multi-trait with different heritabilities.

Third, I simulated a complex genetic correlation pattern with both positive and negative correlations. I divided seven groups into two groups (three traits and four traits). I set the within-group correlation of the first group to 0.95 and the second group to 0.9, and I set the correlation between groups to −0.9. I assumed a uniform heritability of 0.4 for all traits. PLEIO outperformed other methods (Figure 9c).

With a sample size of 𝑁 = 50,000, PLEIO attained a power of 78.6%, while the second-best method (MTAG-U) attained 66.3%, and the third-best method (MTAG- C) attained 62.6%. The result indicates that PLEIO is optimized for a joint analysis of multi-trait with a complex correlation pattern.

Fourth, I simulated a mixture of quantitative and binary traits. I assumed four quantitative traits and three binary traits. For quantitative traits, I assumed that

phenotypic units could differ between traits. When 𝑈 is the standard phenotypic unit I assumed, I changed the units of four traits from 0.1𝑈 to 10𝑈. I assumed a uniform heritability of 0.4 and a uniform genetic correlation of 0.5. Again, PLEIO outperformed other methods (Figure 9d). With a sample size of 𝑁 = 50,000, PLEIO attained a power of 80.1%, while the second-best method (MTAG-U) attained 63.4%, and the third-best method (MTAG-C) attained 57.3%. The result indicates that PLEIO systematically combines heterogeneous traits by standardizing the effect sizes.

So far, I tested the power by changing one factor per simulation: different heritabilities, a complex genetic correlation pattern, different phenotypic units. In a real data analysis, all three can occur together. I tested such a combined situation (Figure 9e). With a sample size of 𝑁 = 50,000, PLEIO attained a power of 49.2%, while the second-best method (MTAG-U) attained 59.3%.

Next, I wanted to test a power simulation using real data-based parameters. In this simulation of seven studies, I assumed one focal trait and six non-focal traits where the focal trait shows strong genetic correlations with the non-focal traits. Here, I assumed that MTAG could selectively take the p-values of the focal trait only, which I call MTAG-F.

Based on the information provided by LD-HUB[32], I chose LDL as the focal trait and selected six traits that are strongly correlated to LDL (0.35 ≥∣𝑟𝑔∣≥ 0.17): triglyceride (TG), coronary artery disease (CAD), Age at Smoking (Age_Smo),

childhood IQ (cIQ), Hemoglobin A1c (HbA1C), and Waist-Hip-Ratio (WHR). For simplicity, I assumed that all seven traits share 1000 causal variants. Unlike MTAG- F, PLEIO and MTAG-U can have strong associations driven by one or some non- focal traits with the large 𝒉𝟐 if I assume the same sample size. To compensate for this difference in heritability, the samples sizes were adjusted so that 𝑵𝒉𝟐 is constant for all traits. Then, I doubled the sample size of the focal trait.

Figure 10 shows the result of the power simulation above. Again, PLEIO outperformed other methods. With sample sizes that meet 𝑵𝒉𝟐= 10,000, PLEIO attained a power of 72.6%, while the second-best method (MTAG-U) attained 52.8%, and the third-best method (ASSET) attained 37.3%. Note that MTAG-F is a trait-specific method, and the interpretation is different for MTAG-F than other methods. Therefore, a careful interpretation is required for other methods before concluding that the focal trait drives the association.

Figure 9 The results of the power test. I performed a total of five power tests. Each line shows the statistical power of a model gained from an association test using seven summary statistics: PLEIO (red), MTAG-U (blue), MTAG-C (light blue), METAL (green), and ASSET (yellow). At the bottom of the figure, I visualized the simulation setting of each test. The box plot shows the genetic correlation. 𝑄 and 𝐵 indicate whether the phenotype is quantitative or binary. The heritability values of the traits are shown on the left side of the boxplot. The trait phenotype units are shown at the bottom of the box plot. The line thickness indicates the 95% confidence interval.

Figure 10. Power test results assuming LDL as the focal trait. Each line shows the statistical power of a model gained from an association test using seven summary statistics:

PLEIO (red), MTAG-U (blue), MTAG-F (light blue), METAL (green), and ASSET (yellow). Note that the x-axis is the product of the sample number and heritability. For example, the number of samples of a trait with a heritability value of 0.1 is 100,000 for 𝑁𝒉2= 10,000 and 40,000 for 𝑁𝒉2= 4,000. At the bottom of the figure, I visualized the simulation setting of each test. The box plot shows the genetic correlation. The color of the trait name indicates whether the phenotype is quantitative (green) or binary (purple). Since the focal trait is the main interest of this analysis, I assumed that the focal trait collected twice as many samples as a non-focal trait. In other words, a point of the focal trait in the x-axis means 𝑁𝒉22.