The reason for pre-defined breaks (-logP)

Regarding the log-transformed P-value breaks used to generate the joint distribution in conjFDR, the default settings are: 
trait 1 breaks from [0,30] in increments of 0.01; seq(0,30,0.01)
trait 2 breaks from [0,3] in increments of 0.1; seq(0,3,0.1)

Is there any reason for this breaks? Would it be beneficial to adjust the breaks for trait 2?

I tried another break: t1 for [0,20] in increments of 0.01 and t2 for [0,10] in increments of 0.1. Using the same dataset, this break can identify more SNPs with lower conjFDR. The distribution of conjFDR also slight changed because dividing the joint P-value space into smaller cells. I assume that using finer breaks can better capture the local peaks and reduce the dilution artifacts.

For example, A SNP with (P1 = 10-5, P2 = 5×10-6) might fall into a bin with many null SNPs in coarse breaks (e.g., P2 = 0–0.01), failing conjFDR. With finer breaks, it could dominate its new bin (P2 = 0–10-5), passing FDR.

However, using finer breaks may also risk increasing variance in FDR estimation (finer bins reduce the number of SNPs per cell) and the empirical FDR adjustment may become unstable or inflated, especially in regions with low SNP density. I am wondering if there is any way to get robust estimate or any other method to complement (e.g. clumping).

Is there any reason for this break? 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The reason for pre-defined breaks (-logP) #26

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

The reason for pre-defined breaks (-logP) #26

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions