Skip to content

chi-raag/pyihw

Repository files navigation

pyIHW

Python implementation of Independent Hypothesis Weighting (IHW) by Ignatiadis & Huber (2016, Nature Methods).

IHW improves power in large-scale multiple testing by learning data-driven weights from an independent covariate while controlling FDR at a user-specified level.

Installation

pip install pyihw

Quick start

pyIHW ships with DESeq2 results from the airway RNA-seq dataset (Himes et al. 2014):

import numpy as np
from pyihw import ihw, load_airway, bh_threshold

pvalues, basemean = load_airway()
print(f"{len(pvalues)} hypotheses")
33469 hypotheses

Run IHW with baseMean as the covariate and compare to standard Benjamini-Hochberg:

result = ihw(pvalues, basemean, alpha=0.1, rng=np.random.default_rng(42))

t_bh = bh_threshold(pvalues, alpha=0.1)
bh_rejections = int(np.sum(pvalues <= t_bh))

print(f"BH rejections:  {bh_rejections}")
print(f"IHW rejections: {result.n_rejections}")
print(f"Improvement:    +{result.n_rejections - bh_rejections} discoveries")
BH rejections:  4099
IHW rejections: 4876
Improvement:    +777 discoveries

Parameters

ihw(
    pvalues,
    covariates,
    alpha,
    *,
    covariate_type="ordinal",   # "ordinal" or "nominal"
    nbins="auto",               # number of covariate strata
    nfolds=5,                   # cross-validation folds
    adjustment_type="bh",       # "bh" (FDR) or "bonferroni" (FWER)
    null_proportion=False,      # Storey's pi0 estimation
    rng=None,                   # numpy.random.Generator for reproducibility
)

Reproducibility

Pass an rng argument to get deterministic results:

result = ihw(pvalues, covariates, alpha=0.1, rng=np.random.default_rng(42))

Dependencies

NumPy and SciPy only.

Acknowledgments

pyIHW is a Python reimplementation of the IHW R/Bioconductor package by Nikolaos Ignatiadis and Wolfgang Huber. The method is described in:

Ignatiadis, N., Klaus, B., Zaugg, J.B. et al. Data-driven hypothesis weighting increases detection power in genome-scale multiple testing. Nature Methods 13, 577–580 (2016). doi:10.1038/nmeth.3885

The bundled airway dataset is from Himes et al. (2014), PLoS ONE 9(6): e99625 (GEO GSE52778), processed through DESeq2.

About

Python implementation of IHW

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages