4. eQTL Mapping

FastGxC estimates genetic effects on the context-shared and each context-specific component using ultra-fast linear regression models.

eQTL Mapping with MatrixEQTL

FastGxC uses the MatrixEQTL R package to perform cis- (or trans-) eQTL mapping using the eQTL_mapping_step() function.

eQTL_mapping_step()

This function takes input genotypes, expression, and gene/SNP location files (see simulate data and decomposition sections for format of input files), and produces summary statistics in MatrixEQTL format. Below is an example code demonstration how to run eQTL mapping on the shared and C-specific expression components from the simulated data.

input_dir <- "simulated_example/"
out_dir <- input_dir
expr_files <- list.files(input_dir, pattern = "_specific_expression.txt$")

### NOTE: this is just an example of how to set context names with simulated data. 
### Please change these lines to be specific to your dataset
nC <- length(expr_files)
context_names <- paste0("context", seq(1, nC))

all_contexts <- c(context_names, "shared")
SNP_file_name <- file.path(input_dir, "SNPs.txt")
snps_location_file_name <- file.path(input_dir, "snpsloc.txt")
gene_location_file_name <- file.path(input_dir, "geneloc.txt")

for (i in seq_along(all_contexts)) {
  context <- all_contexts[i]

  if (context == "shared") {
    shared_specific <- "shared"
    expression_file_name <- file.path(input_dir, "context_shared_expression.txt")
  } else {
    shared_specific <- "specific"
    expression_file_name <- file.path(input_dir, paste0(context, "_specific_expression.txt"))
  }
  
  eQTL_mapping_step(
    SNP_file_name = SNP_file_name,
    snps_location_file_name = snps_location_file_name,
    expression_file_name = expression_file_name,
    gene_location_file_name = gene_location_file_name,
    context = context,
    out_dir = out_dir,
    output_file_name_cis = file.path(out_dir, paste0(context, "_", shared_specific, ".cis_pairs.txt")),
    output_file_name_tra = file.path(out_dir, paste0(context, "_", shared_specific, ".trans_pairs.txt")),
    use_model = modelLINEAR,
    cis_dist = 1e6,
    pv_threshold_cis = 1,
    pv_threshold_tra = 0,
    error_covariance = numeric()
  )
}

Accelerated eQTL Mapping with tensorqtl (Optional)

FastGxC also supports GPU-accelerated cis-eQTL mapping using tensorqtl, integrated via the reticulate R package. To enable this, simply pass method = "tensorqtl" to the eQTL_mapping_step() function. The inputs remain unchanged. This method is particularly useful for large datasets and provides substantial speedups. Note: TensorQTL requires Python 3.7+ and appropriate libraries (tensorqtl, numpy, pandas, pyarrow, torch). Ensure your environment is properly configured.

Output:

FastGxC will save C+1 summary statistics files in .txt format (see the output file example below) for compatibility with downstream FastGxC multiple testing adjustment steps.

contextX_specific.cis_pairs.txt

SNP	        gene	beta	                p.value	                FDR
snp64928	gene65	0.418661087751389	8.4807042218939e-07	0.084807042218939
snp42566	gene43	-0.355810463428497	7.53491371859939e-06	0.306773379105091
snp22672	gene23	-0.394770801067352	9.20320137315272e-06	0.306773379105091
snp73350	gene74	0.344478219747543	1.59910672991937e-05	0.399776682479843
snp86064	gene87	-0.371489137411118	3.10237595856035e-05	0.550542627746407

Getting eQTL effect sizes in each context post FastGxC

FastGxC decomposes eQTL effects into a shared component and context-specific deviations. By construction, specific effects represent deviations from the mean effect across contexts. Positive values indicate stronger-than-average effects; negative values indicate weaker-than-average effects. The full eQTL effect size in each context $β_c$ can always be recovered by adding the shared $β^{sh}$ and specific effects $β_c^{sp}$, i.e. $β_c = β^{sh} + β_c^{sp}$.

The code below reads the shared and context-specific summary statistics produced by the eQTL mapping step and recovers the total effect size in each context. The merge performs an inner join on SNP-gene pairs, so only pairs present in both files are combined.

input_dir <- "simulated_example/"

# Load shared effects
shared <- read.table(file.path(input_dir, "shared_shared.cis_pairs.txt"),
                     header = TRUE, stringsAsFactors = FALSE)

# Find all context-specific output files
specific_files <- list.files(input_dir, pattern = "_specific\\.cis_pairs\\.txt$",
                             full.names = TRUE)

# For each context, merge with shared effects and compute total effect
total_effects <- lapply(specific_files, function(f) {
  context <- gsub("_specific\\.cis_pairs\\.txt$", "", basename(f))
  specific <- read.table(f, header = TRUE, stringsAsFactors = FALSE)
  
  # Inner join on SNP-gene pair ensures only matching pairs are combined
  merged <- merge(shared, specific, by = c("SNP", "gene"), suffixes = c(".sh", ".sp"))
  merged$beta.total <- merged$beta.sh + merged$beta.sp
  merged$context <- context
  merged[, c("SNP", "gene", "context", "beta.sh", "beta.sp", "beta.total")]
})

total_effects <- do.call(rbind, total_effects)
head(total_effects)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

4. eQTL Mapping

eQTL Mapping with MatrixEQTL

Accelerated eQTL Mapping with tensorqtl (Optional)

Output:

Getting eQTL effect sizes in each context post FastGxC

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally