If you have followed the steps in the README.md Markdown document and have installed the peakHiC pipeline and have run the commands to create V4C profiles and loops for the example data, the resulting files should now be available for further analysis. Here we will briefly describe how to visualize V4C plots from viewpoints of interest in R and how to export loops and V4C BigWig tracks for further analysis using e.g. the UCSC (https://genome.ucsc.edu/) or IGV (https://igv.org/) genome browsers or the Juicebox (https://github.com/aidenlab/Juicebox) tool developed by the lab of Erez Lieberman Aiden.
The first step is to open the R console and load the peakHiC interactive functions from this repository. Again make sure that you set the baseFolder variable in the code below correctly to match your local installation.
baseFolder <- "/home/geert/localdev/github/peakHiC/"
sourceFile <- paste0(baseFolder,"R/peakHiC_interactive.R")
source(sourceFile)
Anytime we want to analyze data with peakHiC, we also need to load a peakHiCObj that contains all configuration data for a specific peakHiC dataset and analysis run. Here, we will use the peakHiC example data, analyzing Hi-C from GM12878 cells limited to a ~4Mb section (181 viewpoints) on chr1 (hg38).
objFile <- paste0(baseFolder,"DATA/example_data/hg38_4DN_Rao_GM12878_peakHiC_example_peakHiCObj.rds")
peakHiCObj <- readRDS(objFile)
Viewpoints can be defined from any locus of interest in the genome. The example data contains 181 viewpoints from gene promoters (TSS), CTCF bound site in GM12878 (ENCODE ChIP-Seq data) and sites with enrichment for H3K27ac in GM12878. For the example data, these viewpoints are already stored in a GenomicRanges object. You can view them by typing
peakHiCObj$vpsGR
in the R console. To add your own viewpoints, you need to create a txt file which defines them.
- Viewpoint file
- peakHiC viewpoints are organized in a viewpoint file. Columns in this file specify the genomic coordinates of each viewpoint (row in the file). Additionally we need to specify a unique vpID, a name and a type (e.g. TSS, CTCF site) :
| chr | vpPos | vpID | name | type |
|---|---|---|---|---|
| chr1 | 42846618 | CTCF_ENCODE_14013 | CTCF_ENCODE_14013 | CTCF |
| chr1 | 42931204 | CTCF_ENCODE_25928 | CTCF_ENCODE_25928 | CTCF |
Table 1. Example of a peakHiC viewpoint file which defines genomic loci from which V4C profiles will be created.
The R code below will read this file and replace the vpsGR object in peakHiCObj, so that these viewpoints can be analyzed. Based on the genomic partition and restriction fragments defined in the peakHiC object, the creatVPs function will assign a partID and fragID to each viewpoint.
exampleVPs <- paste0(baseFolder,"DATA/example_data/hg38_4DN_Rao_GM12878_peakHiC_example_VPs.txt")
vpData <- read.table(exampleVPs,header=TRUE,stringsAsFactors=FALSE)
peakHiCObj$vpsGR <- createVPs(vpData=vpData,peakHiCObj=peakHiCObj)
Below is a screenshot where we loaded the exported BigWig tracks for 2 example VPs into the IGV browser.
Figure 1. Visualization of peakHiC BigWig tracks containing V4C profiles from 2 example viewpoints
