SnapHiC: a computational pipeline to identify chromatin loops from single-cell Hi-C data

Miao Yu1,2, Armen Abnousi3, Yanxiao Zhang2, Guoqiang Li2, Lindsay Lee3, Ziyin Chen1, Rongxin Fang2,4, Taylor M Lagler5, Yuchen Yang6,7, Jia Wen8, Quan Sun5, Yun Li5,8,9, Bing Ren10,11, Ming Hu12

  1. State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, China.
  2. Ludwig Institute for Cancer Research, La Jolla, CA, USA.
  3. Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic Foundation, Cleveland, OH, USA.
  4. Howard Hughes Medical Institute, Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA, USA.
  5. Department of Biostatistics, University of North Carolina, Chapel Hill, NC, USA.
  6. Department of Pathology and Laboratory Medicine, University of North Carolina, Chapel Hill, NC, USA.
  7. McAllister Heart Institute, University of North Carolina, Chapel Hill, NC, USA.
  8. Department of Genetics, University of North Carolina, Chapel Hill, NC, USA.
  9. Department of Computer Science, University of North Carolina, Chapel Hill, NC, USA.
  10. Ludwig Institute for Cancer Research, La Jolla, CA, USA. biren@health.ucsd.edu.
  11. Center for Epigenomics, Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA, USA. biren@health.ucsd.edu.
  12. Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic Foundation, Cleveland, OH, USA. hum@ccf.org.

Abstract

Single-cell Hi-C (scHi-C) analysis has been increasingly used to map chromatin architecture in diverse tissue contexts, but computational tools to define chromatin loops at high resolution from scHi-C data are still lacking. Here, we describe Single-Nucleus Analysis Pipeline for Hi-C (SnapHiC), a method that can identify chromatin loops at high resolution and accuracy from scHi-C data. Using scHi-C data from 742 mouse embryonic stem cells, we benchmark SnapHiC against a number of computational tools developed for mapping chromatin loops and interactions from bulk Hi-C. We further demonstrate its use by analyzing single-nucleus methyl-3C-seq data from 2,869 human prefrontal cortical cells, which uncovers cell type-specific chromatin loops and predicts putative target genes for noncoding sequence variants associated with neuropsychiatric disorders. Our results indicate that SnapHiC could facilitate the analysis of cell type-specific chromatin architecture and gene regulatory programs in complex tissues.

Presented By Miao Yu