Cohesin-mediated loop anchors confine the locations of human replication origins

Daniel J Emerson # 1 2 3 , Peiyao A Zhao # 4 , Ashley L Cook # 1 2 3 , R Jordan Barnett 1 2 3 , Kyle N Klein 4 , Dalila Saulebekova 5 , Chunmin Ge 1 2 3 , Linda Zhou 1 2 3 , Zoltan Simandi 1 2 3 , Miriam K Minsk 1 2 3 , Katelyn R Titus, Weitao Wang, Wanfeng Gong, Di Zhang, Liyan Yang, Sergey V Venev, Johan H Gibcus, Hongbo Yang, Takayo Sasaki, Masato T Kanemaki, Feng Yue, Job Dekker, Chun-Long Chen, David M Gilbert, Jennifer E Phillips-Cremins.
Nature. 2022-06-08;606(7915):812-819.
Abstract
DNA replication occurs through an intricately regulated series of molecular events and is fundamental for genome stability1,2. At present, it is unknown how the locations of replication origins are determined in the human genome. Here we dissect the role of topologically associating domains (TADs)3-6, subTADs7 and loops8 in the positioning of replication initiation zones (IZs). We stratify TADs and subTADs by the presence of corner-dots indicative of loops and the orientation of CTCF motifs. We find that high-efficiency, early replicating IZs localize to boundaries between adjacent corner-dot TADs anchored by high-density arrays of divergently and convergently oriented CTCF motifs. By contrast, low-efficiency IZs localize to weaker dotless boundaries. Following ablation of cohesin-mediated loop extrusion during G1, high-efficiency IZs become diffuse and delocalized at boundaries with complex CTCF motif orientations. Moreover, G1 knockdown of the cohesin unloading factor WAPL results in gained long-range loops and narrowed localization of IZs at the same boundaries. Finally, targeted deletion or insertion of specific boundaries causes local replication timing shifts consistent with IZ loss or gain, respectively. Our data support a model in which cohesin-mediated loop extrusion and stalling at a subset of genetically encoded TAD and subTAD boundaries is an essential determinant of the locations of replication origins in human S phase.

Related data

Data summary
All new raw data created in this manuscript have been uploaded to the 4D Nucleome portal and will be freely released for full distribution to the public (see specific details below). Processed data files for all figures and extended data figures are provided as Supplementary Tables 1–19. ORM data have been uploaded to the National Center for Biotechnology Information, BioProject database accession number PRJNA788726 (http://genome.ucsc.edu/s/dsaulebe/ORM%20data%20HCT116). Two-fraction Repli-seq data for Blobel engineered lines (raw data and processed log2[early/late] from three conditions) were obtained from https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE190117. Group 1 data (16-fraction Repli-seq data for H1 human ES cells) are available from the 4D Nucleome portal as follows: H1 human ES raw fastq, https://data.4dnucleome.org/experiment-sets/4DNESXRBILXJ/; H1 human ES read-depth-normalized array for visualization, https://data.4dnucleome.org/files-processed/4DNFIEEYFQ7C/; H1 human ES scaled, read-depth-normalized array for IZ calls, https://data.4dnucleome.org/files-processed/4DNFI3N8GHKR/; H1 human ES early, early–mid and late IZs on read-depth-normalized array, https://data.4dnucleome.org/files-processed/4DNFIRF7WZ3H/. Group 2 data (16-fraction Repli-seq data for wild-type HCT116 cells) are available from the 4D Nucleome portal as follows: wild-type HCT116 raw fastq, https://data.4dnucleome.org/experiment-sets/4DNESNGZM5FG/; wild-type HCT116 mitochondria-normalized array for IZ calls, https://data.4dnucleome.org/files-processed/4DNFIPIQTMJ9/; wild-type HCT116 early, early–mid and late IZs on mitochondria-normalized array, https://data.4dnucleome.org/files-processed/4DNFI95K53YS/. Group 3 data (16-fraction Repli-seq data for wild-type and cohesin-knockdown HCT116 pairing) are available from the 4D Nucleome portal as follows: RAD21-knockdown HCT116 raw, https://data.4dnucleome.org/experiment-sets/4DNES92AU9JR/; RAD21-knockdown HCT116 read-depth-normalized downsampled array for IZ calls, https://data.4dnucleome.org/files-processed/4DNFI3ZMWG5T/; RAD21-knockdown HCT116 early, early–mid and late IZs called on the read-depth-normalized downsampled array, https://data.4dnucleome.org/files-processed/4DNFIGOMS9G7/; wild-type HCT116 raw fastq, https://data.4dnucleome.org/experiment-sets/4DNESNGZM5FG/; wild-type HCT116 read-depth-normalized downsampled array for IZ calls, https://data.4dnucleome.org/files-processed/4DNFI6NGWNOG/; wild-type HCT116 early, early–mid and late IZs called on the read-depth-normalized downsampled array, https://data.4dnucleome.org/files-processed/4DNFIYO3H24N/. Group 4 data (16-fraction Repli-seq data for wild-type and WAPL-knockdown HCT116 pairing) are available from the 4D Nucleome portal as follows: WAPL-knockdown HCT116 raw, https://data.4dnucleome.org/experiment-sets/4DNES72NE7SL/; WAPL-knockdown HCT116 read-depth-normalized downsampled array for IZ calls, https://data.4dnucleome.org/files-processed/4DNFI7MI88QR/; WAPL-knockdown HCT116 early, early–mid and late IZs called on the read-depth-normalized downsampled array, https://data.4dnucleome.org/files-processed/4DNFIDI1QJVA/; wild-type HCT116 raw fastq, https://data.4dnucleome.org/experiment-sets/4DNESNGZM5FG/; wild-type HCT116 read-depth-normalized downsampled array for IZ calls, https://data.4dnucleome.org/files-processed/4DNFI6NGWNOG/; wild-type HCT116 early, early–mid and late IZs called on the read-depth-normalized downsampled array, https://data.4dnucleome.org/files-processed/4DNFILNNSFMD/. Group 5 data (16-fraction Repli-seq data visualization) are available from the 4D Nucleome portal as follows: wild-type HCT116 read-depth-normalized downsampled array for visualization, https://data.4dnucleome.org/files-processed/4DNFI6NGWNOG/; RAD21-knockdown HCT116 read-depth-normalized downsampled array for visualization, https://data.4dnucleome.org/files-processed/4DNFI3ZMWG5T/; WAPL-knockdown HCT116 read-depth-normalized downsampled array for visualization, https://data.4dnucleome.org/files-processed/4DNFI7MI88QR/. Hi-C data for wild-type and WAPL-knockdown HCT116 pairing are available from the 4D Nucleome portal as follows: WAPL-knockdown HCT116 raw Hi-C, https://data.4dnucleome.org/experiment-set-replicates/4DNES1JP4KZ1/; WAPL-knockdown HCT116 normalized balanced Hi-C matrices, https://data.4dnucleome.org/files-processed/4DNFIY5939F3/; WAPL-knockdown HCT116 loops, https://data.4dnucleome.org/files-processed/4DNFILP7BD5H/; wild-type HCT116 raw Hi-C, https://data.4dnucleome.org/experiment-set-replicates/4DNESNSTBMBY/; wild-type HCT116 normalized balanced Hi-C matrices, https://data.4dnucleome.org/files-processed/4DNFI5MR78O6/; wild-type HCT116 loops, https://data.4dnucleome.org/files-processed/4DNFIOQLL854/. Two-fraction Repli-seq data for human iPS wild-type and two CRISPR-engineered lines (raw data and processed log2[early/late] from three conditions) are available from the 4D Nucleome portal as follows: wild-type human iPS line raw data, https://data.4dnucleome.org/experiment-sets/4DNESDYES9QD/; wild-type human iPS line log2[early/late], https://data.4dnucleome.org/files-processed/4DNFI5WEY784/; human engineered clone 1 80-kb-IZ-deletion iPS line raw data, https://data.4dnucleome.org/experiment-sets/4DNESE3WCUAQ/; human engineered clone 1 80-kb-IZ-deletion iPS line log2[early/late], https://data.4dnucleome.org/files-processed/4DNFIZMB415V/; human engineered clone 2 30-kb-control-deletion iPS line raw data, https://data.4dnucleome.org/experiment-sets/4DNES66YWJU7/; human engineered clone 2 30-kb-control-deletion iPS line log2[early/late], https://data.4dnucleome.org/files-processed/4DNFIWDMF7HW/. 5C data for human IPS wild-type and two engineered lines (primer bed file, raw heatmaps and processed heatmaps from three conditions) are available from the 4D Nucleome portal as follows: wild-type human iPS line raw data, https://data.4dnucleome.org/experiment-set-replicates/4DNESLRDUPZ6/; wild-type human iPS line balanced 5C data, replicate 1, https://data.4dnucleome.org/files-processed/4DNFIXM8V3ZB/, replicate 2, https://data.4dnucleome.org/files-processed/4DNFIDB6M1ZN/; wild-type human engineered clone 1 80-kb-boundary-deletion iPS line raw data, https://data.4dnucleome.org/experiment-set-replicates/4DNES39F1QWU/; wild-type human engineered clone 1 80-kb-boundary-deletion iPS line balanced 5C data, https://data.4dnucleome.org/files-processed/4DNFIA8P94BX/; wild-type human engineered clone 2 30-kb-control-deletion iPS line raw data, https://data.4dnucleome.org/experiment-set-replicates/4DNES3PDMUHG/; wild-type human engineered clone 2 30-kb-control-deletion iPS line balanced 5C data: replicate 1, https://data.4dnucleome.org/files-processed/4DNFI7WZYRHP/, replicate 2, https://data.4dnucleome.org/files-processed/4DNFI7V4VXAQ/. We freely release all custom code for loop, TAD and subTAD detection at the following bitbucket links: TAD/subTAD detection, https://bitbucket.org/creminslab/cremins_lab_tadsubtad_calling_pipeline_11_6_2021; loop detection, https://bitbucket.org/creminslab/cremins_lab_loop_calling_pipeline_11_6_2021/src/initial/.