The genome-wide multi-layered architecture of chromosome pairing in early drosophila embryos

Nature

The genome-wide multi-layered architecture of chromosome pairing in early drosophila embryos"


Play all audios:

Loading...

ABSTRACT Genome organization involves _cis_ and _trans_ chromosomal interactions, both implicated in gene regulation, development, and disease. Here, we focus on _trans_ interactions in


_Drosophila_, where homologous chromosomes are paired in somatic cells from embryogenesis through adulthood. We first address long-standing questions regarding the structure of embryonic


homolog pairing and, to this end, develop a haplotype-resolved Hi-C approach to minimize homolog misassignment and thus robustly distinguish _trans_-homolog from _cis_ contacts. This


computational approach, which we call Ohm, reveals pairing to be surprisingly structured genome-wide, with _trans_-homolog domains, compartments, and interaction peaks, many coinciding with


analogous _cis_ features. We also find a significant genome-wide correlation between pairing, transcription during zygotic genome activation, and binding of the pioneer factor Zelda. Our


findings reveal a complex, highly structured organization underlying homolog pairing, first discovered a century ago in _Drosophila_. Finally, we demonstrate the versatility of our


haplotype-resolved approach by applying it to mammalian embryos. SIMILAR CONTENT BEING VIEWED BY OTHERS STAGE-RESOLVED HI-C ANALYSES REVEAL MEIOTIC CHROMOSOME ORGANIZATIONAL FEATURES


INFLUENCING HOMOLOG ALIGNMENT Article Open access 08 October 2021 THE IMPACT OF CHROMOSOMAL FUSIONS ON 3D GENOME FOLDING AND RECOMBINATION IN THE GERM LINE Article Open access 20 May 2021


ORDER AND STOCHASTICITY IN THE FOLDING OF INDIVIDUAL _DROSOPHILA_ GENOMES Article Open access 04 January 2021 INTRODUCTION Although chromosomes are organized within the nucleus into distinct


territories, they nevertheless come into contact _in trans_. In diploid organisms, including mammals, such _trans_ contacts may involve specific interactions, such as pairing, between the


homologous maternal and paternal chromosomes. Although best known in meiotic cells, homolog pairing can also occur in somatic cells and influence gene expression via phenomena such as


transvection (reviewed by1,2,3). In _Drosophila_, somatic pairing increases extensively from early embryogenesis to adulthood, making this organism an ideal system for studying _trans_


interactions (reviewed by1,2,3). What, for example, is the structure of homolog pairing in early embryos? Pairing may encompass many forms, ranging from a well-aligned juxtaposition of


homologs (railroad track), to a loose, laissez-faire association of homologous regions, or even an apposition of highly disordered structures; for instance, pairing in mammalian systems can


manifest as a nonrandom, approximate co-localization of homologous chromosomal regions (reviewed by1,2). Recently, high- and super-resolution4,5,6,7 imaging of several genomic loci has


suggested that pairing can involve distinct chromosomal domains as well as more closely merged regions, while simulations of pairing8 via integration of the lamina-DamID data and the Hi-C


data have predicted correlations between pairing and epigenetic domains. Furthermore, a transgene-based study of transvection has proposed a domain-based mode for pairing9. These studies are


consistent with long-standing discussions about the establishment, maintenance, and stability of pairing as well as the relationship between pairing and transcription (reviewed


by1,2,3,10,11). Nevertheless, questions regarding the extent and structure of pairing remain. More recently, Hi-C-derived ligation products with overlapping sequences have been used to infer


short-range interactions between homologs or sister chromatids in a tetraploid Kc167 cell line and thus a correlation between short-range interactions and active regions12. Here, we asked


how pairing initiates in early _Drosophila_ embryos, when maternal and paternal genomes first meet. Although the imaging of individual loci has documented the initiation of pairing during


early embryogenesis13,14,15,16,17, the global structure of pairing and its genome-wide relationship to fundamental developmental programs has remained elusive. To that end, we took advantage


of single nucleotide variants (SNVs) to enable haplotype-resolved Hi-C. Haplotype-resolved Hi-C has previously been used to study _cis_ interactions in mammalian


systems18,19,20,21,22,23,24,25,26,27,28,29 as well as _trans_-homolog interactions in yeast30 and, here, we developed a method (Ohm) to much more confidently identify those _trans_


interactions that occur between homologous chromosomes. Importantly, our approach allowed us to detect both short- and long-range interactions, that is, interactions spanning genomic


distances as small as a few kilobases to several megabases. Excitingly, haplotype-resolved Hi-C as well as Ohm can be applied to any organism whose genome sequence has been


haplotype-resolved, and we demonstrate their generality by applying them also to studies25,26 of early mammalian embryos. Strikingly, our Ohm approach reveals that homolog pairing is


extensive along entire lengths of chromosomes at short- as well as long-range distances and is highly structured, with _trans_-homolog domains, domain boundaries, compartments, and


interaction peaks. We reveal the coincidence between _trans_-homolog features and the positions and sizes of analogous _cis_ features and then provide a three-axis framework of precision,


proximity, and continuity to accommodate all types of inter-chromosomal interactions, including homolog pairing. These findings complement those of our companion study (AlHaj Abed, Erceg,


Goloborodko et al.31), which, by generating and then using a _Drosophila_ hybrid cell line with a high degree of pairing, enabled a much more in-depth analysis of the structure of pairing.


In particular, this other study observed at least two forms of pairing and correlations between these forms and the active or repressed state of chromatin. In our current study, we also


explore the relationship between pairing and the developmental programs in play in the early embryo during the initial stages of pairing. We find that pairing correlates with the opening of


chromatin mediated by the pioneer factor Zelda during zygotic genome activation. RESULTS HAPLOTYPE-RESOLVED HI-C FOR PROFILING EMBRYONIC PAIRING We focused our Hi-C analysis on the window of


_Drosophila_ embryogenesis extending from 2–4 h after egg laying (AEL), when zygotic gene transcription is activated and the embryo transitions from a syncytial blastoderm into a


multicellular state32. It is during this window that maternal and paternal homologs come together for the first time13,14,15, and the rapid early cleavage cycles give way to the longer 13th


and 14th cycles, thus minimizing the contribution of mitotic genome organization to our Hi-C signal33. To generate the embryos for our study, we mated two divergent _Drosophila_ Genetic


Reference Panel lines, 057 and 439 (Fig. 1a, b), as a high density of SNVs is required for assaying homolog pairing via genomic methods. Indeed, strains 057 and 439 differ with an average


SNV frequency of ~5.5 per kilobase (kb), except on chromosome 4 (1.0 SNV/kb) (see Supplementary Note 1). This frequency was adjusted to ~5.1 SNVs/kb (0.01 for chr4) when we re-sequenced both


lines and reconstructed an F1 diploid genome using only high-confidence homozygous SNVs (Supplementary Fig. 1 and Supplementary Table 1, see Methods); this diploid genome assembly ensured


the most stringent haplotype-resolved mapping of Hi-C products. Importantly, we confirmed that hybrid embryos achieved levels of homolog pairing consistent with those observed in other


studies13,14,15,16,17 using fluorescent in situ hybridization (FISH) to assess pairing (Fig. 1c) at heterochromatic (16.0% ± 5.7–62.4% ± 3.6) and euchromatic (2.0% ± 1.5–22.0% ± 2.4) loci


across different chromosomes (Fig. 1d, e). In addition, pairing increased to expected levels in 057/439 embryos as they aged (Supplementary Fig. 2a). We performed Hi-C and recovered 513


million Hi-C products, which after mapping to the reference genome, yielded ~56% unique read pairs (Supplementary Table 2). When assembled into a map and analyzed at 1 kb resolution without


regard to parental origin of the mapped fragments, these reads revealed features that are routinely seen in non-haplotype-resolved Hi-C maps33,34—a central _cis_ diagonal representing


short-distance interactions in _cis_ (_cis_ read pairs) as well as signatures for compartments, domains (also called contact domains19 and topologically associated domains, or TADs35,36,


hereafter referred to as domains), and interaction peaks—thus confirming the quality of our Hi-C data (Supplementary Fig. 2b). Presence of these interphase hallmarks indicates that the vast


majority of the cells are in interphase rather than mitosis, where such features are absent33. OHM-DIRECTED DISTINCTION OF _TRANS_-HOMOLOG FROM _CIS_ CONTACTS Using our newly generated


haplotype-resolved F1 diploid genome to determine the parental origin of our reads, we found that 5.8% of all mappable read pairs appeared to represent _trans_ (_trans_-homolog as well as


_trans_-heterolog) contacts and that, of these, 36.1% indicated contacts between homologous chromosomes (_trans_-homolog read pairs) (Supplementary Table 3). Encouraged by these numbers, we


next determined the quality of the purported _trans_-homolog reads. For example, errors in reconstruction of the F1 diploid genome or sequencing of the Hi-C products can lead to


misassignment of _cis_ read pairs as _trans_-homolog read pairs (Fig. 2a) and, given the greater number of _cis_ as versus _trans_-homolog read pairs, even a small rate of misassignment of


_cis_ read pairs could be confounding. Thus, to assess the potential magnitude of homolog misassignment, we sought a signature for it in our Hi-C dataset, focusing on read pairs where the


separation along the genome (genomic separation) of the two reads of a read pair is <1 kb (Fig. 2b, shaded area). We chose to examine these read pairs because they are enriched in


non-informative byproducts of the Hi-C protocol37, the most abundant being unligated pieces of DNA, known as dangling ends (Fig. 2b, hatched area and Supplementary Fig. 3, see Methods).


Dangling ends have genomic separations of <1 kb, are “inwardly” orientated, and 100–1,000 fold more abundant than read pairs of other orientations at the same separations (Supplementary


Fig. 4a) and, since they are exclusively _cis_ read pairs, their enrichment among _trans_-homolog read pairs can only arise via homolog misassignment (HM in figures and Methods). As such,


the enrichment of short-distance (<1 kb) _trans_-homolog pairs can serve as a signature of homolog misassignment and be used to assess the overall quality of our homolog assignment


(Methods). Consistent with this hypothesis, this signature of homolog misassignment almost disappeared when we increased the stringency of our mapping by requiring at least two SNVs per read


(Fig. 2c and Supplementary Fig. 4b, c, probability of homolog misassignment PHM = 0.01%). In contrast, homolog misassignment increased when we used the less accurate DGRP SNV annotations


(Supplementary Fig. 4d, PHM = 2.40%). This approach, which we call Oversight of homolog misassignment (Ohm), allowed us to estimate the probability of homolog misassignment as only ~0.17%


when requiring at least 1 SNV per read (Fig. 2d, see Methods). Finally, this analysis indicated that homolog misassignment introduced less than 5% of erroneous _trans_-homolog contacts at


separations above 1 kb (Fig. 2e). Having gained confidence that contamination contributes only a minor portion of purported _trans_-homolog read pairs, we observed that _trans_-homolog


contacts at genomic separations of 1 kb are >100-fold more frequent than those at genomic separations of >1 Mb, and >500 fold more frequent than contacts between regions on


different chromosomes (_trans_-heterolog, Fig. 2b). Thus, _trans_-homolog contacts constitute a major fraction of _trans_-chromosomal interactions. HOMOLOG PAIRING IN _DROSOPHILA_ EMBRYOS IS


HIGHLY STRUCTURED We next generated a Hi-C map using ≥ 1 SNV per read and filtering our data to exclude contacts below 3 kb of genomic separation in order to substantially reduce


contamination by Hi-C byproducts. The resulting map was striking in a number of ways (Supplementary Fig. 2b). In addition to displaying the expected central _cis_ diagonal, it revealed


prominent _“trans_-homolog_”_ diagonals that extended across the entire length of each chromosome (Fig. 3a, black arrows). This observation demonstrated that interactions between homologs,


including those within a few kilobases and up to several megabases, extend genome-wide in early embryos at the time when pairing is being initiated. The _trans_-homolog diagonals suggest


that homologs are relatively well-aligned when coming into contact, consistent with a railroad track structure (Fig. 3b). This indication, however, does not preclude less aligned forms of


pairing, such as the paired domains observed at a number of genomic regions4,5,6,7. For example, Fig. 2b also shows that extensive _trans_-homolog interactions may occur between non-allelic


regions corresponding to genomic separations of hundreds of kilobases or more, albeit at much reduced frequencies. We would expect such interactions to appear as signals at varying distances


off the _trans_-homolog diagonals, leaving open the possibility of other structures, such as laissez-faire (Fig. 3c) and highly disordered pairing (Fig. 3d). That pairing may assume


different forms in _Drosophila_ has long been considered (reviewed by1,2,3), with recent studies suggesting that the state of pairing is a fine balance between pairing and anti-pairing


factors (reviewed by2). Here, we propose a framework in which the different aspects of pairing can be placed along the continuum of three axes representing (i) the precision with which


homologous regions are aligned (_x_ axis), (ii) the proximity with which homologs are held together (_y_ axis), and (iii) the continuity or degree to which a particular state of pairing


extends uninterrupted (_z_ axis), with the continuum along any axis potentially reflecting different forms of pairing and/or the maturation (or degradation) of pairing over time (Fig. 3e).


In this context, high values along all three axes would produce railroad track pairing, intermediate values would approximate a laissez-faire mode of pairing, and minimum values would


encompass interactions ranging from extreme laissez-faire to disorder. This framework can also be used to describe _trans_ interactions between heterologous chromosomes, particularly


prominent examples being those arising from the Rabl polarization of centromeres and telomeres, appearing in some Hi-C maps, including ours, as contacts along non-homologous arms (e.g. 2L


and 2R, as well as 3L and 3R)33,34; the Rabl configuration may facilitate homolog pairing by reducing search space within the nucleus (Fig. 3f13,14,15). Note, however, that _trans_-homolog


contacts were tighter and/or more frequent than _trans_-heterolog contacts (Fig. 3g, h), emphasizing the prevalence of homolog pairing in _Drosophila_. We next asked whether our


haplotype-resolved maps could further clarify the molecular structure of paired homologs as well as elucidate how _trans_-homolog interactions are integrated with the _cis_ interactions that


shape the 3D organization of the genome. Remarkably, we observed _trans_-homolog domains, domain boundaries, compartments, and interaction peaks resembling analogous features in _cis_ maps


(Fig. 3i–k and Supplementary Fig. 5a–e). In fact, 54% of the boundaries of _trans_-homolog domains overlapped a boundary of a _cis_ domain (averaged over two homologs, Supplementary Fig. 


5f). While this value is lower than the upper bound of 72% for _trans_-homolog boundaries shared between replicate haplotype-resolved Hi-C datasets, it is nevertheless significantly above


the 35% overlap expected at random (_P_ < 10−10, _t_-test). Note that, as we obtained 92.3% overlap of domain boundaries between replicates of non-haplotype-resolved Hi-C datasets, it is


likely that the overlap values for haplotype-resolved datasets reflect the relatively low number of haplotype-resolved reads we obtained. The similarity between _trans_-homolog and _cis_


contact maps is further highlighted through comparisons with maternal- and paternal-specific Hi-C domains, which have been observed in other systems to be concordant21; in our dataset, we


observed 73.9% concordance between _cis_ domain boundaries of homologous chromosomes (Supplementary Fig. 5f; also Fig. 3l and Supplementary Fig. 5g, h), wherein concordance between two


replicates was 73.2% (see Methods). _Trans-_homolog maps do, however, differ from some _cis_-defined features (Fig. 3m and Supplementary Fig. 5i, dark blue; see also31). Taken together, our


findings indicate that homolog pairing goes far beyond simple genome-wide alignments of homologous pairs and includes surprisingly structured _trans_-homolog domains, boundaries, and


specific interaction peaks that frequently correspond to _cis_ features (Fig. 3n). HOMOLOG PAIRING IS ASSOCIATED WITH ZYGOTIC GENOME ACTIVATION Spurred on by the rich history of transvection


(reviewed by1,2,3) we asked how broadly pairing might be correlated genome-wide with fundamental developmental events such as the binding of major transcription factors in the early


_Drosophila_ embryo. We focused our attention on arguably the four most prominently studied maternally-contributed factors, involved in zygotic genome activation, in particular key


developmental events such as the opening of chromatin, initiation of transcription, and embryonic pattern formation: the pioneer factor Zelda (Zld), which mediates early chromatin


accessibility38,39; Bicoid (Bcd)40 and Dorsal (Dl)41, which are involved in patterning the anterior-posterior (AP) and dorsal-ventral (DV) axis, respectively; and GAGA factor (GAF)42, which


may facilitate transcription in later stages. We correlated the binding of these factors, as revealed by ChIP-seq datasets40,41,43, to local variations in the degree of pairing as


characterized by the pairing score (PS), defined as the log2 _trans_-homolog contact frequency within a 28 kb window along the diagonal of Hi-C maps (Methods). Excitingly, regions associated


with elevated PS values were significantly enriched for the binding of Zld and Bcd (_P_ < 10−10, Spearman; Fig. 4a) as compared to controls in which we randomized 200 kb chunks of PS


values (Fig. 4b) or 200 kb chunks of binding profiles (Supplementary Fig. 6a). No correlation was observed, however, between PS values and the binding of GAF or Dl (_P_ = 0.139, _P_ = 0.029,


Spearman, respectively; Fig. 4b). Then, using partial correlation analyses, we examined whether these correlations would be affected upon removal of the contribution of a third feature. For


instance, how might the correlation between the binding of Bcd and PS values be affected by the binding of Zld? We found that Zld binding contributes significantly, as Spearman’s


correlation coefficient dropped from 0.155 (_P_ = 2.1 × 10−65; Fig. 4b) to 0.044 (_P_ = 1.3 × 10−6; Supplementary Fig. 6b), when controlled for Zld binding. Interestingly, Spearman’s


correlation coefficient between the binding of Dl and PS values decreased from −0.020 (_P_ = 0.029; Fig. 4b) to −0.157 (_P_ = 3.0 × 10−67; Supplementary Fig. 6b) when controlled for the


contribution of Zld. These findings are consistent with a role of Zld in Bcd- as well as Dl-dependent binding and gene activation40,41. Interestingly, we observed that genes associated with


both AP and DV patterning44,45 are similarly associated with higher PS values (Supplementary Fig. 6c). Moreover, loss of Bcd binding in Zld-depleted embryos40 was significantly enriched at


regions otherwise associated with high PS values (_P_ < 1.21 × 10−68, Mood’s median test; Supplementary Fig. 6d, e). In sum, our data suggest that Zld-mediated early opening of chromatin,


including the binding of Bcd, is strongly correlated with pairing in embryos. As previous studies suggested that regions with high Zld occupancy were more dependent on Zld for establishing


domain boundaries33, we asked whether Zld may play a role in domains that are established via _trans_-homolog interactions. Several observations supported our hypothesis. First, visual


inspection of our Hi-C revealed that _trans_-homolog domain boundaries are associated with high Zld occupancy and, moreover, are associated with values of PS that are higher than those of


the interior of domains (Fig. 4a and Supplementary Fig. 7a, _P_ < 10−10, Mood’s median test). Interestingly, 25% of the top Zld-occupied regions have high PS values (_P_ < 10−10,


Mood’s median test, Fig. 4c). Second, using Hi-C data from Zld-depleted embryos33, we observed a decrease of boundary strength at regions that would otherwise be associated with high PS


values and strong Zld binding (Fig. 4d and Supplementary Fig. 7b, _P_ < 10−10, Spearman). Third, we found that Zld depletion trended strongly toward reduced pairing, with the reduction


being significant at four out of six loci assayed (_P_ < 2.56 × 10−3, Fisher’s two-tailed exact; Supplementary Fig. 7c, d). Finally, we also observed strong correlations between PS values


and features known to be enriched at domain boundaries, such as RNA Pol II33, housekeeping genes33,46, and nascent zygotic gene products47 (Supplementary Fig. 6c, 7e). Thus, Zld-mediated


chromatin accessibility and transcription appear correlated with the establishment of homolog pairing in the early embryo (Fig. 4e), reinforcing the notion that the establishment of homolog


pairing may bear a relationship to genome function during key developmental events. For example, Zld-mediated opening of genomic regions may facilitate homologous sequences finding each


other when pairing is initiated in early embryogenesis. Whether Zelda plays a direct or indirect role in initiating pairing, however, will require additional study, as will the interplay


between Zld and other factors involved in pairing and other forms of chromosome organization. HAPLOTYPE-RESOLVED APPLICATION OF OHM TO MAMMALIAN EMBRYOS One of the most exciting developments


in recent years has been the growing indication that homolog pairing is not special to _Drosophila_, that it may be a general property in many organisms, including mammals (reviewed by1,2).


Thus, in order to both demonstrate the applicability of Ohm as well as assess the state of pairing in mammals, we applied our haplotype-resolved Ohm approach to mammalian embryos. As


pairing in mammals is generally transient and localized, for example, during DNA repair, V(D)J recombination, imprinting, or X-inactivation (reviewed by1,2), these analyses would, at the


least, also serve as a control for observations in _Drosophila_, where pairing is far more extensive. In particular, we considered two deeply sequenced Hi-C datasets from Du et al.25 and Ke


et al.26 representing gametes and F1 hybrid mouse embryos bearing a high frequency of SNVs (1 autosomal SNV per ~130 bp or ~490 bp, respectively). Reasoning that the haploid nature of sperm


and oocytes precludes homolog pairing, we focused first on these cell types to determine the level of stringency that would be necessary to remove any apparent _trans_-homolog read pairs


that would necessarily have resulted from homolog misassignment (Supplementary Fig. 8a). That level of stringency was ≥2 SNVs per read (Supplementary Fig. 8b and Supplementary Data 1, see


Methods). Requiring ≥2 SNVs per read, we turned to the datasets representing embryos ranging in age from the single, 2-, 4-, and 8-cell stage through the earliest days of embryogenesis and


beyond. We observed no conclusive signal for _trans_-homolog contacts (Supplementary Figs. 9 and 10), consistent especially for the very earliest stages with previous indications that the


maternal and paternal genomes are spatially segregated25,26,48,49,50; (Supplementary Fig. 9). More specifically, although the signal for _trans-_homolog contacts for some of the earlier


stages seemed to hover above that expected from homolog misassignment (Supplementary Fig. 9) and thus raise the possibility of homolog pairing, this discordance between observed and expected


_trans_-homolog contacts could also reflect artifacts arising from a non-uniform genomic density of SNVs and/or Hi-C biases (see37); we were unable to resolve this issue due to the sparsity


of data (Supplementary Data 1). The data also raise the possibility of a higher order clustering of homologous as well as heterologous subtelomeric regions (Supplementary Fig. 10),


reminiscent of the clustering of pericentromeric and subtelomeric regions seen previously48. While, again, the sparsity of data precluded our ability to determine such clustering reflected


any degree of homolog pairing, it is worth noting that telomere as well as centromere clustering has been reported in embryos across several species34,48,51 and may constitute a mechanism


for initiating pairing13,14,15,52. In brief, although we found no definitive evidence of homolog pairing in early mouse embryos, our data nevertheless leave this issue open for future


studies. Also, as repetitive sequences were excluded from our analyses, the potential of _trans_-homolog contacts at repeats remains to be explored. DISCUSSION To conclude, our study


addressed the fundamental nature of chromosome pairing and, to this end, developed a robust method, Ohm, for conducting haplotype-resolved Hi-C studies wherein interactions between


homologous chromosomes are carefully vetted for homolog misassignment. This approach ensured the highest quality assignments of parental origin to Hi-C reads when distinguishing _cis_ from


_trans_ contacts, and can be extended to any number of other methods53. Using Ohm, we obtained, for the first time, a genome-wide high-resolution view of homolog pairing during early


_Drosophila_ development. Our data revealed genome-wide juxtaposition of homologs along their entire lengths as well as a multi-layered organization of _trans_-homolog domains, domain


boundaries, compartments, and interaction peaks. We also observed concordance of _trans_-homolog and _cis-_ features, arguing that _cis_ and _trans_ interactions are structurally coordinated


(see also31). This striking correspondence of _trans_-homolog and _cis_ domain boundaries and interaction peaks may suggest similar mechanisms of their formation. For instance, SMC


complexes implicated in loop extrusion54, formation of _cis_ domains, and cohesion of sister chromatids in mammals may play a role in pairing and formation of _trans_-homolog domains in


_Drosophila_. We pursue this line of thought in our companion paper31. Moreover, by layering _trans_-homolog contacts on models of genome organization that have relied primarily, if not


solely, on _cis_ contacts, our findings highlight how haplotype-resolved Hi-C can shed light on paradigms of genome organization and function that would otherwise remain hidden. For example,


pairing may underlie some of the structural and regulatory differences between the 3D architecture of the _Drosophila_ genome and that of organisms lacking pairing; indeed, the close


proximity of homologs may influence the manner in which loops and domains are formed in _Drosophila_ in addition to playing a role in gene regulation2,31,55. Our observations have further


revealed a correlation between pairing and Zelda-mediated chromatin accessibility during zygotic genome activation. In this way, they champion a relationship between _trans_-homolog genome


organization and key developmental decisions, aligning well with the growing recognition that pairing can play a potent role in gene regulation, even at some loci in mammals (reviewed


by1,2). As such, it may be of significance that homologs are unpaired during the earliest moments of _Drosophila_ embryogenesis56 and then initiate pairing at a time coinciding with


activation of the zygotic genome. This in mind, we examined data pertaining to early mouse embryo; as in _Drosophila_, murine maternal and paternal genomes are segregated in the earliest


cell cycles25,26,48,49,50, but little is known about the genome-wide status of pairing thereafter. Our analyses were further fueled by studies suggesting that pairing may be a general


mechanism by which organisms detect structural heterozygosity in their genomes early in development, culling those that are deleteriously rearranged57. Although we observed some intriguing


signals, ultimately, the sparsity of data did not allow us to draw strong conclusions regarding pairing in the mouse embryos, pointing to a need for additional haplotype-resolved Hi-C


analyses and/or the resolution provided by imaging. Our studies also suggest how pairing in _Drosophila_ could shape early nuclear organization of the zygotic genome. For instance, as


pairing brings similar loci together, it may enhance the compartmentalization of heterochromatic and euchromatic structures during phase separation in early _Drosophila_ embryos58.


Alternatively, the segregation of heterochromatin from euchromatin could facilitate homology searches during the initiation of pairing. Indeed, how homologs identify each other remains one


of the most intriguing questions. For instance, do homologs become structurally similar because of pairing, or do they have comparable conformations prior to pairing, their similarities


contributing to homolog recognition? As our results do not reveal striking differences between the _cis_ maps of homologs, they are consistent with the latter scenario. However, because our


_cis_ maps likely represent a mix of paired and unpaired homologs, potential differences between the _cis_ maps of homologs may only become apparent in younger embryos prior to pairing.


Finally, we note that the sensitivity of Ohm has potential usefulness also for the analysis of _trans_ contacts between heterologous chromosomes, which would fall in the plane of _x_ = 0 in


our framework of precision (x), proximity (y), and continuity (z). Intriguingly, it may be that, in some instances, only one of two alleles participates in heterologous interactions


(reviewed by59), and thus, here, our haplotype-resolved approach could be applied to dissect the parental origin of the interacting alleles. Of course, when our framework is extended to


encompass the dimension of time, it should then be able to capture the dynamics of _trans_-homolog and _trans_-heterolog contacts through cell division, development, aging, and disease.


METHODS SELECTION OF PARENTAL _DROSOPHILA_ LINES To be able to distinguish homologs in sequencing data, we selected two parental lines with a high number of SNVs from _Drosophila


melanogaster_ Genetic Reference Panel (DGRP), which contains _Drosophila_ lines inbred over 20 generations (see Supplementary Note 1). GENOMIC DNA ISOLATION AND LIBRARY PREPARATION 20 adult


flies from each parental line (the Bloomington stock numbers 29652 and 29658) were homogenized with motor (Kimble-Kontes Pellet Pestle Cordeless Motor) and pestle (Kimble-Kontes) on ice.


Genomic DNA was isolated using DNeasy Blood & Tissue Kit (Qiagen) according to manufacturers’ recommendations. The integrity of the isolated genomic DNA (i.e. no degradation present) was


verified by running a 1% agarose gel. Genomic DNA libraries were generated using Illumina TruSeq Nano DNA Library Preparation kit, and were 150 bp paired-end sequenced at the TUCF Genomics


Facility using Illumina HiSeq2500. COLLECTION AND FIXATION OF THE F1 HYBRID EMBRYOS The _Drosophila_ handling was in accordance with the Harvard Medical School guidelines. The inbred


DGRP-057 (maternal) and 439 (paternal) lines were mated to obtain the F1 hybrid embryos. To ensure that accurate parental genotypes were crossed, around 28,000 males and 28,000 virgin


females were manually sorted. The F1 hybrid embryos were collected and fixed after three pre-lays followed by aging for 2–4 h time-point at 25 °C. To validate that the embryos were of the


correct stage, and did not contain contaminants from older embryos, a small aliquot (~100 embryos) per collection was set aside, devitellinised, and stored in methanol at −20 °C to verify


developmental stages of collection under microscope. If even a single older embryo was noted, collection was discarded, as that embryo may contribute more nuclei than 2–4 h embryo, depending


on exact developmental difference between them. The remaining embryos from collections were snap-frozen in liquid nitrogen and stored at −80 °C. IN SITU HI-C ON _DROSOPHILA_ F1 HYBRID


EMBRYOS Hi-C protocol from Rao et al.19 was adapted for the _Drosophila_ embryos. 30 mg of snap-frozen fixed embryos were homogenized with pestle (Kimble-Kontes) in 500 μl of ice-cold lysis


buffer, and incubated 30 min on ice. Upon cell lysis, the chromatin was digested using 500 U of DpnII restriction enzyme overnight at 37 °C. The DpnII overhangs were filled in with biotin


mix during 1.5 h at 37 °C, and the chromatin was ligated for 4 h at 18 °C. RNA was degraded by adding 5 μl of 10 mg/ml RNase A (ThermoFisher Scientific) followed by incubation for 30 min at


37 °C. After degradation of proteins and crosslink reversal overnight at 68 °C, DNA was ethanol precipitated. DNA shearing was performed to obtain around 400–600 bp fragments with Qsonica


Instrument (Q800R) using the following parameters: 30 s on/off, 15x cycles, 70% ampl. After size-selection to around 500 bp, biotin pull-down was performed with a minor modification. Namely,


the size-selected DNA was eluted in 200 μl of 10 mM Tris-HCl pH 8.0, and was mixed with equal volume of 2x Binding Buffer (10 mM Tris-HCl pH 7.5, 1 mM EDTA, 2 M NaCl) containing pre-washed


Dynabeads MyOne Streptavidin T1 beads (Life Technologies). The library preparation19 was performed with the following modifications. The adapter ligation was in 46 μl of 1x Quick ligation


reaction buffer (NEB), 2 μl of DNA Quick ligase (NEB), and 2 μl of NEXTflex DNA Barcode Adapter (25 μM, NEXTflex DNA Barcodes −6, Bioo Scientific #514101). PCR reaction contained 23 μl of


adaptor ligated DNA, 2 μl of NEXTflex Primer Mix (12.5 μM), and 25 μl NEBNext Q5 Hot Start HiFi PCR Master Mix (NEB), and was run using the program: 98 °C, 30 s, (98 °C, 10 s, 65 °C, 75 s)


repeated 7 times, 65 °C, 5 min, hold at 4 °C. Purification of PCR products was performed by incubation for 15 min instead of 5 min after each addition of Agencourt AMPure beads. The final


libraries were eluted in 15 μl of 10 mM Tris-HCl pH 8.0 after 15 min incubation at room temperature, and 15 min incubation on a magnet. The library quality was assessed using the High


Sensitivity DNA assay on a 2100 Bioanalyzer system (Agilent Technologies). The libraries corresponding to two independent biological replicates were 150 bp paired-end sequenced at the TUCF


Genomics Facility using Illumina HiSeq2500. FISH PROBES Oligo probes for heterochromatin repeats at chr2 (AACAC), chr3 (dodeca) and chrX (359)60 were synthesized by Integrated DNA


Technologies (IDT) with the following sequences and a fluorescent dye: AACAC (Cy3-AACACAACACAACACAACACAACACAACACAACAC), dodeca (FAM488-ACGGGACCAGTACGG), and 359


(Cy5-GGGATCGTTAGCACTGGTAATTAGCTGC). The libraries for euchromatin probes 89D-89E/BX-C (315 kb), 2R16 (904 kb), 3R7 (844 kb), 3R19 (700 kb), 69 C (674 kb), and 28B (680 kb) were designed


using Oligopaint technology (Supplementary Table 4)17,61,62, and were amplified using T7 amplification (see section Oligopaint probe synthesis) with forward primers containing a site for


secondary oligo annealing, and reverse primers containing a T7 promoter sequence (sequences are provided in Supplementary Table 5). DNA secondary oligos61, which were used together with


primary probes, had both 5′ and 3′ conjugated fluorophores (Supplementary Table 5). OLIGOPAINT PROBE SYNTHESIS Oligopaints probes were synthesized using T7 amplification followed by reverse


transcription63. Oligopaints libraries were first amplified with Kapa Taq enzyme (Kapa Biosystems, 5 U/μl), and corresponding forward and reverse primers (without the site for secondary


oligo annealing and the T7 promoter, i.e. without the underlined sequences from Supplementary Table 5) using linear PCR program as follows: 95 °C, 5 min, (95 °C, 30 s, 58 °C, 30 s, 72 °C, 15


 s) repeated 25 times, 72 °C 5 min, hold at 4 °C. Upon clean up with DNA Clean & Concentrator-5 (DCC-5) kit (Zymo Research), this PCR was followed by another bulk up PCR using the same


program, but this time to introduce the site for secondary oligo annealing to the forward strand, and the T7 promoter to the reverse strand (primers used with the underlined sequences from


Supplementary Table 5). The PCR product with added T7 promoter was cleaned up using DNA Clean & Concentrator-5 (DCC-5) kit (Zymo Research), and became a template in T7 reaction to


produce excess RNA with HiScribe T7 High Yield RNA Synthesis Kit (NEB) and RNAseOUT (ThermoFisher Scientific) overnight at 37 °C. RNA was then reverse transcribed into DNA with Maxima H


Minus RT Transcriptase (ThermoFisher Scientific) in a 2 h reaction at 50 °C, followed by the inactivation of the RT enzyme at 85 °C for 5 min. Subsequently, RNA was degraded via alkaline


hydrolysis (0.5 M EDTA and 1 M NaOH in 1:1) for 10 min at 95 °C, and ssDNA was purified with DNA Clean & Concentrator-5 (DCC-5) kit (Zymo Research), where DNA binding buffer was replaced


with Oligo binding buffer (Zymo). FISH IN EMBRYOS FISH in _Drosophila_ embryos17,61 was performed with modifications. Embryos were dechorionated with 50% bleach for 2.5 min, washed with


0.1% Triton X-100 in PBS, and fixed for 30 min in 500 μl of fix (PBS containing 4% formaldehyde, 0.5% Nonidet P-40, and 50 mM EGTA), plus 500 μl of heptane. Upon replacing the aqueous phase


with methanol, embryos were vigorously shaken for 2 min. Finally, embryos were triple washed with 100% methanol, and stored at −20 °C in methanol. Prior to FISH, embryos were gradually


rehydrated in 2x SSCT (0.3 M NaCl, 0.03 M NaCitrate, 0.1% Tween-20). Subsequently, embryos were incubated for 10 min in 2x SSCT/20% formamide, followed by another 10 min in 2x SSCT/50%


formamide, and then hybridized in a hybridization solution (2x SSCT, 50% formamide, 10% dextran sulphate, RNase) with primary Oligopaint probe set (200 pmol for heterochromatin and 300 pmol


for euchromatin probes) for 30 min at 80 °C, and then at 37 °C overnight. After primary probe hybridization, two 30 min washes in 2x SSCT/50% formamide were performed at 37 °C. In a case


secondary probe was used, secondary hybridization was performed between those two washes for 30 min in 2x SSCT/50% formamide with 300 pmol of secondary probe at 37 °C. The washes were


continued in 2x SSCT/20% formamide for 10 min at RT, and with three rinses in 2x SSCT. The embryos were stained with Hoechst 33342 (Invitrogen), washed in 2x SSCT for 10 min at RT, quickly


rinsed in 2X SSC, and mounted in SlowFade Gold antifade mountant (Invitrogen). Images were taken using a Zeiss LSM780 laser scanning confocal microscope with a ×63 oil NA 1.40 lens at 1024 ×


 1024 resolution. ANALYSIS OF FISH SIGNALS The analysis was performed by manually examining each section of Z-stack with the Zeiss ZEN Software. Distance in 3D space between two FISH signals


was measured using the Ortho-distance function in the Zeiss ZEN Software. Two homologs were considered paired if 3D distance between centers of two FISH signals was ≤0.8 μm or there was


only one FISH signal. GENERATION OF ZLD-DEPLETED EMBRYOS The _UAS-shRNA-zld_41 virgin females were crossed to the maternal triple driver Gal4 (_MTD-Gal4_, the Bloomington stock number 31777)


males. The F1 heterozygous females _UAS-shRNA-zld_ /_MTD-Gal4_ were then mated to their sibling males, and the F2 Zld-depleted embryos were collected, and fixed as described in the section


FISH in embryos. The control embryos were obtained using the same crossing scheme just by replacing the _MTD-Gal4_ with the wild-type Canton S males. To validate that embryos were of the


correct genotype, a small aliquot (~500 embryos) was inspected under microscope for both Zld-depleted and control samples. THE CONSTRUCTION OF F1 DIPLOID _DROSOPHILA_ GENOME We sequenced the


two inbred parental _Drosophila_ lines (DGRP-057 and DGRP-439) together with the hybrid PnM cell line31 at the average coverage of 118, 117, and 396 reads per base pair, respectively64. We


then detected the sequence variation of these three libraries using _bcftools_. In summary, we obtained high-quality normalized sequence variants as following: * 1. Trimmed low-quality


sequences with seqtk trimfq, aligned whole-genome paired-end reads against the reference dm3 genome using BWA mem, and removed aligned PCR duplicates with samtools –rmdup. * 2. Piled


alignments up along the reference genome with bcftools pileup –min-MQ 20 –min-BQ 20. * 3. Called raw sequence variants from the pileups with bcftools call. * 4. Normalized raw sequence


variants with bcftools norm. * 5. Selected only high-coverage high-quality normalized sequence variants using bcftools filter INFO/DP > 80 & QUAL > 200 & (TYPE = “SNV”|IDV >


 1). We then phased heterozygous PnM variants using bcftools isec. We picked high-confidence variants on the maternal autosomes by selecting heterozygous PnM variants that were present among


maternal DGRP-057 variants and absent among paternal DGRP-439 sequence variants (both homo- and heterozygous); the high-confidence paternal variants phasing was selected in an opposite


manner. Since PnM is a male line, for the maternal copy of chrX in F1 embryos, we considered only homozygous high-quality variants detected in the PnM cell line. To reconstruct the consensus


sequence of the paternal copy of chrX in F1 embryos, we kept only homozygous variants detected in the paternal DGRP-439 _Drosophila_ line. Finally, we reconstructed the sequence of the F1


embryos with samtools consensus, using (a) the homozygous autosomal PnM SNVs, (b) the heterozygous phased autosomal PnM SNVs, and (c) the homozygous maternal chrX PnM SNVs as well as the


homozygous SNVs detected in the paternal DGRP-439 _Drosophila_ line. MAPPING AND PARSING We started with sequences of Hi-C molecules and trimmed low-quality base pairs at both ends of each


side using the standard mode of _seqtk trimfq_ v.1.2-r94 (https://github.com/lh3/seqtk). Then we mapped the trimmed sequences to the reference dm3 genome or the constructed dm3-based F1


diploid genome using _bwa mem_ v.0.7.1565 with flags -SP. We then extracted the coordinates of Hi-C contacts using the _pairtools parse_ command line tool


(https://github.com/mirnylab/pairtools). We only kept read pairs that mapped uniquely to one of the two homologous chromosomes and removed PCR duplicates using the standard mode of the


_pairtools dedup_ command line tool. CONTACT FREQUENCY (P(S)) CURVES We used unique Hi-C pairs to calculate the functions of contact frequency P(s) _vs_ genomic separations. We grouped


genomic distances between 10 bp and 10 Mb into ranges of exponentially increasing widths, with 8 ranges per order of magnitude. For every range of separations, we found the number of


observed _cis_- or _trans_-homolog interactions within this range of separations and divided it by the total number of all loci pairs separated by such distances. We used a similar technique


to quantify the degree of coalignment of chromosomal arms due to Rabl configuration (Fig. 3h). Specifically, the Rabl configuration increased the contact frequency between pairs of loci


located at the same relative positions along different chromosomal arms. We characterized this tendency by calculating a P(s) on _trans_-heterolog maps, where we calculated the position


mismatch as _s_ = _x_1 − _x_2/_L_2 ×_L_1, where _x_1 and x2 were the distances from each locus to the centromere and L1 and L2 were the lengths of the two chromosomal arms. ESTIMATING THE


RATE OF HI-C BYPRODUCTS Each Hi-C dataset contains non-informative byproducts, i.e. sequenced DNA molecules that originate from a single DNA fragment, and thus do not carry any information


on chromatin conformation (unlike informative Hi-C molecules that form via a ligation of two spatially proximal DNA fragments and capture a spatial contact) (Supplementary Fig. 3a, b). The


two well-known types of Hi-C byproducts are: (i) unligated DNA fragments (termed “dangling ends”) and (ii) self-ligated DNA fragments (“self-circles”)37. “Dangling ends” are small pieces of


intact unligated DNA that passed through all selection steps of the Hi-C protocol and ended up in the library of sequenced molecules (Supplementary Fig. 3b). “Dangling ends” appear as _cis_


read pairs at s ~100–1000 bp (such molecules are left in the library after the size selection step of the Hi-C protocol), with the direction of the two reads of a read pair pointing towards


each other along the reference genome. “Self-circles” are pieces of DNA whose ends were ligated to each other and the resulting circular DNA had a break in another location, producing a


linear piece of DNA (Supplementary Fig. 3b). “Self-circles” also appear in _cis_, but at longer separations, up to a few kb (this distance depends on the frequency of the restrictions sites


along the genome) and the resulting paired read orientations point away from each other along the reference genome37. Because both of these types of byproducts were characterized by their


narrow range of genomic separations and a specific mutual orientation of the paired end reads (Supplementary Fig. 3b), their abundance was estimated using P(s) curves for four possible


combinations of the directionality of paired read ends (Supplementary Figs 3c and 4a). In such plots, “dangling ends” produced a characteristic 100 × −1000x enrichment of inward read pairs


below 1 kb separation and “self-circles” produced a weaker enrichment of outward read pairs at separations up to 10 kb, though their amount and typical genomic separation seemed to vary


greatly between different experimental protocols (Supplementary Figs. 3c and 4a)37. During the work on this manuscript, we also discovered a third, previously undescribed type of a Hi-C


byproduct. We found that the read pairs that aligned to the same genome strand (i.e. same-strand read pairs) were enriched at _s_ < 500 bp. Upon closer examination, we found that such


short-distance same-strand read pairs shared a similar unique pattern: while one of the two reads were typically fully aligned to some locus on the genome, the other read would split into


two fractions, which would align to different locations. The inner, 3′ fraction mapped in _cis_, downstream and opposite to the read on the first side (as in a “dangling end”); while the


outer, 5′ fraction overlapped the inner fraction, but went into the opposite direction (Supplementary Fig. 3d). This non-trivial alignment pattern suggested a scenario, where, during some


stage of the Hi-C protocol, a DNA molecule formed a hairpin and was extended using itself as a template (Supplementary Fig. 3e). During the work on this manuscript, this type of a Hi-C


byproduct was independently reported and thoroughly characterized in66, who dubbed it as “hairpin loops”. Another, older study reported formation of similar hairpin loops in whole-genome


sequencing of ancient DNA67. The mechanism by which such “hairpin loops” form remains to be firmly established. Golloshi et al.66 proposed that these “hairpin loops” were formed during the


end repair stage of the Hi-C protocol via self-annealing of a pre-existing microhomology region followed by an elongation by the T4 polymerase. If this hypothesis was true, then “hairpin


loops” should form with equal probability on all Hi-C molecules, regardless of the orientation and relative location of the ligated fragments, as well as on “dangling ends” and


“self-circles”. The enrichment in short distance same-strand read pairs would then be most likely due to formation of “hairpin loops” on “dangling ends”, which is a highly enriched


substrate. Importantly, we can still identify the ligated fragments even in Hi-C molecules with “hairpin loops”. First, the extra base pairs introduced by a “hairpin loop” map to the same


location as the substrate (albeit at the opposite direction) and their alignment can serve as a proxy for the location of the substrate. Second, provided a sufficient long read length, we


can simply ignore these extra pairs and instead align the inner fraction of the Hi-C molecule, which served as a substrate for the “hairpin loop”. Thus, “hairpin loops” represent the most


benign type of a Hi-C byproduct and their presence does not significantly reduce the number and identity of contacts extracted from a Hi-C library. We produced an upper estimate of the


amount of byproduct types in our sample, assuming that the true P(s) followed that of the same-strand read pairs at _s_ > 1 kb, and was constant at s<1 kb. Using this assumption, we


found that “dangling ends” constituted 39.5% of mapped and phased reads in our Hi-C librarys, “self-circles” were 0.3%, and “hairpin loops” formed from “dangling ends” made up 2.9% of the


library. BINNING AND BALANCING OF HI-C DATA We aggregated unique Hi-C pairs into genomic bins of 1 kb and larger, using the cooler package (https://github.com/mirnylab/cooler). Low signal


bins were excluded prior to balancing using the MADmax filter: we removed all bins, whose coverage was 7 genome-wide median deviations below the median bin coverage. Additionally, we removed


all _cis_- and _trans_-homolog contacts at separations below 3 kb, since these reads pairs were dominated by non-informative Hi-C artifacts, unligated DNA fragments and ligation sites


formed via self-circularization of DNA fragments68. We then balanced the obtained contact matrices via iterative correction (IC), i.e. equalized the sum of contacts in every row/column to


1.0. We calculate observed/expected contact frequency maps (often abbreviated as O/E CF) in _cis_ and _trans_-homolog by dividing each diagonal of an IC contact map by its chromosome-wide


average value over non-filtered genomic bins. THE DEFINITION OF HOMOLOG MISASSIGNMENT When we aligned a read to a diploid genome, we assigned to it three bits of information: which pair of


homologous chromosomes this read originated from; the allele, _i.e_. which of the two homologs it originated from, and its location along the chromosome. Importantly, we could assign the


allele (or, the homolog) to a read only if it overlapped some sequence that is unique to one homolog, i.e. if it overlapped a SNV; otherwise, it was impossible to tell which of the two


homologs the read originated from. Even in the _Drosophila_ and mouse models with the highest density of SNVs, the sequences of the two homologous chromosomes were still highly similar and


contained only one SNV per 100–200 bp. Because of that, the majority of read pairs (150 bp) lacked allele assignment on one or both sides; and the allele assignment for the rest of pairs was


based on one or a few SNVs, and thus was prone to errors. SOURCES OF HOMOLOG MISASSIGNMENT The errors in allele assignment (below, homolog misassignment, HM) occur for two reasons: (A) due


to an error in the sequence of the read, or (B) due to an error in the sequence of the genome. Below we discussed how such errors lead to HM. Also, note that we only discussed the case where


diploid genome sequences only contained SNVs and did not analyze the effect of insertions, deletions and duplications. * (A) The type and rate of sequencing errors varies highly among


different sequencing platforms69. In the case of the most popular short-read platform, Illumina, sequencing errors typically lead to substitutions at the rate around 0.1% per base pair69.


Such a substitution could occasionally lead to a HM, if it occurs at a SNV site and the erroneously called base pair by chance matches the other SNV allele. * (B) Some types of errors in the


sequence of a diploid genome may lead to HM. The possible cases are: * (B1) an error at a site without a SNV (e.g. the true sequence is A in both alleles, which we wrote as A-A, but in the


genome database it is T-T)—in this case, reads coming from either homolog would contain a mismatch with the database. Such error does not lead to HM. * (B2) a missing SNV (e.g. true A-T, but


in the database it is A-A). Such an error leads to mapping imbalance, _i.e_. reads from the homolog with the correct sequence would get mapped, while the reads from a homolog with a


sequence error will be dropped. A missing SNV does not lead to HM. * (B3) a false positive SNV (e.g. true A-A, database A-T). In this case, reads from both homologs will be assigned to the


same allele (the one matching the true sequence). A false positive SNV error leads to both HM and mapping imbalance. * (B4) an erroneous SNV (e.g. true A-T, database A-C). In this case,


reads from the homolog containing an error will not be mapped, which will produce mapping imbalance. An erroneous SNV does not lead to HM. * (B5) missing heterozygosity of an SNV allele.


Samples derived from populations of organisms can contain heterozygous SNVs, _i.e_. different organisms within the population may contain different base pairs at the same location on the


same homolog. The issue, however, is that genomic databases can store only one allele per homolog (below, referred to as a “known” allele), while the other alleles become ignored (referred


to as “unknown” alleles) (note that this issue can be solved by using graph genomes, which can store multiple overlapping variants per position70). The effects of heterozygosity on mapping


depends on the nature of “known” and “unknown” alleles on the two homologs: * (B5.1) non-overlapping “known” and “unknown” alleles, when all alleles of one homolog are different from the


allele on the other homolog (e.g., true A/C-T, database A-T). This scenario leads to a partial mapping imbalance, because the reads containing the “unknown” allele do not get mapped. This


does not lead to HM. * (B5.2) overlapping “known” and “unknown” alleles, when the “unknown” allele on one homolog is the same as the “known” allele on the other homolog (e.g. true A/T-T,


database A-T). This scenario leads to HM, because all reads containing the “unknown” allele will be interpreted as containing the “known” allele from the other homolog. Also, this scenario


leads to a mapping imbalance. MISASSIGNED _CIS_ PAIRS CAN CONTAMINATE _TRANS_-HOMOLOG SIGNAL HM is particularly problematic for studying the _trans_-homolog contact patterns. If we would


assign a wrong homolog to one side of a _cis_ Hi-C pair, we would misinterpret this pair as a _trans_-homolog contact. Given that _trans_-homolog contacts in our sample were much less


frequent than _cis_ ones, even a small probability of HM could lead to a significant contamination of the _trans_-homolog contact map. Note, that a cross-contamination between the _cis_ maps


of the two homologs was much less likely, since it required simultaneous HM on both sides of a Hi-C pair. ESTIMATING HOMOLOG MISASSIGNMENT USING HI-C BYPRODUCTS The observed _trans_-homolog


read pairs thus was a mixture of “true” _trans_-homolog contacts and _cis_ pairs with a misassigned homolog on one side. For P(s) curves, this could be expressed as: $${\mathrm{P}}_{trans -


{\mathrm{homolog}}}\left( s \right)^{{\mathrm{obs}}} = \left( {1 - \mathrm{P}_{{\mathrm{HM}}}} \right) \ast {\mathrm{P}}_{trans - {\mathrm{homolog}}}\left( s \right)^{{\mathrm{true}}} +


\mathrm{P}_{{\mathrm{HM}}} \ast \mathrm{P}_{{\mathrm{cis}}}\left( \mathrm{s} \right),$$ (1) where P_trans_-homolog(s)obs was the P(s) curve for the observed _trans_-homolog contacts,


P_trans_-homolog(s)true was that for the _true trans_-homolog contacts (_i.e_. Hi-C _trans_-homolog pairs that represented a true _trans_-homolog ligation) and P_cis_(s) was P(s) for _cis_


contacts, and PHM was the probability of HM. Here, the probability of HM was defined per paired read (_i.e_. the probability per side is a PHM/2); we assumed that each Hi-C pair had the same


chance to get an erroneously assigned homolog, regardless of its genomic separation or orientation, _i.e_. that HM probability was the same for all Hi-C pairs. We also accounted for HM in


_trans_-homolog read pairs, which were falsely interpreted as _cis_ ones. Estimating the probability of HM in a mapped Hi-C library was a challenging task. A priori, for any given


_trans_-homolog read pair, we cannot tell if it was produced by a true _trans_-homolog ligation, or it resulted from HM of a _cis_ pair. However, we could reduce the relative amount of


homolog-misassigned pairs by increasing the stringency of homolog assignment. For that purpose, we selected pairs where both sides overlapped at least two or three SNVs and where read


alignments had no mismatches with respect to the reference genome. We reasoned that, for such pairs, HM was less likely, since it required simultaneous errors at multiple SNVs. For pairs


with 2 + SNVs on each side, the shape of P_trans_-homolog(s) changed at separations <1 kb: the ~10-fold enrichment of _trans_-homolog inward pairs at s < 1 kb that was present in


unfiltered data (Supplementary Fig. 4a), was greatly reduced after increasing the stringency of homolog assignment (Fig. 2c and Supplementary Fig. 4b). We tried increasing the stringency of


homolog assignment further by requiring 3 + SNVs on each side, but this did not change the shape of P_trans_-homolog(s) (Supplementary Fig. 4c). This observation suggested that the


homolog-misassigned _cis_ pairs did not significantly contribute to P_trans_-homolog(s) for pairs with 2 + SNVs. Conversely, decreasing the accuracy of homolog assignment by remapping our


data to less accurate DGRP SNV annotations (see Supplementary Note 1), increased the abundance of observed short-distance inward _trans_-homolog pairs (Supplementary Fig. 4d). We concluded


that the enrichment of short-distance inward _trans_-homolog pairs was due to homolog-misassigned inward _cis_ pairs. Inward _trans_-homolog pairs at _s_ < 1 kb were particularly


sensitive to contamination by homolog-misassigned _cis_ pairs, because at these separations and directionalities _cis_ pairs were the most abundant (>104 more abundant that


_trans_-homolog pairs of other directionalities) due to “dangling ends”. Thus, even rare HM at a probability as low as ~10−4 would generate enough false _trans_-homolog pairs to match the


true signal, and thus change the shape of inward P_trans_-homolog(s) at _s_ < 1 kb. Thus, the ratio of P_trans_-homolog(s)/P_cis_(s) for inward pairs at short separations could be used as


an upper estimate for the probability of HM: $${\mathrm{P}}_{{\mathrm{HM}}} \le < {\mathrm{P}}_{trans - {\mathrm{homolog}}}\left( \mathrm{s}


\right)^{{\mathrm{inward}}}/{\mathrm{P}}_{cis}\left( \mathrm{s} \right)^{{\mathrm{inward}}} > _{300{\mathrm{bp}} < {\mathrm{s}} < 600{\mathrm{bp}}},$$ (2) where


<…>300bp<s<600bp was the geometric mean around the range of separations between 300 bp and 600 bp, corresponding to the “dangling end” peak of P_cis_(s). Using this approach, we


estimated the HM probability in our data at ~0.17% (Fig. 2d). Importantly, this was an _upper_ estimate, since not all of the _trans_-homolog inward pairs at s < 1 kb were


homolog-misassigned _cis_ pairs. Finally, given this probability, misassigned _cis_ contacts (the second term in Eq. (1)) made up only 5% of detected _trans_-homolog contacts at s~1 kb


separations (Fig. 2e) and even less at larger separations. ANALYSIS OF PUBLISHED HAPLOTYPE-RESOLVED MOUSE HI-C DATA We tested our Ohm approach to haplotype-resolved Hi-C using data from two


published haplotype-resolved Hi-C studies on mice25,26. For mapping, we reconstructed two sets of diploid genomes (Black6 x DBA/2J and Black6 x PWK/PhJ) using SNVs from the Mouse Genome


Project71. The genome of Black6 parental mouse line did not require reconstruction, since it served as the basis for the reference mm10/GRCm38 genome. We reconstructed consensus genomes with


bcftools consensus, using only homozygous SNVs that passed the quality control. We then mapped the publicly available sequenced Hi-C libraries to the diploid genomes (Black6 x DBA/2J for26


and Black6 x PWK/PhJ for25) and extracted the positions of Hi-C contacts using the same approach as for our _Drosophila_ data. For contact maps, we only used Hi-C pairs overlapping 2 + SNVs


on each side. We did not perform IC on the produced contacts maps due to their sparsity. Non-ICed raw contact maps have to be interpreted with a degree of caution, since they are affected by


the variation of sequencing visibility of loci due to an uneven distribution of SNVs, GC content, amount of open chromatin, etc. In a naive interpretation, a decay of _trans_-homolog


contact frequency P_trans_-homolog(s) with distance s is indicative of homolog pairing, since it means that pairs of loci in homologous positions make more contacts than loci in


non-homologous positions. However, such decay can also be observed in samples without pairing, but due to severe contamination of _trans_-homolog contacts by homolog-misassigned _cis_ pairs.


To interpret the curves of _trans_-homolog contact frequency P_trans_-homolog(s), we compared them to a negative control, no-pairing curves P_trans_-homolog(s)no-pairing, which described


the expected amount of observed _trans_-homolog contacts in samples without pairing, but in presence of HM. P_trans_-homolog(s)no-pairing could be derived from Eq.(1) assuming


P_trans_-homolog(s)true = const: $${\mathrm{P}}_{trans - {\mathrm{homolog}}}\left( s \right)^{{\mathrm{no}} - {\mathrm{pairing}}}\sim {\mathrm{avg}}.{\mathrm{cross}} -


{\mathrm{parent}}\,trans - {\mathrm{heterolog}}\,{\mathrm{CF}} + \mathrm{P}_{{\mathrm{HM}}} \ast \mathrm{P}_{cis}\left( s \right),$$ (3) here, we could estimate PHM using Eq. (2). Thus, in


order to claim that a sample had homolog pairing, we had to observe significantly more contacts than that predicted by P_trans_-homolog(s)no-pairing, at least in some range of separations.


ANALYSIS OF PUBLISHED HI-C DATA FROM _DROSOPHILA_ EMBRYOS To compare our data with other studies on the structure of chromosomes in _Drosophila_ embryos, we re-analysed the publicly


available Hi-C dataset from33 using the same methods as we used for our own data. PAIRING SCORE To characterize the degree of pairing between homologous loci across the whole genome, we


introduced a genome-wide statistics track called _pairing score (PS)_. The PS of a genomic bin is log2 of average _trans_-homolog IC contact frequency between all pairs of bins within a


window of ±W bins. For each genomic bin i, its pairing score with window size W was defined as: $${\mathrm{PS}}^{\mathrm{W}}\left( {\mathrm{i}} \right) = {\mathrm{log}}_2 <


{\mathrm{CF}}_{{\mathrm{m}},{\mathrm{n}}} > ,$$ (4) averaged over bins _m_ and _n_ between i-W-th and i + W-th genomic bins on different homologs of the same chromosome. Importantly,


under this definition, the PS quantified only contacts between homologous loci and their close neighbors and did not quantify pairing between non-homologous loci on homologous chromosomes.


The choice of the window size W was guided by the balance between specificity and sensitivity. Using a bigger window increased sensitivity, accumulating contacts across more loci pairs,


while smaller windows increased specificity, allowing to see smaller-scale variation of homolog pairing; empirically, we found that, for our contact maps binned at 4 kb resolution, using a 7


 × 7 bin window (_W_ = 3) provided the optimal balance between specificity and sensitivity. A close examination of the PS track revealed two important features: (i) the _Drosophila_ genome


was divided into regions that demonstrate consistently high, relatively similar, values of PS, followed by extended regions where PS dipped into lower values, (ii) switching between high-


and low-PS regions seemed to occur around insulating boundaries. INSULATION SCORES For every bin of a contact map binned at 4 kb resolution, we calculated the insulation score as the total


number of normalized and filtered contacts formed across that bin by pairs of bins located on the either side, up to 5 bins (20 kb) away for Hi-C mapped to reference dm3 maps, and up to 10


bins (40 kb) away for haplotype-resolved Hi-C maps (see Supplementary Note 1). We then normalized the score by its genome-wide median. To find insulating boundaries, we detected all local


minima and maxima in the log2-transformed and then characterized them by their prominence (Billauer E. peakdet: Peak detection using MATLAB, http://billauer.co.il/peakdet.html). The detected


minima in the insulation score corresponded to a local depletion of contacts across the genomic bin, were then called as insulating boundaries. We found empirically that the distribution of


log-prominence of boundaries had a bimodal shape, and we selected all boundaries in the high-prominence mode above a prominence cutoff of 0.1 for Hi-C mapped to the reference dm3 map, and a


cutoff of 0.3 for haplotype-resolved Hi-C maps. We called the insulating boundaries in _trans_-homolog contact maps using the same technique, requiring a minimal prominence of 0.3. Finally,


we removed boundaries that were adjacent to the genomic bins that were masked out during IC. To estimate the similarity of insulating boundaries detected in the _cis_ and _trans_-homolog


contact maps, we calculated the number of overlapping boundaries. We allowed for a mismatch up to four genomic 4 kb bins (16 kb total) between overlapping boundaries to account for the drift


caused by the stochasticity of contact maps. We reported average percentage of overlapping boundaries. For instance, the average fraction (73.9%) that overlapping boundaries between _cis_


maternal and _cis_ paternal boundaries occupied within _cis_ maternal only (74.2%), and _cis_ paternal only boundaries (73.7%). Similarly, for replicates, we calculated the average


percentage of overlapping boundaries for _cis_ maternal replicates (75.4%), and also for _cis_ paternal replicates (71.1%), and then provided a mean of the two values (73.2%). In reference


dm3 Hi-C contact maps, the percentage of overlapping boundaries between the two replicates reached 92.3%, suggesting that the lower reproducibility of haplotype-resolved boundaries is due to


technical and not biological variation between the two replicates. We estimated the significance of an overlap between two given boundary sets by comparing it to the overlap expected by


chance alone, given the sizes of the two sets. We estimated the latter by calculating an overlap between two similarly-sized random subsets of genomic bins visible in our Hi-C maps. To


estimate how significantly the observed overlap deviates from the random expectation, we repeated randomization 10 times and compared the random overlaps to the observed overlap using a


t-test. CHIP-SEQ We mapped the publicly available raw ChIP-seq data following the same procedure as used by the ENCODE consortium72, (https://github.com/ENCODE-DCC/chip-seq-pipeline). To


calculate the number of overlapping ChIP-seq peaks, we re-sized all peaks to 1 kb around their centers and used pyBedTools to calculate peak intersections73. To generate the tracks of Zld


binding, we used ChIP datasets GSM1596215 and GSM1596219 and input datasets GSM1596216 and GSM1596220 from study GSE6544141. For Dl, we used ChIP datasets GSM1596223 and GSM1596227 and input


datasets GSM1596224 and GSM1596228 from the same study GSE6544141. For GAF, we used ChIP datasets GSM614652 and input datasets GSM614653 and GSM614654 from study GSE2353743. For wild-type


Bcd, we used ChIP datasets GSM1332670 and GSM1332671 and input datasets GSM1332672 and GSM1332673 from GSE5525640. For Bcd in _zld_ mutants, we used ChIP datasets GSM1332674 and GSM1332675


and input datasets GSM1332676 and GSM1332677 from GSE5525640. Finally, for RNA Pol II, we used ChIP datasets GSM1596231 and GSM1596235 and input datasets GSM1596232 and GSM1596236 from


GSE6544141. GRO-SEQ We mapped the publicly available GRO-seq datasets (GSM1020093 and GSM1020094 from GSE4161147) using STAR74 following the same procedure as used by the ENCODE consortium,


(https://github.com/ENCODE-DCC/long-rna-seq-pipeline/tree/master/dnanexus). CORRELATION ANALYSES BETWEEN THE PS AND CHIP OR GRO-SEQ The correlation analyses75 were performed by first


dividing the genome into 10 kb bins. In each bin, the fraction of sequence occupied by the PS, the respective ChIP or GRO-seq signal was determined. Then, for each bin genome-wide


correlations were calculated between the PS values and a ChIP or GRO-seq signal of interest. The strength of correlation was reported using Spearman correlation coefficients and


corresponding P-values in two flavors, either for a pairwise comparison between two features, or as a part of analyses that examined whether the correlation between two features was affected


by co-correlation with a third feature. The Spearman correlation coefficients were also displayed using a heatmap. Control regions consisted of the randomized PS or the binding profiles for


Zld, Bcd, GAF, and Dl, where each respective feature was chunked into blocks of 200 kb size, that were randomly moved around. CORRELATION BETWEEN THE PS AND ZLD BINDING To visualize the


relationship between PS and Zld binding, we plotted the distribution of PS for genomic bins from each of the four quartiles of the genome-wide distribution of Zld ChIP-seq signal. We then


tested if the effects of Zld binding on chromatin conformation were correlated with the pairing status of loci. We re-analyzed the published Hi-C datasets on Zld depletion33 with the same


computational methods as above and obtained normalized Hi-C maps at 4 kb resolution for WT and Zld-depleted embryos at nuclear cycle 14. Then, we calculated the change of the 20 kb-window


insulation score upon Zld depletion, ΔinsZld, and compared it to PS. For bins from each quartile of Zld ChIP-seq binding, we fitted the resulting 2D-distribution of ΔinsZld vs PS with a


bivariate Gaussian. We then plotted the resulting Gaussians as ellipses corresponding to ±2 standard deviations from the mean along each of the independent axes of the distribution (i.e.


Mahalanobis distance of 2). SETS OF GENES We obtained target gene lists for AP and DV factors from MacArthur et al. (1% FDR dataset)45 and Zeitlinger et al.44. We retrieved locations of


housekeeping genes from the Flybase by selecting genes with at least moderate expression level (RPKM > 11) across all assessed stages and tissues33,46. REPORTING SUMMARY Further


information on research design is available in the Nature Research Reporting Summary linked to this article. DATA AVAILABILITY All raw sequencing data and extracted Hi-C contacts have been


deposited in the Gene Expression Omnibus (GEO) repository under accession number “GSE121255 [https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE121255]”. The publically available ChIP-seq


and GRO-seq datasets were obtained using accession numbers provided in the Methods section. All other relevant data is available upon request. The source data underlying Fig. 1e and


Supplementary Figs. 2a and 7d are provided as a Source Data file. CODE AVAILABILITY We performed all custom data analyses in Jupyter Notebooks76, using matplotlib77, numpy78, and pandas79


packages. We automated data analyses in command line interface using GNU Parallel80. The software used in this study is available at https://github.com/mirnylab/. REFERENCES * Apte, M. S.


& Meller, V. H. Homologue pairing in flies and mammals: gene regulation when two are involved. _Genet. Res. Int._ 2012, 430587 (2012). PubMed  Google Scholar  * Joyce, E. F., Erceg, J.


& Wu, C. T. Pairing and anti-pairing: a balancing act in the diploid genome. _Curr. Opin. Genet. Dev._ 37, 119–128 (2016). Article  CAS  PubMed  PubMed Central  Google Scholar  * Fukaya,


T. & Levine, M. Transvection. _Curr. Biol._ 27, R1047–R1049 (2017). Article  CAS  PubMed  PubMed Central  Google Scholar  * Cattoni, D. I. et al. Single-cell absolute contact


probability detection reveals chromosomes are organized by multiple low-frequency yet specific interactions. _Nat. Commun._ 8, 1753 (2017). Article  ADS  PubMed  PubMed Central  Google


Scholar  * Szabo, Q. et al. TADs are 3D structural units of higher-order chromosome organization in Drosophila. _Sci. Adv._ 4, eaar8082 (2018). Article  ADS  PubMed  PubMed Central  Google


Scholar  * Cardozo Gizzi, A. M. et al. Microscopy-based chromosome conformation capture enables simultaneous visualization of genome organization and transcription in intact organisms. _Mol.


Cell_ 74, 212-222 e5 (2019). Article  PubMed  Google Scholar  * Mateo, L. J. et al. Visualizing DNA folding and RNA in embryos at single-cell resolution. _Nature_ 568, 49–54 (2019). Article


  ADS  CAS  PubMed  PubMed Central  Google Scholar  * Li, Q. et al. The three-dimensional genome organization of Drosophila melanogaster through data integration. _Genome Biol._ 18, 145


(2017). Article  PubMed  PubMed Central  Google Scholar  * Viets, K. et al. Homologous chromosomes button together to promote interchromosomal gene regulation. _Dev. Cell_ (2019, In press).


* Jorgensen, R. Altered gene expression in plants due to trans interactions between homologous genes. _Trends Biotechnol._ 8, 340–344 (1990). Article  CAS  PubMed  Google Scholar  * Cook, P.


R. The transcriptional basis of chromosome pairing. _J. Cell. Sci._ 110, 1033–1040 (1997). CAS  PubMed  Google Scholar  * Rowley, M. J. et al. Condensin II counteracts cohesin and RNA


polymerase II in the establishment of 3D chromatin organization. _Cell Rep._ 26, 2890–2903 e3 (2019). Article  CAS  PubMed  PubMed Central  Google Scholar  * Hiraoka, Y. et al. The onset of


homologous chromosome pairing during Drosophila melanogaster embryogenesis. _J. Cell. Biol._ 120, 591–600 (1993). Article  CAS  PubMed  Google Scholar  * Fung, J. C., Marshall, W. F.,


Dernburg, A., Agard, D. A. & Sedat, J. W. Homologous chromosome pairing in Drosophila melanogaster proceeds through multiple independent initiations. _J. Cell. Biol._ 141, 5–20 (1998).


Article  CAS  PubMed  PubMed Central  Google Scholar  * Gemkow, M. J., Verveer, P. J. & Arndt-Jovin, D. J. Homologous association of the Bithorax-Complex during embryogenesis:


consequences for transvection in Drosophila melanogaster. _Development_ 125, 4541–4552 (1998). CAS  PubMed  Google Scholar  * Bateman, J. R. & Wu, C. T. A genomewide survey argues that


every zygotic gene product is dispensable for the initiation of somatic homolog pairing in Drosophila. _Genetics_ 180, 1329–1342 (2008). Article  CAS  PubMed  PubMed Central  Google Scholar


  * Joyce, E. F., Apostolopoulos, N., Beliveau, B. J. & Wu, C. T. Germline progenitors escape the widespread phenomenon of homolog pairing during Drosophila development. _PLoS Genet._ 9,


e1004013 (2013). Article  PubMed  PubMed Central  Google Scholar  * Selvaraj, S., J. R. Dixon, Bansal, V. & Ren, B. Whole-genome haplotype reconstruction using proximity-ligation and


shotgun sequencing. _Nat. Biotechnol_. 31, 1111–1118 (2013). * Rao, S. S. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. _Cell_ 159,


1665–1680 (2014). Article  CAS  PubMed  PubMed Central  Google Scholar  * Deng, X. et al. Bipartite structure of the inactive mouse X chromosome. _Genome Biol._ 16, 152 (2015). Article 


PubMed  PubMed Central  Google Scholar  * Dixon, J. R. et al. Chromatin architecture reorganization during stem cell differentiation. _Nature_ 518, 331–336 (2015). Article  ADS  CAS  PubMed


  PubMed Central  Google Scholar  * Minajigi, A. et al. Chromosomes. A comprehensive Xist interactome reveals cohesin repulsion and an RNA-directed chromosome conformation. _Science_ 349,


aab2276 (2015). Article  Google Scholar  * Darrow, E. M. et al. Deletion of DXZ4 on the human inactive X chromosome alters higher-order genome architecture. _Proc. Natl Acad. Sci. USA_ 113,


E4504–E4512 (2016). Article  CAS  PubMed  PubMed Central  Google Scholar  * Giorgetti, L. et al. Structural organization of the inactive X chromosome in the mouse. _Nature_ 535, 575–579


(2016). Article  ADS  CAS  PubMed  PubMed Central  Google Scholar  * Du, Z. et al. Allelic reprogramming of 3D chromatin architecture during early mammalian development. _Nature_ 547,


232–235 (2017). Article  ADS  CAS  PubMed  Google Scholar  * Ke, Y. et al. 3D chromatin structures of mature gametes and structural reprogramming during mammalian embryogenesis. _Cell_ 170,


367–381 e20 (2017). Article  CAS  PubMed  Google Scholar  * Barutcu, A. R., Maass, P. G., Lewandowski, J. P., Weiner, C. L. & Rinn, J. L. A TAD boundary is preserved upon deletion of the


CTCF-rich Firre locus. _Nat. Commun._ 9, 1444 (2018). Article  ADS  PubMed  PubMed Central  Google Scholar  * Bonora, G. et al. Orientation-dependent Dxz4 contacts shape the 3D structure of


the inactive X chromosome. _Nat. Commun._ 9, 1445 (2018). Article  ADS  CAS  PubMed  PubMed Central  Google Scholar  * Tan, L., Xing, D., Chang, C. H., Li, H. & Xie, X. S.


Three-dimensional genome structures of single diploid human cells. _Science_ 361, 924–928 (2018). Article  ADS  CAS  PubMed  PubMed Central  Google Scholar  * Kim, S. et al. The dynamic


three-dimensional organization of the diploid yeast genome. _Elife_ 6, e23623 (2017). * AlHaj Abed, J. et al. Highly structured homolog pairing reflects functional organization of the


_Drosophila_ genome. _Nat. Commun_. (2019, In press). * Harrison, M. M. & Eisen, M. B. Transcriptional activation of the zygotic genome in Drosophila. _Curr. Top. Dev. Biol._ 113, 85–112


(2015). Article  CAS  PubMed  Google Scholar  * Hug, C. B., Grimaldi, A. G., Kruse, K. & Vaquerizas, J. M. Chromatin architecture emerges during zygotic genome activation independent of


transcription. _Cell_ 169, 216–228 e19 (2017). Article  CAS  PubMed  Google Scholar  * Stadler, M. R., Haines, J. E. & Eisen, M. Convergence of topological domain boundaries,


insulators, and polytene interbands revealed by high-resolution mapping of chromatin contacts in the early Drosophila melanogaster embryo. _Elife_ 6, e29550 (2017). * Dixon, J. R. et al.


Topological domains in mammalian genomes identified by analysis of chromatin interactions. _Nature_ 485, 376–380 (2012). Article  ADS  CAS  PubMed  PubMed Central  Google Scholar  * Nora, E.


P. et al. Spatial partitioning of the regulatory landscape of the X-inactivation centre. _Nature_ 485, 381–385 (2012). Article  ADS  CAS  PubMed  PubMed Central  Google Scholar  * Imakaev,


M. et al. Iterative correction of Hi-C data reveals hallmarks of chromosome organization. _Nat. Methods_ 9, 999–1003 (2012). Article  CAS  PubMed  PubMed Central  Google Scholar  * Liang, H.


L. et al. The zinc-finger protein Zelda is a key activator of the early zygotic genome in Drosophila. _Nature_ 456, 400–403 (2008). Article  ADS  CAS  PubMed  PubMed Central  Google Scholar


  * Harrison, M. M., Li, X. Y., Kaplan, T., Botchan, M. R. & Eisen, M. B. Zelda binding in the early Drosophila melanogaster embryo marks regions subsequently activated at the


maternal-to-zygotic transition. _PLoS Genet._ 7, e1002266 (2011). Article  CAS  PubMed  PubMed Central  Google Scholar  * Xu, Z. et al. Impacts of the ubiquitous factor Zelda on


Bicoid-dependent DNA binding and transcription in Drosophila. _Genes Dev._ 28, 608–621 (2014). Article  CAS  PubMed  PubMed Central  Google Scholar  * Sun, Y. et al. Zelda overcomes the high


intrinsic nucleosome barrier at enhancers during Drosophila zygotic genome activation. _Genome Res._ 25, 1703–1714 (2015). Article  CAS  PubMed  PubMed Central  Google Scholar  * Chen, K.


et al. A global change in RNA polymerase II pausing during the Drosophila midblastula transition. _eLife_ 2, e00861 (2013). Article  PubMed  PubMed Central  Google Scholar  * Negre, N. et


al. A cis-regulatory map of the Drosophila genome. _Nature_ 471, 527–531 (2011). Article  ADS  CAS  PubMed  PubMed Central  Google Scholar  * Zeitlinger, J. et al. Whole-genome ChIP-chip


analysis of Dorsal, Twist, and Snail suggests integration of diverse patterning processes in the Drosophila embryo. _Genes Dev._ 21, 385–390 (2007). Article  CAS  PubMed  PubMed Central 


Google Scholar  * MacArthur, S. et al. Developmental roles of 21 Drosophila transcription factors are determined by quantitative differences in binding to an overlapping set of thousands of


genomic regions. _Genome Biol._ 10, R80 (2009). Article  PubMed  PubMed Central  Google Scholar  * Graveley, B. R. et al. The developmental transcriptome of Drosophila melanogaster. _Nature_


471, 473–479 (2011). Article  ADS  CAS  PubMed  Google Scholar  * Saunders, A., Core, L. J., Sutcliffe, C., Lis, J. T. & Ashe, H. L. Extensive polymerase pausing during Drosophila axis


patterning enables high-level and pliable transcription. _Genes Dev._ 27, 1146–1158 (2013). Article  CAS  PubMed  PubMed Central  Google Scholar  * Mayer, W., Smith, A., Fundele, R. &


Haaf, T. Spatial separation of parental genomes in preimplantation mouse embryos. _J. Cell. Biol._ 148, 629–634 (2000). Article  CAS  PubMed  PubMed Central  Google Scholar  * Flyamer, I. M.


et al. Single-nucleus Hi-C reveals unique chromatin reorganization at oocyte-to-zygote transition. _Nature_ 544, 110–114 (2017). Article  ADS  CAS  PubMed  PubMed Central  Google Scholar  *


Reichmann, J. et al. Dual-spindle formation in zygotes keeps parental genomes apart in early mammalian embryos. _Science_ 361, 189–193 (2018). Article  ADS  CAS  PubMed  Google Scholar  *


Sexton, T. et al. Three-dimensional folding and functional organization principles of the Drosophila genome. _Cell_ 148, 458–472 (2012). Article  CAS  PubMed  Google Scholar  * Lewis, E. B.


The theory and application of a new method of detecting chromosomal rearrangements in Drosophila melanogaster. _Am. Nat._ 88, 225–239 (1954). Article  Google Scholar  * Davies, J. O.,


Oudelaar, A. M., Higgs, D. R. & Hughes, J. R. How best to identify chromosomal interactions: a comparison of approaches. _Nat. Methods_ 14, 125–134 (2017). Article  CAS  PubMed  Google


Scholar  * Fudenberg, G., Abdennur, N., Imakaev, M., Goloborodko, A. & Mirny, L. A. Emerging evidence of chromosome folding by loop extrusion. _Cold Spring Harb. Symp. Quant. Biol._ 82,


45–55 (2017). Article  PubMed  Google Scholar  * Rowley, M. J. et al. Evolutionarily conserved principles predict 3D chromatin organization. _Mol. Cell_ 67, 837–852 e7 (2017). Article  CAS 


PubMed  PubMed Central  Google Scholar  * Huettner, A. F. Maturation and fertilization of Drosophila melanogaster. _J. Morphol. Physiol._ 39, 249–265 (1924). Article  Google Scholar  *


Derti, A., Roth, F. P., Church, G. M. & Wu, C. T. Mammalian ultraconserved elements are strongly depleted among segmental duplications and copy number variants. _Nat. Genet._ 38,


1216–1220 (2006). Article  CAS  PubMed  Google Scholar  * Strom, A. R. et al. Phase separation drives heterochromatin domain formation. _Nature_ 547, 241–245 (2017). Article  ADS  CAS 


PubMed  PubMed Central  Google Scholar  * Maass, P.G., Barutcu, A.R. & Rinn, J.L. Interchromosomal interactions: a genomic love story of kissing chromosomes. _J. Cell Biol_. 218, 27–38


(2018). * Joyce, E. F., Williams, B. R., Xie, T. & Wu, C. T. Identification of genes that promote or antagonize somatic homolog pairing using a high-throughput FISH-based screen. _PLoS


Genet._ 8, e1002667 (2012). Article  CAS  PubMed  PubMed Central  Google Scholar  * Beliveau, B. J. et al. Single-molecule super-resolution imaging of chromosomes and in situ haplotype


visualization using Oligopaint FISH probes. _Nat. Commun._ 6, 7147 (2015). Article  ADS  CAS  PubMed  Google Scholar  * Senaratne, T. N., Joyce, E. F., Nguyen, S. C. & Wu, C. T.


Investigating the interplay between sister chromatid cohesion and homolog pairing in Drosophila nuclei. _PLoS Genet._ 12, e1006169 (2016). Article  PubMed  PubMed Central  Google Scholar  *


Beliveau, B. J. et al. In situ super-resolution imaging of genomic DNA with OligoSTORM and OligoDNA-PAINT. _Methods Mol. Biol._ 1663, 231–252 (2017). Article  CAS  PubMed  PubMed Central 


Google Scholar  * Pedersen, B. S. & Quinlan, A. R. Mosdepth: quick coverage calculation for genomes and exomes. _Bioinformatics_ 34, 867–868 (2018). Article  CAS  PubMed  Google Scholar


  * Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. _Bioinformatics_ 25, 1754–1760 (2009). Article  CAS  PubMed  PubMed Central  Google Scholar


  * Golloshi, R., Sanders, J. T. & McCord, R. P. Iteratively improving Hi-C experiments one step at a time. _Methods_ 142, 47–58 (2018). Article  CAS  PubMed  Google Scholar  * Star, B.


et al. Palindromic sequence artifacts generated during next generation sequencing library preparation from historic and ancient DNA. _PLoS. ONE._ 9, e89676 (2014). Article  ADS  PubMed 


PubMed Central  Google Scholar  * Lajoie, B. R., Dekker, J. & Kaplan, N. The Hitchhiker’s guide to Hi-C analysis: practical guidelines. _Methods_ 72, 65–75 (2015). Article  CAS  PubMed 


Google Scholar  * Glenn, T. C. Field guide to next-generation DNA sequencers. _Mol. Ecol. Resour._ 11, 759–769 (2011). Article  CAS  PubMed  Google Scholar  * Paten, B., Novak, A. M.,


Eizenga, J. M. & Garrison, E. Genome graphs and the evolution of genome inference. _Genome Res._ 27, 665–676 (2017). Article  CAS  PubMed  PubMed Central  Google Scholar  * Lilue, J. et


al. Sixteen diverse laboratory mouse reference genomes define strain-specific haplotypes and novel functional loci. _Nat. Genet._ 50, 1574–1583 (2018). Article  CAS  PubMed  PubMed Central 


Google Scholar  * Landt, S. G. et al. ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. _Genome Res._ 22, 1813–1831 (2012). Article  CAS  PubMed  PubMed Central 


Google Scholar  * Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. _Bioinformatics_ 26, 841–842 (2010). Article  CAS  PubMed  PubMed


Central  Google Scholar  * Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. _Bioinformatics_ 29, 15–21 (2013). Article  CAS  PubMed  Google Scholar  * McCole, R. B., Erceg, J.,


Saylor, W. & Wu, C. T. Ultraconserved elements occupy specific arenas of three-dimensional mammalian genome organization. _Cell Rep._ 24, 479–488 (2018). Article  CAS  PubMed  PubMed


Central  Google Scholar  * Kluyver, T. et al. Jupyter Notebooks—a publishing format for reproducible computational workflows. in _Positioning and Power in Academic Publishing: Players,


Agents and Agendas_ (eds Loizides, F. & Schmidt, B.) 87–90 (IOS Press, 2016). * Hunter, J.D. Matplotlib: A 2D graphics environment. in _Computing in Science & Engineering_ Vol. 9


(eds Thiruvathukal, G.K. & Läufer, K.) 90–95 (IEEE CS and the AIP, 2007). * van der Walt, S., Colbert, S.C. & Varoquaux, G. The NumPy array: a structure for efficient numerical


computation. in _Computing in Science & Engineering_ Vol. 13 22–30 (IEEE CS, 2011). * McKinney, W. Data Structures for Statistical Computing in Python. in _Proceedings of the 9th Python


in Science Conference_(eds van der Walt, S. & Millman, J.) 51–56 (2010). * Tange, O. GNU parallel: the command-line power tool. _USENIX Mag._ 36, 42–47 (2011). Google Scholar  Download


references ACKNOWLEDGEMENTS We are grateful to the Wu and Mirny laboratories, participants of the Annual Northeast Regional Chromosome Pairing Conferences, the Lieberman Aiden laboratory,


4DN DCIC, in particular Giancarlo Bonora from the Noble laboratory, Brian J. Beliveau, Guillaume J. Filion, Mirko Francesconi, Ben Lehner, M. Jordan Rowley, and the Cavalli laboratory,


especially Frédéric Bantignies, for discussions, the Furlong laboratory for Oregon-R fly collection, the Rushlow laboratory for _UAS-shRNA-zld_ line, the TUCF Genomics Sequencing Core


Facility, the Microscopy Resources on North Quad (MicRoN), and the Bloomington _Drosophila_ Stock Center for _Drosophila_ stocks. We apologize to the authors whose work we could not cite due


to constrains on referencing. This work was supported by an EMBO Long-Term Fellowship (ALTF 186-2014) to J.E., a William Randolph Hearst Award to R.B.M., and awards from NIH/NCI (Ruth L.


Kirschstein NRSA, F32CA157188) to E.F.J., NIH/NHGRI (R01 HG003143) to J.D. (Howard Hughes Medical Institute investigator), NIH/NIGMS (R01 GM114190) to L.A.M., and NIH/NIGMS (RO1GM123289,


DP1GM106412, R01HD091797) and HMS to C-t.W. J.D., and L.A.M. acknowledge support from the National Institutes of Health Common Fund 4D Nucleome Program (U54 DK107980). AUTHOR INFORMATION


Author notes * Bryan R. Lajoie Present address: Illumina, San Diego, CA, USA * Geoffrey Fudenberg Present address: Gladstone Institutes of Data Science and Biotechnology, San Francisco, CA,


94158, USA * Son C. Nguyen & Eric F. Joyce Present address: Department of Genetics, Penn Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA,


19104-6145, USA * T. Niroshini Senaratne Present address: Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, CA, 90095,


USA * These authors contributed equally: Jelena Erceg, Jumana AlHaj Abed, Anton Goloborodko. AUTHORS AND AFFILIATIONS * Department of Genetics, Harvard Medical School, Boston, MA, 02115,


USA Jelena Erceg, Jumana AlHaj Abed, Ruth B. McCole, Son C. Nguyen, Wren Saylor, Eric F. Joyce, T. Niroshini Senaratne, Mohammed A. Hannan, Guy Nir & C.-ting Wu * Institute for Medical


Engineering and Science, Massachusetts Institute of Technology (MIT), Cambridge, MA, 02139, USA Anton Goloborodko, Geoffrey Fudenberg, Nezar Abdennur, Maxim Imakaev & Leonid A. Mirny *


Howard Hughes Medical Institute and Program in Systems Biology, Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA, 01605-0103,


USA Bryan R. Lajoie & Job Dekker * Department of Physics, Massachusetts Institute of Technology (MIT), Cambridge, MA, 02139, USA Leonid A. Mirny * Wyss Institute for Biologically


Inspired Engineering, Harvard University, Boston, MA, 02115, USA C.-ting Wu Authors * Jelena Erceg View author publications You can also search for this author inPubMed Google Scholar *


Jumana AlHaj Abed View author publications You can also search for this author inPubMed Google Scholar * Anton Goloborodko View author publications You can also search for this author


inPubMed Google Scholar * Bryan R. Lajoie View author publications You can also search for this author inPubMed Google Scholar * Geoffrey Fudenberg View author publications You can also


search for this author inPubMed Google Scholar * Nezar Abdennur View author publications You can also search for this author inPubMed Google Scholar * Maxim Imakaev View author publications


You can also search for this author inPubMed Google Scholar * Ruth B. McCole View author publications You can also search for this author inPubMed Google Scholar * Son C. Nguyen View author


publications You can also search for this author inPubMed Google Scholar * Wren Saylor View author publications You can also search for this author inPubMed Google Scholar * Eric F. Joyce


View author publications You can also search for this author inPubMed Google Scholar * T. Niroshini Senaratne View author publications You can also search for this author inPubMed Google


Scholar * Mohammed A. Hannan View author publications You can also search for this author inPubMed Google Scholar * Guy Nir View author publications You can also search for this author


inPubMed Google Scholar * Job Dekker View author publications You can also search for this author inPubMed Google Scholar * Leonid A. Mirny View author publications You can also search for


this author inPubMed Google Scholar * C.-ting Wu View author publications You can also search for this author inPubMed Google Scholar CONTRIBUTIONS J.E., J.A.A., and C-t.W. designed the


experiments. A.G. designed Hi-C computational analyses with input from J.E., J.A.A., L.A.M., and C-t.W. J.D. and B.R.L. provided input in experimental design and data analysis. Experimental


data were generated by J.E. and J.A.A., with the help of S.C.N., E.F.J., T.N.S., M.A.H. for sorting males and virgin females, and G.N. for Oligopaints design. J.E. and R.B.M. selected the


parental lines. J.E., J.A.A., A.G., B.R.L., G.F., N.A., M.I., R.B.M., and W.S. performed computational analyses of the Hi-C data. J.E. performed FISH data analysis and Zld depletion


experiments. J.E., J.A.A., A.G., G.F., L.A.M., and C.-t.W. interpreted the data. J.E., J.A.A., A.G., L.A.M., and C.-t.W. wrote the paper with input from the other authors. CORRESPONDING


AUTHORS Correspondence to Leonid A. Mirny or C.-ting Wu. ETHICS DECLARATIONS COMPETING INTERESTS The authors declare no competing interests. ADDITIONAL INFORMATION PEER REVIEW INFORMATION


_Nature Communications_ thanks the anonymous reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available. PUBLISHER’S NOTE Springer Nature remains


neutral with regard to jurisdictional claims in published maps and institutional affiliations. SUPPLEMENTARY INFORMATION SUPPLEMENTARY INFORMATION PEER REVIEW FILE REPORTING SUMMARY


DESCRIPTION OF ADDITIONAL SUPPLEMENTARY FILES SUPPLEMENTARY DATA 1 SOURCE DATA SOURCE DATA RIGHTS AND PERMISSIONS OPEN ACCESS This article is licensed under a Creative Commons Attribution


4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and


the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s


Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not


permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit


http://creativecommons.org/licenses/by/4.0/. Reprints and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Erceg, J., AlHaj Abed, J., Goloborodko, A. _et al._ The genome-wide multi-layered


architecture of chromosome pairing in early _Drosophila_ embryos. _Nat Commun_ 10, 4486 (2019). https://doi.org/10.1038/s41467-019-12211-8 Download citation * Received: 25 January 2019 *


Accepted: 27 August 2019 * Published: 03 October 2019 * DOI: https://doi.org/10.1038/s41467-019-12211-8 SHARE THIS ARTICLE Anyone you share the following link with will be able to read this


content: Get shareable link Sorry, a shareable link is not currently available for this article. Copy to clipboard Provided by the Springer Nature SharedIt content-sharing initiative


Trending News

Tom schwartz reveals whether he’d marry again after katie maloney divorce

EXPLORE MORE Tom Schwartz may be off the market — forever. The “Vanderpump Rules” star confessed to Andy Cohen on “Watch...

The love he left behind: all about david bowie and iman's unbreakable bond and how she's 'holding up', says source

They were a beautiful, fashionable and staggeringly talented couple, but David Bowie and Iman were even more enviable fo...

Shopping: vs opens 1st pink store; california market center sales

_This article was originally on a blog post platform and may be missing photos, graphics or links. See About archive blo...

Don rickles: his most memorable comedic moments

For over six decades, Don Rickles was a living comedy highlight reel. Whether he was roasting his own audience during a ...

Top 10 schools and educational buildings of 2021

WHETHER IT’S A KINDERGARTEN OR A UNIVERSITY CAMPUS, THE SPACES IN WHICH WE LEARN DIRECTLY INFLUENCE HOW KNOWLEDGE IS ABS...

Latests News

The genome-wide multi-layered architecture of chromosome pairing in early drosophila embryos

ABSTRACT Genome organization involves _cis_ and _trans_ chromosomal interactions, both implicated in gene regulation, de...

Veteran producer tom mcnulty to head the exchange’s new production division

Veteran producer Tom McNulty, whose credits include _The Spectacular Now _and _Date Night_, has been hired by The Exchan...

We are 'underweight' on indian equities; see risk of earnings downgrades: hsbc

According to the global brokerage firm, the country has seen a progress of structural reforms across sectors, equities a...

Terry crews recreates white chicks dance for america's got talent audience: 'always ready'

Terry Crews has still got the moves! On Tuesday, the 53-year-old actor shared a video of himself on Instagram of a behin...

Nhs workers join anti-vax facebook that claims covid vaccine is 'poison'

HUNDREDS of NHS workers have joined an anti-vax Facebook group claiming the coronavirus vaccine is a "poison" ...

Top