Notably, at sites where CLAMP is required for chromatin accessibility (Type II, CLAMP-DA, n=23), ZLD occupancy is entirely ablated in
clamp-i
embryos (
Figure 5A
). CLAMP occupancy levels are also reduced after maternal
zld
RNAi at sites where ZLD is required for chromatin accessibility (Type III, ZLD-DA, n=123). Overall, we observed that CLAMP and/or ZLD occupancy is reduced at most of their co-bound regions when either one of the TFs is depleted, which is consistent with their inter-dependent binding relationship. Moreover,
clamp-i
has a stronger impact on ZLD occupancy than
zld-i
has on CLAMP occupancy.
To assess how CLAMP/ZLD-modulated chromatin accessibility impacts transcription, we examined the effect of maternal
clamp
(
Rieder et al., 2017
) or
zld
(
Schulz et al., 2015
) depletion on expression (RNA-seq data) of genes that fall into the four types of CLAMP/ZLD co-occupied sites (
Figure 5B
). We found that the expression levels of genes (Type II, CLAMP-DA, n=23) that require CLAMP for chromatin accessibility are significantly (p<0.05, Mann-Whitney U-test) downregulated in embryos lacking CLAMP compared to the Type IV (both non-DA) CLAMP and ZLD-independent group (n=374) (
Figure 5B
). Genes (Type III, ZLD-DA, n=123) dependent on ZLD for their accessibility also show a significant (p<0.001, Mann-Whitney U-test) reduction in expression upon maternal CLAMP depletion, suggesting CLAMP also might contribute to the regulation of genes at which ZLD regulates chromatin accessibility, likely by increasing ZLD binding.
Motif analysis demonstrates that CLAMP and ZLD motifs are enriched at genomic loci that are regulated by each factor as well as independent sites (Type IV), in addition to the motif for another GA-binding protein, GAF (
Figure 5—figure supplement 1B
). We next determined whether GAF alters chromatin accessibility at loci at which depletion of CLAMP or ZLD individually alters accessibility (Type IV) and is bound by all three factors. Indeed, we found that approximately 10% of loci that require GAF for their chromatin accessibility (n=104) (
Gaskill et al., 2021
) overlap with regions where depleting CLAMP or ZLD individually does not alter accessibility (CLAMP non-DA and/or ZLD non-DA) (
Figure 5C
, upper panel). When we do not require occupancy of ZLD and CLAMP at their non-DA sites, the overlap with the GAF-dependent regions is approximately 97% (
Figure 5C
, lower panel). These results suggest GAF might function at these CLAMP/ZLD independent sites, supporting a model in which multiple TFs coordinately regulate early zygotic chromatin accessibility during ZGA (
Hamm and Harrison, 2018
).
Together, our results reveal the CLAMP and ZLD regulate chromatin accessibility, which alters the occupancy of both factors and regulates zygotic transcription. Furthermore, GAF and/or other TFs might function at sites that are not altered by depleting CLAMP or ZLD individually, suggesting that multiple TFs promote chromatin accessibility during ZGA. It is also possible that CLAMP and ZLD are functionally redundant at the subset of genomic loci at which they regulate each other’s occupancy, but depleting either factor individually is not sufficient to alter chromatin and expression.
Two questions central to early embryogenesis of all metazoans are how and where do early TFs work together to drive chromatin changes and ZGA. Here, we defined a novel function of CLAMP as a new pioneer TFs that has a targeted yet essential function in early embryonic development. We found that CLAMP directly binds to nucleosomal DNA (
Figure 1
), establishes and/or maintains chromatin accessibility at promoters of genes that often encode other TFs (
Figure 2
), and facilitates the binding of ZLD to promoters (
Figure 3
) to regulate activation of zygotic gene transcription (
Figure 4
). We discovered that CLAMP and ZLD regulate each other’s binding via mediating chromatin accessibility which further regulates their target gene expression (
Figure 5
). Overall, we provide new insight into how CLAMP and ZLD function together to enhance each other’s occupancy and increase chromatin accessibility, which drives ZGA.
We defined multiple classes of CLAMP-dependent and ZLD-dependent genomic loci in early embryos, which provides insight into how CLAMP and ZLD regulate chromatin accessibility and zygotic transcription during ZGA (
Figure 6
): (1) CLAMP promotes ZLD enrichment at sites where CLAMP increases chromatin accessibility and further regulates ZLD target gene expression. These loci remain open and transcriptionally active even upon ZLD depletion. (2) ZLD facilitates CLAMP occupancy at sites where ZLD regulates chromatin accessibility and promotes CLAMP target gene expression. When maternal CLAMP is depleted, these loci remain accessible and genes are actively transcribed. (3) GAF and/or other TFs could play major roles in opening chromatin at locations co-bound by CLAMP and ZLD but that are not altered in accessibility after depleting CLAMP or ZLD individually. CLAMP and ZLD could also function redundantly at some of these loci because they alter each other’s occupancy at these loci but do not change accessibility or expression after depletion of either maternal CLAMP or ZLD individually. Overall, our data suggest that CLAMP functions with ZLD regulate chromatin accessibility and gene expression of the early zygotic genome.
CLAMP and ZLD function together at promoters to regulate each other’s occupancy and gene expression of genes encoding other key TFs. We defined CLAMP and ZLD co-bound peaks in early embryos, which revealed roles for CLAMP and ZLD in defining chromatin accessibility and activating zygotic transcription at a subset of the zygotic genome.CLAMP-dependent regions: CLAMP promotes ZLD enrichment at these sites where CLAMP binding increases chromatin accessibility and regulates target gene expression. These sites are closed and lack binding of ZLD when maternal
clamp
is depleted, and they remain open and transcription is activated when maternal
zld
is depleted. ZLD-dependent regions: ZLD modulates chromatin opening and transcription at these sites that are bound by CLAMP but do not depend on CLAMP for chromatin accessibility. These sites are closed and lack binding of CLAMP when maternal
zld
is depleted, and they remain open and active when maternal
clamp
is depleted. CLAMP/ZLD-independent regions: GAF or other TFs open chromatin at locations co-bound by CLAMP and ZLD where chromatin accessibility is not altered when each factor is depleted individually. CLAMP and ZLD could also function redundantly at some of these loci. These sites remain accessible and transcriptionally active upon either maternal
zld
or
clamp
depletion. CLAMP, chromatin-linked adaptor for male-specific lethal (MSL) proteins; TF, transcription factor.
Although we have demonstrated an instrumental role for CLAMP in defining a subset of the open chromatin landscape in early embryos, our data show that CLAMP does not increase chromatin accessibility at promoters of all zygotic genes independent of ZLD. Consistent with our results in the early embryo, CLAMP regulates chromatin accessibility at only a few hundred genomic loci in male S2 (258 sites) and female Kc (102 sites) cell lines. Unlike ZLD, which plays a global role in regulating chromatin accessibility at promoters throughout the genome, depletion of CLAMP alone mainly drives changes at promoters of specific genes that often encode TFs that are important for early development, consistent with phenotypic data. These findings indicate that CLAMP and ZLD regulate ZGA in different ways: ZLD mediates chromatin opening globally, while the CLAMP functions in a more targeted way at certain essential early TF genes. However, both proteins are critical to ZGA and loss of either is catastrophic in terms of overall embryonic development.
Moreover, ZLD binding and/or chromatin accessibility is not regulated by maternal depletion of CLAMP at all GA-rich sites in the genome. GAF is also enriched at these same ZLD-bound regions where ZLD is not required for chromatin accessibility (
Schulz et al., 2015
;
Gaskill et al., 2021
). Both CLAMP and GAF are deposited maternally (
Rieder et al., 2017
;
Hamm et al., 2017
) and bind to similar GA-rich motifs (
Kaye et al., 2018
). To test whether GAF compensates for the depletion of CLAMP or ZLD, we tried to perform GAF RNAi in the current study to prevent GAF from compensating for CLAMP depletion. However, we and other laboratories could not achieve depletion of GAF in early embryos by RNAi, likely due to autoregulation of its own promoter and its prion-like self-perpetuating function (
Tariq et al., 2013
).
We previously demonstrated that competition between CLAMP and GAF at GA-rich binding sites is essential for MSL complex recruitment in S2 cells (
Kaye et al., 2018
). Furthermore, CLAMP excludes GAF at the histone locus which co-regulates genes that encode the histone proteins (
Rieder et al., 2017
). However, we also observed synergistic binding between CLAMP and GAF at many additional binding sites (
Kaye et al., 2018
). The relationship between CLAMP and GAF in early embryos remains unclear. It is very possible that the competitive relationship has not been established in early embryos, since dosage compensation has not yet been initiated (
Prayitno et al., 2019
). Using GAF-dependent loci defined by
Gaskill et al., 2021
, we found that genomic loci where GAF functions largely overlap with regions where depletion of CLAMP or ZLD alone does not alter chromatin accessibility, indicating that GAF may function independently of CLAMP or ZLD or is functionally redundant. Future studies are required to distinguish between these models by examining how GAF and CLAMP affect each other’s binding to co-bound loci and simultaneously eliminating both factors.
The GA-rich sequences targeted by CLAMP and GAF are distinct from each other in vivo and in vitro. GAF binding sites typically have 3.5 GA repeats; however, GAF is able to bind to as few as three bases (GAG) within the
hsp70
promoter and in vitro (
Wilkins and Lis, 1999
). In contrast, CLAMP binding sites contain an 8-bp core with a less well-conserved second GA dinucleotide within the core (GA__GAGA) (
Alekseyenko et al., 2008
). CLAMP binding sites also include a GAGAG pentamer at a lower frequency than GAF binding sites, and flanking bases surrounding the 8-bp core are critical for CLAMP binding (
Kaye et al., 2018
). Therefore, GAF and CLAMP may have overlapping and non-overlapping functions at different loci, tissues, or developmental stages. Moreover, another TF, Pipsqueak (Psq) also binds to sites containing the GAGAG motif, and has multiple functions during oogenesis and embryonic pattern formation and functions with Polycomb in three-dimensional genome organization (
Lehmann et al., 1998
;
Gutierrez-Perez et al., 2019
). In the future, an optogenetic inactivation approach could be used to remove CLAMP, GAF, and/or Psq simultaneously in a spatial and temporal manner (
McDaniel et al., 2019
).
ZLD is an essential TF that regulates activation of the first set of zygotic genes during the minor wave of ZGA and thousands of genes transcribed during the major wave of ZGA at nuclear cycle 14 (
Liang et al., 2008
;
Harrison et al., 2011
). ZLD also establishes and maintains chromatin accessibility of specific regions and facilitates TF binding and early gene expression (
Sun et al., 2015
;
Schulz et al., 2015
). CLAMP regulates histone gene expression (
Rieder et al., 2017
), X chromosome dosage compensation (
Soruco et al., 2013
), and establishes/maintains chromatin accessibility (
Urban et al., 2017b
). Nonetheless, it remained unclear whether and how CLAMP and ZLD functionally interact during ZGA. Here, we demonstrate that CLAMP and ZLD function together at a subset of promoters that often encode other transcriptional regulators.
ZLD regulates CLAMP occupancy earlier than CLAMP regulates ZLD occupancy. Genomic loci at which CLAMP is dependent on ZLD early (0–2 hr) in development often become independent from ZLD later (2–4 hr), with the caveat that ZLD depletion is not as effective later in development. Therefore, CLAMP may require the pioneering activity of ZLD to access specific loci before ZGA, but ZLD may no longer be necessary once CLAMP binding is established. Also, our results suggest that CLAMP is a potent regulator of ZLD binding, especially in 2–4 hr embryos. ZLD can bind to many more promoter regions at 0–2 hr, while CLAMP mainly binds to introns early in development but occupies promoters later at 2–4 hr. Therefore, CLAMP may require ZLD to increase chromatin accessibility of these promoter regions (
Schulz et al., 2015
).
In addition to its role in embryonic development, CLAMP also plays an essential role in targeting the MSL male dosage compensation complex to the X chromosome (
Soruco et al., 2013
).
Drosophila
embryos initiate X chromosome counting in nuclear cycle 12 and start the sex determination cascade prior to the major wave of ZGA at nuclear cycle 14 (
Gergen, 1987
;
ten Bosch et al., 2006
). However, most dosage compensation is initiated much later in embryonic development (
Prayitno et al., 2019
). Our data support a model in which CLAMP functions early in the embryo prior to MSL complex assembly to open up specific chromatin regions for MSL complex recruitment (
Urban et al., 2017b
;
Rieder et al., 2019
). Moreover, ZLD likely functions primarily as an early pioneer factor, whereas CLAMP has pioneer functions in both early and late-ZGA embryos. Consistent with this hypothesis, CLAMP binding is enriched at both early and late zygotic genes. In contrast, ZLD binding binds more frequently to early genes, suggesting that there may be a sequential relationship between occupancy of these two TFs at some loci during early embryogenesis.
The different characteristics of dependent and independent CLAMP and ZLD binding sites also provide insight into how early TFs work together to regulate ZGA. At dependent sites, there are often relatively broad peaks of CLAMP and ZLD that are significantly enriched for clusters of motifs for the required protein. Our CLAMP gel shift assays and those previously reported (
Kaye et al., 2018
) also show multiple shifted bands consistent with possible multimerization. CLAMP contains two central disordered prion-like glutamine-rich regions (
Kaye et al., 2018
), a domain that is critical for transcriptional activation and multimerization in vivo in several TFs, including GAF (
Wilkins and Lis, 1999
). Moreover, glutamine-rich repeats alone can be sufficient to mediate stable protein multimerization in vitro (
Stott et al., 1995
). Therefore, it is reasonable to hypothesize that the CLAMP glutamine-rich domain also functions in CLAMP multimerization.
In contrast, ZLD fails to form dimers or multimers (
Hamm et al., 2015
;
Hamm et al., 2017
), indicating that ZLD most likely binds as a monomer. There is no evidence that CLAMP and ZLD have any direct protein-protein interaction at sites where they depend on each other to bind. For example, mass spectrometry results that identified dozens of CLAMP-associated proteins did not identify ZLD (
Urban et al., 2017b
). No data has validated any protein-protein interactions of ZLD with itself as a multimer or between ZLD and any other TFs (
Hamm et al., 2017
). In the future, simultaneous ablation of maternal CLAMP and ZLD will allow the analysis of potential functional redundancy at a subgroup of genomic loci. Our study suggests that regulating the chromatin landscape in early embryos to drive ZGA requires the function of multiple pioneer TFs.
MBP-tagged CLAMP DBD was expressed and purified as described previously (
Kaye et al., 2018
). MBP-tagged (pTHMT,
Peti and Page, 2007
) FL CLAMP protein was expressed in
Escherichia coli
BL21 Star (DE3) cells (Life Technologies). Bacterial cultures were grown to an optical density of 0.7–0.9 before induction with 1 mM isopropyl-β-D-1-thiogalactopyranoside (IPTG) for 4 hr at 37°C.
Cell pellets were harvested by centrifugation and stored at −80°C. Cell pellets were resuspended in 20 mM Tris, 1 M NaCl, 0.1 mM ZnCl
2
, and 10 mM imidazole pH 8.0 with one EDTA-free protease inhibitor tablet (Roche) and lysed using an Emulsiflex C3 (Avestin). The lysate was cleared by centrifugation at 20,000 rpm for 50 min at 4°C, filtered using a 0.2 μm syringe filter, and loaded onto a HisTrap HP 5 ml column. The protein was eluted with a gradient from 10 to 300 mM imidazole in 20 mM Tris, 1.0 M NaCl pH 8.0, and 0.1 mM ZnCl
2
. Fractions containing MBP-CLAMP FL were loaded onto a HiLoad 26/600 Superdex 200 pg column equilibrated in 20 mM Tris, 1.0 M NaCl, pH 8.0. Fractions containing FL CLAMP were identified by SDS-PAGE and concentrated using a centrifugation filter with a 10-kDa cutoff (Amicon, Millipore) and frozen as aliquots.
The 240 bp 5C2 DNA fragment used for nucleosome in vitro assembly was amplified from 276 bp 5C2 fragments (50 ng/µl, IDT gBlocks gene fragments) by PCR (see 276 bp 5C2 and primer sequences below) using OneTaq Hot Start 2× Master Mix (New England Biolabs). The DNA was purified using the PCR Clean-Up Kit (Qiagen) and concentrated to 1 µg/µl by SpeedVac Vacuum (Eppendorf). The nucleosomes were assembled using the EpiMark Nucleosome Assembly Kit (New England Biolabs) following the kit’s protocol.
5C2 (276 bp),
bold
sequences are CLAMP-binding motifs, underlined sequences are primer binding sequences:
TCGACGACTAGTTTAAAGTTATTGTAGTTCTTAGAGCAGAATGTATTTTAAATATCAATGTTTCGATGTAGAAATTGAATGGTTTAAATCACGTTCACACAACTTA
GAAAGAGATAG
CGATGGCGGTGT
GAAAGAGAGCGAGATAG
TTGGAAGCTTCATG
GAAATGAAAGAGAGGTAG
TTTTTGGAAATGAAAGTTGTACTAGAAATAAGTATTTTATGTATATAGAATATCGAAGTACAGAAATTCGAAGCGATCTCAACTTGAATATTATATCG
Primers for 5C2 region (product is 240 bp):
Forward:
TTGTAGTTCTTAGAGCAGAATGT
Reverse:
GTTGAGATCGCTTCGAATTT
DNA or nucleosome probes at 35 nM (700 fmol/reaction) were incubated with MBP-tagged CLAMP DBD protein or MBP-tagged FL CLAMP protein in a binding buffer. The binding reaction buffer conditions are similar to conditions previously used to test ZLD nucleosome binding (
McDaniel et al., 2019
) in 20 µl total volume: 7.5 µl BSA/HEGK buffer (12.5 mM HEPES, pH 7.0, 0.5 mM EDTA, 0.5 mM EGTA, 5% glycerol, 50 mM KCl, 0.05 mg/ml BSA, 0.2 mM PMSF, 1 mM DTT, 0.25 mM ZnCl
2
, and 0.006% NP-40) 10 µl probe mix (5 ng poly[d-(IC)], 5 mM MgCl
2
, 700 fmol probe), and 2.5 µl protein dilution (0.5µM, 1 µM, and 2.5 µM) at room temperature for 60 min. Reactions were loaded onto 6% DNA retardation gels (Thermo Fisher Scientific) and run in 0.5× Tris–borate–EDTA buffer for 2 hr. Gels were post stained with GelRed Nucleic Acid Stain (Thermo Fisher Scientific) for 30 min and visualized using the ChemiDoc MP imaging system (Bio-Rad).
To deplete maternally deposited
clamp
or
zld
mRNA throughout oogenesis, we crossed a maternal triple driver (MTD-GAL4, Bloomington, #31777) line (
Ni et al., 2011
) with a Transgenic RNAi Project (TRiP)
clamp
RNAi line (Bloomington, #57008), a TRiP
zld
RNAi line (from C. Rushlow lab) or
egfp
RNAi line (Bloomington, #41552). The
egfp
RNAi line was used as control in smFISH immunostaining and imaging experiments. The MTD-GAL4 line alone was used as the control line in ATAC-seq and ChIP-seq experiments.
Briefly, the MTD-GAL4 virgin females (5–7 days old) were mated with TRiP UAS-RNAi males to obtain MTD-Gal4/UAS-RNAi line daughters. The MTD drives RNAi during oogenesis in these daughters. Therefore, the targeted mRNA is depleted in their eggs. Then MTD-Gal4/UAS-RNAi daughters were mated with males to produce embryos with depleted maternal
clamp
or
zld
mRNA and used for ATAC-seq and ChIP-seq experiments. The embryonic phenotypes of the maternal
zld
−
TRiP RNAi line were confirmed previously (
Sun et al., 2015
). Maternal
clamp
−
embryonic phenotypes of the TRiP
clamp
RNAi line were confirmed by immunofluorescent staining in our study. Moreover, we validated CLAMP or ZLD protein knockdown in early embryos by Western blotting using the Western Breeze Kit (Invitrogen) and measured
clamp
and
zld
mRNA levels by qRT-PCR (
Figure 1—figure supplement 1B,C
and
Figure 1—source data 1
).
To optimize egg collections, young (5–7 days old) females and males were mated. To ensure mothers do not lay older embryos during collections, we first starved flies for 2 hr in the empty cages and discarded the first 2 hr grape agar plates with yeast paste (Plate set #0). When we collected eggs for the experiments, we put flies in the cages with grape agar plates (Plate set #1) with yeast paste for egg laying for 2 hr. Then, we replaced Plate set #1 with a new set of plates (Plate set #2) at the 2 hr time point. We kept Plate set #1 embryos (without any adult flies) to further develop for another 2 hr to obtain 2–4 hr embryos. At the same time, we obtained newly laid 0–2 hr embryos from Plate set #2. Therefore, this strategy successfully prevented cross-contamination between 0–2 hr (Plate set #2) and 2–4 hr embryos (Plate set #1).
For whole embryo single-molecule fluorescence in situ hybridization (smFISH) and immunostaining and subsequent imaging, standard protocols were used (
Little and Gregor, 2018
). smFISH probes complementary to
run
were a gift from Thomas Gregor, and those complementary to
eve
were a gift from Shawn Little. The concentrations of the different dyes and antibodies were as follows: Hoechst (Invitrogen, 3 µg/ml), anti-NRT (Developmental Studies Hybridoma Bank BP106, 1:10), AlexaFluor secondary antibodies (Invitrogen Molecular Probes, 1:1000). Imaging was done using a Nikon A1 point-scanning confocal microscope with a 40× oil objective. Image processing and intensity measurements were done using ImageJ software (NIH). Figures were assembled using Adobe Photoshop CS4.
We conducted ATAC-seq following the protocol from
Blythe and Wieschaus, 2016
. 0–2 hr or 2–4 hr embryos were laid on grape agar plates, dechorionated by 1 min exposure to 6% bleach (Clorox) and then washed three times in deionized water. We homogenized 10 embryos and lysed them in 50 µl lysis buffer (10 mM Tris 7.5, 10 mM NaCl, 3 mM MgCl
2
, and 0.1% NP-40). We collected nuclei by centrifuging at 500
g
at 4°C and resuspended nuclei in 5 µl TD buffer with 2.5 µl Tn5 enzyme (Illumina Tagment DNA TDE1 Enzyme and Buffer Kits). We incubated samples at 37°C for 30 min at 800 rpm (Eppendorf Thermomixer) for fragmentation, and then purified samples with Qiagen MinElute columns before PCR amplification. We amplified libraries by adding 10 µl DNA to 25 µl NEBNext HiFi 2× PCR mix (New England Biolabs) and 2.5 µl of a 25 µM solution of each of the Ad1 and Ad2 primers. We used 13 PCR cycles to amplify samples from 0 to 2 hr embryos and 12 PCR cycles to amplify samples from 2 to 4 hr embryos. Next, we purified libraries with 1.2× Ampure SPRI beads. We performed three biological replicates for each genotype (n=2) and time point (n=2). We measured the concentrations of 12 ATAC-seq libraries by Qubit and determined library quality by Bioanalyzer. We sequenced libraries on an Illumina Hi-seq 4000 sequencer at GeneWiz (South Plainfield, NJ) using the 2 × 150 bp mode. ATAC-seq data is deposited at NCBI GEO and the accession number is GSE152596.
We performed ChIP-seq as previously described (
Blythe and Wieschaus, 2015
). We collected and fixed ~100 embryos from each MTD-GAL4 and RNAi cross 0–2 hr or 2–4 hr after egg lay. We used 3 µl of rabbit anti-CLAMP (
Soruco et al., 2013
) and 2 µl rat anti-ZLD (from C. Rushlow lab) per sample. We performed three biological ChIP replicates for each protein (n=2), genotype (n=3), and time point (n=2). In total, we prepared 36 libraries using the NEBNext Ultra ChIP-seq Kit (New England Biolabs) and sequenced libraries on the Illumina HiSeq 2500 sequencer using the 2 × 150 bp mode. ChIP-seq data is deposited at NCBI GEO and the accession number is GSE152598.
Prior to sequencing, the Fragment Analyzer showed the library top peaks were in the 180–190 bp range, which is comparable to the previously established embryo ATAC-seq protocol (
Haines, 2017
). Demultiplexed reads were trimmed of adapters using TrimGalore (
Krueger, 2017
) and mapped to the
Drosophila
genome dm6 version using Bowtie2 (v. 2.3.0) with option
--very-sensitive
,
--no-mixed
,
--no-discordant
,
--dovetail
-X 2000 k 2. We used Picard tools (v. 2.9.2) and SAMtools (v.1.9,
Li et al., 2009
) to remove the reads that were unmapped, failed primary alignment, or duplicated (-F 1804), and retain properly paired reads (-f 2) with MAPQ >30. After quality trimming and mapping, the Picard tool reported the mean fragment sizes for all ATAC-seq mapped reads are between 125 and 161 bp. As expected, we observed three classes of peaks: (1) a sharp peak at <100 bp (open chromatin); (2) a peak at ~200 bp (mono-nucleosome); and (3) other larger peaks (multi-nucleosomes).
After mapping, we used Samtools to select a fragment size ≤100 bp within the bam files to focus on open chromatin. Peak regions for open chromatin regions were called using MACS2 (v. 2.1.1,
Zhang et al., 2008
) with parameters -f BAMPE -g dm
--call-summits
. ENCODE blacklist was used to filter out problematic regions in dm6 (
Amemiya et al., 2019
). Bam files and peak bed files were used in DiffBind v.3.12 (
Stark and Brown, 2019
) for count reads (dba.count), library size normalization (dba.normalize), and calling (dba.contrast) DA region with the DESeq2 method. Peak regions (201 bp) were centered by peak summits and extended 100 bp on each side. Sites were defined as DA with statistically significant differences between conditions using absolute cutoffs of FC>0.5 and FDR<0.1. We report all accessible peaks from DiffBind in
Figure 2—source data 1
.
We used DeepTools (v. 3.1.0,
Ramírez et al., 2014
) to generate enrichment heatmaps (CPM normalization), and average profiles were generated in DeepStats (
Gautier, 2020
). We used 1× depth (reads per genome coverage, RPGC) normalization in Deeptools bamCoverage for making the coverage Bigwig files and uploaded to IGV (
Robinson et al., 2011
) for genomic track visualizations. Homer (v. 4.11,
Givler and Lilienthal, 2005
) was used for de novo motif searches. Visualizations and statistical tests were conducted in
R Development Core Team, 2014
. Specifically, we annotated peaks to their genomic regions using R packages Chipseeker (
Yu et al., 2015
) and we performed gene ontology enrichment analysis using clusterProfiler (
Yu et al., 2012
). Boxplot and violin plots were generated using ggplot2 (
Wickham, 2009
) package.
Briefly, we trimmed ChIP-seq raw reads with TrimGalore (
Krueger, 2017
) with a minimal phred score of 20, 36 bp minimal read length, and Illumina adaptor removal. We then mapped cleaned reads to the
D. melanogaster
genome (UCSC dm6) with Bowtie2 (v. 2.3.0) with the –very-sensitive-local flag feature. We used Picard tools (v. 2.9.2) and SAMtools (v. 1.9,
Li et al., 2009
) to remove the PCR duplicates. We used MACS2 (v. 2.1.1,
Zhang et al., 2008
) to identify peaks with default parameters and MSPC (v. 4.0.0,
Jalili et al., 2015
) to obtain consensus peaks from three replicates. The peak number for each sample was summarized in
Table 1
. ENCODE blacklist was used to filter out problematic regions in dm6 (
Amemiya et al., 2019
). We identified DB and non-DB between MTD and RNAi samples using DiffBind (v. 3.10,
Stark and Brown, 2019
) with the DESeq2 method. Peak regions (501 bp) were centered by peak summits and extended 250 bp on each side. The DB and non-DB peak numbers are summarized in
Table 1
. DB was defined with absolute FC>0.5 and FDR<0.05 (
Table 1—source data 1
).
We used DeepTools (v. 3.1.0,
Ramírez et al., 2014
) to generate enrichment heatmaps and average profiles. Bigwig files were generated with DeepTools bamCompare (scale factor method: SES; Normalization: log
2
) and uploaded to IGV (
Robinson et al., 2011
) for genomic track visualization. We used Homer (v. 4.11,
Givler and Lilienthal, 2005
) for de novo motif searches and genomic annotation. Intervene (
Khan and Mathelier, 2017
) was used for intersection and visualization of multiple peak region sets. Visualizations and statistical tests were conducted in
R Development Core Team, 2014
. Specifically, we annotated peaks to their genomic regions using the R package Chipseeker (
Yu et al., 2015
) and we did gene ontology enrichment analysis using clusterProfiler (
Yu et al., 2012
). Boxplots and violin plots were generated using the ggplot2 (
Wickham, 2009
) package.
We used Bedtools (
Quinlan and Hall, 2010
) intersection tool to intersect peaks in CLAMP ChIP-seq binding regions with CLAMP DA or non-DA peaks. Based on the intersection of the peaks, we defined four types of CLAMP related peaks: (1) DA with CLAMP, (2) DA without CLAMP, (3) non-DA with CLAMP, and (4) non-DA without CLAMP. Similarly, we defined ZLD related peaks by intersecting ZLD DA or non-DA peaks and ATAC-seq data sets (
Hannon et al., 2017
;
Soluri et al., 2020
) from wt and
zld
germline clone (
zld-
) embryos at the NC14 +12 min stage. Specifically, we defined four classes of genomic loci for ZLD-related classes: (1) DA with ZLD, (2) DA without ZLD, (3) non-DA with ZLD, and (4) non-DA, without ZLD. We used DeepTools (v. 3.1.0,
Ramírez et al., 2014
) to generate enrichment heatmaps for each subclass of peaks. Peaks locations in each CLAMP or ZLD-related category were summarized in
Table 2—source data 1
.
To define strong, weak, and unbound genes close to peaks in CLAMP or ZLD ChIP-seq data, we used the peak binding score reported in MACS2 -log10(p-value) of 100 as a cutoff value. We defined the following categories: (1) strong binding peaks: score greater than 100; (2) weak binding peak: score lesser than 100; (3) unbound peaks: the rest of the peaks that are neither strong or weak. Then, we annotated all peaks using Homer annotatePeaks (v. 4.11,
Givler and Lilienthal, 2005
). We then obtained the log
2
fold change (
clamp-i/
MTD or
zld-i/yw
) of gene expression in the RNA-seq data set for each protein binding group: CLAMP (
Rieder et al., 2017
) or ZLD (
Schulz et al., 2015
). Boxplots and violin plots were generated using the ggplot2 (
Wickham, 2009
) package.
RNA-seq data sets from wt and maternal
clamp
depletion by RNAi were from GSE102922 (
Rieder et al., 2017
). RNA-seq data sets from
yw
wt and
zld
maternal RNAi were from GSE65837 (
Schulz et al., 2015
). ATAC-seq data from wt and
zld
germline clones were from GSE86966 (
Hannon et al., 2017
). Processed ATAC-seq data identifying differential peaks between wt and
zld
germline mutations were from
Soluri et al., 2020
.
The GAGA factor is required in the early Drosophila embryo not only for transcriptional regulation but also for nuclear division
Development
122
:1113–1124.
Early even-skipped stripes act as morphogenetic gradients at the single cell level to establish engrailed expression
Development
121
:4371–4382.
Dosage compensation in Drosophila: evidence that daughterless and Sex-lethal control X chromosome activity at the blastoderm stage of embryogenesis
Genetics
117
:477–485.
Using HOMER software, NREL’s Micropower Optimization Model, to Explore the Role of Gen-Sets in Small Solar Power Systems; Case Study: Sri Lanka (No: NREL/TP-710-36774)
National Renewable Energy Lab.
Drosophila neurotactin, a surface glycoprotein with homology to serine esterases, is dynamically expressed during embryogenesis
Development
110
:1327–1340.
An improvement of the 2ˆ(-delta delta CT) method for quantitative real-time polymerase chain reaction data analysis
Biostatistics, Bioinformatics and Biomathematics
3
:71–85.
Department of Molecular Biology, Cellular Biology, and Biochemistry, Brown University, Providence, United States
Present address
Department of Animal Science, College of Agriculture and Life Sciences, Cornell University, Ithaca, United States
Contribution
Conceptualization, Data curation, Formal analysis, Supervision, Validation, Investigation, Visualization, Methodology, Writing - original draft, Writing - review and editing
Contributed equally with
Leila Rieder
For correspondence
[email protected]
Competing interests
No competing interests declared
Contribution
Conceptualization, Data curation, Software, Formal analysis, Supervision, Validation, Investigation, Visualization, Methodology, Writing - review and editing
Contributed equally with
Jingyue Duan
Competing interests
No competing interests declared
Department of Molecular Biology, Cellular Biology, and Biochemistry, Brown University, Providence, United States
Contribution
Validation
Competing interests
No competing interests declared
Department of Molecular Biology, Cellular Biology, and Biochemistry, Brown University, Providence, United States
Contribution
Validation
Competing interests
No competing interests declared
Department of Molecular Pharmacology, Physiology and Biotechnology, Brown University, Providence, United States
Contribution
Supervision, Validation, Visualization
Competing interests
No competing interests declared
Department of Molecular Biology, Cellular Biology, and Biochemistry, Brown University, Providence, United States
Department of Molecular Biology, Princeton University, Princeton, United States
Contribution
Supervision, Validation, Visualization, Writing - review and editing
Competing interests
No competing interests declared
Department of Molecular Biology, Cellular Biology, and Biochemistry, Brown University, Providence, United States
Contribution
Resources, Methodology
Competing interests
No competing interests declared
Department of Molecular Pharmacology, Physiology and Biotechnology, Brown University, Providence, United States
Contribution
Supervision, Funding acquisition, Methodology
Competing interests
No competing interests declared
Department of Molecular Biology, Cellular Biology, and Biochemistry, Brown University, Providence, United States
Contribution
Conceptualization, Resources, Data curation, Software, Formal analysis, Supervision, Funding acquisition, Validation, Investigation, Visualization, Methodology, Writing - original draft, Project administration, Writing - review and editing
For correspondence
[email protected]
Competing interests
No competing interests declared
The authors thank Dr. Melissa Harrison, Tyler Gibson, and Marissa Gaskill for sending the GAF-dependent region bed file and helpful discussions. The authors thank members in the Larschan lab for feedback and discussions. This work was supported by NIH Grant F32GM109663, K99HD092625, and R00HD092625 to Dr. Leila Rieder and R35GM126994 to Dr. Erica Larschan, and in part by NSF Grant 1845734 and NIH Grant R01GM118530 to Dr. Nicolas L Fawzi.
Preprint posted:
July 15, 2020 (view preprint)
Received: April 30, 2021
Accepted: August 2, 2021
Accepted Manuscript published:
August 3, 2021 (version 1)
Version of Record published:
August 16, 2021 (version 2)
A two-part list of links to download the article, or parts of the article, in various formats.
Downloads
(link to download the article as PDF)
Cognitive decline is a significant health concern in our aging society. Here, we used the model organism
C. elegans
to investigate the impact of the IIS/FOXO pathway on age-related cognitive decline. The
daf-2
Insulin/IGF-1 receptor mutant exhibits a significant extension of learning and memory span with age compared to wild-type worms, an effect that is dependent on the DAF-16 transcription factor. To identify possible mechanisms by which aging
daf-2
mutants maintain learning and memory with age while wild-type worms lose neuronal function, we carried out neuron-specific transcriptomic analysis in aged animals. We observed downregulation of neuronal genes and upregulation of transcriptional regulation genes in aging wild-type neurons. By contrast, IIS/FOXO pathway mutants exhibit distinct neuronal transcriptomic alterations in response to cognitive aging, including upregulation of stress response genes and downregulation of specific insulin signaling genes. We tested the roles of significantly transcriptionally-changed genes in regulating cognitive functions, identifying novel regulators of learning and memory. In addition to other mechanistic insights, a comparison of the aged vs young
daf-2
neuronal transcriptome revealed that a new set of potentially neuroprotective genes is upregulated; instead of simply mimicking a young state,
daf-2
may enhance neuronal resilience to accumulation of harm and take a more active approach to combat aging. These findings suggest a potential mechanism for regulating cognitive function with age and offer insights into novel therapeutic targets for age-related cognitive decline.
LD score regression (LDSC) is a method to estimate narrow-sense heritability from genome-wide association study (GWAS) summary statistics alone, making it a fast and popular approach. In this work, we present interaction-LD score (i-LDSC) regression: an extension of the original LDSC framework that accounts for interactions between genetic variants. By studying a wide range of generative models in simulations, and by re-analyzing 25 well-studied quantitative phenotypes from 349,468 individuals in the UK Biobank and up to 159,095 individuals in BioBank Japan, we show that the inclusion of a
cis
-interaction score (i.e. interactions between a focal variant and proximal variants) recovers genetic variance that is not captured by LDSC. For each of the 25 traits analyzed in the UK Biobank and BioBank Japan, i-LDSC detects additional variation contributed by genetic interactions. The i-LDSC software and its application to these biobanks represent a step towards resolving further genetic contributions of sources of non-additive genetic effects to complex trait variation.
Copy number variation in large gene families is well characterized for plant resistance genes, but similar studies are rare in animals. The zebrafish (
Danio rerio
) has hundreds of NLR immune genes, making this species ideal for studying this phenomenon. By sequencing 93 zebrafish from multiple wild and laboratory populations, we identified a total of 1513 NLRs, many more than the previously known 400. Approximately half of those are present in all wild populations, but only 4% were found in 80% or more of the individual fish. Wild fish have up to two times as many NLRs per individual and up to four times as many NLRs per population than laboratory strains. In contrast to the massive variability of gene copies, nucleotide diversity in zebrafish NLR genes is very low: around half of the copies are monomorphic and the remaining ones have very few polymorphisms, likely a signature of purifying selection.
eLife is a non-profit organisation inspired by research funders and led by scientists. Our mission is to help scientists accelerate discovery by operating a platform for research communication that encourages and recognises the most responsible behaviours in science.
eLife Sciences Publications, Ltd is a limited liability non-profit non-stock corporation incorporated in the State of Delaware, USA, with company number 5030732, and is registered in the UK with company number FC030576 and branch number BR015634 at the address:
eLife Sciences Publications, Ltd
95 Regent Street
Cambridge CB2 1AW