Satellite DNA genomic structure and heterochromatin organization Amanda Larracuente Department of Biology
Satellite DNA
• Chromosome segregation • Heterochromatin
formation
• Nuclear organization
http://www.chrombios.com
Satellite DNA: misregulation • Genomic instability and cancer Zhu et al. 2011, Ting et al. 2011, Shamas 2011
• Senescence and Aging e.g. Swanson et al. 2013
• Chromosome mis-segregation Rosic et al. 2014
Satellite DNA: rapid evolution • Rapid turnover between species
• Genetic incompatibilities
Ferree and Prasad 2011
Why don’t we know more about satellite DNA? • Low recombination • Difficult genomics 1. Underrepresented among Sanger reads 2. Assembly difficult/impossible • Mutation dynamics not well understood
Outline I. Heterochromatin organization
II. Detailed satDNA structure
Drosophila genomes
2 Mya
0.24 Mya
Photo: A. Karwath
simulans clade
Drosophila genomes
2 Mya
0.24 Mya
Photo: A. Karwath
P5C3
~95X
Kim et al. 2014
Drosophila genomes
2 Mya
0.24 Mya
Photo: A. Karwath
P6C4 ~115X
~120X
~85X
~95X
simulans clade assemblies I.
Canu
+
Hybrid DBG2OLC
Canu: Koren et al. 2016
quickmerge
Merged-1
Chakraborty et al. 2016 NAR Mahul Chakraborty , Ching-Ho Chang
simulans clade assemblies I.
Canu
+
Hybrid DBG2OLC
II.
Merged-1
+
quickmerge
MHAP
quickmerge
Merged-1
Merged-2
Quiver x2 Pilon
III. MHAP: Berlin et al. 2015 Canu: Koren et al. 2016
Final Chakraborty et al. 2016 NAR Mahul Chakraborty , Ching-Ho Chang
Data Source Genbank Contig Genbank Scaffold PacBio Contig Illumina Contig
5.0e+07
0.0e+00 0
50
100
Contig/Scaffold rank
150
Mahul Chakraborty
Genome organization Few structural rearrangements
2L
X
2R Dmau Dsim Dmel
Dmau Dsim Dmel
Dmau Dsim Dmel
3L
4
3R
Dmau Dsim Dmel
Ching-Ho Chang Emerson Khost
Heterochromatin organization Pericentric gene-containing regions rearranged
2L
X
2R Dmau Dsim Dmel
Dmau Dsim Dmel
Dmau Dsim Dmel
3L
4
3R
Dmau Dsim Dmel
Ching-Ho Chang Emerson Khost
Heterochromatin organization Cytogenetic maps: FISH
2
Y 4
2
2 X
Y
Y 4
3
4
X
3X
3
2
2
2
X
4 X 3
2
2
4
3
Time (Mya)
Time (Mya) Time (Mya)
4
4
3
3
0.2
4
2
4
4
3
2
Rsp AAGAG 2 3
3
3
X
X
X
D. melanogaster D. melanogaster D. melanogaster D. sechellia D. sechellia D. sechellia D. simulans D. mauritiana D. simulans D. simulans D. mauritiana D. mauritiana Dmau 2L
X
Dsim Dmel
2R Dmau Dsim Dmel
3L
4
Dmau Dsim Dmel
0.2
Y
Y
2
3
X
Y
X
X 4
4X
2
D. mauritiana 0.2
D. simulans
D. melanogaster
2
2
Dmau Dsim Dmel
3R
Complex satellite DNA famlilies Rsp
Rsp-like
1.688
Ching-Ho Chang Emerson Khost
Underrepresented heterochromatin Dmau
Sequenced males
600
600
count count
Expected: ~100X for autosomes ~ 50X for X ~ 50X for Y
400
400
200
200
0
Ching-Ho Chang
0
0
50
0
50
100
coverage coverage
100
150 150
Underrepresented heterochromatin Dmau Region
Sequenced males
A
600
600
count count
Expected: ~100X for autosomes ~ 50X for X ~ 50X for Y
U
*
X
Y X
Y
400
*
Region A
400
U X Y
Observed: ~100X for autosomes ~ 50X for X ~ 31X for Y ~ 29X for U
200
200
0
Ching-Ho Chang
0
0
0 *P < 10-16 MWU
50 50
coverage coverage
100
150
100
150
Outline I. Heterochromatin organization – Genes in pericentric regions reorganized – satDNA reorganization – Biased heterochromatic read recovery
II. Detailed satDNA structure
Outline I. Heterochromatin organization – Genes in pericentric regions reorganized – satDNA reorganization – Biased heterochromatic read recovery
II. Detailed satDNA structure
Responder (Rsp) satellite
D. melanogaster Larracuente 2014
Responder (Rsp) satellite
Dimeric Structure Left
Right
120 bp
120 bp
Wu et al. 1988
Dynamic evolution Rsp
Ch 2
Ch 2 Ch 3
Ch X
Lost?
Larracuente 2014
Rapid evolution of satellite DNA • Natural selection? • Neutral? Unequal crossing-over and gene conversion Expansion
Contraction
Smith 1976; Dover 1982; Charlesworth et al. 1994; Stephan 1986
Satellite DNA assembly D. melanogaster ~95X PacBio Kim et al. 2014 Ø PBcR-BLASR
Phillipy, Koren
Cel8.1
Ø MHAP
Berlin et al. 2015
Cel8.2, Cel 8.3
Ø Canu
Koren et al. 2016
Photo: A. Karwath
Satellite DNA assembly D. melanogaster ~95X PacBio Kim et al. 2014 Ø PBcR-BLASR
Phillipy, Koren
Cel8.1
Ø MHAP
Berlin et al. 2015
Cel8.2, Cel 8.3
Ø Canu
Koren et al. 2016
Photo: A. Karwath
• Run over a grid of parameter values • Quiver + Pilon
Bari1 Helitron Jockey LTR Retrotransposon Mariner.Tc1 Non-LTR Retrotransposon Rsp Left Rsp Right Rsp Trunc Rsp Variant Simple repeat Transib
30
Counts
20
10
0
5_
5_ 4
5 G 5_
6 G 5_
7 G 5_
G
2
Rsp satellite
G
(A AG
AG ) n
Detailed organization of Rsp satellite
0
0
100
Position (kb)
200
300
Khost, Eickbush Larracuente BioRxiv 2016
1 G 5_
5_ 2 G
g fra
5_ 3 G
G
5_
5_ 4
5 G 5_
6 G 5_
7 G 5_
G
2
Rsp satellite
G
(A AG
AG ) n
Detailed organization of Rsp satellite Concerted evolution: Unequal exchange and gene conversion Bari1 Helitron Jockey LTR Retrotransposon Mariner.Tc1 Non-LTR Retrotransposon Rsp Left Rsp Right Rsp Trunc Rsp Variant Simple repeat Transib
30
Counts
20
10
0
0
0
100
Position (kb)
200
300
Khost, Eickbush Larracuente BioRxiv 2016
Summary
1 pixels
– Heterochromatin organization – Genomics of satellite structure
Summary
1 pixels
– Heterochromatin organization – Genomics of satellite structure – Detailed evolutionary history of satellites
Evolutionary dynamics within populations Sequence diversity and abundance variation across populations 3000 2500
Count
2000 1500
1000 750
1000
500
500
250 0
Population
bw e
a Zi
m
ba
an i m Ta s
h ig le Ra
nd er la th Ne
a ac Ith
ng iji Be
ba
bw e
ia Zi m
Ta sm an
h ig le Ra
nd Ne
th
er
la
a ac Ith
Be
iji
ng
s
s
0
Population
Data: DGRP & GDL Illumina genomes from MacKay et al. 2012, Grenier et al. 2015
Summary
1 pixels
– Heterochromatin organization – Genomics of satellite structure – Detailed evolutionary history of satellites – Functional genomics
3000 2500
1500
1000 750
1000
500
500
250
ia an m
ba bw e
Zi m
gh
Population
Ta s
le i Ra
la n
ng
ac a
er th Ne
Ith
iji Be
ia an m
gh
ba bw e m
Zi
Ta s
ds a
la n
le i Ra
Ne th
ng
ac
iji
Ith
er
Population
ds
0
0
Be
Count
2000
Satellite expression and regulation Rsp satellite
+ strand - strand
ovary small RNAs Data from: Pane et al. 2011
Emerson Khost
Summary
1 pixels
– Heterochromatin organization – Genomics of satellite structure – Detailed evolutionary history of satellites – Functional genomics – Experimental manipulation 3000 2500
1500
1000 750
1000
500
500
250
ia an m
ba bw e
Zi m
gh
Population
Ta s
le i Ra
la n
ng
ac a
er th Ne
Ith
iji Be
ia an m
gh
ba bw e m
Zi
Ta s
ds a
la n
le i Ra
Ne th
ng
ac
iji
Ith
er
Population
ds
0
0
Be
Count
2000
Acknowledgments sim clade genomes
Complex satellites
Ching-Ho Chang (U. Rochester) Mahul Chakorbharty (UC Irvine) J.J. Emerson (UC Irvine) Kristi Montooth (U Nebraska) Colin Meiklejohn (U Nebraska)
Emerson Khost (U. Rochester) Danna Eickbush (U. Rochester)
UR Center for Integrated Research Computing
Jeffrey Vedenayagam (NYU)
Funding
Join our group! Positions open in evolutionary genomics Ø NIH-funded postdoc Ø Graduate students http://blogs.rochester.edu/Larracuente
260-bp satellite organization 1.688 family satellite Helitron Jockey LTR Retrotransposon Loa Mariner.Tc1 Non-LTR Retrotransposon Simple repeat
15
Counts
10
5
0 600
700
Position (Kb)
800
900
Khost, Eickbush Larracuente BioRXiv 2016
Y chromosome 40 Mb
D. melanogaster reference: ~4 Mb D. simulans: ~30 Mb Ching-Ho Chang
Rapid evolution of satellite DNA: Neutral? • Intrinsic mutational properties • Unequal crossing-over and gene conversion Expansion
Contraction
Smith 1976; Dover 1982; Charlesworth et al. 1994; Stephan 1986
Rapid evolution of satellite DNA: Selection? • Intragenomic conflict over germline transmission • Target of male drive 95 % Sandler et al. 1959
Rapid evolution of satellite DNA: Selection? • Intragenomic conflict over germline transmission • Target of male drive • Female meiotic drive (centromere drive)
>50 % < 50 %
Walker 1971; Henikoff et al. 2001; Malik and Henikoff 2001
What can we learn from this assembly? • Variation • Recombination: Sequence diversity and abundance What contributes to differences between individuals? 3000
2000 1500 1000
Ral_208
Ral_380
Ral_379
Ral_391
Ral_362
Ral_313
Ral_350
Ral_427
Ral_399
Ral_358
Ral_40
Ral_437
Ral_375
500
Ral_357
copy #
2500
1 5_ G
G
5_ 3
G
g
fra
G
G
Bari1 Helitron Jockey LTR Retrotransposon Mariner.Tc1 Non-LTR Retrotranspo Rsp Left Rsp Right Rsp Trunc Rsp Variant Simple repeat Transib
30
Counts
20
10
0
5_
5_ 4
5
5_
G
6
5_
G
G
5_
7
2
G
(A AG
AG ) n
• Variation • Recombination: Sequence diversity and abundance
5_ 2
What can we learn from this assembly?
0
0
100
Position (kb)
200
300
30
Counts
Unequal exchange in array center 20 Expansion
10
0
0
1 5_
Bari1 Helitron Jockey LTR Retrotransposon Mariner.Tc1 Non-LTR Retrotranspo Rsp Left Rsp Right Rsp Trunc Rsp Variant Simple repeat Transib
Contraction 0
G
G
5_ 3
G
g
fra
G
G
5_
5_ 4
5
5_
G
6
5_
G
G
5_
7
2
G
(A AG
AG ) n
• Variation • Recombination: Sequence diversity and abundance
5_ 2
What can we learn from this assembly?
100
Position (kb)
200
300
1 G 5_
5_ 2 G
G
5_ 3
g fra
5_ 1 G
2 G
5_
3 5_ G
G
5_
4 5_ G
30 20 20
Counts
Counts
fra g
G
G
5_ 5 G
5_ 6 G
G
5_ 7
2 G
n
AG
)
(A AG (A AG
Bari1 Helitron Jockey Bari1 LTR Retrotransposon Helitron Mariner.Tc1 Jockey Non-LTR Retrotransposon LTR Retrotransposon Rsp Left Mariner.Tc1 Rsp Right Non-LTR Retrotransposon Rsp Trunc Rsp Left Rsp Variant Rsp Right Simple repeat Rsp Trunc Transib Rsp Variant Simple repeat Transib
30
10 10
0
5_
5_ 4
5 G 5_
6 G 5_
7 G 5_
G
AG ) n
2
Detailed organization of Rsp satellite
0
100
0 0
Position (kb)
200
0
0
100
Position (kb)
300
Khost, Eickbush Larracuente BioRxiv 2016 200
300
D. melanogaster complex satellites Rsp 1.688
Emerson Khost
Unusual locus composition in D. melanogaster D. sechellia
D. simulans
Psec = 0.15
Psim = 0.18
D. mauritiana
Pmau = 0.45
Larracuente 2014; Anthony Geneva, methods: Blomberg et al 2003
Genome data from MacKay et al. 2012 Ral_208
Ral_380
Ral_379
Ral_391
Ral_362
Ral_313
Ral_350
Ral_427
Ral_399
Ral_358
Ral_40
Ral_437
2500
Ral_375
Ral_357
Rsp copy #
Rsp abundance variation in D. melanogaster
3000
R=0.93
2000
1500
1000
500
Copy number variation in global populations Rsp 3000 2500