Isoform sequencing PacBio RSII
Anna Bratus PacBio User Meeting, Barcelona, November 10, 2015
SCHEDULE I.
CASE STUDY
II. LIBRARY PREPARATION III. SEQUENCING IV. DATA OUTCOME V. CONCLUSIONS
··· 2
CASE STUDY
E. coli diarrhea (ETEC, enterotoxigenic E. coli) in the pig How a pig gets diarrhea?
··· 3
CASE STUDY E. coli diarrhea of the pig: Phenotyping
• Microscopic E. coli adhesion test – Obtain enterocytes from slaughtered pigs – Add bacteria F4 – Bacteria adhere to the brush border of enterocytes of pigs with a susceptible phenotype – Bacteria do not adhere to the brush border of enterocytes of resistant pigs
enterocytes
brush boarders
bacterias Python, 2003
··· 4
New boundaries of the F4bcR locus 131 Mbp SW207 132.5
143 Mbp ZDHHC19 143.3 TNK2 143.6 MUC4 143.8 KIAA0226 143.9
MUC4GT 143.8 S0283 145.0 S0075 146.6 SW1876 150.6
LRCH3 144.1 LMLN 144.2 ZNF148 144.5 SLC12A8 144.8 HEG1 144.9 MUC13 145.0 ITGB5 145.1
KARLN 145.4
MYLK 146.2
SW698 186.5
188 Mbp
SSC13
148 Mbp
Sscrofa 10.2 assembly
··· 5
CASE STUDY Muc13 Zhang, 2008
3-5 kb
Ren, 2012
··· 6
CASE STUDY
Facts and questions: 1. DNA sequences of Muc13A and Muc13B cloned in BACs showed differences in tandem repeat regions: PacBio sequencing 2. Are those TRs transcribed?
3. Are there any other candidates in the critical region in pig chromosome 13? 4. Are there any other E.coli receptor candidiates in the pig genome? 5. Are we able to improve pig reference genome annotation?
··· 7
LIBRARY PREPARATION
1
5
8
2 9
3 4
6
7 “Isoform Sequencing (Iso-Seq™) Using the Clontech ® SMARTer ® PCR cDNA Synthesis Kit and BluePippin™ Size-Selection System” · · · 8
LIBRARY PREPARATION
1
INPUT MATERIAL
• SAMPLE SOURCE: PIG EPITHELIAL CELLS ISOLATED FROM SMALL INTESTINE • NO. OF SAMPLES IN EXPERIMENT: 2 SAMPLES: FROM RESISTANT AND SUSCEPTIBLE ANIMALS
• SAMPLE QUANTITY: 1µg of TOTAL RNA • SAMPLE QUALITY: RINs: 9.0 and 9.2 ··· 9
LIBRARY PREPARATION
2
GENERATION OF FULL-LENGTH cDNA: Clontech® SMRTer® PCR cDNA Synthesis Kit
Single step
· · · 10
LIBRARY PREPARATION
3
PCR CYCLE OPTIMIZATION SAMPLE 1
SAMPLE 2
12 14 16 18 20 12 14 16 18 20 CYCLE NUMBER
· · · 11
LIBRARY PREPARATION
3
PCR CYCLE OPTIMIZATION SAMPLE 1
· · · 12
LIBRARY PREPARATION
4
LARGE SCALE PCR FOR SIZE SELECTION ON THE BluePippin™ SYSTEM (KAPA HiFi) SAMPLE 1 SAMPLE 1 SAMPLE 2
SAMPLE 2
· · · 13
LIBRARY PREPARATION
5
SIZE SELECTION ON THE BluePippin™ SYSTEM cDNA FRACTIONS: I. 1-2 kb II. 2-3 kb III. 3-6 kb IV. 5-10 kb I.
II. III. IV.
· · · 14
LIBRARY PREPARATION
6
LARGE SCALE PCR FOR SMRTbell™ LIBRARY PREPARATION (KAPA HiFi) cDNA FRACTIONS: I. 1-2 kb II. 2-3 kb III. 3-6 kb IV. 5-10 kb I.
II. III. IV.
· · · 15
LIBRARY PREPARATION
7
cDNA SMRTbell™TEMPLATE PREPARATION cDNA FRACTIONS: I. 1-2 kb II. 2-3 kb III. 3-6 kb IV. 5-10 kb I.
II. III. IV.
· · · 16
LIBRARY PREPARATION
8
OPTIONAL 2nd BluePippin™ SIZE SELECTION FOR LONG INSERT LIBRARIES
cDNA fraction IV: BEFORE size-selection
cDNA fraction IV: AFTER size-selection
· · · 17
SEQUENCING cDNA Sample Fraction
1-2 kb 2-3 kb 3-6 kb 5-10 kb
No of SMRT No of reads cells/fraction (*1000)/cell
Mean Throughput polymerase (Gbp) read length (*1000)
1
1
92,7
16,7
1.55
2
1
89,9
17,1
1.5
1
1
97,6
16,5
1.6
2
1
92,9
16,7
1.5
1
2
92,1
16,0
2.9
2
2
92,4
15,5
2.8
1
2
81,3
12,3
2.0
2
4
51,1
11,3
2.3
CHEMISTRY: P6/C4, 1X240min movie · · · 18
SEQUENCING SMRTbell libraries Fraction III-NON SIZE SELECTED
Fraction III-NON SIZE SELECTED
Fraction IV-SIZE SELECTED Fraction IV-SIZE SELECTED
· · · 19
DATA OUTPUT
“Iso-SeqTM: Full-Length Transcript Analysis Using SMRT® Analysis V2.3” Philip Lobb MSc, 18th March 2015
· · · 20
DATA OUTPUT Read classification Reads are grouped as full-length non-chimeric (flnc) and non-full-length (nfl) cDNA Fraction 1-2 kb
2-3 kb 3-6 kb 5-10 kb
Sample
No of reads of insert (*1000)
No of flnc reads (*1000)
Flnc reads (%)
Average flnc length (*1000)
1
89,0
43,4
48
1,2
2
86,2
43,4
50
1,2
1
93,2
42,1
45
2,3
2
88,3
39,2
44
2,3
1
171,0
55,8
32
2,9
2
170,4
54,1
31
3,1
1
158,9
45,1
28
5,7
2
135,5
40,8
30
5,9 · · · 21
DATA OUTPUT Reads clustering, consesnsus calling and quality filtering
Sample
No of consensus isoforms (*1000)
Average No of polished consensus high-quality isoforms isoforms (*1000) read length
No of polished low-quality isoforms(*1000)
1
106,7
3,3
38,4
68,2
2
110,2
3,6
35,6
74,5
high quality clusters =99% accuracy after quiver polishing
· · · 22
DATA OUTPUT
Coverage of the transctiptoms All annotated junctions in the transcriptomes were covered by the reads. Coverage by the hq isoforms is close to reach plateau. SAMPLE 1
reads
hq clusters
· · · 23
Conclusions 1. DNA sequences of Muc13A and Muc13B cloned in BACs showed differences in
tandem repeat regions: PacBio sequencing 2. Are those TRs transcribed? – Yes. 3. Are there any other candidates in the critical region in pig chromosome 13?
4. Are there any other E.coli receptor candidiates in the pig genome? Phenotype (res. Vs. sus.) specfic transcripts were detected, will be confirmed with deep sequencing data.
5. Are we able to improve pig reference genome annotation? – Yes, novel junctions detected.
· · · 24
ACKNOWLEDGEMENTS:
Weihong Qi, FGCZ Andrea Patrignani, FGCZ Stefan Neuenschwander, ETHZ
· · · 25