Magdalena Bus Challenging samples Why Massively Parallel ...

Report 6 Downloads 23 Views
A Next Generation Sequencing Panel for DNA Typing of Challenging Samples

Magdalena Bus Dept. of Immunology, Genetics and Pathology Science for Life Laboratory Uppsala University, Sweden

Challenging samples

Bones/teeth:

• Low Copy Number

Viking Age boat graves

• Highly degraded

Vasa warship

• Contaminated

Viking Age mass graves

• Limited samples

Historical Persons

Old Uppsala (Sweden), a Viking Age boat grave from 9th or 10th century.

Why Massively Parallel Sequencing?

Sanger sequencing of mtDNA (SNPs)  low throughput  high cost Fluorescent-based CE-STR typing  detection of DNA fragment size – SNP variants cannot be detected  loss of larger size loci Analyses of mixed DNA samples – challenge!

1

Why Massively Parallel Sequencing? • Thousands to millions DNA targets can be sequenced in parallel • Simultaneous analysis of multiple loci on autosomes, sex chromosomes, and the entire mtGenome • Different type of markers: SNPs (single or microhaplotypes), STRs and InDels using the same technology • Many markers - obtain enough partial data even for highly degraded samples

Commercially available NGS kits • HID-Ion AmpliSeqTM Ancestry Panel - Ion PGMTM System, Ion Torrent  HID-Ion AmpliSeqTM Ancestry Panel – 165 autosomal Ancestry Informative Markers (AIMs); amplicon size range 120-130 bp • ForenSeqTM DNA Signature Prep Kit – Illumina MiSeq  27 global autosomal STRs, 7 X-STRs, 24 Y-STRs, 94 identity SNPs, 22 phenotypic SNPs, biogeographical SNPS; amplicon size range 61-462 bp

The aims:  Simultaneous analysis of multiple loci in a single panel – SNPs, STRs, InDels

 Analysis of nuclear DNA and mitochondrial DNA in the same panel with correction for copy number differences  Development of a MPS panel for DNA extracted from historical, limited and highly degraded samples

2

Panel design Target capture approach

Web-based tool for custom design of probepanels

Not PCR-based targeting – less bias Design parameters: - 150 bp paired-end reads - Illumina MiSeq - for FFPE samples (degraded DNA) down to 50 bp fragments

www.agilent.com

Description of panels Panel 1: nuclear DNA markers Most SNPs from the ALFRED database 34-plex SNP for ID 52-plex SNP for ID 86-plex IISNPs 40 X- and Y-SNPs 39 Eye- and hair color SNPs 135 SNPs from HID-Ion AmpliSeqTM panel

The ALlele FREquency Database

http://alfred.med.yale.edu/alfred /snpSets.asp

>300 SNPs

30 InDels

– Individual identification – Ancestry information – Eye- and hair color prediction

13 autosomal short STR targets

Panel 2: entire mitochondrial genome

DNA extraction Control samples: high quality and quantity of DNA for evaluation Aged samples: bones, teeth DNA concentration: 0.233 – 4.680 ng/μL

Sequencing (Illumina) MiSeq

Template preparation

1. Digestion of genomic DNA with restriction enzymes

2. HaloPlex Target Enrichment: • Incorporation of indexes and Illumina seq motifs and gDNA fragment circularization • Capture target DNA-probe hybrids

4. PCR amplification

Primer 1

Seq motif

Target DNA

Seq motif

Index

Primer 2

3

Control samples, two HaloPlex panels - mixed in different ratios Sample A11 B8 C10 E14 F15 F20 G6 I8 J1 J5 K20

mtDNA 0 0 0 0 0 0 1 1 1 1 1

nDNA 1 1 1 1 1 1 10 5 3 0 0

ROI (plot) nDNA nDNA nDNA nDNA nDNA nDNA mtDNA+nDNA mtDNA+nDNA mtDNA+nDNA mtDNA mtDNA

Compensate for more abundant mtDNA

Coverage mitochondrial DNA (16024-16579)

K20 mtDNA J5 mtDNA J1 mtDNA:nDNA 1:3 I8 mtDNA:nDNA 1:5 G6 mtDNA:nDNA 1:10

STR analysis

16/18 4 bp deletion

D3S1358

15/16 More complex readouts 18/18

8 bp insertion

15/16

Genotyping is possible on the longer reads

16/16

4

SNP analysis

A/G A/A A/A G/G G/G A/G

Multiple SNPs within a target

Increased information microhaplotypes 6 samples 5 haplotypes

Mixture analysis – epithelial cells and sperm cells in an unknown ratio

Epithelial cells (XX) Sperm cells (XY)

Mixture (XX/XY)

An average coverage: 130 reads

5

HVII in three samples

3 samples 3 different mtDNA haplotypes

Positions 150/152/153

73 A/G

Challenging Viking-age samples (updated panel with more than 900 nSNPs) Sample ID

DNA concentration

P

0.233 ng/µL (total 14 ng)

S

2.19 ng/µL (total 131 ng)

I

0.982 ng/µL (total 59 ng)

Required concentration

5 ng/µL (250 ng in total)

Maximum coverage mtDNA

# nDNA SNPs/ range of coverage/ average coverage

31

158 10 – 433 42.7

395

99 10 – 236 56.3

28

92 10 – 313 39.3

Sample P

At least 90 nDNA SNPs STRs – very low coverage Predicted to have had blue eyes, light hair

Damage of single bases: Sanger sequencing vs. MPS of mtDNA

6

Conclusions • MPS using HaloPlex and MiSeq is promising for challenging sample analysis • Test on high quality DNA - 5 ng/µL (225 ng) – coverage of > 200 reads for most targets • Over 90 nDNA SNPs detected for suboptimal input and highly degraded DNA – ”only” 10 %, ”only” 14 ng

• Nuclear and mitochondrial in the same panel for limited samples – promising strategy to save material, flexible • More effort for improving the methodology needed to get higher coverage

7