RESOLVING COMPLEX CASES
OF NGS-‐BASED HLA TYPING WITH HLA TWIN
FOR RESEARCH USE ONLY
EFI 30th Annual Meeting Omixon Limited
Table of Contents EXERCISE 1 – OVERVIEW OF A HIGH QUALITY SAMPLE ......................................................................................... 3 EXERCISE 2 – INVESTIGATION OF A NOVEL ALLELE ............................................................................................... 4 EXERCISE 3 – INVESTIGATION OF A NULL ALLELE .................................................................................................. 6 EXERCISE 4 – RESOLUTION OF A CIS/TRANS AMBIGUITY ...................................................................................... 8 EXERCISE 5 – INVESTIGATION OF AN UNUSUAL SAMPLE ..................................................................................... 11 EXERCISE 6 – INVESTIGATION OF SAMPLES WITH TECHNICAL ERRORS ................................................................ 13 EXERCISE 7 – INVESTIGATION OF A RARE ALLELE ................................................................................................ 16
Resolving Complex Cases of NGS-‐based HLA Typing with HLA Twin. Copyright© 2016, Omixon Ltd. Confidential & Proprietary
Page 2 │ 17
Exercise 1 – Overview of a High Quality Sample
1. Launch Omixon HLA Twin from the laptop provided to you. Username: EFI2016 Password: EFI2016 The HLA Typing dashboard should look similar to the following screenshot:
Here you can see the HLA Typing dashboard. There are several Sample and Analysis files displayed. Question 1: Just by looking at this screen can you identify which sample has the highest quality results? Which one and why? _____________________________________________________________________________ 2. Select the analysis for the sample called OMX6 to View the results. (Hint: there are multiple tabs of results) 3. Review the following: a. Quality control table b. Mappability c. Fragment Size Distribution d. Allele Imbalance e. Genome Browser Question 2: Is there anything out of the ordinary that would require further review? _____________________________________________________________________________ Resolving Complex Cases of NGS-‐based HLA Typing with HLA Twin. Copyright© 2016, Omixon Ltd. Confidential & Proprietary
Page 3 │ 17
Exercise 2 – Investigation of a Novel Allele
1. Select the analysis file for sample OMX30. This sample contains a novel allele. (Hint: there is a special icon that indicates the presence of a novel allele) Question 1: Which locus indicates a novel allele and which allele is that? _____________________________________________________________________________ 2. Go into the Sample details to review the Quality Control table. Question 2: What is the overall quality of the all of the loci and specifically for the locus with the novelty? _____________________________________________________________________________ Question 3: What are the Quality Metrics on HLA-‐B that have not passed and can they be affecting the genotype result? _____________________________________________________________________________ 3. Go to the Data Statistics tab to review the overall quality of the run. Question 4: Is there anything to cause concern on the quality of the data? _____________________________________________________________________________ 4. Go into the Genome Browser. Question 5: What are 3 things that you would investigate to determine whether this is a real novel allele? _____________________________________________________________________________ 5. Display only the chromosome with the novel allele. 6. Toggle the reference mask to display only the differences between the consensus sequence and the re-‐alignment. (Hint: right click anywhere in the genome browser or click CTRL+D) Question 6: In which genomic region is the novelty found and at what IMGT position? _____________________________________________________________________________ Resolving Complex Cases of NGS-‐based HLA Typing with HLA Twin. Copyright© 2016, Omixon Ltd. Confidential & Proprietary
Page 4 │ 17
Question 7: What is the relative reference and the novel nucleotide identified? _____________________________________________________________________________ Question 8: What is the consensus coverage at the novelty position? _____________________________________________________________________________ Question 9: What are the base statistics at the novelty position? _____________________________________________________________________________ Question 10: How can you ensure that the consensus sequence generation was correct and that no reads were “borrowed” from the other chromosome at that position? (Hint: display both chromosomes together and check for reads that span across neighboring heterozygous positions) _____________________________________________________________________________ Question 11: Do you believe this is a true novel allele and would you accept the result? _____________________________________________________________________________
Resolving Complex Cases of NGS-‐based HLA Typing with HLA Twin. Copyright© 2016, Omixon Ltd. Confidential & Proprietary
Page 5 │ 17
Exercise 3 – Investigation of a Null Allele
1. Select the analysis file for sample OMX40. This sample contains a null allele. Question 1: Which locus is identified with a null allele and what is the genotype of this allele? _____________________________________________________________________________ Question 2: Have both algorithms identified the null allele? How can you confirm this? _____________________________________________________________________________ 2. Click on View Results. 3. Go to sample details and inspect the Quality Control tab. Question 3: Is there anything to cause concern on the quality of the null allele? _____________________________________________________________________________ 4. Go into the Genome Browser. Question 4: If you compare the two alleles, is there any difference in the number of exons and introns? _____________________________________________________________________________ 5. Display only the chromosome with the null allele. Question 5: What would you do to determine what variation causes it to be a null allele? _____________________________________________________________________________ 6. Add HLA-‐A*24:02:01:01 to the chromosome. (Hint: right click anywhere in the genome browser and click on Add custom allele(s) to 1st chromosome.) Question 6: What is the IMGT position of the variation that makes the null allele different from the “parent” allele? (Hint: use the reference mask again) _____________________________________________________________________________ Resolving Complex Cases of NGS-‐based HLA Typing with HLA Twin. Copyright© 2016, Omixon Ltd. Confidential & Proprietary
Page 6 │ 17
Question 7: What is the type of this variation (SNP, Insertion, deletion)? ______________________________________________________________________________ 7. Inspect the sequence surrounding the difference as well as the short read alignment in that region. Question 8: What observations do you make regarding the short read alignment when comparing the null allele with HLA-‐A*24:02:01:01? ______________________________________________________________________________ ______________________________________________________________________________ Question 9: Would feel confident that this is a true null allele based on the data and your observations? ______________________________________________________________________________
Resolving Complex Cases of NGS-‐based HLA Typing with HLA Twin. Copyright© 2016, Omixon Ltd. Confidential & Proprietary
Page 7 │ 17
Exercise 4 – Resolution of a Cis/Trans Ambiguity
1. Select the initial analysis file for sample OMX50. (Hint: Look for the file with the earlier date.) Question 1: Do all loci have unambiguous results? Which ones do not? _____________________________________________________________________________ 2. Go into the sample details and select the ambiguous locus. Question 2: What are the possible allele combinations? _____________________________________________________________________________ Question 3: Have all quality control metrics passed? If not, which one(s) have not? _____________________________________________________________________________ Question 4: Do you think that the quality control metric warnings are related to this ambiguity? _____________________________________________________________________________ Question 5: What are the most common causes of cis/trans ambiguities? _____________________________________________________________________________ 3. Select the first allele pair and go into the Genome Browser to review the alignment. 4. Right click anywhere in the Genome Browser and select “Toggle Reference Masked” to hide all homozygous positions. Question 6: How many phased breaks are there and in what regions? _____________________________________________________________________________ Question 7: What is the coverage depth at each phase break? _____________________________________________________________________________
Resolving Complex Cases of NGS-‐based HLA Typing with HLA Twin. Copyright© 2016, Omixon Ltd. Confidential & Proprietary
Page 8 │ 17
Question 8: What is the distance between the 2 heterozygous positions harboring the phase breaks? _____________________________________________________________________________ 5. Let’s compare the displayed allele combination with the second identified allele combination. 6. Right click anywhere in the Genome Browser and click on “Add allele of other result pair to 1st chromosome”. 7. Repeat for 2nd chromosome. 8. Now you should be able to see 2 consensus sequence alignments and 4 more re-‐alignments for 4 different alleles. Question 9: Why are only the exons displayed for the alleles of the other result pair? _____________________________________________________________________________ 9. For ease, you may display each chromosome separately. (Hint: focus only on exons) Question 10: How do the 2 allele candidates for chromosome 1 differ? _____________________________________________________________________________ Question 11: How do the 2 allele candidates for chromosome 2 differ? _____________________________________________________________________________ Question 12: What do you think is the reason for these differences? _____________________________________________________________________________ 10. A phase break is likely due to this cis/trans ambiguity. Question 13: Given the position of the difference between the alleles and the position of the phase breaks, which phase break is most likely to be the cause? _____________________________________________________________________________ Right underneath the “Phasing track”, there is a “Variants track” that says “straight” and “cross” with numbers. This shows the number of read pairs that connect 2 neighboring heterozygous positions and their relationship regarding the 2 chromosomes. See below for a schematic explanation, from another sample. The light green annotation track reads: Straight: 383/246 Cross: 0/1. Resolving Complex Cases of NGS-‐based HLA Typing with HLA Twin. Copyright© 2016, Omixon Ltd. Confidential & Proprietary
Page 9 │ 17
11. In our sample, inspect the 2 heterozygous positions harbouring the second phase break. Make sure you are displaying both alleles now. Question 14: How many read pairs support the “straight” combination and how many the “cross” combination? _____________________________________________________________________________ Question 15: Based on this observation, which combination would you think is correct? _____________________________________________________________________________ Another way to try to resolve this is by reanalysing the sample with more reads. In this case, the sample was re-‐analysed with 10000 reads. (Hint: right click and view protocol on the re-‐ analysed file) 12. Go back to the HLA Typing Dashboard and select the re-‐analysed file. (Hint: Check for the most recent analysis date.) 13. Go into the Genome Browser and navigate to the region where the second phase break was. Question 16: Has the phase break and ambiguity been resolved? _____________________________________________________________________________ Question 17: Based on the information you have gathered from your investigation, what alleles would you assign for this locus? _____________________________________________________________________________ Resolving Complex Cases of NGS-‐based HLA Typing with HLA Twin. Copyright© 2016, Omixon Ltd. Confidential & Proprietary
Page 10 │ 17
Exercise 5 – Investigation of an Unusual Sample
1. Select the analysis file for sample OMX22. This sample contains an unusual sample. 2. Go into the Sample details to review the Quality Control table. Question 1: Which loci indicate investigation may be required? ____________________________________________________________________________ Question 2: Which locus indicates a quality control failure, and which QC metric has failed? ____________________________________________________________________________ “Exon spot noise ratio” will detect if there is unexpected “noise” in the exons for a particular locus. We will walk through an example of what has been detected here and use other QC metrics to help us pinpoint the cause. Question 3: What is the color of the “Noise ratio” QC metric? ____________________________________________________________________________ One of the things that “Noise ratio” will detect is contamination, i.e. two samples have been mixed. If this QC metric is green, we can eliminate contamination as the source of the noise in the exon. Question 4: How about the “Non-‐exon spot noise ratio”? Is that also red? ____________________________________________________________________________ If the “Non-‐exon spot noise ratio” is green, this will indicate the noise we are dealing with is very localized, possibly even to single exon. 3. Go to the Genotype tab, select HLA-‐B and click on the “Best Matches Only” button, to turn it off. Question 5: Are there any other allele pairs that don’t share the same two-‐field call as the best matching pair? ____________________________________________________________________________
Resolving Complex Cases of NGS-‐based HLA Typing with HLA Twin. Copyright© 2016, Omixon Ltd. Confidential & Proprietary
Page 11 │ 17
There is currently no way of detecting what we are looking for in HLA Twin (this feature is ‘coming soon’!). It can however be detected by analyzing the sample in Omixon’s research software, called Omixon HLA Explore. 4. Click on the “Best Matches Only” button again, to turn it on again. 5. Go into the Genome Browser for HLA-‐B, and view both chromosomes simultaneously. 6. Toggle the reference mask (right click, or CTRL+D). 7. Add a “custom allele” to chromosome 1 – add HLA-‐B*35:03:01 . Question 6: Inspect exon 3. At what position is the difference between the HLA-‐B*35:01:01:01 allele and the HLA-‐B*35:03:01 allele? ____________________________________________________________________________ Question 7: What is the coverage for the HLA-‐B*35:01:01:01 allele and the HLA-‐B*35:03:01 allele at this position? ____________________________________________________________________________ Question 8: Are there any other variants nearby? ____________________________________________________________________________ In this case the “exon spot noise ratio” has detected a third allele present for HLA-‐B. The sample is triploid at HLA-‐B (only). The third allele detected is extremely similar to one of the other alleles detected, which is why only the “exon spot noise ratio” has shown a failure – a more diverse triploid allele would probably cause both the exon and non-‐exon spot noise QC measures to fail. Contamination would cause the main “noise ratio” QC measure to fail. This triploid allele has been confirmed with other techniques. At the moment, there is no way of isolating the third allele within Twin (this feature is in development). It can however be seen in the Omixon HLA Explore software, which is how it was isolated for the purpose of this tutorial. Omixon HLA Explore is free of charge for all labs that use Omixon Holotype HLA in routine. If you see an “exon spot noise ratio” failure in Twin, you should retest the sample with other techniques to confirm a triploid call.
Resolving Complex Cases of NGS-‐based HLA Typing with HLA Twin. Copyright© 2016, Omixon Ltd. Confidential & Proprietary
Page 12 │ 17
Exercise 6 – Investigation of Samples with Technical Errors
1. Select the analysis file for sample OMX54. Question 1: What is the overall quality state of all loci? ____________________________________________________________________________ 2. Go into the sample details to review the Quality Control table. Question 2: Which QC metrics indicate that inspection is required across all loci? ____________________________________________________________________________ Question 3: What is the potential relationship between the QC metrics that are yellow with each other? Can one affect the other? ____________________________________________________________________________ 3. Go to the Data Statistics tab to review the Fragment Size distribution and Read Quality graphs. In Holotype HLA there is a size selection step for fragments between 650 and 1200 bp. This includes the adaptor and index sequences. The fragment sizes that are counted on this graph are after removal of the adaptors and indexes, as well as QC trimming. Thus, these are the “pure” HLA sequences that are used for the consensus sequence generation and they range between 300 and 750 bp. Question 4: What do you think is the importance of using a wide range of fragment sizes for the consensus sequence generation? ____________________________________________________________________________ ____________________________________________________________________________ Question 5: What is the average fragment size indicated for this sample? ____________________________________________________________________________ Resolving Complex Cases of NGS-‐based HLA Typing with HLA Twin. Copyright© 2016, Omixon Ltd. Confidential & Proprietary
Page 13 │ 17
Question 6: Does this value seem to fall within the range of fragment sizes we should expect? ____________________________________________________________________________ Question 7: What do you think is a potential cause for the observed value here? (Hint: think of the steps in the library preparation) ____________________________________________________________________________ 4. Inspect the QC metrics for Crossmapping and Ambiguous Layout. Question 8: Do you think that the above QC metrics are affected by the shorter fragment sizes and why? ____________________________________________________________________________ Question 9: Which other QC metrics may be affected by the shorter fragments? (Hint: think of the importance of the long fragments) ____________________________________________________________________________ Question 10: In this sample, are the genotypes of all loci affected by the shorter fragments? Would you accept the calls and why? ____________________________________________________________________________ Let’s work through a different sample now. 5. Go back to the HLA Typing and select the analysis file for sample OMX52. This sample was typed for HLA-‐A, B, C, DRB1, DPB1, DQA1 and DQB1. 6. Click on View results. Question 11: Which loci have a genotype result on the HLA Typing Analysis Result dashboard? Are any loci missing and which ones? ____________________________________________________________________________ 7. Go into the sample details to review the Data Statistics tab. 8. Scroll down to the Mappability table. This table shows how many reads were found to map on every locus analysed and the Best Mapped reads are the high quality ones that were used for consensus sequence generation. Resolving Complex Cases of NGS-‐based HLA Typing with HLA Twin. Copyright© 2016, Omixon Ltd. Confidential & Proprietary
Page 14 │ 17
Question 12: How many Mapped Reads are there for the locus that did not display a result? ____________________________________________________________________________ Question 13: What are the possible causes for the luck of data for this locus? (Hint: think of lab-‐ related processes) ____________________________________________________________________________ Question 14: How can you pinpoint whether it was a technical issue and at which step in the protocol? ____________________________________________________________________________ ____________________________________________________________________________
Resolving Complex Cases of NGS-‐based HLA Typing with HLA Twin. Copyright© 2016, Omixon Ltd. Confidential & Proprietary
Page 15 │ 17
Exercise 7 – Investigation of a Rare Allele
1. Go to the HLA Typing Dashboard. 2. Select the analysis file for sample OMX24. This sample was typed for 5 loci. Question 1: Which symbol indicates the presence of a rare allele in HLA Twin? ____________________________________________________________________________ Question 2: Which locus has a rare allele and what is that allele? ____________________________________________________________________________ 3. Go into the Sample Details to review the Quality Control table. Question 3: What is the overall quality of the locus with the rare allele? Is there anything in particular that may cause concern for the validity of the results? ____________________________________________________________________________ 4. Select the locus in question from the left side and go into the Genome Browser to review the alignment. Note: The big gap you see in the middle of intron 2 of chromosome 1 is a visualization artifact and is not a concern. 5. Click on Displayed Allele(s) button to display them separately and toggle to allele 1 only. Question 4: One of the QC warnings was low coverage on an intron. Can you point the relative location of this by visual inspection? ____________________________________________________________________________ 6. Zoom in that region and inspect the consensus sequence. There is a continuous repeat here of GT followed by GA which makes the region more difficult to sequence and then align. Relatively lower coverage is occasionally observed in this region. Question 5: Does this seem to be affecting the genotype results? ____________________________________________________________________________
Resolving Complex Cases of NGS-‐based HLA Typing with HLA Twin. Copyright© 2016, Omixon Ltd. Confidential & Proprietary
Page 16 │ 17
7. Let’s add a closely related common allele to compare and determine if what we are seeing is not correct. 8. Add HLA-‐DRB1*04:01:01 as a custom candidate. 9. Toggle the reference mask (right click, or CTRL+D) and zoom out completely to display the entire length of the allele. Question 6: Are there any differences between the 2 alleles and in what region? ____________________________________________________________________________ Question 7: Relative to the rare allele, how is the coverage of the common allele in that region? ____________________________________________________________________________ Question 8: And how about the short read alignment? (Hint: click on Display Short Reads to visualize them) ____________________________________________________________________________ Question 9: Is there sufficient data to support the common allele versus the rare one, which would concern you that the wrong allele has been identified? Finally, would you assign and accept this rare allele for reporting it? ____________________________________________________________________________ Congratulations! You finished the prepared material for the workshop. Please feel free to explore HLA Twin and ask the instructors any additional questions. Thank you for attending our workshop.
Resolving Complex Cases of NGS-‐based HLA Typing with HLA Twin. Copyright© 2016, Omixon Ltd. Confidential & Proprietary
Page 17 │ 17