PhD Studentship: Structural and copy number variation analysis using adaptive long read sequencing

University of Nottingham

Long Read Sequencing (LRS) is a critical tool for understanding disease and variation in populations of samples ranging from human to critical food crops. LRS is particularly suited to the identification of structural variation due to the information gain within a single long read molecule. However, Oxford Nanopore Technologies sequencers do not have to sequence long reads. Adaptive Sampling using direct base calling, first demonstrated in our laboratory in Nottingham, enables dynamic selection of individual molecules during sequencing (Nature Biotech, 2020, Nature Methods, 2016). A sequenced read will be the entire length of the molecule, whereas a rejected read will become a short read. This results in a mixed library of molecules being sequenced from the sample. As a consequence, regions for which long reads are generated end up with higher coverage than unwanted regions and enable enrichment of, say, a set of cancer gene panels. We have previously shown that far more data can be obtained from this type of mixed experiment including capturing of structural variants of clinical importance and changes in copy number throughout genomes of interest. For example, binned read counts can be used to infer copy number variation regardless of read length, whilst long reads can be used to determine complex structural variation. A critical application for this approach is within medicine, where rapid identification of CNVs and SVs can be crucial to aid the diagnosis of a variety of tumour types and potential disease states. Currently these analyses take weeks or months.

Here we aim to reduce the total time taken by implementing real time analysis using our minoTour platform (see Munro BiorXiv 2021a). We have recently demonstrated integration of complex analysis pipelines within minoTour (see Munro BiorXiv 2021b) which allow us to dynamically update adaptive sampling targets (manuscript in prep). We will develop new adaptive algorithms to dynamically update selected regions for sequencing based on real time analysis of the likelihood of an SV/CNV within a given region of the genome (see De Maio BiorXiv 2020). Combined with our existing targeting strategies we anticipate being able to provide a single report to a medical colleague capturing SNP/SV and CNV data within 72 hours of sample receipt (see Patel BiorXiv 2021).
To test and develop this approach we will work closely with Mike Hubank (Scientific Director, NHS England North London Genomic Laboratory Hub). Alongside medical applications, this approach would have significant benefits to the study of SVs in non-human populations. We will therefore develop the methods to detect potential SVs/CNVs in a model plant population, Arabidopsis arenosa, with Prof Levi Yant. A. arenosa has undergone recent whole genome duplication and has evidence of significant SV within the population.

This opportunity is open to students who are classed as Home for fees. International students will be ineligible to be considered for this round of recruitment. Funding is available for four years from late September 2023. The award covers tuition fee (£4,712) at the home rate plus an annual stipend which was (£18,622) for 2023/24. This is set by the Research Councils.

Please contact me directly should you have any questions regarding this matter.

View or Apply
To help us track our recruitment effort, please indicate in your cover/motivation letter where ( you saw this job posting.

Job Location