Project properties

Title Improving the quality of metagenome assemblies.
Group Systems and Synthetic Biology
Project type thesis
Credits 36
Supervisor(s) Dr Jasper Koehorst; Mike Loomans
Examiner(s) Dr Jasper Koehorst; Assoc. Prof. Dr Peter Schaap
Contact info rober1.smith@wur.nl
Begin date 2024/04/18
End date
Description To reconstruct genomes from reads generated by sequencing methods, reads are assembled by assemblers into genomes by usually one of two methods. De Bruijn Graph and K-mer-based methods such as the SPAdes assembler are good at accounting for technical sequencing errors and are regarded as the de facto standard for assembly. Alternatively, reads can be assembled using Overlap-Layout-Consensus (OLC) algorithms. These methods work well with reads that contain few technical sequencing errors and generally separate genomes well but are outperformed by De Bruijn assemblers in most real-world scenarios[1]. The problem we want to tackle comes from the relaxed approach to De Bruijn Graph assembler which may haphazardly assemble two or more genomes of different strains into one genome during metagenomic assembly. We want to separate the genomes in these edge cases using OLC assemblers and parameter fine-tuning.

References:
[1] Espinosa, E., Bautista, R., Fernandez, I., Larrosa, R., Zapata, E. L., & Plata, O. (2023). Comparing assembly strategies for third-generation sequencing technologies across different genomes. In Genomics (Vol. 115, Issue 5, p. 110700). Elsevier BV. https://doi.org/10.1016/j.ygeno.2023.110700.
Used skills Python or R
Common Workflow Language
Requirements Basic bioinformatics knowledge (SSB20306 or similar)