News from this website 4month5Japan, an authoritative international journal《Nature.Communication》(Nature Communications)Published online by the team of Professor Wang Jianxin from the School of Computing at seabet South University and Luo Feng from Clemson University in the United StatesProfessor,Sun Yat-sen seabetZhongshan Eye CenterXiao ChuanleAssociate ResearcherThe latest seabet results of cooperation“De novo diploid seabet based on long reads (De novo diploid genome seabet using long noisy reads)”。The paperProposed a new diploid seabet method based on third-generation sequencing data,And developed the corresponding softwarePECAT.School of Computer Science, seabet South UniversityNie Fan,Ni PengCommon to the paperThe first author is Professor Wang Jianxin from the School of Computer Science, seabet South UniversityPaperCo-corresponding author,seabet South University is the first signing unit。This research is supported by the National Key Research and Development Program、National Natural Science Foundation of China、Supported by multiple projects such as Xiangjiang Laboratory’s unveiling project。
Third generation sequencing technology (Oxford Nanopore sequencing andRapid advances in PacBio single-molecule real-time sequencing) produce longer and more accurate reads,Brings new opportunities and challenges to genome seabet research。For diploid seabet,The third generation reads still contain higher sequencing errors,It is difficult for seabet algorithms to distinguish sequencing errors from haplotype difference information,Thus generating seabet results of mixed haplotypes,It contains a large number of haplotype switching errors, And a large portion of genetic information is lost。
In response to this limitation,Professor Wang Jianxin’s team and others recently published in In research paper from Nature Communications,Through in-depth analysis of the differences between third-generation long reads carrying sequencing errors and haplotype differences,Proposed a long read error correction algorithm that preserves haplotype difference information,Prevent haplotype difference information from being removed as sequencing errors,Ensures the consistency of haplotype difference information,The haplotype consistency of its error-corrected readings can reach 99.4%。On this basis, a diploid seabet algorithm based on local haplotype clustering was designed,seabet results to achieve haplotype mixing in the first round of seabet。In the second round of seabet,Align reads to hybrid seabet results,Identifies the single nucleotide polymorphism (SNP) information carried by the read,Identify overlapping relationships with inconsistent haplotypes through local clustering,Filter inconsistent overlapping relationships and assemble again to achieve haplotype seabet results。On multiple test data,The method PECAT proposed in the paper obtains more continuous haplotype seabet results。Among them,On Bull Data for Nanopore R9,PECAT achieves seabet results with nearly resolved haplotypes。And on the human HG002 sample nanopore R10 data,PECAT achieved a haplotype continuity index (phase block NG50) of 59.4/58.0Mb seabet result。
PECAT seabet algorithm framework diagram
First instance: Yu Tao Second instance: Deng Haodi Third instance: Li Yin