| | english | español | français |
Go to record ID

  Home|Finding Information|Record details   Printer-friendly version

Information Resource
Record information and status
Record ID
115946
Status
Published
Date of creation
2021-03-23 19:56 UTC (austein.mcloughlin@cbd.int)
Date of publication
2021-03-23 19:56 UTC (austein.mcloughlin@cbd.int)

General Information
Title
DUGMO: tool for the detection of unknown genetically modified organisms with high-throughput sequencing data for pure bacterial samples
Author
Julie Hurel, Sophie Schbath, Stéphanie Bougeard, Mathieu Rolland, Mauro Petrillo, Fabrice Touzain
Author’s contact information
Fabrice Touzain
ANSES, Laboratoire de Ploufragan, GVB unit, 22440
Ploufragan, France

Kindly refer to attached article for contact's email address
Language(s)
  • English
Publication date
2020-07-06
Subject
Summary, abstract or table of contents
Background
The European Community has adopted policies regarding the dissemination and use of genetically modified organisms (GMOs). In fact, a maximum threshold of 0.9% of contaminating GMOs is tolerated for a "GMO-free" label. In recent years, imports of undescribed GMOs have been detected. Their sequences are not described and therefore not detectable by conventional approaches, such as PCR.

Results
We developed DUGMO, a bioinformatics pipeline for the detection of genetically modified (GM) bacteria, including unknown GM bacteria, based on Illumina paired-end sequencing data. The method is currently focused on the detection of GM bacteria with - possibly partial - transgenes in pure bacterial samples. In the preliminary steps, coding sequences (CDSs) are aligned through two successive BLASTN against the host pangenome with relevant tuned parameters to discriminate CDSs belonging to the wild type genome (wgCDS) from potential GM coding sequences (pgmCDSs). Then, Bray-Curtis distances are calculated between the wgCDS and each pgmCDS, based on the difference of genomic vocabulary. Finally, two machine learning methods, namely the Random Forest and Generalized Linear Model, are carried out to target true GM CDS(s), based on six variables including Bray-Curtis distances and GC content. Tests carried out on a GM Bacillus subtilis showed 25 positive CDSs corresponding to the chloramphenicol resistance gene and CDSs of the inserted plasmids. On a wild type B. subtilis, no false positive sequences were detected.

Conclusion
DUGMO detects exogenous CDS, truncated, fused or highly mutated wild CDSs in high-throughput sequencing data, and was shown to be efficient at detecting GM sequences, but it might also be employed for the identification of recent horizontal gene transfers.
Thematic areas
Additional Information
Type of resource
  • Article (journal / magazine / newspaper)
Identifier
https://doi.org/10.1186/s12859-020-03611-5
Publisher and its location
BMC Bioinformatics
Springer Nature
Rights
Open Access
Format
PDF - 0.98 MB (17 pages)
Keywords and any other relevant information
Detection; Unknown LMO; Bacteria; Illumina sequencing data; Machine learning; bioinformatics