Unsupervised encoding selection through ensemble pruning for biomedical classification
Background: Owing to the rising levels of multi-resistant pathogens, antimicrobial peptides, an alternative strategy to classic antibiotics, got more attention. A crucial part is thereby the costly identification and validation. With the ever-growing amount of annotated peptides, researchers leve...
Uloženo v:
Hlavní autoři: | , , |
---|---|
Médium: | Článek |
Jazyk: | angličtina |
Vydáno: |
Philipps-Universität Marburg
2023
|
Témata: | |
On-line přístup: | Plný text ve formátu PDF |
Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
Shrnutí: | Background: Owing to the rising levels of multi-resistant pathogens, antimicrobial
peptides, an alternative strategy to classic antibiotics, got more attention. A crucial
part is thereby the costly identification and validation. With the ever-growing amount
of annotated peptides, researchers leverage artificial intelligence to circumvent the
cumbersome, wet-lab-based identification and automate the detection of promising
candidates. However, the prediction of a peptide’s function is not limited to antimicrobial
efficiency. To date, multiple studies successfully classified additional properties,
e.g., antiviral or cell-penetrating effects. In this light, ensemble classifiers are employed
aiming to further improve the prediction. Although we recently presented a workflow
to significantly diminish the initial encoding choice, an entire unsupervised encoding
selection, considering various machine learning models, is still lacking.
Results: We developed a workflow, automatically selecting encodings and generating
classifier ensembles by employing sophisticated pruning methods. We observed that
the Pareto frontier pruning is a good method to create encoding ensembles for the
datasets at hand. In addition, encodings combined with the Decision Tree classifier as
the base model are often superior. However, our results also demonstrate that none of
the ensemble building techniques is outstanding for all datasets.
Conclusion: The workflow conducts multiple pruning methods to evaluate ensemble
classifiers composed from a wide range of peptide encodings and base models. Consequently,
researchers can use the workflow for unsupervised encoding selection and
ensemble creation. Ultimately, the extensible workflow can be used as a plugin for the
PEPTIDE REACToR, further establishing it as a versatile tool in the domain. |
---|---|
Popis jednotky: | Gefördert durch den Open-Access-Publikationsfonds der UB Marburg. |
DOI: | 10.1186/s13040-022-00317-7 |