Discovery of Novel Biomarkers by Text Mining: A New Avenue for Drug Research?

Carlo A Trugenberger; David Peregrim

Discovery of Novel Biomarkers by Text Mining: A New Avenue for Drug Research?

Abstract

Carlo A Trugenberger and David Peregrim

Data are paramount to modern targeted drug design. Precious revelations obtained by applying data mining and
computational chemistry on large molecular databases, innovative at one time, are now everyday procedures for
therapy identification. However, there is an even larger source of valuable information available that can potentially
be tapped for discoveries: repositories constituted by research documents.
While numerical methods for the analysis of structured data like those in genomics and proteomics databases
are well developed and standard toolboxes are easily available, knowledge discovery from unstructured data in text
documents is still considered the “Holy Grail” of text mining and no stable methodology has yet emerged from the
scant few known attempts.
Here we review a recent pilot experiment to discover novel biomarkers and phenotypes for diabetes and obesity
by self-organized text mining of about 120,000 PubMed abstracts, public clinical trial summaries, and internal Merck
research documents by the InfoCodex semantic engine. Retrieval of known entities missed by other traditional
approaches could be demonstrated and the InfoCodex semantic engine was shown to discover new diabetes and
obesity biomarkers and phenotypes, although noticeable noise (uninteresting or obvious terms) was generated.
The reported text mining approach to biomarker discovery shows much promise and has the potential to be
developed into a new avenue for pharmaceutical research, especially to shorten time-to-market of novel drugs, or
speed up early recognition of dead ends and adverse reactions.

Haftungsausschluss: Dieser Abstract wurde mit Hilfe von Künstlicher Intelligenz übersetzt und wurde noch nicht überprüft oder verifiziert

Teile diesen Artikel

Zeitschriften-Highlights

Indiziert in

CAS-Quellenindex (CASSI)
Index Copernicus
Google Scholar
Öffnen Sie das J-Tor
Genamics JournalSeek
Nationale Wissensinfrastruktur Chinas (CNKI)
Ulrichs Zeitschriftenverzeichnis
RefSeek
Hamdard-Universität
EBSCO A-Z
OCLC – WorldCat
Publons
Genfer Stiftung für medizinische Ausbildung und Forschung
Euro-Pub

Zeitschrift für molekulare Biomarker und Diagnose