posted on 2014-08-25, 00:00authored byPaolo Frasconi, Francesco Gabbrielli, Marco Lippi, Simone Marinai
Optical
chemical structure recognition is the problem of converting
a bitmap image containing a chemical structure formula into a standard
structured representation of the molecule. We introduce a novel approach
to this problem based on the pipelined integration of pattern recognition
techniques with probabilistic knowledge representation and reasoning.
Basic entities and relations (such as textual elements, points, lines,
etc.) are first extracted by a low-level processing module. A probabilistic
reasoning engine based on Markov logic, embodying chemical and graphical
knowledge, is subsequently used to refine these pieces of information.
An annotated connection table of atoms and bonds is finally assembled
and converted into a standard chemical exchange format. We report
a successful evaluation on two large image data sets, showing that
the method compares favorably with the current state-of-the-art, especially
on degraded low-resolution images. The system is available as a web
server at http://mlocsr.dinfo.unifi.it.