Electrical identification of single
DNA nucleotides with solid-state
nanopore/nanogap and quantum transport technology offers a new paradigm
in the field of DNA sequencing and has the potential to supersede
existing techniques. However, the overlapping of fingerprint electric
conductance signals due to the similar size and comparable frontier
orbital energy levels of nucleotides has been a major impediment to
identifying them with high accuracy. Herein, a synergistic approach
combining the quantum transport method and machine learning algorithms
has been devised to achieve the high-precision identification of DNA
nucleotides. A model germanene nanogap is investigated for single-nucleotide-based
DNA sequencing by calculating the transmission function and current–voltage
characteristics. With the transmission function data sets, the Random
Forest Classifier algorithm identified all four nucleotides and also
demonstrated that binary, ternary, and quaternary combinations of
nucleotides could also be classified with a high degree of precision,
F1 score, and accuracy. The interelectrode distance analysis illustrates
that transmission functions are sensitive to the electrode-nucleotide
coupling effect and that the ML classifier can extrapolate the information
during classification. Our findings provide a guide to the ML application
on the nanogap device to achieve fast, cost-effective, and single-shot
nucleotide identification.