posted on 2018-09-07, 00:00authored byPaula Duek, Alain Gateau, Amos Bairoch, Lydie Lane
20,230 protein-coding
genes have been predicted from the analysis
of the human genome (neXtProt release 2018-01-17), and about 10% of
them are still lacking functional annotation, either predicted by
bioinformatics tools or captured from experimental reports. A systematic
exploration of the available literature on uncharacterized human genes/proteins
led to proposal of functional annotations for 113 proteins and to
consolidation of a list of 1,862 uncharacterized human proteins. The
advanced search functionality of neXtProt was used extensively in
order to examine the landscape of the uncharacterized human proteome
in terms of subcellular locations, protein–protein interactions,
tissue expression, association with diseases, and 3D structure. Finally,
a deep data mining in various publicly available resources allowed
building functional hypotheses for 26 uncharacterized human proteins
validated at protein level (uPE1). These hypotheses cover the fields
of cilia biology, male reproduction, metabolism, nervous system, immunity,
inflammation, RNA metabolism, and chromatin biology. They will require
experimental validation before they can be considered for annotation.
Despite technological progresses, the pace of human protein characterization
studies is still slow. It could be accelerated by a better integration
of existing knowledge resources and by initiating large collaborative
projects involving specialists of different biology fields. We hope
that our analysis will contribute to set up the ground for such collaborative
approaches and will be exploited by the HUPO Human Proteome Project
teams committed to characterize uPE1 proteins.