DigiMOF: A Database of Metal–Organic Framework
Synthesis Information Generated via Text Mining
Posted on 2023-05-18 - 04:15
The vastness of materials
space, particularly that which is concerned
with metal–organic frameworks (MOFs), creates the critical
problem of performing efficient identification of promising materials
for specific applications. Although high-throughput computational
approaches, including the use of machine learning, have been useful
in rapid screening and rational design of MOFs, they tend to neglect
descriptors related to their synthesis. One way to improve the efficiency
of MOF discovery is to data-mine published MOF papers to extract the
materials informatics knowledge contained within journal articles.
Here, by adapting the chemistry-aware natural language processing
tool, ChemDataExtractor (CDE), we generated an open-source database
of MOFs focused on their synthetic properties: the DigiMOF database.
Using the CDE web scraping package alongside the Cambridge Structural
Database (CSD) MOF subset, we automatically downloaded 43,281 unique
MOF journal articles, extracted 15,501 unique MOF materials, and text-mined
over 52,680 associated properties including the synthesis method,
solvent, organic linker, metal precursor, and topology. Additionally,
we developed an alternative data extraction technique to obtain and
transform the chemical names assigned to each CSD entry in order to
determine linker types for each structure in the CSD MOF subset. This
data enabled us to match MOFs to a list of known linkers provided
by Tokyo Chemical Industry UK Ltd. (TCI) and analyze the cost of these
important chemicals. This centralized, structured database reveals
the MOF synthetic data embedded within thousands of MOF publications
and contains further topology, metal type, accessible surface area,
largest cavity diameter, pore limiting diameter, open metal sites,
and density calculations for all 3D MOFs in the CSD MOF subset. The
DigiMOF database and associated software are publicly available for
other researchers to rapidly search for MOFs with specific properties,
conduct further analysis of alternative MOF production pathways, and
create additional parsers to search for additional desirable properties.
CITE THIS COLLECTION
DataCiteDataCite
3 Biotech3 Biotech
3D Printing in Medicine3D Printing in Medicine
3D Research3D Research
3D-Printed Materials and Systems3D-Printed Materials and Systems
4OR4OR
AAPG BulletinAAPG Bulletin
AAPS OpenAAPS Open
AAPS PharmSciTechAAPS PharmSciTech
Abhandlungen aus dem Mathematischen Seminar der Universität HamburgAbhandlungen aus dem Mathematischen Seminar der Universität Hamburg
ABI Technik (German)ABI Technik (German)
Academic MedicineAcademic Medicine
Academic PediatricsAcademic Pediatrics
Academic PsychiatryAcademic Psychiatry
Academic QuestionsAcademic Questions
Academy of Management DiscoveriesAcademy of Management Discoveries
Academy of Management JournalAcademy of Management Journal
Academy of Management Learning and EducationAcademy of Management Learning and Education
Academy of Management PerspectivesAcademy of Management Perspectives
Academy of Management ProceedingsAcademy of Management Proceedings
Academy of Management ReviewAcademy of Management Review
Glasby, Lawson
T.; Gubsch, Kristian; Bence, Rosalee; Oktavian, Rama; Isoko, Kesler; Moosavi, Seyed Mohamad; et al. (2023). DigiMOF: A Database of Metal–Organic Framework
Synthesis Information Generated via Text Mining. ACS Publications. Collection. https://doi.org/10.1021/acs.chemmater.3c00788