Text Mining Metal-Organic Framework Papers

Cited 40 time in webofscience Cited 0 time in scopus
  • Hit : 447
  • Download : 396
We have developed a simple text mining algorithm that allows us to identify surface area and pore volumes of metal-organic frameworks (MOFs) using manuscript html files as inputs. The algorithm searches for common units (e.g., m(2)/g, cm(3)/g) associated with these two quantities to facilitate the search. From the sample set data of over 200 MOFs, the algorithm managed to identify 90% and 88.8% of the correct surface area and pore volume values. Further application to a test set of randomly chosen MOF html files yielded 73.2% and 85.1% accuracies for the two respective quantities. Most of the errors stem from unorthodox sentence structures that made it difficult to identify the correct data as well as bolded notations of MOFs (e.g., la) that made it difficult identify its real name. These types of tools will become useful when it comes to discovering structure-property relationships among MOFs as well as collecting a large set of data for references.
Publisher
AMER CHEMICAL SOC
Issue Date
2018-02
Language
English
Article Type
Article
Keywords

HYDROGEN STORAGE; SURFACE-AREA; ADSORPTION; CRYSTALS; POROSITY; DESIGN; TOOL

Citation

JOURNAL OF CHEMICAL INFORMATION AND MODELING, v.58, no.2, pp.244 - 251

ISSN
1549-9596
DOI
10.1021/acs.jcim.7b00608
URI
http://hdl.handle.net/10203/241107
Appears in Collection
CBE-Journal Papers(저널논문)
Files in This Item
000426613800006.pdf(3.38 MB)Download
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 40 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0