Rule identification from Web pages by the XRML approach

Cited 16 time in webofscience Cited 18 time in scopus
  • Hit : 1325
  • Download : 409
DC FieldValueLanguage
dc.contributor.authorKang, Jko
dc.contributor.authorLee, Jae Kyuko
dc.date.accessioned2008-04-30T09:49:33Z-
dc.date.available2008-04-30T09:49:33Z-
dc.date.created2012-02-06-
dc.date.created2012-02-06-
dc.date.issued2005-11-
dc.identifier.citationDECISION SUPPORT SYSTEMS, v.41, no.1, pp.205 - 227-
dc.identifier.issn0167-9236-
dc.identifier.urihttp://hdl.handle.net/10203/4304-
dc.description.abstractIn the world of Web pages, there are oceans of documents in natural language texts and tables. To extract rules from Web pages and maintain consistency between them, we have developed the framework of XRML (eXtensible Rule Markup Language). XRML allows the identification of rules on Web pages and generates the identified rules automatically. For this purpose, we have designed the Rule Identification Markup Language (RIML), which is similar to the formal Rule Structure Mark-tip Language (RSML), both as parts of XRML. RIML 2.0 is designed to identify rules not only from texts, but also from tables on Web pages, and to transform to the formal rules in RSML syntax automatically. While designing RIML 2.0, we considered the features of sharing variables and values, omitted terms, and synonyms. We have conducted an experiment to evaluate the potential benefit of the XRML approach with real world Web pages of Amazon.com, BarnesandNoble.com, and Powells.com. We found that 100.0% of the rules and 99.7% of the rule components could be identified and automatically generated if we do not count the statements for linkages, which generically do not exist on the Web pages. Since the linkage components occupy 11.2% of all components in the rule base, the overall limitation of automatic rule generation is 88.8%. In this setting, 88.5% of the overall rule components could be generated from the identified rules from the Web pages. The result provides solid proof that XRML can facilitate the extraction and maintenance of rules from Web pages while building expert systems in the Semantic Web environment. (c) 2005 Elsevier B.V All rights reserved.-
dc.languageEnglish-
dc.language.isoen_USen
dc.publisherELSEVIER SCIENCE BV-
dc.subjectKNOWLEDGE ACQUISITION-
dc.subjectINFORMATION EXTRACTION-
dc.subjectNATURAL-LANGUAGE-
dc.subjectMARKUP LANGUAGE-
dc.subjectTEXT-
dc.subjectONTOLOGIES-
dc.subjectDOCUMENTS-
dc.subjectSUPPORT-
dc.subjectSYSTEM-
dc.titleRule identification from Web pages by the XRML approach-
dc.typeArticle-
dc.identifier.wosid000232712000012-
dc.identifier.scopusid2-s2.0-25444513368-
dc.type.rimsART-
dc.citation.volume41-
dc.citation.issue1-
dc.citation.beginningpage205-
dc.citation.endingpage227-
dc.citation.publicationnameDECISION SUPPORT SYSTEMS-
dc.identifier.doi10.1016/j.dss.2005.01.004-
dc.embargo.liftdate9999-12-31-
dc.embargo.terms9999-12-31-
dc.contributor.localauthorLee, Jae Kyu-
dc.contributor.nonIdAuthorKang, J-
dc.type.journalArticleArticle-
dc.subject.keywordAuthorrule identification-
dc.subject.keywordAuthorrule acquisition-
dc.subject.keywordAuthorknowledge engineering-
dc.subject.keywordAuthorknowledge acquisition-
dc.subject.keywordAuthorXRML-
dc.subject.keywordAuthorRuleML-
dc.subject.keywordAuthorXML-
dc.subject.keywordPlusKNOWLEDGE ACQUISITION-
dc.subject.keywordPlusINFORMATION EXTRACTION-
dc.subject.keywordPlusNATURAL-LANGUAGE-
dc.subject.keywordPlusMARKUP LANGUAGE-
dc.subject.keywordPlusTEXT-
dc.subject.keywordPlusONTOLOGIES-
dc.subject.keywordPlusDOCUMENTS-
dc.subject.keywordPlusSUPPORT-
dc.subject.keywordPlusSYSTEM-
Appears in Collection
MT-Journal Papers(저널논문)
Files in This Item
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 16 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0