Guiding Practical Text Classification Framework to Optimal State in Multiple Domains

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 491
  • Download : 0
This paper introduces DICE, a Domain-Independent text Classification Engine. DICE is robust, efficient, and domain-independent in terms of software and architecture. Each module of the system is clearly modularized and encapsulated for extensibility. The clear modular architecture allows for simple and continuous verification and facilitates changes in multiple cycles, even after its major development period is complete. Those who want to make use of DICE can easily implement their ideas on this test bed and optimize it for a particular domain by simply adjusting the configuration file. Unlike other publically available tool kits or development environments targeted at general purpose classification models, DICE specializes in text classification with a number of useful functions specific to it. This paper focuses on the ways to locate the optimal states of a practical text classification framework by using various adaptation methods provided by the system such as feature selection, lemmatization, and classification models.
Publisher
KSII-KOR SOC INTERNET INFORMATION
Issue Date
2009-06
Language
English
Article Type
Article
Keywords

FEATURE-SELECTION; CATEGORIZATION

Citation

KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, v.3, no.3, pp.285 - 307

ISSN
1976-7277
DOI
10.3837/tiis.2009.03.005
URI
http://hdl.handle.net/10203/96931
Appears in Collection
CS-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0