Understanding multilingualism in Switzerland using text mining algorithms텍스트 마이닝 알고리즘을 이용한 다중 언어 사회 스위스에 대한 이해

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 439
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisorOh, Hae-Yun-
dc.contributor.advisor오혜연-
dc.contributor.authorKim, Jae-Won-
dc.contributor.author김재원-
dc.date.accessioned2015-04-23T06:16:05Z-
dc.date.available2015-04-23T06:16:05Z-
dc.date.issued2014-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=592443&flag=dissertation-
dc.identifier.urihttp://hdl.handle.net/10203/196857-
dc.description학위논문(석사) - 한국과학기술원 : 전산학과, 2014.8, [ iv, 23 p. ]-
dc.description.abstractMany of today’s societies are made up of multiple language groups, including groups of monolingual speakers and multilingual speakers of several different languages. We can ask many interesting questions about those societies including how widely each language is used, what topics are communicated in each language, whether there are time differences in the way information gets to each language group, and whether and how members of a language group communicate with members of another language group. We tackle these questions by looking at Switzerland, a highly multilingual society, with a large corpus of geotagged Twitter data. Specifically, we crawled 47 million tweets from 97,577 users, identified the language for each of those tweets, and analyzed those tweets using topic and language analysis tools. By using hierarchical Dirichlet scaling process, a nonparametric topic model for labeled data, we discover which topics are most popular for English, German, French monolinguals, as well as English-German, English-French, and German-French bilingual users. We analyze hashtags for major world events to understand whether certain groups have earlier access to information. We look at the general language use to compare the language variety of monolingual and bilingual users. By applying these computational methods to a large corpus of tweets from Switzerland, we show that there are many interesting linguistic and sociolinguistic phenomena that can be uncovered.eng
dc.languageeng-
dc.publisher한국과학기술원-
dc.subjectText Mining-
dc.subject토픽 모델링-
dc.subject트위터-
dc.subject소셜미디어-
dc.subject다중 언어-
dc.subject텍스트 마이닝-
dc.subjectMultilingualism-
dc.subjectSocial Media-
dc.subjectTwitter-
dc.subjectTopic Modelling-
dc.titleUnderstanding multilingualism in Switzerland using text mining algorithms-
dc.title.alternative텍스트 마이닝 알고리즘을 이용한 다중 언어 사회 스위스에 대한 이해-
dc.typeThesis(Master)-
dc.identifier.CNRN592443/325007 -
dc.description.department한국과학기술원 : 전산학과, -
dc.identifier.uid020113144-
dc.contributor.localauthorOh, Hae-Yun-
dc.contributor.localauthor오혜연-
Appears in Collection
CS-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0