DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Oh, Hae-Yun | - |
dc.contributor.advisor | 오혜연 | - |
dc.contributor.author | Kim, Jae-Won | - |
dc.contributor.author | 김재원 | - |
dc.date.accessioned | 2015-04-23T06:16:05Z | - |
dc.date.available | 2015-04-23T06:16:05Z | - |
dc.date.issued | 2014 | - |
dc.identifier.uri | http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=592443&flag=dissertation | - |
dc.identifier.uri | http://hdl.handle.net/10203/196857 | - |
dc.description | 학위논문(석사) - 한국과학기술원 : 전산학과, 2014.8, [ iv, 23 p. ] | - |
dc.description.abstract | Many of today’s societies are made up of multiple language groups, including groups of monolingual speakers and multilingual speakers of several different languages. We can ask many interesting questions about those societies including how widely each language is used, what topics are communicated in each language, whether there are time differences in the way information gets to each language group, and whether and how members of a language group communicate with members of another language group. We tackle these questions by looking at Switzerland, a highly multilingual society, with a large corpus of geotagged Twitter data. Specifically, we crawled 47 million tweets from 97,577 users, identified the language for each of those tweets, and analyzed those tweets using topic and language analysis tools. By using hierarchical Dirichlet scaling process, a nonparametric topic model for labeled data, we discover which topics are most popular for English, German, French monolinguals, as well as English-German, English-French, and German-French bilingual users. We analyze hashtags for major world events to understand whether certain groups have earlier access to information. We look at the general language use to compare the language variety of monolingual and bilingual users. By applying these computational methods to a large corpus of tweets from Switzerland, we show that there are many interesting linguistic and sociolinguistic phenomena that can be uncovered. | eng |
dc.language | eng | - |
dc.publisher | 한국과학기술원 | - |
dc.subject | Text Mining | - |
dc.subject | 토픽 모델링 | - |
dc.subject | 트위터 | - |
dc.subject | 소셜미디어 | - |
dc.subject | 다중 언어 | - |
dc.subject | 텍스트 마이닝 | - |
dc.subject | Multilingualism | - |
dc.subject | Social Media | - |
dc.subject | - | |
dc.subject | Topic Modelling | - |
dc.title | Understanding multilingualism in Switzerland using text mining algorithms | - |
dc.title.alternative | 텍스트 마이닝 알고리즘을 이용한 다중 언어 사회 스위스에 대한 이해 | - |
dc.type | Thesis(Master) | - |
dc.identifier.CNRN | 592443/325007 | - |
dc.description.department | 한국과학기술원 : 전산학과, | - |
dc.identifier.uid | 020113144 | - |
dc.contributor.localauthor | Oh, Hae-Yun | - |
dc.contributor.localauthor | 오혜연 | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.