DC Field | Value | Language |
---|---|---|
dc.contributor.author | Whang, Steven Euijong | ko |
dc.contributor.author | Lofgren, Peter | ko |
dc.contributor.author | Garcia-Molina, Hector | ko |
dc.date.accessioned | 2019-04-15T19:31:19Z | - |
dc.date.available | 2019-04-15T19:31:19Z | - |
dc.date.created | 2018-03-29 | - |
dc.date.created | 2018-03-29 | - |
dc.date.issued | 2013-08 | - |
dc.identifier.citation | 39th International Conference on Very Large Data Bases, VLDB 2013, pp.349 - 360 | - |
dc.identifier.issn | 2150-8097 | - |
dc.identifier.uri | http://hdl.handle.net/10203/257646 | - |
dc.description.abstract | We study the problem of enhancing Entity Resolution (ER) with the help of crowdsourcing. ER is the problem of clustering records that refer to the same real-world entity and can be an extremely difficult process for computer algorithms alone. For example, figuring out which images refer to the same person can be a hard task for computers, but an easy one for humans. We study the problem of resolving records with crowdsourcing where we ask questions to humans in order to guide ER into producing accurate results. Since human work is costly, our goal is to ask as few questions as possible. We propose a probabilistic framework for ER that can be used to estimate how much ER accuracy we obtain by asking each question and select the best question with the highest expected accuracy. Computing the expected accuracy is #P-hard, so we propose approximation techniques for efficient computation. We evaluate our best question algorithms on real and synthetic datasets and demonstrate how we can obtain high ER accuracy while significantly reducing the number of questions asked to humans. © 2013 VLDB Endowment. | - |
dc.language | English | - |
dc.publisher | The VLDB Endowment | - |
dc.title | Question selection for crowd entity resolution | - |
dc.type | Conference | - |
dc.identifier.scopusid | 2-s2.0-84881231558 | - |
dc.type.rims | CONF | - |
dc.citation.beginningpage | 349 | - |
dc.citation.endingpage | 360 | - |
dc.citation.publicationname | 39th International Conference on Very Large Data Bases, VLDB 2013 | - |
dc.identifier.conferencecountry | IT | - |
dc.identifier.conferencelocation | Riva del Garda, Trento | - |
dc.identifier.doi | 10.14778/2536336.2536337 | - |
dc.contributor.localauthor | Whang, Steven Euijong | - |
dc.contributor.nonIdAuthor | Lofgren, Peter | - |
dc.contributor.nonIdAuthor | Garcia-Molina, Hector | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.