DC Field | Value | Language |
---|---|---|
dc.contributor.author | Lee, Geon | ko |
dc.contributor.author | Park, Chanyoung | ko |
dc.contributor.author | Shin, Kijung | ko |
dc.date.accessioned | 2022-12-07T11:00:21Z | - |
dc.date.available | 2022-12-07T11:00:21Z | - |
dc.date.created | 2022-12-02 | - |
dc.date.created | 2022-12-02 | - |
dc.date.created | 2022-12-02 | - |
dc.date.issued | 2022-11-29 | - |
dc.identifier.citation | The 22nd IEEE International Conference on Data Mining, ICDM 2022, pp.1023 - 1028 | - |
dc.identifier.issn | 1550-4786 | - |
dc.identifier.uri | http://hdl.handle.net/10203/301996 | - |
dc.description.abstract | Sets have been used for modeling various types of objects, and measuring similarity between them has been a key building block of a wide range of applications. However, as sets have grown in numbers and sizes, the computational cost and storage required for set similarity computation have become substantial. In this work, we propose SET2Box, which represents sets as boxes to precisely capture overlaps of sets and thus accurately estimate various similarity measures. Additionally, based on the proposed box quantization scheme, we design SET2Box+, which yields more concise but more accurate box representations of sets. Through extensive experiments on 8 real-world datasets, we show that, compared to baseline approaches, SET2Box+ is (a) Accurate: achieving up to 40.8× smaller estimation error while requiring 60% fewer bits to encode sets, (b) Concise: yielding up to 96.8× more concise representations with similar estimation error, and (c) Versatile: enabling the estimation of four set-similarity measures from a single representation of each set. For reproducibility, the source code and datasets used in the paper are available at https://github.com/geon0325/Set2Box. © 2022 IEEE. | - |
dc.language | English | - |
dc.publisher | IEEE Computer Society | - |
dc.title | Set2Box: Similarity Preserving Representation Learning for Sets | - |
dc.type | Conference | - |
dc.identifier.wosid | 000965045700115 | - |
dc.identifier.scopusid | 2-s2.0-85147729678 | - |
dc.type.rims | CONF | - |
dc.citation.beginningpage | 1023 | - |
dc.citation.endingpage | 1028 | - |
dc.citation.publicationname | The 22nd IEEE International Conference on Data Mining, ICDM 2022 | - |
dc.identifier.conferencecountry | US | - |
dc.identifier.conferencelocation | Orlando, FL | - |
dc.identifier.doi | 10.1109/ICDM54844.2022.00125 | - |
dc.contributor.localauthor | Park, Chanyoung | - |
dc.contributor.localauthor | Shin, Kijung | - |
dc.contributor.nonIdAuthor | Lee, Geon | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.