DSpace at KOASAS: Data Collection and Quality Challenges for Deep Learning

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Journal Papers(저널논문)

Data Collection and Quality Challenges for Deep Learning

Cited 39 time in

Cited 16 time in scopus

Hit : 306
Download : 0

Export

DC Field	Value	Language
dc.contributor.author	Whang, Steven Euijong	ko
dc.contributor.author	Lee, Jae-Gil	ko
dc.date.accessioned	2021-01-28T05:57:28Z	-
dc.date.available	2021-01-28T05:57:28Z	-
dc.date.created	2021-01-21	-
dc.date.created	2021-01-21	-
dc.date.created	2021-01-21	-
dc.date.issued	2020-08	-
dc.identifier.citation	PROCEEDINGS OF THE VLDB ENDOWMENT, v.13, no.12, pp.3429 - 3432	-
dc.identifier.issn	2150-8097	-
dc.identifier.uri	http://hdl.handle.net/10203/280093	-
dc.description.abstract	Software 2.0 refers to the fundamental shift in software engineering where using machine learning becomes the new norm in software with the availability of big data and computing infrastructure. As a result, many software engineering practices need to be rethought from scratch where data becomes a first-class citizen, on par with code. It is well known that 80{90% of the time for machine learning development is spent on data preparation. Also, even the best machine learning algorithms cannot perform well without good data or at least handling biased and dirty data during model training. In this tutorial, we focus on data collection and quality challenges that frequently occur in deep learning applications. Compared to traditional machine learning, there is less need for feature engineering, but more need for significant amounts of data. We thus go through state-of-the-art data collection techniques for machine learning. Then, we cover data validation and cleaning techniques for improving data quality. Even if the data is still problematic, hope is not lost, and we cover fair and robust training techniques for handling data bias and errors. We believe that the data management community is well poised to lead the research in these directions. The presenters have extensive experience in developing machine learning platforms and publishing papers in top-tier database, data mining, and machine learning venues.	-
dc.language	English	-
dc.publisher	ASSOC COMPUTING MACHINERY	-
dc.title	Data Collection and Quality Challenges for Deep Learning	-
dc.type	Article	-
dc.identifier.wosid	000597303100084	-
dc.type.rims	ART	-
dc.citation.volume	13	-
dc.citation.issue	12	-
dc.citation.beginningpage	3429	-
dc.citation.endingpage	3432	-
dc.citation.publicationname	PROCEEDINGS OF THE VLDB ENDOWMENT	-
dc.identifier.doi	10.14778/3415478.3415562	-
dc.contributor.localauthor	Whang, Steven Euijong	-
dc.contributor.localauthor	Lee, Jae-Gil	-
dc.description.isOpenAccess	N	-
dc.type.journalArticle	Article	-

Appears in Collection: EE-Journal Papers(저널논문)CS-Journal Papers(저널논문)

Files in This Item: There are no files associated with this item.

This item is cited by other documents in WoS

⊙ Detail Information in WoSⓡ	Click to see
⊙ Cited 39 items in WoS	Click to see citing articles in

Display Simple Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Data Collection and Quality Challenges for Deep Learning

This item is cited by other documents in WoS

KOASAS

Communities & Collections