Why not to use zero imputation? correcting sparsity bias in training neural networks제로 임퓨테이션의 희소성 편향 보정을 통한 인공 신경망의 누락 데이터 처리

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 184
  • Download : 0
Handling missing data is one of the most fundamental problems in machine learning. Among manyapproaches, the simplest and most intuitive way is zero imputation, which treats the value of a missingentry simply as zero. However, many studies have experimentally confirmed that zero imputation resultsin suboptimal performances in training neural networks. Yet, none of the existing work has explainedwhat brings such performance degradations. In this paper, we introduce thevariable sparsity problem(VSP), which describes a phenomenon where the output of a predictive model largely varies with respectto the rate of missingness in the given input, and show that it adversarially affects the model performance.We first theoretically analyze this phenomenon and propose a simple yet effective technique to handlemissingness, which we refer to asSparsity Normalization (SN), that directly targets and resolves the VSP.We further experimentally validate SN on diverse benchmark datasets, to show that debiasing the effectof input-level sparsity improves the performance and stabilizes the training of neural networks.
Advisors
Yang, Eunhoresearcher양은호researcher
Description
한국과학기술원 :전산학부,
Publisher
한국과학기술원
Issue Date
2020
Identifier
325007
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 전산학부, 2020.2,[iv, 33 p. :]

Keywords

Missing Data▼aVariable Sparsity Problem▼aImputation▼aSparsity Normalization▼aCollaborative Filtering▼aHealth Care▼aDeep Learning; 누락데이터▼a가변희소문제▼a임퓨테이션▼a희소성표준화▼a협력적여과▼a헬스케어▼a심층학습

URI
http://hdl.handle.net/10203/284674
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=911005&flag=dissertation
Appears in Collection
CS-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0