Distance-based outlier detection for high dimension, low sample size data

Cited 9 time in webofscience Cited 0 time in scopus
  • Hit : 246
  • Download : 0
Despite the popularity of high dimension, low sample size data analysis, there has not been enough attention to the sample integrity issue, in particular, a possibility of outliers in the data. A new outlier detection procedure for data with much larger dimensionality than the sample size is presented. The proposed method is motivated by asymptotic properties of high-dimensional distance measures. Empirical studies suggest that high-dimensional outlier detection is more likely to suffer from a swamping effect rather than a masking effect, thus yields more false positives than false negatives. We compare the proposed approaches with existing methods using simulated data from various population settings. A real data example is presented with a consideration on the implication of found outliers.
Publisher
TAYLOR & FRANCIS LTD
Issue Date
2019-01
Language
English
Article Type
Article
Citation

JOURNAL OF APPLIED STATISTICS, v.46, no.1, pp.13 - 29

ISSN
0266-4763
DOI
10.1080/02664763.2018.1452901
URI
http://hdl.handle.net/10203/285422
Appears in Collection
IE-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 9 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0