Double data piling leads to perfect classification

Cited 1 time in webofscience Cited 0 time in scopus
  • Hit : 1740
  • Download : 0
Data piling refers to the phenomenon that training data vectors from each class project to a single point for classification. While this interesting phenomenon has been a key to understanding many distinctive properties of high-dimensional discrimination, the theoretical underpinning of data piling is far from properly established. In this work, high-dimensional asymptotics of data piling is investigated under a spiked covariance model, which reveals its close connection to the well-known ridged linear classifier. In particular, by projecting the ridge discriminant vector onto the subspace spanned by the leading sample principal component directions and the maximal data piling vector, we show that a negatively ridged discriminant vector can asymptotically achieve data piling of independent test data, essentially yielding a perfect classification. The second data piling direction is obtained purely from training data and shown to have a maximal property. Furthermore, asymptotic perfect classification occurs only along the second data piling direction.
Publisher
INST MATHEMATICAL STATISTICS-IMS
Issue Date
2021
Language
English
Article Type
Article
Citation

ELECTRONIC JOURNAL OF STATISTICS, v.15, no.2, pp.6382 - 6428

ISSN
1935-7524
DOI
10.1214/21-EJS1945
URI
http://hdl.handle.net/10203/292453
Appears in Collection
IE-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 1 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0