DC Field | Value | Language |
---|---|---|
dc.contributor.author | Yoon, Jaehong | ko |
dc.contributor.author | Hwang, Sung Ju | ko |
dc.contributor.author | Cao, Yue | ko |
dc.date.accessioned | 2023-12-11T03:04:05Z | - |
dc.date.available | 2023-12-11T03:04:05Z | - |
dc.date.created | 2023-12-09 | - |
dc.date.issued | 2023-07-26 | - |
dc.identifier.citation | 40th International Conference on Machine Learning (ICML 2023) | - |
dc.identifier.uri | http://hdl.handle.net/10203/316204 | - |
dc.description.abstract | Motivated by the efficiency and rapid convergence of pre-trained models for solving downstream tasks, this paper extensively studies the impact of Continual Learning (CL) models as pre-trainers. We find that, in both supervised and unsupervised CL, the transfer quality of representations does not show a noticeable degradation of fine-tuning performance but rather increases gradually. This is because CL models can learn improved task-general features when easily forgetting task-specific knowledge. Based on this observation, we suggest a new unsupervised CL framework with masked modeling, which aims to capture fluent task-generic representation during training. Furthermore, we propose a new fine-tuning scheme, GLobal Attention Discretization (GLAD), that preserves rich task-generic representation during solving downstream tasks. The model fine-tuned with GLAD achieves competitive performance and can also be used as a good pre-trained model itself. We believe this paper breaks the barriers between pre-training and fine-tuning steps and leads to a sustainable learning framework in which the continual learner incrementally improves model generalization, yielding better transfer to unseen tasks. | - |
dc.language | English | - |
dc.publisher | International Machine Learning Society | - |
dc.title | Continual Learners are Incremental Model Generalizers | - |
dc.type | Conference | - |
dc.identifier.scopusid | 2-s2.0-85174425945 | - |
dc.type.rims | CONF | - |
dc.citation.publicationname | 40th International Conference on Machine Learning (ICML 2023) | - |
dc.identifier.conferencecountry | US | - |
dc.identifier.conferencelocation | Honolulu, HI | - |
dc.contributor.localauthor | Hwang, Sung Ju | - |
dc.contributor.nonIdAuthor | Cao, Yue | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.