Open Problem: Is There a First-Order Method that Only Converges to Local Minimax Optima?

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 61
  • Download : 0
Can we effectively train a generative adversarial network (GAN) (or equivalently, optimize a minimax problem), similar to how we successfully learn a classification neural network (or equivalently, minimize a function) by gradient methods? The answer to this question at the moment is “No”. Despite extensive studies over the past ten years, training GANs still remains challenging. As a result, diffusion-based generative models are largely replacing GANs. When training GANs, we not only struggle with finding stationary points, but also, from a practical view, suffer from the so-called mode-collapse phenomenon, generating samples that lack diversity compared to the training data. Due to the nature of GAN, the mode-collapse is likely to occur when we accidentally find an optimal point for the maximin problem, rather than the original minimax problem (Goodfellow, 2016). This consequently suggests that addressing a long-standing open question of whether there exists a first-order method that only converges to (local) optimum of minimax problems can resolve the aforementioned shortcomings. Apparently, none of the existing methods possess such a property, neither theoretically nor practically. This is in contrast to the fact that a standard gradient descent method successfully finds (local) minima (Lee et al., 2016). Surprisingly, in nonconvex-nonconcave minimax optimization, Jin et al. (2020) are the first to suggest an appropriate notion of local optimality, especially taking account of the order of minimization and maximization. Jin et al. (2020) also presented a partial answer to the above open question, by demonstrating that a two-timescale gradient descent ascent only converges to a strict local minimax optimum, under a certain condition. However, the convergence to general local minimax optimum was left mostly unexplored, even though such a non-strict local minimax optimum is prevalent in practice. Our recent findings in (Chae et al., 2023) illustrate that it is indeed possible to find some non-strict local minimax optimum by a two-timescale variant of extragradient method. This positive result brings new attention to the aforementioned open question. In this paper, we detail it in regard to our recent findings. Furthermore, we are writing to revive discussion on the appropriate notion of local minimax optimum. This was initially discussed by Jin et al. (2020), but not much thereafter, which we believe is an important piece of answering the open question.
Publisher
ML Research Press
Issue Date
2023-07-15
Language
English
Citation

36th Annual Conference on Learning Theory, COLT 2023, pp.5957 - 5964

URI
http://hdl.handle.net/10203/317214
Appears in Collection
MA-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0