StyleDrop: Text-to-Image Generation in Any Style

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 132
  • Download : 0
DC FieldValueLanguage
dc.contributor.authorSohn, Kihyukko
dc.contributor.authorRuiz, Natanielko
dc.contributor.authorLee, Kiminko
dc.contributor.authorChin, Daniel Castroko
dc.contributor.authorBlok, Irinako
dc.contributor.authorChang, Huiwenko
dc.contributor.authorBarber, Jarredko
dc.contributor.authorJiang, Luko
dc.contributor.authorEntis, Glennko
dc.contributor.authorLi, Yuanzhenko
dc.contributor.authorHao, Yuanko
dc.contributor.authorEssa, Irfanko
dc.contributor.authorRubinstein, Michaelko
dc.contributor.authorKrishnan, Dilipko
dc.date.accessioned2023-12-08T01:03:47Z-
dc.date.available2023-12-08T01:03:47Z-
dc.date.created2023-12-07-
dc.date.issued2023-12-13-
dc.identifier.citation37th Conference on Neural Information Processing Systems (NeurIPS)-
dc.identifier.urihttp://hdl.handle.net/10203/316030-
dc.description.abstractPre-trained large text-to-image models synthesize impressive images with an appropriate use of text prompts. However, ambiguities inherent in natural language and out-of-distribution effects make it hard to synthesize image styles, that leverage a specific design pattern, texture or material. In this paper, we introduce StyleDrop, a method that enables the synthesis of images that faithfully follow a specific style using a text-to-image model. The proposed method is extremely versatile and captures nuances and details of a user-provided style, such as color schemes, shading, design patterns, and local and global effects. It efficiently learns a new style by fine-tuning very few trainable parameters (less than 1% of total model parameters) and improving the quality via iterative training with either human or automated feedback. Better yet, StyleDrop is able to deliver impressive results even when the user supplies only a single image that specifies the desired style. An extensive study shows that, for the task of style tuning text-to-image models, StyleDrop implemented on Muse convincingly outperforms other methods, including DreamBooth and textual inversion on Imagen or Stable Diffusion.-
dc.languageEnglish-
dc.publisherNeural Information Processing Systems Foundation-
dc.titleStyleDrop: Text-to-Image Generation in Any Style-
dc.typeConference-
dc.type.rimsCONF-
dc.citation.publicationname37th Conference on Neural Information Processing Systems (NeurIPS)-
dc.identifier.conferencecountryUS-
dc.identifier.conferencelocationNew Orleans Ernest N. Morial Convention Center-
dc.contributor.localauthorLee, Kimin-
dc.contributor.nonIdAuthorSohn, Kihyuk-
dc.contributor.nonIdAuthorRuiz, Nataniel-
dc.contributor.nonIdAuthorChin, Daniel Castro-
dc.contributor.nonIdAuthorBlok, Irina-
dc.contributor.nonIdAuthorChang, Huiwen-
dc.contributor.nonIdAuthorBarber, Jarred-
dc.contributor.nonIdAuthorJiang, Lu-
dc.contributor.nonIdAuthorEntis, Glenn-
dc.contributor.nonIdAuthorLi, Yuanzhen-
dc.contributor.nonIdAuthorHao, Yuan-
dc.contributor.nonIdAuthorEssa, Irfan-
dc.contributor.nonIdAuthorRubinstein, Michael-
dc.contributor.nonIdAuthorKrishnan, Dilip-
Appears in Collection
AI-Conference Papers(학술대회논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0