R-GAN: Exploring Human-like Way for Reasonable Text-to-Image Synthesis via Generative Adversarial Networks
Date
2021
Authors
Qiao, Y.
Chen, Q.
Deng, C.
DIng, N.
Qi, Y.
Tan, M.
Ren, X.
Wu, Q.
Editors
Advisors
Journal Title
Journal ISSN
Volume Title
Type:
Conference paper
Citation
Proceedings of the 29th ACM International Conference on Multimedia, 2021, pp.2085-2093
Statement of Responsibility
Yanyuan Qiao, Qi Chen, Chaorui Deng, Ning Ding, Yuankai Qi, Mingkui Tan, Xincheng Ren, Qi Wu
Conference Name
ACM Multimedia Conference (MM) (20 Oct 2021 - 24 Oct 2021 : Virtual online China)
Abstract
Despite recent significant progress on generative models, context-rich text-to-image synthesis depicting multiple complex objects is still non-trivial. The main challenges lie in the ambiguous semantic of a complex description and the intricate scene of an image with various objects, different positional relationship and diverse appearances. To address these challenges, we propose R-GAN, which can generate reasonable images according to the given text in a human-like way. Specifically, just like humans will first find and settle the essential elements to create a simple sketch, we first capture a monolithic-structural text representation by building a scene graph to find the essential semantic elements. Then, based on this representation, we design a bounding box generator to estimate the layout with position and size of target objects, and a following shape generator, which draws a fine-detailed shape for each object. Different from previous work only generating coarse shapes blindly, we introduce a coarse-to-fine shape generator based on a shape knowledge base. At last, to finish the final image synthesis, we propose a multi-modal geometry-aware spatially-adaptive generator conditioned on the monolithic-structural text representation and the geometry-aware map of the shapes. Extensive experiments on the real-world dataset MSCOCO show the superiority of our method in terms of both quantitative and qualitative metrics.
School/Discipline
Dissertation Note
Provenance
Description
Access Status
Rights
© 2021 Association for Computing Machinery