Learning Object Consistency and Interaction in Image Generation from Scene Graphs

Learning Object Consistency and Interaction in Image Generation from Scene Graphs

Yangkang Zhang, Chenye Meng, Zejian Li, Pei Chen, Guang Yang, Changyuan Yang, Lingyun Sun

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence
Main Track. Pages 1731-1739. https://doi.org/10.24963/ijcai.2023/192

This paper is concerned with synthesizing images conditioned on a scene graph (SG), a set of object nodes and their edges of interactive relations. We divide existing works into image-oriented and code-oriented methods. In our analysis, the image-oriented methods do not consider object interaction in spatial hidden feature. On the other hand, in empirical study, the code-oriented methods lose object consistency as their generated images miss certain objects in the input scene graph. To alleviate these two issues, we propose Learning Object Consistency and Interaction (LOCI). To preserve object consistency, we design a consistency module with a weighted augmentation strategy for objects easy to be ignored and a matching loss between scene graphs and image codes. To learn object interaction, we design an interaction module consisting of three kinds of message propagation between the input scene graph and the learned image code. Experiments on COCO-stuff and Visual Genome datasets show our proposed method alleviates the ignorance of objects and outperforms the state-of-the-art on visual fidelity of generated images and objects.
Keywords:
Computer Vision: CV: Neural generative models, auto encoders, GANs  
Computer Vision: CV: Scene analysis and understanding