External Knowledge Enhanced 3D Scene Generation from Sketch
Date
2025
Authors
Wu, Z.
Feng, M.
Wang, Y.
Xie, H.
Dong, W.
Miao, B.
Mian, A.
Editors
Leonardis, A.
Ricci, E.
Roth, S.
Russakovsky, O.
Sattler, T.
Varol, G.
Ricci, E.
Roth, S.
Russakovsky, O.
Sattler, T.
Varol, G.
Advisors
Journal Title
Journal ISSN
Volume Title
Type:
Conference paper
Citation
Proceedings, Part VI of the 18th European Conference on Computer Vision (ECCV 2024), as published in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2025 / Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (ed./s), vol.15064, pp.286-304
Statement of Responsibility
Zijie Wu, Mingtao Feng, B, Yaonan Wang, He Xie, Weisheng Dong, Bo Miao, and Ajmal Mian
Conference Name
18th European Conference on Computer Vision (ECCV) (29 Sep 2024 - 4 Oct 2024 : Milan, Italy)
Abstract
Generating realistic 3D scenes is challenging due to the complexity of room layouts and object geometries. We propose a sketch based knowledge enhanced diffusion architecture (SEK) for generating customized, diverse, and plausible 3D scenes. SEK conditions the denoising process with a hand-drawn sketch of the target scene and cues from an object relationship knowledge base. We first construct an external knowledge base containing object relationships and then leverage knowledge enhanced graph reasoning to assist our model in understanding hand-drawn sketches. A scene is represented as a combination of 3D objects and their relationships, and then incrementally diffused to reach a Gaussian distribution. We propose a 3D denoising scene transformer that learns to reverse the diffusion process, conditioned by a hand-drawn sketch along with knowledge cues, to regressively generate the scene including the 3D object instances as well as their layout. Experiments on the 3D-FRONT dataset show that our model improves FID, CKL by 17.41%, 37.18% in 3D scene generation and FID, KID by 19.12%, 20.06% in 3D scene completion compared to the nearest competitor DiffuScene.
School/Discipline
Dissertation Note
Provenance
Description
Access Status
Rights
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025