External Knowledge Enhanced 3D Scene Generation from Sketch

dc.contributor.authorWu, Z.
dc.contributor.authorFeng, M.
dc.contributor.authorWang, Y.
dc.contributor.authorXie, H.
dc.contributor.authorDong, W.
dc.contributor.authorMiao, B.
dc.contributor.authorMian, A.
dc.contributor.conference18th European Conference on Computer Vision (ECCV) (29 Sep 2024 - 4 Oct 2024 : Milan, Italy)
dc.contributor.editorLeonardis, A.
dc.contributor.editorRicci, E.
dc.contributor.editorRoth, S.
dc.contributor.editorRussakovsky, O.
dc.contributor.editorSattler, T.
dc.contributor.editorVarol, G.
dc.date.issued2025
dc.description.abstractGenerating realistic 3D scenes is challenging due to the complexity of room layouts and object geometries. We propose a sketch based knowledge enhanced diffusion architecture (SEK) for generating customized, diverse, and plausible 3D scenes. SEK conditions the denoising process with a hand-drawn sketch of the target scene and cues from an object relationship knowledge base. We first construct an external knowledge base containing object relationships and then leverage knowledge enhanced graph reasoning to assist our model in understanding hand-drawn sketches. A scene is represented as a combination of 3D objects and their relationships, and then incrementally diffused to reach a Gaussian distribution. We propose a 3D denoising scene transformer that learns to reverse the diffusion process, conditioned by a hand-drawn sketch along with knowledge cues, to regressively generate the scene including the 3D object instances as well as their layout. Experiments on the 3D-FRONT dataset show that our model improves FID, CKL by 17.41%, 37.18% in 3D scene generation and FID, KID by 19.12%, 20.06% in 3D scene completion compared to the nearest competitor DiffuScene.
dc.description.statementofresponsibilityZijie Wu, Mingtao Feng, B, Yaonan Wang, He Xie, Weisheng Dong, Bo Miao, and Ajmal Mian
dc.identifier.citationProceedings, Part VI of the 18th European Conference on Computer Vision (ECCV 2024), as published in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2025 / Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (ed./s), vol.15064, pp.286-304
dc.identifier.doi10.1007/978-3-031-72658-3_17
dc.identifier.isbn9783031726576
dc.identifier.issn0302-9743
dc.identifier.issn1611-3349
dc.identifier.orcidMiao, B. [0000-0002-3025-4429]
dc.identifier.urihttps://hdl.handle.net/2440/148014
dc.language.isoen
dc.publisherSpringer Nature
dc.publisher.placeCham, Switzerland
dc.relation.granthttp://purl.org/au-research/grants/arc/DP240101926
dc.relation.granthttp://purl.org/au-research/grants/arc/FT210100268
dc.relation.ispartofseriesLecture Notes in Computer Science; 15064
dc.rights© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025
dc.source.urihttps://doi.org/10.1007/978-3-031-72658-3_17
dc.subjectScene Generation; Knowledge Enhanced System; Diffusion
dc.titleExternal Knowledge Enhanced 3D Scene Generation from Sketch
dc.typeConference paper
pubs.publication-statusPublished

Files

Collections