Sketch2Shape: Single-View Sketch-Based 3D Reconstruction via Multi-View Differentiable Rendering

1Technical University of Munich

Abstract

In recent years, the creation and manipulation of 3D objects have gained prominence in numerous sectors, from entertainment over interior design to manufacturing. Building on the universal nature of sketches, we present a pipeline that can generate 3D objects from 2D sketches. (1) Our approach uses an view-agnostic encoder to map a single-view hand-drawn sketch into the DeepSDF latent space. (2) We use the DeepSDF auto-decoder & differentiable sphere tracing to optimize the latent vector on multi-view silhouettes. (3) Once the final latent vector is found, we can obtain our shape by extracting SDF values on a grid structure and running the marching cubes algorithm Our method outperforms a retrieval baseline and allows for interactive editing of the shape during the generation process.

Real-Time Sketch Application

The view-agnostic encoder should make it easier for artists to not restrict their creativity by constraining them to sketch from a specific view or provide exact view information of the sketch. Additionally, our approach facilitates an interactive workflow, enabling artists to generate content in real-time, thus enhancing the immediacy and responsiveness of the creative process.

Live-Editing via Differentiable Rendering

Our approach allows for live editing by manipulating the silhouettes after the initial shape has been generated. We accomplish this by developing an interactive painting tool that allows artists to paint over the silhouette to either add or remove geometry or introduce new features not originally present. The modified silhouette can be utilized for differentiable rendering. This editing functionality is applicable for both single and multiple views.

Qualitative Live-Editing Results

At the top are rendered normal images, displaying the initial latent code generated from the encoder and rendered using our differentiable share tracer. These images serve as the starting point for our interactive workflow. Next to the normal images are the single or multi-view editings. Below the edited images are the reconstructed 3D meshes, optimized using differentiable rendering based on the silhouettes from the editing. Our approach can accurately recover shapes from the edited images, showcasing the fidelity and robustness in capturing and refining details. Notably, in cases where the initial shape is incorrect, such as the chair on the right, our optimization process faces challenges due to constraints within the latent space. Despite these limitations, our method demonstrates remarkable capability in shape recovery and creative expression, underscoring its potential for diverse applications in art and design.

Multi-View Robustness

A crucial characteristic of our encoder is that we train it to be view-agnostic. While different views may result in approximately similar shapes, achieving perfect consistency across all views remains unattainable. Due to the inherent ambiguity in sketches, it remains uncertain whether perfect view-robustness is feasible.

BibTeX

@article{borth2024sketch2shape,
  author    = {Robin Borth, Daniel Korth},
  title     = {Sketch2Shape: Single-View Sketch-Based 3D Reconstruction via Multi-View Differentiable Rendering},
  year      = {2024},
}