Sketching is a visual thinking tool available to humans for several decades. With the advent of modern sketching technologies, artists use sketches to express and iterate their ideas. To accelerate sketch-based ideation and illustration workflows, we propose a novel framework, SketchBuddy, which retrieves diverse fine-grained object suggestions to enrich a sketch and coherently inserts it into the scene. Sketchbuddy detects objects in the input sketch to estimate the scene context which is then utilized for the recommendation and insertion. We propose a novel multi-modal transformer based framework for obtaining context-aware fine-grained object recommendations. We train a CNN-based bounding box classifier to extract information from the input scene and the recommended objects to infer plausible locations for insertion. While prior works focus on sketches at object-level only, SketchBuddy is the first work in the direction of scene-level sketching assistance. Our extensive evaluations comparing SketchBuddy against competing baselines across several metrics and agreements with human preferences demonstrate its value on several aspects.
Learn More