ECCV 2024
University of Cambridge1 XR Labs, Qualcomm Technologies, Inc.2
Unfortunately, we are not able to release the code for FastCAD. However, the first author, Florian Langer, is happy to answer any questions via email at fml35@cam.ac.uk or by setting up a short call. Florian Langer is also offering consultancy services in computer vision, so if that is of interest, please feel free to reach out or visit his website: florian.langer.co.uk.
The input to FastCAD is a point cloud, which may be derived from an RGB-D scan or a noisy scene reconstruction obtained by applying a neural reconstruction method to a video. For each detected object we predict its class p, oriented bounding box parameters b, front-facing side classification f and shape embedding vector w. The front-facing side prediction allows us to choose between the four possible orientations when aligning the CAD model within the oriented bounding box.
We learn a shape embedding space using a contrastive learning setup with two new auxiliary tasks. We use cropped object point clouds from the RGB-D scan as the anchors and associate the point cloud from the annotated CAD model as the positive example and a point cloud from a randomly sampled CAD model of the same category as the negative example. These three point clouds are passed through an encoder to produce embeddings A (anchor), P (positive) and N (negative) to which we apply a triplet loss. Further, we introduce two auxiliary tasks, performing foreground-background segmentation and estimating the shape similarity between the positive and the negative CAD model, which improve the quality of the embeddings.
We train and evaluate FastCAD on the Scan2CAD dataset which annotates ScanNet with CAD models. Visually inspecting our predictions we find that FastCAD is able to produce accurate CAD model retrievals and alignments from both point clouds and videos.
Quantitatively we find that FastCAD outperforms competing methods, particularly in the video setting, while being significantly faster.
@inproceedings{FastCAD, author = {Florian Langer and Jihong Ju and Georgi Dikov and Gerhard Reitmayr and Mohsen Ghafoorian}, title = {FastCAD: Real-Time CAD Retrieval and Alignment from Scans and Videos}, booktitle = {Proc. European Conference on Computer Vision (ECCV)}, month = {October}, year = {2024}, address={Milan} }