Part of SplatOverflow — Published at ACM CHI 2025
Splatoverflow is a paper about using 3D Gaussian Splats linked to a CAD model as artifacts for remote hardware device (e.g., 3D Printers) debugging for remote maintainers.
Real-time localization in Gaussian splat scenes and SfM models.
The GIF on the left shows real-time camera tracking within a 3D scene using a standard phone camera.
These ideas were ideas not used in the final paper, but are still cool nevertheless. These are potential applications of the localization combined with gaussian splats that I thought would be cool.
Overlaying a 3D hand skeleton on the user's workspace to register precise manual gestures and interactions with the device, and update object CAD (eg. printhead movement) based off of physical interactions with the device.
A mobile interface that tracks physical objects in 3D and connecting them with the virtual gaussian splat + CAD model, providing real-time bounding boxes and parts labels (e.g. "PART #") dynamically positioned on screen.
Utilizing multiple camera angles and tracking them in one 3D Gaussian Splat scene. Could be useful for Social AR around a gaussian splatted object. Eg. someone remote could watch a 3D scene of multiple people interacting with an object in real-time.
Standard pipelines take video inputs, construct a Structure-from-Motion (SfM) sparse point cloud model via COLMAP, and use that geometry to optimize the final Gaussian Splat.
My system took advantage of this underlying SfM model to compute real-time camera localization, mapping query frames directly to the corresponding 3D splat.
I was able to do this because the overall SplatOverflow system already computes a COLMAP model for the created 3D artifact of the machine.
High-level overview of my pipeline hooking into standard SfM-based splatting.
To achieve real-time camera tracking inside pre-built Gaussian splat environments, I used a state-of-the-art Hierarchical Localization (hloc) algorithm:
The Offline (feature indexing) and Online (query image retrieval, 2D-3D matches, and 6-DoF pose estimation) pipeline.
My implementation was used to connect user-captured mobile videos with the static 3D Gaussian Splat scene, facilitating real-time asynchronous troubleshooting.
Floating video screen placement