Gaussian Splat Localization

Gaussian Splat Localization

Part of SplatOverflow — Published at ACM CHI 2025

Splatoverflow is a paper about using 3D Gaussian Splats linked to a CAD model as artifacts for remote hardware device (e.g., 3D Printers) debugging for remote maintainers.

Real-time localization in Gaussian splat scenes and SfM models.

The GIF on the left shows real-time camera tracking within a 3D scene using a standard phone camera.

3D Embodied Interaction Concepts

These ideas were ideas not used in the final paper, but are still cool nevertheless. These are potential applications of the localization combined with gaussian splats that I thought would be cool.

Hand Tracking & Interaction

Overlaying a 3D hand skeleton on the user's workspace to register precise manual gestures and interactions with the device, and update object CAD (eg. printhead movement) based off of physical interactions with the device.

Mobile AR Annotation

A mobile interface that tracks physical objects in 3D and connecting them with the virtual gaussian splat + CAD model, providing real-time bounding boxes and parts labels (e.g. "PART #") dynamically positioned on screen.

Multi-Camera Localization

Utilizing multiple camera angles and tracking them in one 3D Gaussian Splat scene. Could be useful for Social AR around a gaussian splatted object. Eg. someone remote could watch a 3D scene of multiple people interacting with an object in real-time.

Concept interaction sketches

Overview

Gaussian Splatting Reconstruction Pipeline

Standard pipelines take video inputs, construct a Structure-from-Motion (SfM) sparse point cloud model via COLMAP, and use that geometry to optimize the final Gaussian Splat.

My system took advantage of this underlying SfM model to compute real-time camera localization, mapping query frames directly to the corresponding 3D splat.

I was able to do this because the overall SplatOverflow system already computes a COLMAP model for the created 3D artifact of the machine.

High-Level Gaussian Splat Pipeline

High-level overview of my pipeline hooking into standard SfM-based splatting.

Hierarchical Localization (hloc)

To achieve real-time camera tracking inside pre-built Gaussian splat environments, I used a state-of-the-art Hierarchical Localization (hloc) algorithm:

  • Global Database Speedup: Relies on a global database of image descriptors from the SfM model to rapidly search and narrow down matching regions.
  • Integration: Since the image database is already generated during the initial SplatOverflow reconstruction process, I could use it with no extra prep.
  • Optimization: I Optimized the PyTorch model preloading and inference steps specifically tailored to the known global database to make it fast.
Hierarchical Localization Pipeline

The Offline (feature indexing) and Online (query image retrieval, 2D-3D matches, and 6-DoF pose estimation) pipeline.

Spatial Video-to-Splat Alignment

My implementation was used to connect user-captured mobile videos with the static 3D Gaussian Splat scene, facilitating real-time asynchronous troubleshooting.

  • Floating Video Perspective: The local user's video is projected as a floating virtual screen inside the SplatOverflow viewport, aligned to the exact spatial perspective it was captured from.
  • Collaborative Inspection: Remote maintainers can inspect hardware issues (e.g., verifying a 3D printer's axis alignment) in sync with the live splat and overlaying CAD models.
Video to Splat Connection

Floating video screen placement