📐 Projection-to-Geometry Learning for Industrial Vision Measurement
Overview
This project focuses on recovering real-world geometric dimensions from monocular visual observations under perspective distortion.
It aims to bridge the gap between projective geometry and true physical measurements, enabling accurate and robust dimension estimation using low-cost vision sensors.
🔍 Problem Statement
In industrial scenarios, objects captured by cameras suffer from:
- Perspective distortion (foreshortening effect)
- View-dependent geometric deformation
- Nonlinear mapping between image features and real dimensions
Traditional methods based on explicit geometric modeling are often:
- Sensitive to camera pose
- Difficult to generalize across setups
- Limited under real-world noise and depth uncertainty
⚙️ Method
We propose a geometry-constrained learning framework that integrates classical vision geometry with data-driven regression:
1. Planar Reconstruction

- Extract reference plane using RANSAC
- Establish a normalized plane coordinate system
- Reduce 3D problem to structured 2D geometry
2. Perspective Normalization
- Apply homography-based transformation
- Align image observations to a canonical top-down view
3. Feature Extraction

- Use object detection (e.g., YOLO) to obtain:
- Projected position
- Projected orientation (yaw)
- Projected size (length & width)
4. Learning-based Inverse Mapping

We learn a nonlinear mapping:
- Input: projected geometric features
- Output: true physical dimensions and pose
- Model: lightweight regression network (e.g., ResMLP)
🧠 Key Insight
Instead of explicitly modeling the full projection pipeline,
we treat the problem as an inverse geometry learning task, where:
- Geometry provides structural constraints
- Learning handles nonlinear residual errors
This significantly improves robustness under real-world conditions.
📊 Results
- Achieved millimeter-level accuracy (~2 mm error)
- Strong generalization across varying camera poses
- Stable performance under noise and partial observation
🚀 Contributions
- A unified pipeline combining:
- Planar geometry reconstruction
- Perspective normalization
- Learning-based error compensation
- A data-driven alternative to traditional calibration-heavy methods
- Demonstration of geometry + learning hybrid paradigm for measurement tasks
📌 Keywords
Computer Vision · Geometric Reconstruction · Error Compensation · Industrial Measurement · Learning-based Inverse Mapping