Computer Vision

Deep learning for visual understanding, object detection, segmentation, and pose estimation.

The Problem

Computer vision systems must interpret and understand the visual world — a problem that remains far from solved in many real-world scenarios:

  • Occlusion and clutter — Real environments are messy. Objects are partially hidden, stacked, and viewed from awkward angles. Handling occlusion robustly is a persistent challenge.
  • Domain shift — Models trained on clean datasets degrade when deployed in new environments with different lighting, camera angles, or object appearances.
  • Fine-grained recognition — Distinguishing between visually similar categories (e.g., defect types on a circuit board) requires models that capture subtle visual differences.
  • 3D understanding — Inferring 3D structure, pose, and spatial relationships from 2D images is essential for robotics and augmented reality but remains challenging.

What We're Working On

  • Defect detection — Deep learning-enhanced systems for automated visual inspection, particularly for PCB manufacturing quality control.
  • Pose estimation — Multi-modal approaches to hand-object and pallet pose estimation, enabling precise robotic manipulation in warehouse and industrial settings.
  • Semantic segmentation — Lightweight neural network architectures for real-time scene understanding on mobile robots and embedded devices.

Related Publications

2 papers in Computer Vision

View all