Computer Vision Project
Computer Vision Programming Project: Urban Traffic Sign Recognition and Analysis System
Project Overview
This programming project challenges to develop a comprehensive computer vision system for analyzing urban traffic signs from real-world street scene imagery. Students will apply classical computer vision techniques, feature extraction methods, traditional machine learning classifiers, and modern deep learning approaches to build a robust traffic sign detection and recognition pipeline. The project uses the Mapillary Traffic Sign Dataset (https://www.mapillary.com/dataset/trafficsign) , which contains over 100,000 high-resolution images from diverse global locations with varying weather conditions, lighting, viewpoints, and camera sensors—making it an ideal testbed for advanced computer vision methods.
Dataset Description
Mapillary Traffic Sign Dataset specifications:
- Size: 100,000+ high-resolution street scene images
- Annotations: 320,000+ labeled traffic signs with bounding boxes
- Classes: 300+ distinct traffic sign categories (students will focus on subset)
- Diversity: Global coverage with images from multiple countries, weather conditions, times of day, and camera types
- Annotation quality: Fully annotated bounding boxes with class labels
- Access: Publicly available for academic use
The image dataset is available on ARC in /data/project/MSA8395/mapillary_traffic_sign_dataset
Project Requirements
Phase 1: Dataset Preparation, Preprocessing & Region Proposal
Timeline: Week 1
Part A: Dataset Curation and Exploration
- Dataset Sampling: From the full Mapillary dataset, create a focused subset:
- Identify the 5 most frequent traffic sign classes
- Extract 100 images per class containing these signs
- Ensure balanced class distribution (or document imbalance strategy)
- Split data: 70% training, 15% validation, 15% test
- Exploratory Analysis:
- Analyze class distributions and sign size variations
- Visualize sample images showing different conditions (weather, lighting, occlusion)
- Document challenges: scale variation, multiple signs per image, background complexity
- Examine bounding box annotations and prepare ground truth data
Deliverable: Dataset preparation script and exploratory analysis notebook with visualizations.
Part B: Preprocessing Pipeline
- Color Space Analysis: Convert images to RGB, HSV, and Lab color spaces. Analyze which color space best isolates traffic signs from complex urban backgrounds (sky, buildings, vegetation).
- Image Enhancement:
- Implement adaptive histogram equalization (CLAHE) for lighting normalization
- Apply bilateral filtering for noise reduction while preserving edges
- Test preprocessing on challenging images (nighttime, shadows, rain)
- Sign Region Extraction: Use bounding box annotations to extract sign regions. Implement padding strategy to include context around signs.
- Standardization: Resize extracted signs to uniform dimensions (e.g., 64x64 or 128x128 pixels) while maintaining aspect ratio considerations.
Deliverable: Preprocessing pipeline that outputs enhanced, standardized sign images.
Part C: Classical Region Proposal
Develop a classical computer vision pipeline to detect sign candidates without using ground truth bounding boxes:
- Color-Based Detection:
- Use HSV color space to detect red, blue, and yellow regions (common sign colors)
- Apply morphological operations (opening, closing) to clean detections
- Generate candidate regions using connected components
- Edge and Shape Detection:
- Apply Canny edge detection to find sign boundaries
- Use Hough Circle Transform to detect circular signs
- Use Hough Line Transform to detect rectangular/triangular sign boundaries
- Combine edge and shape information to propose sign candidates
- Region Filtering:
- Filter candidates by size, aspect ratio, and location
- Score candidates based on edge strength and color consistency
- Implement Non-Maximum Suppression to eliminate overlapping proposals
- Evaluation: Compare proposed regions against ground truth using Intersection over Union (IoU). Calculate precision, recall, and F1-score for region proposal at IoU thresholds of 0.3, 0.5, and 0.7.
Expected Outcomes
By completing this project, students will produce:
Technical Deliverables
- Complete Python Codebase:
- Dataset preparation and sampling scripts
- Preprocessing and enhancement pipeline
- Classical region proposal implementation
- Feature extraction module (HOG, color, ORB)
- Classical ML training and evaluation scripts
- Well-organized, documented, modular code
- Trained Models:
- Classical models: SVM and Random Forest with engineered features
- Documentation:
- Jupyter notebooks for each project phase
- Code comments and markdown explanations
- Hyperparameter choices with justifications
- Experimental observations and insights
- Visual Results:
- Dataset exploration visualizations
- Preprocessing pipeline effects
- Region proposal results with IoU analysis
- HOG visualizations and feature space plots (t-SNE, PCA)
- Confusion matrices for all classifiers
- Detection and classification result example, also show misclassification examples
Grading Rubric
- Phase 1 (Data Preparation & Region Proposal): 35%
- Dataset curation and exploration: 10%
- Preprocessing pipeline: 10%
- Classical region proposal and evaluation: 15%
- Phase 2 (Feature Extraction & Models): 40%
- Classical features and ML classification: 40%
- Phase 3 (Analysis & Report): 25%
- End-to-end system comparison: 10%
- Failure analysis with visualizations: 8%
- Insights and recommendations: 7%
- Code Quality & Documentation: 10%
- Code organization and modularity
- Documentation clarity and completeness
- Reproducibility with clear instructions
- Bonus (up to 10%):
- Ensemble methods combining approaches
- Exceptional visualizations and analysis depth
- Testing on additional challenging scenarios
Resources and Tools
Required Python Libraries:
- OpenCV (
cv2): Core image processing and computer vision - scikit-image (
skimage): Advanced image processing algorithms - scikit-learn (
sklearn): Classical ML, metrics, dimensionality reduction - NumPy/Pandas: Numerical operations and data management
- Matplotlib/Seaborn: Visualization and plotting
- TensorFlow or PyTorch: Deep learning framework
- Albumentations: Advanced image augmentation
- Ultralytics (optional): For YOLO implementation
Computing Resources:
- GPU Access: Google Colab Pro or university GPU cluster recommended
- RAM: Minimum 16GB for handling high-resolution images
- Storage: 50-100GB for full dataset (10-20GB for subset)
Dataset Access:
- Mapillary Traffic Sign Dataset: Available from Mapillary research page https://labelbox.com/datasets/mapillary-traffic-sign-dataset/ or https://www.mapillary.com/dataset/trafficsign
- The image dataset is available on ARC in
/data/project/MSA8395/mapillary_traffic_sign_dataset
Submission Requirements
- Code Repository: Organized directory structure:
- Create a Project on The GitLab server https:git.insight.gsu.edu
- Add the instructor to your project (as developer)
- Continuously update the repo as your work on your project
project/
├── notebooks/
│ ├── 01_data_preparation.ipynb
│ ├── 02_preprocessing_region_proposal.ipynb
│ ├── 03_classical_ml.ipynb
│ ├── 04_deep_learning.ipynb
│ └── 05_analysis.ipynb
├── src/
│ ├── preprocessing.py
│ ├── region_proposal.py
│ ├── features.py
│ ├── models.py
│ └── evaluation.py
├── models/
│ ├── svm_model.pkl
│ ├── rf_model.pkl
│ └── deep_model.pth
├── results/
│ ├── figures/
│ └── metrics/
├── README.md
└── requirements.txt
- Final Report: PDF document with:
- Abstract/Executive Summary
- Introduction and dataset description
- Methodology sections for each approach
- Results with comprehensive tables and figures
- Comparative analysis and discussion
- Failure analysis with examples
- Conclusions and recommendations
- References
- README: Clear instructions for:
- Environment setup and dependencies
- Dataset download and preparation
- Running each phase of the project
- Reproducing reported results