GSoC 2024 Report - PictoPy
Project Overview
PictoPy is a project developed during Google Summer of Code 2024 for AOSSIE. The goal was to create a backend system for image processing, object detection, and face recognition. The documentation for the entire project can be found here.
Phase 1 (Mid-Phase)
Setup and Initial Development
-
Backend Setup
- Initialized a FastAPI backend with a standard directory structure
- Faced challenges with large library sizes for object detection models
- Switched to ONNX format, reducing environment size from 5GB (with PyTorch GPU) to ~400MB
-
Routing Logic with Parallel Processing
- Implemented non-blocking requests using FastAPI
- Explored various options for parallel processing (threading, multiprocessing)
- Settled on asyncio for concurrent processing of multiple images
- Switched from uvicorn to hypercorn for better cross-platform compatibility
-
Database Design
- Developed schemas for storing image and album information
- Image schema includes file path, object detection results, metadata, and potential face embeddings
- Album schema contains multiple images and album-specific information
- Plans to refine schemas for more concrete mappings and handle edge cases
Phase 2 (Final Phase)
Feature Implementation
-
Face Embeddings
- Integrated object detection model
- Implemented face detection and embedding generation
- Tested various models (ArcFace, VGG, etc.)
- Selected FaceNet for optimal size and performance constraints
- Used ultralytics model for object and face detection
- Generated face embeddings using FaceNet
- All models converted to ONNX format for efficiency
Related PR
PR-34 (Merged)
-
Face Recognition and Schema Updates
- Implemented face clustering using DBSCAN algorithm
- Updated schemas to accommodate clustering results
- Ensured proper handling of image operations (add/delete) affecting clusters
- Completed core project requirements (image database, album database, object detection, face detection, and recognition)
- Optimized DBSCAN parameters for ONNX model performance
Related PR
PR-36 (Merged)
-
Documentation and API Collection
- Created comprehensive documentation on setup, directory structure, and model architecture
- Developed a Postman Collection for API testing and development
- Utilized mkdocs-material for documentation, hosted here
- Provided a Dockerfile for backend containerization
Project Completion Status
- Status: Completed
- Significant Changes: None, adhered to original plan with minor adjustments (e.g., using ONNX runtime instead of exploring OpenVINO)
Future Plans
- Potential development of an Electron-based frontend
- Continued contribution to the project
- Promotion of PictoPy to leverage its potential and encourage further development