Skip to content

yuvraajnarula/Nano-Vision

Repository files navigation

🚀 Project Nano-Vision

Cross-Architecture Knowledge Distillation & 4-Bit Quantization for Real-Time Edge Inference

📌 Overview

In the era of "Bigger is Better," Project Nano-Vision takes the opposite approach. We tackle the challenge of deploying state-of-the-art Deep Learning models on resource-constrained hardware ("Potatoes").

By implementing Knowledge Distillation (KD), we transfer the "dark knowledge" of a heavy, scratch-built ResNet-50 (Teacher) into a lightweight MobileNetV3-Small (Student). To further bridge the gap between research and production, we apply 4-bit Quantization and ONNX Graph Optimization, achieving real-time inference on standard consumer CPUs.


🧠 The Architecture

1. The Teacher: Custom ResNet-50

Designed specifically for input (CIFAR-100), avoiding the spatial resolution loss found in standard ImageNet-centric architectures.

  • Key Feature: Bottleneck blocks with Identity Shortcuts.
  • Accuracy Target: ~78% on CIFAR-100.

2. The Student: MobileNetV3-Small

A high-efficiency model utilizing Depthwise Separable Convolutions to minimize FLOPs.

  • Key Feature: Low-latency architecture optimized for CPU-bound environments.

3. The Knowledge Transfer (The "Secret Sauce")

We don't just train on hard labels. We use Temperature-Scaled KL Divergence to capture the inter-class relationships learned by the teacher.

The Loss Function:


🛠️ Optimization Pipeline

  1. Logit-based Distillation: Training the student to mimic the teacher's softened probability distribution ().
  2. Quantization-Aware Training (QAT): Simulating low-precision math during training to maintain accuracy after "crushing" weights to INT8/INT4.
  3. ONNX Runtime: Exporting to a hardware-agnostic format to leverage SIMD instructions on local CPUs.

📊 Benchmarks (Projected as Targets)

Metric Teacher (ResNet-50) Student (Quantized) Improvement
Model Size ~95 MB ~2.8 MB 33x Smaller
Inference (CPU) 120ms / image 8ms / image 15x Faster
FPS ~8 FPS ~120 FPS Fluid Motion
Accuracy 78.4% 76.1% Only 2.3% Drop

Setup

conda create -n <env_name> python=3.12 -y
conda activate <env_name>
pip install uv
uv pip install -e ".[train,deploy,dev]"

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors