Talk to ThirdEye Data
Scroll to explore ↓
ThirdEye Data Summer Millers Ltd
Summer Millers Ltd × ThirdEye Data

AI Warehouse Ops -
Stock Counting

From manual counting and audit cycles to audit-grade AI Stock counting.
Image capture, video capture — one engine for both.

≥93%
Accuracy Target
Computer vision counting
<2s
Per-Count Latency
End-to-end per stack
1
AI Engine
For both Images & Video
15w
PoC Duration
2
Capture modes
7yr
Audit retention
Overview — Executive summary
01
Executive Summary
What we propose, at a glance
Proposal v1.0
What this section is
Think of this proposal as your map for the AI counting project. Each section that follows answers one specific question — what problem we're solving, what the system does, how long it takes, what you receive, and what's not in scope.
Three pillars
Image-based
Front + Top photo per stack. Fastest to ship.
Video-based
Short pan + multi-frame voting. Higher robustness.
One AI engine
Same backend, dashboard, audit trail. Both modes share every layer except the camera.
02
Problem Statement
Where manual counting breaks today
6 Pain Points
Why this proposal exists
Manual stock counting is slow, easy to get wrong, and leaves no proof. Below are the six pain points this system is designed to address.
Manual dependency
Every audit cycle requires physical counting. Bottlenecks at peak.
5–15% error rate
Dense stacking + occlusion + tall heights drive routine miscounts.
Audit discrepancies
Reconciliation delays + financial & compliance exposure.
Further constraints
No visual evidence
Manual counts leave no photographic proof to defend.
Doesn't scale
Manual effort scales linearly with warehouse count. Not viable.
Access constraints
Narrow gaps between stacks make side-face photography impractical.
03
Five-layer stack, one engine
Image & video share every layer except the camera
Architecture
Five-stage assembly line
The system runs every time an operator counts a stack. Whether you choose image or video capture, only the first stage changes — the other four are identical.
1
Capture
PWA · no app install
2
Quality Gate
Blur · light · framing
3
AI Counting
YOLO + SAHI
4
Review
Deviation check
5
Record
Audit trail · 7yr
04
Capture — Image vs Video
Pick per warehouse. Same AI engine. Two ways to feed it.
Choice
Option A · Image
Two photos per stack (Front + Top). Fastest to ship. Best for stable lighting and accessible stacks. ~30s per stack end-to-end.
Step 1
Open URL · pick stack
Step 2
Capture FRONT photo
Step 3
Capture TOP photo
Step 4–5
Quality gate → Upload → AI count
Option B · Video
Short 10s pan across stack face. ~300 frames → top 5 → voted count. Higher robustness for tall stacks, partial occlusion, low light.
Step 1
Open URL · pick stack
Step 2
Slow pan across stack face
Step 3
Live overlay guides operator
Steps 4–5
Top 5 frames auto-selected → multi-frame voted count
Hybrid is allowed. Default a warehouse to image, switch to video for tall stacks or monthly audits. Mode is logged per session. Different warehouses can use different modes — no need to standardise across the network.
05/06
Image & Video Pipelines
Front + Top pipeline · Pan, filter, vote
Detail
05 · Image Flow — Front + Top
Two photos in, audit-grade count out. The AI multiplies front (rows × cols) by top (depth). Most sessions complete in well under a minute.
Counting Formula
TOTAL = FRONT (rows × cols) × TOP (depth)
Edge cases
Top not feasible
Operator enters depth manually. Logged as "manual depth entry".
Front blocked
Two partial fronts captured; system stitches and re-runs inference.
Fallen units
Operator enters adjustment with reason code. Both stacks reconcile to net zero.
06 · Video Flow — Frame Funnel
Video replaces two photos with a short slow pan. ~300 raw frames → on-device scoring → top 5 frames → independent AI counts → median voting → 1 final number.
300
Raw
60
Scored
5
Picked
1
Count
Frame scoring signals
Blur · 5 ms
Laplacian variance
Brightness · 2 ms
HSV V-channel mean
Stability · 10 ms
DeviceMotion magnitude
In-frame · 80 ms
ONNX.js small detector
07
AI Model — YOLO + SAHI
Why fine-tuning is mandatory (zero-shot = 0.5% detection)
CV Engine
The brain of the system
We use a fine-tuned object-detection model (YOLO family) sliced into tiles by SAHI so it can count dense, repetitive stacks reliably. The model is trained on photos of YOUR own stocks.
YOLO + SAHI
YOLOv11-L default. SAHI slices dense stacks into tiles. 50–100 ms / image on CPU.
"stacking_unit" label
Every detected object labelled generically. Same model extends to new stocks.
Fine-tune mandatory
Pre-trained models detect <1% of dense stacks. Tested and confirmed.
ThirdEye capabilities
Computer vision Fine-tuning pipelines SAHI tiling Edge inference MLOps · drift detection
Empirical test
Pre-trained YOLO-World (3.8M general images) ran zero-shot on warehouse stacks — detected 0.5% of objects. Fine-tuning on Summer Millers' own data is not optional.
Accuracy potential
Target accuracy
93%
Zero-shot baseline
<1%
08
Architecture & Stack Options
All viable options on the table — recommendations marked. Final stack decided together.
Options-First
Logical architecture (cloud-agnostic)
OPERATOR phone browser · PWA ADMIN dashboard · review LOAD BALANCER · HTTPS · WAF TLS 1.2+ · rate-limited · JWT auth API GATEWAY routing · auth · async stateless INFERENCE FLEET CV model · multi-frame voting auto-scaling RELATIONAL DB sessions · users · audit HA · read replicas OBJECT STORE images · videos · frames immutable · 7yr retention MODEL REGISTRY versioned weights · A/B canary 10% → 100% LOGICAL VIEW · CLOUD-AGNOSTIC
Nothing is locked in. Every layer below has multiple viable options. Our badges mark our default recommendation — but the final selection is decided together with Summer Millers based on existing infra, budget, and compliance.
Deployment Target
AWS
Mumbai region · widest service catalog · mature ML tooling.
★ Recommended
Azure
India regions · strong enterprise integration · GPU availability.
Google Cloud
Mumbai region · Vertex AI for managed ML pipelines.
On-Premise
Full data sovereignty · ideal if Summer Millers has existing infra.
Inference Compute
★ Recommended
CPU Instance
~$150–300/mo. Sufficient with ONNX + SAHI for Summer Millers' scale.
GPU Instance
~$400–900/mo. Required only at >100 warehouse scale or sub-second SLA.
Serverless
Pay-per-inference. Good for bursty audit-cycle traffic; cold starts are a trade-off.
Detection Model
★ Recommended
YOLOv11 + SAHI
Best dense-detection balance · open weights · 50–100 ms / image on CPU.
RT-DETR
Transformer-based · stronger on occluded objects · slightly slower.
YOLOv8 / EfficientDet
Mature alternatives · evaluated in PoC bench if needed.
Capture Frontend
★ Recommended
PWA (React / Next.js)
Zero install · works on Android & iOS · easiest rollout.
Native iOS
Required only if iOS Safari APIs limit the PWA video flow.
Native Android
Optional · camera API parity with PWA — rarely needed.
09
Three-tier gate — trust but verify
AI count is never silently changed
Deviation Logic
Trust but verify
Every AI count is sanity-checked against your book stock from the ERP. The AI count is never silently overwritten — even when it disagrees with book stock, the original AI number is what goes into the audit log.
Deviation Formula
| AI Count − Reference | ÷ Reference
< 5%
Auto-Approve
Count locked. No human touch.
5–15%
Admin Review
Flagged · admin approves or rejects.
> 15%
Mandatory Recount
Operator must retake · supervisor notified.
Thresholds calibrated per warehouse during pilot, based on real (AI count, reference) pair distribution. Adjustable in admin config without redeploy.
10
PoC → MVP — 15 weeks
5 phases · image first, video added in P3
Timeline
PoC phases
Five phases across 15 weeks of PoC. Every phase has clear sign-off criteria before the next starts — no skip-aheads.
P0
SETUP
Wks 1–2
Warehouse visit · annotation pipeline · cloud + dev env
P1
FIRST COMMODITY
Wks 3–6
500–1k images · train YOLO · image flow live · 20 field sessions
P2
EXPAND
Wks 7–10
10+ stocks · model reuse · Dashboard v1
P3
VIDEO + TAG
Wks 11–13
Video PWA · frame funnel · deviation gate live
P4
UAT → MVP
Wks 14–15
UAT with Summer Millers · sign-off + handover
Post-PoC rollout
Pilot · Months 4–6
~10 warehouses · sub-2s latency · accuracy ≥ 93% maintained.
Regional · Months 7–12
~50 warehouses · multi-AZ · read replicas for analytics.
Full Scale · Year 2+
100+ warehouses · auto-scale fleet · optional BOT handover.
11
What this proposal does NOT cover
Boundary honesty — avoid surprises later
Boundary
Explicitly out of scope
Six things we explicitly do NOT cover in this PoC. If any turn out to be necessary, they can be scoped as separate work — but they're not bundled into the price.
✕ Out of scope
Single front image counting. Not audit-grade. Front + Top (or Front + manual depth) required.
✕ Out of scope
Zero-shot on new stocks. Each new stock needs training data and a cycle.
✕ Out of scope
Irregular / loose / bulk storage. Universal formula assumes repeating stacking units.
✕ Out of scope
Replacing ERP / WMS. Records counts + evidence. Doesn't trigger stock movement or procurement.
✕ Out of scope
Hardware procurement. Phones, tripods, lighting upgrades, connectivity — not in scope.
✕ Out of scope
Long-term model governance. Covered by separate AMC, not this PoC SOW.
12
What you receive
Per-stock · platform · documentation · IP transfer
Deliverables
Per stock + platform
Per Stock
Trained YOLO model + metrics · annotated dataset · accuracy benchmarking report · weights + config + scripts
Platform
PWA (image + video flows) · Backend API + inference service · Admin dashboard + audit trail · Docker deployment config · Source code + runbook
Documentation + IP
Documentation
System architecture · API reference · annotation standards · MLOps pipeline · operator quick-reference
IP Transfer: Upon final payment, all custom code · AI/ML models · training datasets · prompts · configurations transfer to Summer Millers Ltd. 24-month non-compete in milling / grain processing / stock warehousing.
13
Risk & Mitigation
Each known risk has a defined response
Risk
No surprises
Every project has risks. Below are the ones we already see today, with a specific mitigation paired against each. If new risks surface during PoC, they're added to this list with a mitigation designed before they become problems.
⚠ Risk
Lighting variability across warehouses degrades accuracy. → Collect training data under all observed lighting in PoC; on-device gate rejects under/over-exposed captures.
⚠ Risk
Unreliable network at remote warehouses. → Offline-first PWA queues locally + syncs on reconnect. Pre-test connectivity at commissioning.
⚠ Risk
iOS Safari gaps in video mode (focus lock, MediaRecorder format, background-tab kills). → Profile phone fleet at PoC Week 2. Plan thin native iOS wrapper if non-trivial iOS share.
⚠ Risk
Model drift over time (new suppliers, packaging changes, seasons). → Automated drift detection on rolling 7-day metrics. Targeted fine-tune on drifted data only.
⚠ Risk
Operator resistance. → Show-not-tell: side-by-side AI vs manual for first 20 sessions per warehouse.
⚠ Risk
Cloud lock-in concern. → Architecture is provider-agnostic. Every component maps to AWS / Azure / GCP / on-prem. Migration is weeks, not months.
14
Team Composition
Indicative — refined per final scope
Team
Who delivers this
Indicative composition for the PoC — sizes flex slightly based on final scope and how soon the MVP rollout begins running in parallel with the back half of PoC.
EM / Scrum Master
Part-time across engagement
AI/ML Lead
Model design + accuracy owner
AI Engineers × 2
Annotation · training · eval
Backend Engineer
FastAPI · deviation · audit
Frontend Engineer
PWA + dashboard
DevOps Engineer
Cloud · CI/CD · monitoring
QA Engineer
Field validation · UAT
Embedded with you
Twice-weekly standups
15
Charges
Pricing & commercial terms
Commercial
Pricing
Full and final charges will be available after complete project specifications have been determined. All pricing will be provided in a formal quotation following the scoping exercise.
Next step: Once Summer Millers Ltd confirms the final project scope — warehouse count, stock types, capture mode, deployment target, and rollout timeline — ThirdEye Data will issue a detailed, itemised cost proposal covering PoC, MVP rollout, and ongoing support options.
16
Post-delivery support
Free window + tickets-based
Support
Free support window
2 weeks of free bug-fix support immediately after delivery. Excludes new development, integrations, or feature additions.
2 weeks freeBug fixes only
Ongoing tickets-based
Post free window. Per-ticket basis with agreed SLAs. New feature work and integrations are separately scoped.
SLA-boundPer request
17
Live Demo — See the product in action
Simulated end-to-end flow · intermediate stages shown for transparency
Interactive
Interactive walkthrough
A live simulation of what an operator and admin would experience. Numbers and visuals are placeholders. Intermediate model views (quality scoring, SAHI tile slicing, multi-frame voting) are shown here to demonstrate the engine's thinking — in production, the operator sees CAPTURE → COUNT → APPROVE only. Click any step or hit Auto-play.
09:42 5G ▮
Working demo available on request. A deployed version at a similar reference site can be walked through live on a call. The simulation above is a faithful representation of the operator + admin experience — same screens, same flow, placeholder values.
18
Closing note
Ready to start immediately on contract approval
Closing
ThirdEye Data × Summer Millers Ltd
Thank you,
Summer Millers.

We can mobilise the team within a week of sign-off and have first stock data collection underway in Week 1. Every architecture decision in this proposal is open for discussion — we recommend, you decide.

Talk to ThirdEye Data →