What is Concept-as-a-Service?

Concept-as-a-Service is a pay-on-acceptance model where Model T proactively generates hardware product concepts for engineering companies. Clients provide 5-7 target accounts, Model T selects the top 3 and delivers 2-3 validated product concepts per client — each with system architecture, BOM, and go-to-market angle — in 2 weeks. Payment only if at least one concept is accepted.

How does Model T work?

Model T uses a 6-stage AI-enhanced pipeline: Qualify (select top 3 from 5-7 accounts), Problem Map (OSINT research), Signal Research (market intelligence), Concept Engineering (architecture + BOM), Validate (expert feasibility review), and Deliver (client-ready package). Total ~50 hours per client, 1-2 weeks elapsed.

What is the cost of Model T?

From €15,000 per client engagement. Pay-on-acceptance: if zero concepts are accepted by the client, cost is zero. Approximately 60% of concepts have led to signed engineering projects.

What is Model T's conversion rate?

Model T achieved a 75% positive response rate during its validation roadshow across Munich and Switzerland — 7 meetings with zero negative reactions. Two meetings converted directly into signed projects.

Model T serves two audiences: semiconductor distributors and vendors who want proactive design-ins instead of reactive RFQ responses, and engineering or product companies who want validated concept exploration before committing R&D budget. Official vendor partners include Renesas, Lattice, NXP, STMicro, Infineon, and Qualcomm.

USE CASE — Use Case

AI Camera Systems for Broadcast & Cinema

Autonomous camera tracking powered by edge AI — replacing manual operation with intelligent object detection, predictive motion algorithms, and cinematic-quality PTZ control for live production.

THE PROBLEM

Live Broadcast Requires 4-8 Camera Operators Per Studio at $2M+/Year

A typical multi-camera live broadcast studio operates 4-8 cameras, each requiring a dedicated operator. At $40-60K per operator annually (salary, benefits, scheduling), a single studio faces $250K-500K in camera operation costs alone. Networks operating 10-20 studios spend $2M-10M per year on camera personnel — a cost that scales linearly with production volume.

The labor shortage compounds the problem. Experienced camera operators are aging out, and younger talent gravitates toward software and digital production roles. During live events — sports, news, concerts — the demand spike for skilled operators exceeds supply, forcing networks to accept lower production quality or pay premium freelance rates.

Existing robotic camera systems (Vinten, Shotoku, Ross) provide motorized PTZ control but require a human operator at the control panel. They automate the physical movement but not the creative decision-making: framing, subject tracking, shot selection, and smooth transitions. True autonomous camera operation requires computer vision, predictive motion algorithms, and broadcast-grade reliability — a combination that demands both FPGA video processing expertise and AI/ML engineering.

4-8

Camera Operators per Studio

$2-10M

Annual Labor Cost (Multi-Studio)

18-22%

AI Camera Market CAGR

$8.2B

Broadcast Equipment Market (2030)

THE SOLUTION

AI-Powered Autonomous Camera Tracking System

Promwad delivers an end-to-end autonomous camera tracking system that combines FPGA-based real-time video processing with AI object detection and predictive motion planning. The system mounts on existing PTZ camera heads, converting manual or remote-operated cameras into AI-autonomous units.

The architecture separates real-time control (FPGA) from AI inference (GPU/NPU), ensuring that camera movements remain smooth and broadcast-grade even during complex multi-subject scenes. A director-level API allows human operators to provide high-level instructions ("follow the speaker," "wide shot of panel") while the AI handles framing, tracking, and transitions.

Camera Sensor Interface

SDI/HDMI input capture via Lattice CrossLink-NX FPGA. 4K60 frame grabbing with zero-copy DMA to processing pipeline. Genlock synchronization for multi-camera setups. Metadata extraction (timecode, tally, iris).

↓

Edge FPGA Processing

Real-time video pre-processing: ROI extraction, downscaling for AI inference, motion vector estimation. Lattice or AMD FPGA with hardware-accelerated color space conversion and deinterlacing. Sub-frame latency (<8ms) for smooth tracking.

↓

AI Tracking Engine

NVIDIA Jetson Orin or Ambarella CV25 for object detection (YOLOv8), pose estimation, and face recognition. Predictive motion model (Kalman filter + LSTM) for anticipatory camera movement. Multi-subject priority scoring based on audio activity and scene context.

↓

Control API & Director Interface

VISCA/IP and NDI PTZ control output. RESTful API for shot programming and scene presets. WebSocket real-time status feed. Integration with production switchers (Ross, Blackmagic, Grass Valley) via GPI and NMOS IS-07.

BEFORE vs. AFTER

Before vs. After: Studio Camera Operations

Dimension

Before

After

Personnel Required

1 operator per camera (4-8 per studio)

1 director overseeing 4-8 AI cameras

Tracking Quality

Operator-dependent, fatigue-affected

Consistent AI tracking with predictive motion

Production Scalability

Linear cost increase with cameras

Marginal cost per additional AI camera

Setup Time

2-4 hours for multi-camera rehearsal

15-30 minutes for AI scene programming

Revenue Model

One-time camera system sale

Hardware + SaaS license + AI model updates

IMPLEMENTATION

Implementation Roadmap

Single-Camera Prototype

4 months

→FPGA video capture module (SDI input, 1080p60)

→Object detection pipeline (YOLOv8 on Jetson Orin)

→Basic PTZ tracking with Kalman filter smoothing

→VISCA/IP control interface for standard PTZ heads

Multi-Camera MVP

8 months

→4K60 support with CrossLink-NX FPGA pipeline

→Multi-camera coordination with genlock sync

→Director API with shot presets and scene programming

→Production switcher integration (GPI + NMOS IS-07)

→Face and pose recognition for subject identification

AI Director Platform

14 months

→Autonomous shot selection based on scene analysis

→Audio-driven camera switching (speech activity detection)

→Cloud-based analytics and production replay

→SaaS licensing model with per-camera subscription

→NDI and ST 2110 output for IP broadcast workflows

EXPECTED OUTCOMES

Expected Outcomes

40-50%

Personnel Cost Reduction

75% faster

Production Setup Time

$5-15K

New SaaS Revenue per Camera/Year

95-98%

Tracking Accuracy (vs. Manual)

<50ms

System Latency (End-to-End)

4 months

Time to Single-Camera Demo

FREQUENTLY ASKED

Does the AI fully replace camera operators?

Not entirely. The system replaces per-camera operators with a single director who provides high-level creative instructions to multiple AI cameras simultaneously. For premium live events (sports finals, concerts), a hybrid model — AI tracking with human override — delivers the best results. The value proposition is reducing a 6-person camera crew to 1-2 people, not eliminating human creative judgment.

What camera brands and PTZ heads are supported?

Any PTZ head supporting VISCA/IP or NDI control protocols — including Panasonic, Sony, Canon, and Ross robotics. The FPGA capture module accepts SDI (BNC) and HDMI inputs. The system is designed as a retrofit module, not a replacement for existing camera infrastructure.

How does this handle fast-moving subjects like sports?

The predictive motion model combines Kalman filtering for trajectory estimation with LSTM neural networks trained on sport-specific movement patterns. The FPGA pre-processing stage provides motion vectors at frame rate, enabling the tracking engine to anticipate movement rather than react to it. End-to-end latency under 50ms ensures broadcast-grade smoothness.

What about privacy and face recognition regulations?

Face recognition is used only for subject identification within the production context (identifying speakers, panelists, performers). No biometric data leaves the local system. The architecture supports GDPR-compliant modes where face recognition is replaced by clothing/position-based tracking for privacy-sensitive deployments.

Start a Pilot →