Your shortcut from cloud AI
to working MCU demos
From ASR to multimodal fusion — technologies packaged as evaluation kits, ready for PoC
Technology Guides
Technology | What it is | Why OEMs need it | Applications | PoC path |
---|---|---|---|---|
ASR (Automatic Speech Recognition) | Offline speech-to-text optimized for MCUs. | Natural voice input without cloud. | Smart appliances, toys & EdTech, IoT. | VoxControl Kit (ESP32-S3) — includes ASR demos (commands, AI Teacher, VoicePIN). |
TTS (Text-to-Speech) | Offline speech synthesis generating natural voice. | Devices “talk back”: status, feedback, narration. | Appliances, toys, accessibility. | TTS Demo (EN) — standalone offline TTS model for MCUs. |
TinyLM (Lightweight Language Model) | Compact model for dialog, intent, storytelling. | Brings offline interaction closer to cloud assistants. | EdTech, assistants, toys. | TinyStories demo or request a custom TinyLM. |
Critical Sound Detection (SoundGuard) | Offline detection of sirens, gunshots, glass break, screams. | Always-on safety without cloud costs. | Smart city, industrial, home security. | SoundGuard mini demo. |
VoicePIN | Recognition of spoken numeric PINs offline. | Secure access without biometrics. | Gates, doors, IoT, backup auth. | VoicePIN demo (in VoxControl Kit Kit). |
VWW (Visual Wake Word) | Vision model detecting presence. | Wakes devices only when needed, saving energy. | Smart cameras, robots, access. | When Your Camera Thinks Before It Shoots (Hackster). |
SoundWW (Sound Wake Word) | Always-on audio trigger. | Low-power audio gating and wake-up. | Smart speakers, toys, industrial. | Multimodal Kit. |
Fusion Logic | Integration of audio + vision channels. | Multimodality reduces false alarms, enables Pro SKUs. | Pro cameras, robots, secure access. | Multimodal PoC. |
Evaluation Kits — Fast Track to PoC
What you receive (always):
- Hardware board (reference SoC)
- Pre-installed, working SW IP on the board (demos ready out of the box)
- Documentation pack (datasheet, quick-start, integration notes)
- API description (functions, I/O, timing, limits)
- Metrics sheet (latency, RAM/Flash, power on real silicon)
Voice Kit
- Inside: ASR (EN), TTS (EN), TinyLM.
- For OEMs: Validate dialogs, command flows, educational scenarios.
- Deliverables: Board + running SW IP, docs, API; sample content; demo video.
Security Kit
- Inside: Critical Sound Detection + VoicePIN.
- For OEMs: Add safety + secure offline access.
- Deliverables: Board + running SW IP, docs, API; tuned models; config tool; demo video.
Multimodal Kit
- Inside: VWW + SoundWW + Fusion Logic.
- For OEMs: Reduce false alarms, save energy, create Pro SKUs.
- Deliverables: Board + running SW IP, docs, API; fusion templates; metrics pack; demo video.
Each kit boots to a working demo on first power-up; no SDK deep-dive required. Binaries for re-flash and configuration files are included in the docs pack.
For Engineers — what you need to know
- What is it? Clear description of each technology (ASR, TTS, TinyLM, Critical Sound Detection, VoicePIN, wake-words, fusion).
- How do I integrate it with my hardware? Supported SoCs (Arm® Cortex™-M55 + Arm® Ethos™-U55, ESP32-S3 for PoC), delivery format (ready binaries + simple APIs), defined memory and power budgets.
- Do I need to spend weeks learning? No — delivered as drop-in blocks with guides, not raw SDKs.
- Is it real and proven? Validated on Himax HX6538, Alif Ensemble, plus Hackster demos and evaluation kits.
- Has anyone else tried it? Yes — community projects and open demos are already public.
- Can I trust the numbers? Latency, RAM/Flash, power usage measured on real silicon and reproducible.
For Product Managers — what you need to know
- Time-to-PoC: From idea to demo in weeks, not months.
- Reduced risk: No need to build ML teams — functions are packaged and validated.
- Clear ROI: Early demos de-risk bigger product investments and impress stakeholders.
- Future roadmap: What works today on U55 scales tomorrow on U85 — same flow, bigger tasks.
- Proof that builds trust: Makers and engineers have already tested these blocks in real-world demos.
Why It Matters
- Reduced time-to-PoC: skip raw model adaptation, start from working kits.
- Predictable performance: latency, memory, and power budgets are measured, not estimated.
- OEM focus: you validate complete functions, not SDK toolchains.