Your shortcut from cloud AI
to working MCU demos

From ASR to multimodal fusion — technologies packaged as evaluation kits, ready for PoC

Technology Guides

TechnologyWhat it isWhy OEMs need itApplicationsPoC path
ASR (Automatic Speech Recognition)Offline speech-to-text optimized for MCUs.Natural voice input without cloud.Smart appliances, toys & EdTech, IoT.VoxControl Kit (ESP32-S3) — includes ASR demos (commands, AI Teacher, VoicePIN).
TTS (Text-to-Speech)Offline speech synthesis generating natural voice.Devices “talk back”: status, feedback, narration.Appliances, toys, accessibility.TTS Demo (EN) — standalone offline TTS model for MCUs.
TinyLM (Lightweight Language Model)Compact model for dialog, intent, storytelling.Brings offline interaction closer to cloud assistants.EdTech, assistants, toys.TinyStories demo or request a custom TinyLM.
Critical Sound Detection (SoundGuard)Offline detection of sirens, gunshots, glass break, screams.Always-on safety without cloud costs.Smart city, industrial, home security.SoundGuard mini demo.
VoicePINRecognition of spoken numeric PINs offline.Secure access without biometrics.Gates, doors, IoT, backup auth.VoicePIN demo (in VoxControl Kit Kit).
VWW (Visual Wake Word)Vision model detecting presence.Wakes devices only when needed, saving energy.Smart cameras, robots, access.When Your Camera Thinks Before It Shoots (Hackster).
SoundWW (Sound Wake Word)Always-on audio trigger.Low-power audio gating and wake-up.Smart speakers, toys, industrial.Multimodal Kit.
Fusion LogicIntegration of audio + vision channels.Multimodality reduces false alarms, enables Pro SKUs.Pro cameras, robots, secure access.Multimodal PoC.

Evaluation Kits — Fast Track to PoC

What you receive (always):
  • Hardware board (reference SoC)
  • Pre-installed, working SW IP on the board (demos ready out of the box)
  • Documentation pack (datasheet, quick-start, integration notes)
  • API description (functions, I/O, timing, limits)
  • Metrics sheet (latency, RAM/Flash, power on real silicon)
Voice Kit
  • Inside: ASR (EN), TTS (EN), TinyLM.
  • For OEMs: Validate dialogs, command flows, educational scenarios.
  • Deliverables: Board + running SW IP, docs, API; sample content; demo video.
Security Kit
  • Inside: Critical Sound Detection + VoicePIN.
  • For OEMs: Add safety + secure offline access.
  • Deliverables: Board + running SW IP, docs, API; tuned models; config tool; demo video.
Multimodal Kit
  • Inside: VWW + SoundWW + Fusion Logic.
  • For OEMs: Reduce false alarms, save energy, create Pro SKUs.
  • Deliverables: Board + running SW IP, docs, API; fusion templates; metrics pack; demo video.

Each kit boots to a working demo on first power-up; no SDK deep-dive required. Binaries for re-flash and configuration files are included in the docs pack.

For Engineers — what you need to know

  • What is it? Clear description of each technology (ASR, TTS, TinyLM, Critical Sound Detection, VoicePIN, wake-words, fusion).
  • How do I integrate it with my hardware? Supported SoCs (Arm® Cortex™-M55 + Arm® Ethos™-U55, ESP32-S3 for PoC), delivery format (ready binaries + simple APIs), defined memory and power budgets.
  • Do I need to spend weeks learning? No — delivered as drop-in blocks with guides, not raw SDKs.
  • Is it real and proven? Validated on Himax HX6538, Alif Ensemble, plus Hackster demos and evaluation kits.
  • Has anyone else tried it? Yes — community projects and open demos are already public.
  • Can I trust the numbers? Latency, RAM/Flash, power usage measured on real silicon and reproducible.

For Product Managers — what you need to know

  • Time-to-PoC: From idea to demo in weeks, not months.
  • Reduced risk: No need to build ML teams — functions are packaged and validated.
  • Clear ROI: Early demos de-risk bigger product investments and impress stakeholders.
  • Future roadmap: What works today on U55 scales tomorrow on U85 — same flow, bigger tasks.
  • Proof that builds trust: Makers and engineers have already tested these blocks in real-world demos.

Why It Matters

  • Reduced time-to-PoC: skip raw model adaptation, start from working kits.
  • Predictable performance: latency, memory, and power budgets are measured, not estimated.
  • OEM focus: you validate complete functions, not SDK toolchains.