untether.ai: High-Performance, Energy-Efficient AI Inference Acceleration

untether.ai is a pioneering AI inference acceleration company based in Toronto, Canada, delivering cutting-edge hardware and software solutions for high-performance, low-latency AI deployment across edge and cloud environments. Specializing in energy-centric AI, the company’s at-memory computing architecture drastically reduces data movement and power consumption, enabling superior compute density and efficiency. Designed for OEMs and on-premise data centers, untether.ai’s products—including the runAI®200 IC, speedAI®240 and speedAI®240 Slim accelerator cards, tsunAImi® tsn200 and tsn800 systems, and the imAIgine® SDK—support TensorFlow, PyTorch, and ONNX models with a seamless push-button inference workflow. The platform is trusted by industries like AgTech, Vision AI, Automotive, Government, Financial, and Datacenters, and has earned recognition through record-breaking results in MLPerf® Inference benchmarks.

Key Features:

At-Memory Computing Architecture: Moves computation closer to data storage to reduce latency and power consumption by up to six times.
High-Performance AI Accelerators: Second-generation ICs and PCIe cards deliver up to 2 PetaFlops of inference performance and 8 TOPS per watt.
Support for Multiple Frameworks: Fully compatible with TensorFlow, PyTorch, and ONNX models for broad developer flexibility.
imAIgine® SDK: A comprehensive development environment with three core components—Compiler, Toolkit, and Runtime—for automated model deployment and optimization.
imAIgine Compiler: Enables model import, quantization, and performance tuning with configurable targets.
imAIgine Toolkit: Offers profiling and simulation tools to evaluate model behavior and performance.
imAIgine Runtime: Provides a C-based API for integration and real-time monitoring of device health and temperature.
Model Garden: A curated library of pre-optimized models for faster deployment.
Custom Kernel Development Flow: Empowers developers to build high-performance, application-specific kernels.
Multi-Chip Partitioning: Allows large neural networks to be split across multiple accelerator devices for scalable deployment.
Generative Compiler Technology (2025): Supports four times more AI models and reduces implementation time by orders of magnitude.
Industry-Leading Benchmark Results: Verified by MLPerf® Inference v4.1 in both Datacenter and Edge categories.
Strategic Partnerships: Collaborations with AMD, Arm, General Motors, and Ola-Krutrim for automotive, data center, and edge applications.
Scalable Solutions: From low-profile 75-watt PCIe cards (speedAI®240 Slim) to high-density server-grade systems (tsunAImi® tsn800).

Pricing: untether.ai operates under a paid model, offering enterprise-grade hardware and software solutions tailored to specific high-performance computing needs. While pricing details are not publicly disclosed, the company targets professional and industrial clients with custom deployments, indicating a premium, usage-driven pricing structure.

Conclusion: untether.ai stands at the forefront of AI inference innovation with its energy-efficient, at-memory architecture and powerful hardware-software ecosystem. Its advanced accelerator cards and imAIgine® SDK empower developers and enterprises to deploy complex AI models with unmatched speed, efficiency, and scalability—making it a transformative force in AI acceleration for mission-critical applications.

untether.ai

Our Review

untether.ai: High-Performance, Energy-Efficient AI Inference Acceleration

You might also like...

Positron.ai

oneinfer.ai

axelera.ai