TEN (Transformative Extensions Network) is an open source framework designed to empower developers to build real-time multimodal AI agents capable of voice, video, text, image, and data-stream interaction with ultra-low latency. It includes a full ecosystem, TEN Turn Detection, TEN Agent, and TMAN Designer, allowing developers to rapidly assemble human-like, responsive agents that can see, speak, hear, and interact. With support for languages like Python, C++, and Go, it offers flexible deployment on both edge and cloud environments. Using components like graph-based workflow design, drag-and-drop UI (via TMAN Designer), and reusable extensions such as real-time avatars, RAG (Retrieval-Augmented Generation), and image generation, TEN enables highly customizable, scalable agent development with minimal code.
Features
- Enables ultra‑low‑latency real‑time multimodal interactions across voice, video, text, and images
- Modular extension architecture supporting reusable components in C++, Go, Python (with Node.js support coming)
- Edge‑to‑cloud integration for balancing performance, privacy, and scalability
- Real‑time agent state management for dynamic responsiveness and adaptive behavior
- Drag‑and‑drop workflow UI (TMAN Designer) for visual composition of conversational flows
- Built‑in ecosystem including turn detection, voice activity detection, agent examples, and extension marketplace (TEN Cloud Store)