The AI-native (edge and LLM) proxy for agents
Fast inference engine for Transformer models
Pure C++ implementation of several models for real-time chatting
Deep Learning API and Server in C++14 support for Caffe, PyTorch
Sparsity-aware deep learning inference runtime for CPUs
Connect home devices into a powerful cluster to accelerate LLM
Python Package for ML-Based Heterogeneous Treatment Effects Estimation
Implementation of model parallel autoregressive transformers on GPUs
Easiest and laziest way for building multi-agent LLMs applications
A scalable inference server for models optimized with OpenVINO
State-of-the-art Parameter-Efficient Fine-Tuning
A RWKV management and startup tool, full automation, only 8MB
A lightweight vision library for performing large object detection
Single-cell analysis in Python
Uniform deep learning inference framework for mobile
Data manipulation and transformation for audio signal processing
Pytorch domain library for recommendation systems
Framework that is dedicated to making neural data processing
Training & Implementation of chatbots leveraging GPT-like architecture
AIMET is a library that provides advanced quantization and compression
Uncover insights, surface problems, monitor, and fine tune your LLM
Unified Model Serving Framework
Deep learning optimization library: makes distributed training easy
Build Production-ready Agentic Workflow with Natural Language
Framework which allows you transform your Vector Database