INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
LiteRT is the new name for TensorFlow Lite (TFLite)
FlashMLA: Efficient Multi-head Latent Attention Kernels
Towards Human-Sounding Speech
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Flower: A Friendly Federated Learning Framework
The Compute Library is a set of computer vision and machine learning
Convert TensorFlow, Keras, Tensorflow.js and Tflite models to ONNX
PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
text and image to video generation: CogVideoX (2024) and CogVideo
ChatGLM2-6B: An Open Bilingual Chat LLM
C++ library for high performance inference on NVIDIA GPUs
Fast inference engine for Transformer models
MNN is a blazing fast, lightweight deep learning framework
A library for accelerating Transformer models on NVIDIA GPUs
CUDA Templates for Linear Algebra Subroutines
Powerful development framework for creating virtually anything
SIMD macro assembler unified for ARM, MIPS, PPC and x86
TF2 Deep FloorPlan Recognition using a Multi-task Network
State of the art faster Transformer with Tensorflow 2.0
Accelerated deep learning R&D
C++ library based on tensorrt integration
Deep learning for text to speech