[go: up one dir, main page]

28 projects for "deepseek" with 1 filter applied:

  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Hightouch is a data and AI platform for marketing and personalization. Icon
    Hightouch is a data and AI platform for marketing and personalization.

    Marketing needs data and AI. Give them Hightouch.

    Find insights, run real-time campaigns, and build AI agents with all your data.
    Learn More
  • 1
    DeepSeek R1

    DeepSeek R1

    Open-source, high-performance AI model with advanced reasoning

    DeepSeek-R1 is an open-source large language model developed by DeepSeek, designed to excel in complex reasoning tasks across domains such as mathematics, coding, and language. DeepSeek R1 offers unrestricted access for both commercial and academic use. The model employs a Mixture of Experts (MoE) architecture, comprising 671 billion total parameters with 37 billion active parameters per token, and supports a context length of up to 128,000 tokens.
    Downloads: 86 This Week
    Last Update:
    See Project
  • 2
    DeepSeek-V3

    DeepSeek-V3

    Powerful AI language model (MoE) optimized for efficiency/performance

    DeepSeek-V3 is a robust Mixture-of-Experts (MoE) language model developed by DeepSeek, featuring a total of 671 billion parameters, with 37 billion activated per token. It employs Multi-head Latent Attention (MLA) and the DeepSeekMoE architecture to enhance computational efficiency. The model introduces an auxiliary-loss-free load balancing strategy and a multi-token prediction training objective to boost performance.
    Downloads: 66 This Week
    Last Update:
    See Project
  • 3
    DeepSeek V2

    DeepSeek V2

    Strong, Economical, and Efficient Mixture-of-Experts Language Model

    DeepSeek-V2 is the second major iteration of DeepSeek’s foundation language model (LLM) series. This version likely includes architectural improvements, training enhancements, and expanded dataset coverage compared to V1. The repository includes model weight artifacts, evaluation benchmarks across a broad suite (e.g. reasoning, math, multilingual), configuration files, and possibly tokenization / inference scripts.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 4
    DeepSeek-OCR

    DeepSeek-OCR

    Contexts Optical Compression

    DeepSeek-OCR is an open-source optical character recognition solution built as part of the broader DeepSeek AI vision-language ecosystem. It is designed to extract text from images, PDFs, and scanned documents, and integrates with multimodal capabilities that understand layout, context, and visual elements beyond raw character recognition.
    Downloads: 11 This Week
    Last Update:
    See Project
  • Pylon is an All-in-one B2B Support Platform for modern B2B businesses. Icon
    Pylon is an All-in-one B2B Support Platform for modern B2B businesses.

    Pylon is a modern support system that integrates with all B2B channels like Slack and Team.

    We bring together everything a post-sales teams team needs including a ticketing system, B2B omnichannel integrations (Slack Connect, Microsoft Teams), modern chat widget, knowledge base, AI support bot, account management, customer marketing, and more.
    Learn More
  • 5
    DeepSeek VL2

    DeepSeek VL2

    Mixture-of-Experts Vision-Language Models for Advanced Multimodal

    DeepSeek-VL2 is DeepSeek’s vision + language multimodal model—essentially the next-gen successor to their first vision-language models. It combines image and text inputs into a unified embedding / reasoning space so that you can query with text and image jointly (e.g. “What’s going on in this scene?” or “Generate a caption appropriate to context”).
    Downloads: 5 This Week
    Last Update:
    See Project
  • 6
    DeepSeek Coder

    DeepSeek Coder

    DeepSeek Coder: Let the Code Write Itself

    DeepSeek-Coder is a series of code-specialized language models designed to generate, complete, and infill code (and mixed code + natural language) with high fluency in both English and Chinese. The models are trained from scratch on a massive corpus (~2 trillion tokens), of which about 87% is code and 13% is natural language. This dataset covers project-level code structure (not just line-by-line snippets), using a large context window (e.g. 16K) and a secondary fill-in-the-blank objective to encourage better contextual completions and infilling. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 7
    DeepSeek Math

    DeepSeek Math

    Pushing the Limits of Mathematical Reasoning in Open Language Models

    ...The repo may also include modules that integrate external computational tools (e.g. a CAS / computer algebra system) or calculator assistance backends to enhance correctness. Because math reasoning is a high bar for LLMs, DeepSeek-Math aims to showcase their model’s ability not just in natural text but in precise formal reasoning.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    DeepSeek VL

    DeepSeek VL

    Towards Real-World Vision-Language Understanding

    DeepSeek-VL is DeepSeek’s initial vision-language model that anchors their multimodal stack. It enables understanding and generation across visual and textual modalities—meaning it can process an image + a prompt, answer questions about images, caption, classify, or reason about visuals in context. The model is likely used internally as the visual encoder backbone for agent use cases, to ground perception in downstream tasks (e.g. answering questions about a screenshot).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    DeepSeek LLM

    DeepSeek LLM

    DeepSeek LLM: Let there be answers

    The DeepSeek-LLM repository hosts the code, model files, evaluations, and documentation for DeepSeek’s LLM series (notably the 67B Chat variant). Its tagline is “Let there be answers.” The repo includes an “evaluation” folder (with results like math benchmark scores) and code artifacts (e.g. pre-commit config) that support model development and deployment.
    Downloads: 4 This Week
    Last Update:
    See Project
  • Infor M3 ERP Icon
    Infor M3 ERP

    Enterprise manufacturers and distributors requiring a solution to manage and execute complex processes

    Efficiently executing the complex processes of enterprise manufacturers and distributors. Infor M3 is a cloud-based, manufacturing and distribution ERP system that leverages the latest technologies to provide an exceptional user experience and powerful analytics in a multicompany, multicountry, and multisite platform. Infor M3 and related CloudSuite™ industry solutions include industry-leading functionality for the chemical, distribution, equipment, fashion, food and beverage, and industrial manufacturing industries. Staying ahead of the competition means staying agile. Our new capabilities bring improved data-driven insights and streamlined workflows to help you make informed decisions and take quick action.
    Learn More
  • 10
    DeepSeek-V3.2-Exp

    DeepSeek-V3.2-Exp

    An experimental version of DeepSeek model

    DeepSeek-V3.2-Exp is an experimental release of the DeepSeek model family, intended as a stepping stone toward the next generation architecture. The key innovation in this version is DeepSeek Sparse Attention (DSA), a sparse attention mechanism that aims to optimize training and inference efficiency in long-context settings without degrading output quality.
    Downloads: 28 This Week
    Last Update:
    See Project
  • 11
    DeepSeek Coder V2

    DeepSeek Coder V2

    DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models

    DeepSeek-Coder-V2 is the version-2 iteration of DeepSeek’s code generation models, refining the original DeepSeek-Coder line with improved architecture, training strategies, and benchmark performance. While the V1 models already targeted strong code understanding and generation, V2 appears to push further in both multilingual support and reasoning in code, likely via architectural enhancements or additional training objectives.
    Downloads: 27 This Week
    Last Update:
    See Project
  • 12
    DeepSeek-OCR 2

    DeepSeek-OCR 2

    Visual Causal Flow

    DeepSeek-OCR-2 is the second-generation optical character recognition system developed to improve document understanding by introducing a “visual causal flow” mechanism, enabling the encoder to reorder visual tokens in a way that better reflects semantic structure rather than strict raster scan order. It is designed to handle complex layouts and noisy documents by giving the model causal reasoning capabilities that mimic human visual scanning behavior, enhancing OCR performance on documents with rich spatial structure. ...
    Downloads: 19 This Week
    Last Update:
    See Project
  • 13
    DeepSeek Prover V2

    DeepSeek Prover V2

    Advancing Formal Mathematical Reasoning via Reinforcement Learning

    DeepSeek-Prover-V2 is DeepSeek’s specialized model for formal theorem proving, particularly targeting proof in Lean 4. The repository describes how they use recursive proof decomposition by prompting DeepSeek-V3 to break complex theorems into subgoals, synthesize proof sketches, and then combine them to bootstrap training data. They then fine-tune via reinforcement learning with binary correct/incorrect feedback to integrate informal reasoning with formal proof behavior. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    DeepSeek MoE

    DeepSeek MoE

    Towards Ultimate Expert Specialization in Mixture-of-Experts Language

    DeepSeek-MoE (“DeepSeek MoE”) is the DeepSeek open implementation of a Mixture-of-Experts (MoE) model architecture meant to increase parameter efficiency by activating only a subset of “expert” submodules per input. The repository introduces fine-grained expert segmentation and shared expert isolation to improve specialization while controlling compute cost.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    GLM-5

    GLM-5

    From Vibe Coding to Agentic Engineering

    ...Building on earlier GLM series models, GLM-5 dramatically scales the parameter count (to roughly 744 billion) and expands pre-training data to significantly improve performance on complex tasks such as multi-step reasoning, software engineering workflows, and agent orchestration compared to its predecessors like GLM-4.5. It incorporates innovations like DeepSeek Sparse Attention (DSA) to preserve massive context windows while reducing deployment costs and supporting long context processing, which is crucial for detailed plans and agent tasks.
    Downloads: 29 This Week
    Last Update:
    See Project
  • 16
    Profile Data

    Profile Data

    Analyze computation-communication overlap in V3/R1

    profile-data is a repository that publishes profiling traces and metrics from DeepSeek’s training and inference infrastructure (especially during DeepSeek-V3 / R1 experiments). The profiling data targets insights into computation-communication overlap, pipeline scheduling (e.g. DualPipe), and how MoE / EP / parallelism strategies interact in real systems. The repository contains JSON trace files like train.json, prefill.json, decode.json, and associated assets. Users can load them into tools like Chrome tracing to inspect GPU idle times, overlapping operations, and scheduling alignment. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    DualPipe

    DualPipe

    A bidirectional pipeline parallelism algorithm

    DualPipe is a bidirectional pipeline parallelism algorithm open-sourced by DeepSeek, introduced in their DeepSeek-V3 technical framework. The main goal of DualPipe is to maximize overlap between computation and communication phases during distributed training, thus reducing idle GPU time (i.e. “pipeline bubbles”) and improving cluster efficiency. Traditional pipeline parallelism methods (e.g. 1F1B or staggered pipelining) leave gaps because forward and backward phases can’t fully overlap with communication. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    3FS

    3FS

    A high-performance distributed file system

    ...By handling caching and batching at a system level, 3FS helps reduce overhead when many features or modules must be evaluated per input (e.g. in an LLM agent pipeline). The repository includes example integration with models like DeepSeek-V2 / V3, showing how 3FS can be plugged into pipelines for operations like plugin processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    GLM-4.6

    GLM-4.6

    Agentic, Reasoning, and Coding (ARC) foundation models

    ...GLM-4.6 also enhances writing quality, producing outputs that better align with human preferences and role-playing scenarios. Benchmark evaluations demonstrate that it not only outperforms GLM-4.5 but also rivals leading global models such as DeepSeek-V3.1-Terminus and Claude Sonnet 4.
    Downloads: 109 This Week
    Last Update:
    See Project
  • 20
    BrowserAI

    BrowserAI

    Run local LLMs like llama, deepseek, kokoro etc. inside your browser

    BrowserAI is a cutting-edge platform that allows users to run large language models (LLMs) directly in their web browser without the need for a server. It leverages WebGPU for accelerated performance and supports offline functionality, making it a highly efficient and privacy-conscious solution. The platform provides a developer-friendly SDK with pre-configured popular models, and it allows for seamless switching between MLC and Transformer engines. Additionally, it supports features such as...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 21
    Open Infra Index

    Open Infra Index

    Production-tested AI infrastructure tools

    open-infra-index is a central “infrastructure index” repository maintained by DeepSeek AI that acts as a catalog and hub for a collection of production-tested AI infrastructure tools and internal building blocks they have open-sourced. Instead of a single monolithic codebase, it functions more like an index or launching point: linking and documenting a set of library repos (e.g. FlashMLA, DeepEP, DeepGEMM, 3FS, etc.) that together form DeepSeek’s infrastructure stack.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    DeepSeekMath-V2

    DeepSeekMath-V2

    Towards self-verifiable mathematical reasoning

    DeepSeekMath-V2 is a large-scale open-source AI model designed specifically for advanced mathematical reasoning, theorem proving, and rigorous proof verification. It’s built by DeepSeek as a successor to their earlier math-specialist models. Unlike general-purpose LLMs that might generate plausible-looking math but sometimes hallucinate or mishandle rigorous logic, Math-V2 is engineered to not only generate solutions but also self-verify them, meaning it examines the derivations, checks logical consistency, and flags or corrects mistakes, producing proofs + verification rather than just a final answer. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 23
    Janus

    Janus

    Unified Multimodal Understanding and Generation Models

    Janus is a sophisticated open-source project from DeepSeek AI that aims to unify both visual understanding and image generation in a single model architecture. Rather than having separate systems for “look and describe” and “prompt and generate”, Janus uses an autoregressive transformer framework with a decoupled visual encoder—allowing it to ingest images for comprehension and to produce images from text prompts with shared internal representations.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    Smallpond

    Smallpond

    A lightweight data processing framework built on DuckDB and 3FS

    smallpond is a lightweight distributed data processing framework built by DeepSeek, designed to scale DuckDB workloads over clusters using their 3FS (Fire-Flyer File System) backend. The idea is to preserve DuckDB’s fast analytics engine but lift it from single-node to multi-node settings, giving you the ability to operate on large datasets (e.g. petabyte scale) without moving to a heavyweight system like Spark.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    DeepSeek-V3.2

    DeepSeek-V3.2

    High-efficiency reasoning and agentic intelligence model

    DeepSeek-V3.2 is a cutting-edge large language model developed by DeepSeek-AI, focused on achieving high reasoning accuracy and computational efficiency for agentic tasks. It introduces DeepSeek Sparse Attention (DSA), a new attention mechanism that dramatically reduces computational overhead while maintaining strong long-context performance.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next