[go: up one dir, main page]

Showing 176 open source projects for "text"

View related business solutions
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    text-dedup

    text-dedup

    All-in-one text de-duplication

    text-dedup is a Python library that enables efficient deduplication of large text corpora by using MinHash and other probabilistic techniques to detect near-duplicate content. This is especially useful for NLP tasks where duplicated training data can skew model performance. text-dedup scales to billions of documents and offers tools for chunking, hashing, and comparing text efficiently with low memory usage. It supports Jaccard similarity thresholding, parallel execution, and flexible...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Elasticsearch

    Elasticsearch

    A Distributed RESTful Search Engine

    Elasticsearch is a distributed, RESTful search and analytics engine that lets you store, search and analyze with ease at scale. It lets you perform and combine many types of searches; it scales seamlessly, and offers answers incredibly fast with search results you can rank based on a variety of factors. Elasticsearch can be used for a wide variety of use cases, from maps and metrics to site search and workplace search, and with all data types.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 3
    Data Formulator

    Data Formulator

    Create rich visualizations with AI

    ... barriers via LLMs' code generation ability. However, these systems do not work well for iterative visualization authoring, because they often require analysts to provide, in a single turn, a text-only prompt that fully describes the complex visualization task to be performed, which is unrealistic to both users and models in many cases. In this paper, we present Data Formulator 2, an LLM-powered visualization system to address these challenges.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 4
    htop

    htop

    An interactive process viewer

    .... Running to requires ncurses libraries, typically named libncurses(w). Since version 2.0, htop is now cross-platform! Check out the video and slides of Hisham's presentation at FOSDEM 2016 about how this came to be. The current releases support Linux, FreeBSD, OpenBSD, DragonFly BSD, MacOSX and Solaris. This is htop, a cross-platform interactive process viewer. It is a text-mode application (for console or X terminals) and requires ncurses.
    Downloads: 7 This Week
    Last Update:
    See Project
  • Simple, Secure Domain Registration Icon
    Simple, Secure Domain Registration

    Get your domain at wholesale price. Cloudflare offers simple, secure registration with no markups, plus free DNS, CDN, and SSL integration.

    Register or renew your domain and pay only what we pay. No markups, hidden fees, or surprise add-ons. Choose from over 400 TLDs (.com, .ai, .dev). Every domain is integrated with Cloudflare's industry-leading DNS, CDN, and free SSL to make your site faster and more secure. Simple, secure, at-cost domain registration.
    Sign up for free
  • 5
    Apexcharts.js

    Apexcharts.js

    Interactive JavaScript Charts built on SVG

    ... look with unlimited possibilities. Below is an example of synchronized charts with github style. Zoom, Pan, Scroll through data. Make selections and load other charts using those selections. An example showing some interactivity. Another approach to Drill down charts where one selection updates the data of other charts. Annotations allows you to write custom text on specific values or on axes values. Valuable to expand the visual appeal of your chart and make it more informative.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 6
    npm-pdfreader

    npm-pdfreader

    Parse text and tables from PDF files.

    npm-pdfreader is a Node.js library for reading text and parsing tables from PDF files. It supports tabular data with automatic column detection and rule-based parsing, making it useful for extracting structured data from PDFs. ​
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    Vespa

    Vespa

    The open big data serving engine

    Make AI-driven decisions using your data, in real-time. At any scale, with unbeatable performance. Vespa is a full-featured text search engine and supports both regular text search and fast approximate vector search (ANN). This makes it easy to create high-performing search applications at any scale, whether you want to use traditional techniques or a modern vector-based approach. You can even combine both approaches efficiently in the same query, something no other engine can do...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    AutoGluon

    AutoGluon

    AutoGluon: AutoML for Image, Text, and Tabular Data

    AutoGluon enables easy-to-use and easy-to-extend AutoML with a focus on automated stack ensembling, deep learning, and real-world applications spanning image, text, and tabular data. Intended for both ML beginners and experts, AutoGluon enables you to quickly prototype deep learning and classical ML solutions for your raw data with a few lines of code. Automatically utilize state-of-the-art techniques (where appropriate) without expert knowledge. Leverage automatic hyperparameter tuning, model...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    CSV

    CSV

    Utility library for working with CSV and other delimited files

    Welcome to CSV.jl! A pure-Julia package for handling delimited text data, be it comma-delimited (csv), tab-delimited (tsv), or otherwise. A fast, flexible delimited file reader/writer for Julia.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Cloud-based observability solution that helps businesses track and manage workload and performance on a unified dashboard. Icon
    Cloud-based observability solution that helps businesses track and manage workload and performance on a unified dashboard.

    For developers, engineers, and operational teams in organizations of all sizes

    Monitor everything you run in your cloud without compromising on cost, granularity, or scale. groundcover is a full stack cloud-native APM platform designed to make observability effortless so that you can focus on building world-class products. By leveraging our proprietary sensor, groundcover unlocks unprecedented granularity on all your applications, eliminating the need for costly code changes and development cycles to ensure monitoring continuity.
    Learn More
  • 10
    Searchkick

    Searchkick

    Intelligent search made easy

    Searchkick brings powerful, production-ready search to Rails by mapping Active Record models into Elasticsearch with sensible defaults and easy customization. It supports language analyzers, stemming, synonyms, misspelling tolerance, and highlighting so search results feel natural to end users. Indexing is model-centric: you declare what fields to index, add computed fields, and trigger reindexing via callbacks or background jobs, with options for zero-downtime rolling reindexes. On the...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    PipeRider

    PipeRider

    Code review for data in dbt

    PipeRider automatically compares your data to highlight the difference in impacted downstream dbt models so you can merge your Pull Requests with confidence. PipeRider can profile your dbt models and obtain information such as basic data composition, quantiles, histograms, text length, top categories, and more. PipeRider can integrate with dbt metrics and present the time-series data of metrics in the report. PipeRider generates a static HTML report each time it runs, which can be viewed...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Sweetviz

    Sweetviz

    Visualize and compare datasets, target values and associations

    ... dataset) relates to other features. Sweetviz integrates associations for numerical (Pearson's correlation), categorical (uncertainty coefficient) and categorical-numerical (correlation ratio) datatypes seamlessly, to provide maximum information for all data types. Automatically detects numerical, categorical and text features, with optional manual overrides. min/max/range, quartiles, mean, mode, standard deviation, sum, median absolute deviation, coefficient of variation, kurtosis, skewness.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    nvim-hlslens

    nvim-hlslens

    Hlsearch Lens for Neovim

    nvim-hlslens helps you better glance at matched information, and seamlessly jump between matched instances.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    ggrepel

    ggrepel

    epel overlapping text labels away from each other in your ggplot2

    ggrepel is an R package that provides “smart” repulsion for text and label geoms in ggplot2. When placing text labels on a plot (e.g. labeling points), the labels can often overlap; ggrepel ensures labels don’t overlap (or overlap less) by repelling labels / pushing them away, adding connecting lines or nudges, etc. It improves the readability of plots, especially when many labels are present. Support for point and segment geoms (so labels can be connected by lines when moved). Supports both...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Automa.jl

    Automa.jl

    A julia code generator for regular expressions

    Automa is a regex-to-Julia compiler. By compiling regex to Julia code in the form of Expr objects, Automa provides facilities to create efficient and robust regex-based lexers, tokenizers and parsers using Julia's metaprogramming capabilities. You can view Automa as a regex engine that can insert arbitrary Julia code into its input-matching process, which will be executed when certain parts of the regex match an input.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    HyperTools

    HyperTools

    A Python toolbox for gaining geometric insights

    ..., text or (mixed) lists. Applying topic models and other text vectorization methods to text data. HyperTools is designed to facilitate dimensionality reduction-based visual explorations of high-dimensional data. The basic pipeline is to feed in a high-dimensional dataset (or a series of high-dimensional datasets) and, in a single function call, reduce the dimensionality of the dataset(s) and create a plot.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    CommonMark.jl

    CommonMark.jl

    A CommonMark-compliant parser for Julia

    A CommonMark-compliant parser for Julia.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    VimBindings.jl

    VimBindings.jl

    Vim bindings for the Julia REPL

    Vim bindings for the Julia REPL. VimBindings.jl is a Julia package which brings vim emulation directly to the Julia REPL.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    LanguageServer.jl

    LanguageServer.jl

    An implementation of the Microsoft Language Server Protocol

    This package implements the Microsoft Language Server Protocol for the Julia programming language. Text editors with a client for the Language Server Protocol are able to make use of the Julia Language Server for various code editing features.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    pprof

    pprof

    pprof is a tool for visualization and analysis of profiling data

    pprof is a profiling visualization and analysis tool that ingests profiles in the profile.proto format and generates human-readable and graph-based reports. It supports multiple profile types (CPU, heap, allocations, contention, etc.) and can present data as text tables, call graphs (via Graphviz/dot), flame graphs, and interactive web UIs. The tool helps developers find hot paths, quantify resource usage, and compare profiles across runs to validate performance changes. It is widely used in Go...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    NBInclude.jl

    NBInclude.jl

    import code from IJulia Jupyter notebooks into Julia programs

    NBInclude is a package for the Julia language that allows you to include and execute IJulia (Julia-language Jupyter) notebook files just as you would include an ordinary Julia file. The goal of this package is to make notebook files just as easy to incorporate into Julia programs as ordinary Julia (.jl) files, giving you the advantages of a notebook (integrated code, formatted text, equations, graphics, and other results) while retaining the modularity and re-usability of .jl files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    patat

    patat

    Terminal-based presentations using Pandoc

    patat (Presentations Atop The ANSI Terminal) is a small tool that allows you to show presentations using only an ANSI terminal. It does not require ncurses. Leverages the great Pandoc library to support many input formats including Literate Haskell. Supports smart slide splitting. Slides can be split up into multiple fragments. There is a live reload mode. Theming support including 24-bit RGB. Auto advancing with configurable delay. Optionally re-wrapping text to terminal width with proper...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    OrientDB

    OrientDB

    DBMS supporting graph, document, full-text and geospatial models

    OrientDB is an Open Source Multi-Model NoSQL DBMS with the support of Native Graphs, Documents, Full-Text search, Reactivity, Geo-Spatial and Object Oriented concepts. It's written in Java and it's amazingly fast. No expensive run-time JOINs, connections are managed as persistent pointers between records. You can traverse thousands of records in no time. Supports schema-less, schema-full and schema-mixed modes. Has a strong security profiling system based on user, roles and predicate security...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    TensorBoardX

    TensorBoardX

    tensorboard for pytorch (and chainer, mxnet, numpy, etc.)

    The SummaryWriter class provides a high-level API to create an event file in a given directory and add summaries and events to it. The class updates the file contents asynchronously. This allows a training program to call methods to add data to the file directly from the training loop, without slowing down training. TensorboardX now supports logging directly to Comet. Comet is a free cloud based solution that allows you to automatically track, compare and explain your experiments. It adds a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Querybook

    Querybook

    Big Data Querying UI, combining collocated table metadata

    Querybook is Pinterest’s open-source big data IDE via a notebook interface. Querybook’s core focus is to make composing queries, creating analyses, and collaborating with others as simple as possible. Organize rich text, queries, and charts into a notebook to easily document your analyses. Work collaboratively with others in a DataDoc and get real-time updates. The Query Editor is aware of your tables and their columns, as such it provides autocompletion, syntax highlighting, and the ability...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next