WikiSQL

A large crowd-sourced dataset for developing natural language interfaces for relational databases. WikiSQL is the dataset released along with our work Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning. Regarding tokenization and Stanza, when WikiSQL was written 3-years ago, it relied on Stanza, a CoreNLP python wrapper that has since been deprecated. If you'd still like to use the tokenizer, please use the docker image. We do not anticipate switching to the current Stanza as changes to the tokenizer would render the previous results not reproducible.

Features

Both the evaluation script as well as the dataset are stored within the repo
Only Python 3 is supported at the moment
Inside the data folder you will find the files in jsonl and db format
We supply a sample predictions file for the dev set
In addition to the raw data dump, we also release an optional annotation script that annotates WikiSQL
Develop natural language interfaces for relational databases

Project Samples

Project Activity

See All Activity >

License

BSD License

Follow WikiSQL

WikiSQL Web Site

User Reviews

Be the first to post a review of WikiSQL!

Additional Project Details

Programming Language

Python

Related Categories

Python HTML XHTML, Python Database Software, Python Reinforcement Learning Frameworks, Python Reinforcement Learning Libraries, Python Reinforcement Learning Algorithms

Registered

2022-07-26

Similar Business Software

Vertex AI

Build, deploy, and scale machine learning (ML) models faster, with fully managed ML tools for any use case. Through Vertex AI Workbench, Vertex AI is natively integrated with BigQuery, Dataproc, and Spark. You can use BigQuery ML to create and execute machine learning models in BigQuery...

See Software
Google Cloud BigQuery

BigQuery is a serverless, multicloud data warehouse that simplifies the process of working with all types of data so you can focus on getting valuable business insights quickly. At the core of Google’s data cloud, BigQuery allows you to simplify data integration, cost effectively and securely...

See Software
Retool

Retool is an AI-powered platform that enables teams to build internal software, agents, and workflows faster using natural language and composable building blocks. It allows users to go from a simple prompt to a fully deployed application that works with their existing data, systems, and...

See Software
DbVisualizer

DbVisualizer is one of the world's most popular database editors. With almost 7 million downloads and Pro users in 150 countries worldwide, it won't disappoint you. Free and Pro versions are available. Developers, analysts, and DBAs use it to elevate their SQL experience with modern tools to...

See Software
TeamDesk

TeamDesk is the leading AI-enhanced Low-Code database platform for creating powerful and flexible web-based databases with AI assisted development and AI-enhanced data analysis. AI-Assisted Development streamlines database design, no coding, no technical skills required. AI-enhanced Data...

See Software
Google Cloud SQL

Fully managed relational database service for MySQL, PostgreSQL, and SQL Server with rich extension collections, configuration flags, and developer ecosystems. New customers get $300 in free credits to spend on Cloud SQL. You won’t be charged until you upgrade. Reduce maintenance costs with...

See Software