Queryparser is Uber’s open source Haskell library for parsing and analyzing SQL queries written in different dialects (Vertica, Hive, Presto). It creates a unified AST and supports name resolution using catalog metadata, enabling large-scale query analysis for lineage detection, analytics, and tooling. The parsing logic produces an AST with table and column identifiers that are "raw" or optionally qualified. Frequently, it is desirable to convert the AST over raw names to an AST over resolved names, where identifiers are fully qualified. This transformation is called "name resolution" or simply "resolution". It requires as input the full list of columns in every table and the full list of tables in every schema, otherwise known as "catalog information". The parsing logic produces an AST with table and column identifiers that are "raw" or optionally qualified.
Features
- Parses multiple SQL dialects: Vertica, Hive, Presto
- Produces a normalized AST for cross-dialect query representation
- Supports name resolution via schema/catalog information to fully qualify identifiers
- Designed for large-scale streaming query analysis pipelines
- Used to uncover foreign-key relationships and metadata across data warehouses
- MIT-licensed and built for integration into analytics ecosystems