Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-52462

Enforce type coercion before children output deduplication in Union

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 4.1.0
    • 4.1.0
    • SQL

    Description

      Right now, query the following query produces plans that are not consistent over different underlying table providers. Query:

      SELECT col1, col2, col3, NULLIF('','') AS col4
      FROM table
      UNION ALL
      SELECT col2, col2, null AS col3, col4
      FROM table;

      This happens because of rule ordering:

      • Sometimes: ... -> WidenSetOperationTypes -> ... -> ResolveReferences (deduplication of Union children outputs) -> ...
      • Sometimes: ... -> ResolveReferences (deduplication of Union children outputs) -> ... -> WidenSetOperationTypes -> ...

      In this issue I propose that we align those two by enforcing type coercion to happen before deduplication.

      Attachments

        Issue Links

          Activity

            People

              mihailoale-db Mihailo Aleksic
              mihailoale-db Mihailo Aleksic
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: