Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-51691

SerializationDebugger should swallow exception when try to find the reason of serialization problem

Rank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      I made a serialization mistake when develop a feature for our production enviroment.

      However the Exception and Stack trace is Confusion.

      We only get the root serialization cause,  but the `Serialization stack` is not shown, it is not easy to find the real problem

       

      ```

      13:38:31.443 WARN org.apache.spark.serializer.SerializationDebugger: Exception in serialization debugger
      org.apache.spark.SparkRuntimeException: Cannot get SQLConf inside scheduler event loop thread.
          at org.apache.spark.sql.errors.QueryExecutionErrors$.cannotGetSQLConfInSchedulerEventLoopThreadError(QueryExecutionErrors.scala:2002)
          at org.apache.spark.sql.internal.SQLConf$.get(SQLConf.scala:225)
          at org.apache.spark.sql.execution.ScalarSubquery.toString(subquery.scala:69)
          at java.lang.String.valueOf(String.java:2994)
          at scala.collection.mutable.StringBuilder.append(StringBuilder.scala:203)
          at scala.collection.immutable.Stream.addString(Stream.scala:701)
          at scala.collection.TraversableOnce.mkString(TraversableOnce.scala:377)
      [info] - SPARK-35874: AQE Shuffle should wait for its subqueries to finish before materializing *** FAILED *** (1 second, 660 milliseconds)
      [info]   org.apache.spark.SparkException: Job aborted due to stage failure: Task not serializable: java.io.NotSerializableException: org.apache.spark.SimpleFutureAction
      [info]   at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2865)
      [info]   at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2800)
      [info]   at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2799)
      [info]   at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
      [info]   at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
      [info]   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)

      ```

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            summaryzb zhoubin
            summaryzb zhoubin
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment