添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接

Problem

You are getting intermittent job failures with a NoSuchElementException error.

Example stack trace

Py4JJavaError: An error occurred while calling o2843.count.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 17 in stage 868.0 failed 4 times, most recent failure: Lost task 17.3 in stage 868.0 (TID 3065) (10.249.38.86 executor 6): java.util.NoSuchElementException
at org.apache.spark.sql.vectorized.ColumnarBatch$1.next(ColumnarBatch.java:69)
at org.apache.spark.sql.vectorized.ColumnarBatch$1.next(ColumnarBatch.java:58)
at scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:44)
at org.apache.spark.sql.execution.arrow.ArrowConverters$$anon$4.next(ArrowConverters.scala:401)
at org.apache.spark.sql.execution.arrow.ArrowConverters$$anon$4.next(ArrowConverters.scala:382)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage9.processNext(Unknown Source)
...

Cause

The NoSuchElementException error is the result of an issue in Apache Arrow optimization. Apache Arrow is an in-memory columnar data format that is used in spark to efficiently transfer data between the JVM and Python.

When Arrow optimization is enabled in the Py4J interface, there is a possibility it can call iterator.next() without checking iterator.hasNext() . This can result in a NoSuchElementException error.

Solution

Set spark.databricks.pyspark.emptyArrowBatchCheck to true in the cluster's Spark config ( AWS | Azure | GCP ).

spark.databricks.pyspark.emptyArrowBatchCheck=true

Enabling spark.databricks.pyspark.emptyArrowBatchCheck prevents a NoSuchElementException error from occurring when the Arrow batch size is 0.

Alternatively, you can disable Arrow optimization by setting the following properties in your cluster's Spark config .

spark.sql.execution.arrow.pyspark.enabled=false
spark.sql.execution.arrow.enabled=false

Disabling Arrow optimization may have performance implications.

Contact Us

If you still have questions or prefer to get help directly from an agent, please submit a request. We’ll get back to you as soon as possible.

Please enter the details of your request. A member of our support staff will respond as soon as possible.