添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement . We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Describe the problem

Having a class not found exception while trying to write as delta

Steps to reproduce

val s3_conf = new SparkConf()
    .set("fs.s3a.access.key", "xxxx")
    .set("fs.s3a.secret.key", "xxx")
// I tested with and without this option, nothing change
    .set("spark.delta.logStore.class", "org.apache.spark.sql.delta.storage.S3SingleDriverLogStore")
val spark = SparkSession
    .builder()
    .master("local")
    .appName("delta-writer")
    .config(s3_conf)
    .getOrCreate()
// creating a dataframe
  df.write
    .partitionBy("date")
    .mode("overwrite")
    .format("delta")
    .save("s3a://mybucket/myObject")

Observed results

The job fails with this error

Exception in thread "main" com.google.common.util.concurrent.ExecutionError: java.lang.NoClassDefFoundError: io/delta/storage/LogStore

Further details

aws cli version that are in the classpath

  • hadoop-aws-3.2.2.jar
  • aws-java-sdk-bundle-1.11.563.jar
  • I am using the spark-submit command to launch the program

    Environment information

  • Delta Lake version: 1.2.0
  • Spark version: 3.2.1
  • Scala version: 2.12.15
  • Willingness to contribute

    The Delta Lake Community encourages bug fix contributions. Would you or another member of your organization be willing to contribute a fix for this bug to the Delta Lake code base?

  • Yes. I can contribute a fix for this bug independently.
  • Yes. I would be willing to contribute a fix for this bug with guidance from the Delta Lake community.
  • No. I cannot contribute a bug fix at this time.
  • @a-haroun Looking into the issue. The LogStore interface and implementations are moved to a separate jar delta-storage as part of the 1.2.0 (migration guide for the same is here). Wondering if you are able to download the delta-storage jar as part of the spark-submit command start.

    Also please share the spark-submit command you used.

    @vkorukanti I am also getting a similar issue: java.lang.NoClassDefFoundError: io/delta/storage/LogStore when using
    PySpark v3.2.1
    Delta Lake v1.2.0

    On the releases page https://github.com/delta-io/delta/releases/tag/v1.2.0 it only lists the single Python artifact. You mention that the LogStore has been moved in delta-storage jar. Does that also get included in the Python artifact?

    [Feature Request] Include a better error message for NoClassDefFoundError: io/delta/storage/LogStore #1199