添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接

The below code executes a 'get' api method to retrieve objects from s3 and write to the data lake.

The problem arises when I use dbutils.secrets.get to get the keys required to establish the connection to s3

my_dataframe.rdd.foreachPartition(partition => {
        val AccessKey = dbutils.secrets.get(scope = "ADB_Scope", key = "AccessKey-ID")
        val SecretKey = dbutils.secrets.get(scope = "ADB_Scope", key = "AccessKey-Secret") 
        val creds = new BasicAWSCredentials(AccessKey, SecretKey)
        val clientRegion: Regions = Regions.US_EAST_1
        val s3client  = AmazonS3ClientBuilder.standard()
        .withRegion(clientRegion)
        .withCredentials(new AWSStaticCredentialsProvider(creds))
        .build()
          partition.foreach(x => {
            val objectKey = x.getString(0)
            val i = s3client.getObject(s3bucketName, objectKey).getObjectContent
            val inputS3String = IOUtils.toString(i, "UTF-8")
            val filePath = s"${data_lake_get_path}"
            val file = new File(filePath)
            val fileWriter = new FileWriter(file)
            val bw = new BufferedWriter(fileWriter)
            bw.write(inputS3String)
            bw.close()
            fileWriter.close()
      })

The above results in the error:-

Caused by: java.util.NoSuchElementException: None.get

at scala.None$.get(Option.scala:529)

at scala.None$.get(Option.scala:527)

at com.databricks.dbutils_v1. impl.SecretUtilsImpl.sc $lzycompute(SecretUtilsImpl.scala:24)

at com.databricks.dbutils_v1. impl.SecretUtilsImpl.sc (SecretUtilsImpl.scala:24)

at com.databricks.dbutils_v1.impl.SecretUtilsImpl.getSecretManagerClient(SecretUtilsImpl.scala:36)

at com.databricks.dbutils_v1.impl.SecretUtilsImpl.getBytesInternal(SecretUtilsImpl.scala:46)

at com.databricks.dbutils_v1.impl.SecretUtilsImpl.get(SecretUtilsImpl.scala:61)

When the actual secret scope values for AccessKey and SecretKey are passed the above code works fine.

How can this work using dbutils.secrets.get so that keys are not exposed in the code?

Hi @Sandesh Puligundla​ , You just need to move the following two lines:

val AccessKey = dbutils.secrets.get(scope = "ADB_Scope", key = "AccessKey-ID")
val SecretKey = dbutils.secrets.get(scope = "ADB_Scope", key = "AccessKey-Secret")

Outside of the foreachpartition block, so these functions will be executed in the context of the driver and sent to the worker nodes.

Hi @Sandesh Puligundla​ , You just need to move the following two lines:

val AccessKey = dbutils.secrets.get(scope = "ADB_Scope", key = "AccessKey-ID")
val SecretKey = dbutils.secrets.get(scope = "ADB_Scope", key = "AccessKey-Secret")

Outside of the foreachpartition block, so these functions will be executed in the context of the driver and sent to the worker nodes.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections.

Click here to register and join today!

Engage in exciting technical discussions , join a group with your peers and meet our Featured Members.

Informatica Cloud mapping with Databricks connection failing with java.util.NoSuchElementException in Data Engineering Read external iceberg table in a spark dataframe within databricks in Data Engineering Mounting a Azure Storage Account path on Databricks in Data Engineering External table from parquet partition in Data Engineering java.util.NoSuchElementException: key not found in Data Engineering © Databricks 2023. All rights reserved. Apache, Apache Spark, Spark and the Spark logo are trademarks of the Apache Software Foundation.
  • Privacy Notice
  • Terms of Use
  • Your Privacy Choices
  • Your California Privacy Rights
  •