添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接
相关文章推荐
俊秀的拐杖  ·  Issue 43196: ...·  2 月前    · 
想发财的苹果  ·  NameError: name ...·  3 月前    · 
逆袭的柿子  ·  jQuery 效果 delay() 方法 ...·  4 月前    · 
Databricks Community

I have a large delta table partitioned by an identifier column that I now have discovered has blank spaces in some of the identifiers, e.g. one partition can be defined by "Identifier=first identifier". Most partitions does not have these blank spaces in the identifiers, and it hasn't been a problem until now when I want to use

REORG TABLE table_name APPLY (PURGE)

to rewrite the files and get rid of some recently deleted columns.

When running REORG, I get

Error in SQL statement: SparkException: Job aborted due to stage failure: ... java.net.URISyntaxException: Illegal character in path at index ...

pointing to that blank space in the path " dbfs:/mnt/container/table_name/Identifier= first identifier /part-01347-8a9a157b-6d0d-75dd-b1b7-2aed12e057db.c000.snappy.parquet ".

Note that this has not been an issue when running OPTIMIZE on the same partition.

Anyone know how I can solve this? The only thing I can think of to move forward is to exclude the problematic partitions from the REORG, but that's a workaround, not a solution. Any tips on an actual solution much appreciated 🙏

FYI similar issue with partitions with "%" in the identifier. Used the filter clause of the REORG to exclude partitions with " " or "%" to be able to move forward with my work but will continue looking for a solution.

I've never seen any pointers not to use strings with blank spaces or percent signs as partition columns. Might this issue be a bug?

Hi @bearys , The error message suggests an illegal character in the path at a specific index.

The error is pointing to a blank space in the path " dbfs:/mnt/container/table_name/Identifier= first identifier /part-01347-8a9a157b-6d0d-75dd-b1b7-2aed12e057db.c000.snappy.parquet ".

This error can occur due to special characters in the path. To resolve this issue, you can try replacing the blank space in the path with an underscore or removing the special characters from the path. Alternatively, you can try URL encoding the path to replace special characters with their corresponding escape sequences.

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group ! If there isn’t one near you, fill out this form and we’ll create one for you to join!

How to do perform deep clone for data migration from one Datalake to another? in Data Engineering error installing the igraph and networkD3 library in Data Engineering Linear Regression HELP! Pickle + Broadcast Variable Error in Data Engineering Exception "java.nio.charset.MalformedInputException: Input length = 1" when creating data profile on Docker Container Service (10.4 LTS) in Data Engineering I am getting an exception "RuntimeException: Caught Hive MetaException attempting to get partition metadata by filter from Hive." in Data Engineering © Databricks 2024. All rights reserved. Apache, Apache Spark, Spark and the Spark logo are trademarks of the Apache Software Foundation.
  • Privacy Notice
  • Terms of Use
  • Your Privacy Choices
  • Your California Privacy Rights
  •