Collectives™ on Stack Overflow
Find centralized, trusted content and collaborate around the technologies you use most.
Learn more about Collectives
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
Learn more about Teams
I'm trying to run my spark program using the spark-submit command (i'm working with scala), i specified the master adress, the class name, the jar file with all dependencies, the input file and then the output file but i'm having and error:
Exception in thread "main" org.apache.spark.sql.AnalysisException:
Multiple sources found for csv
(org.apache.spark.sql.execution.datasources.v2.csv.CSVDataSourceV2,
org.apache.spark.sql.execution.datasources.csv.CSVFileFormat), please
specify the fully qualified class name.;
Here is a screenshot for this error, What is it about? How can i fix it?
Thank you
–
–
Here you got some warnings also,
If you correctly run your
fat-jar
file with correct permissions you can get a output like this for
./spark-submit
Check whether if correctly set environmental variables for spark (
~/.bashrc
). Also check the source CSV file permissions. May be it will be the problem.
If you are running on linux environment set the folder permissions for the source CSV folder as
sudo chmod -R 777 /source_folder
After that again try to run
./spark-submit
with your
fat-jar
file.
Thanks for contributing an answer to Stack Overflow!
-
Please be sure to
answer the question
. Provide details and share your research!
But
avoid
…
-
Asking for help, clarification, or responding to other answers.
-
Making statements based on opinion; back them up with references or personal experience.
To learn more, see our
tips on writing great answers
.