I am getting following error when I am running sql in scala ? But error is not clear so can some one please help me to solve this error. Also how to setup debug mode in spark-shelI.
scala> val df = spark.sql("select * from table1 limit 10");
df: org.apache.spark.sql.DataFrame = [itm_nbr: int, overall_e_coefficient: decimal(15,3) ... 16 more fields]
scala> df.show(10)
java.lang.RuntimeException: serious problem
@Saurabh
You can pass your own "log4j.properties" path to log messages and pass it to your spark shell command.
Example:
# spark-shell --master yarn --deploy-mode client --files /your/path/to/log4j.properties --conf "spark.executor.extraJavaOptions='-Dlog4j.configuration=log4j.properties'" --driver-java-options "-Dlog4j.configuration=file:/your/path/to/log4j.properties"
@Saurabh
For example if you create a "/tmp/log4j.properties" like following:
# cat /tmp/log4j.properties
log4j.rootCategory=debug,console
log4j.logger.com.demo.package=debug,console
log4j.additivity.com.demo.package=false
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.out
log4j.appender.console.immediateFlush=true
log4j.appender.console.encoding=UTF-8
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.conversionPattern=%d [%t] %-5p %c - %m%n
.
Then run the spark-shell as following then you should see DEBUG messages.
# su - spark
# spark-shell --master yarn --deploy-mode client --files /tmp/log4j.properties --conf "spark.executor.extraJavaOptions='-Dlog4j.configuration=log4j.properties'" --driver-java-options "-Dlog4j.configuration=file:/tmp/log4j.properties"
Multiple versions of Spark are installed but SPARK_MAJOR_VERSION is not set
Spark1 will be picked by default
2018-09-17 07:52:29,343 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Rate of successful kerberos logins and latency (milliseconds)])
2018-09-17 07:52:29,388 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Rate of failed kerberos logins and latency (milliseconds)])
2018-09-17 07:52:29,389 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[GetGroups])
2018-09-17 07:52:29,390 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field private org.apache.hadoop.metrics2.lib.MutableGaugeLong org.apache.hadoop.security.UserGroupInformation$UgiMetrics.renewalFailuresTotal with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Renewal failures since startup])
2018-09-17 07:52:29,390 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field private org.apache.hadoop.metrics2.lib.MutableGaugeInt org.apache.hadoop.security.UserGroupInformation$UgiMetrics.renewalFailures with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Renewal failures since last successful login])
2018-09-17 07:52:29,392 [main] DEBUG org.apache.hadoop.metrics2.impl.MetricsSystemImpl - UgiMetrics, User and group related metrics
2018-09-17 07:52:29,845 [main] DEBUG org.apache.hadoop.security.SecurityUtil - Setting hadoop.security.token.service.use_ip to true
2018-09-17 07:52:30,386 [main] DEBUG org.apache.hadoop.util.Shell - setsid exited with exit code 0
2018-09-17 07:52:30,501 [main] DEBUG org.apache.hadoop.security.Groups - Creating new Groups object
2018-09-17 07:52:30,523 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - Trying to load the custom-built native-hadoop library...
2018-09-17 07:52:30,534 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path
2018-09-17 07:52:30,535 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2018-09-17 07:52:30,535 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-09-17 07:52:30,536 [main] DEBUG org.apache.hadoop.util.PerformanceAdvisory - Falling back to shell based
2018-09-17 07:52:30,537 [main] DEBUG org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback - Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
2018-09-17 07:52:30,709 [main] DEBUG org.apache.hadoop.security.Groups - Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000
2018-09-17 07:52:30,751 [main] DEBUG org.apache.hadoop.security.UserGroupInformation - hadoop login
.
@Saurabh
You can pass your own "log4j.properties" path to log messages and pass it to your spark shell command.
Example:
# spark-shell --master yarn --deploy-mode client --files /your/path/to/log4j.properties --conf "spark.executor.extraJavaOptions='-Dlog4j.configuration=log4j.properties'" --driver-java-options "-Dlog4j.configuration=file:/your/path/to/log4j.properties"
@Saurabh
For example if you create a "/tmp/log4j.properties" like following:
# cat /tmp/log4j.properties
log4j.rootCategory=debug,console
log4j.logger.com.demo.package=debug,console
log4j.additivity.com.demo.package=false
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.out
log4j.appender.console.immediateFlush=true
log4j.appender.console.encoding=UTF-8
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.conversionPattern=%d [%t] %-5p %c - %m%n
.
Then run the spark-shell as following then you should see DEBUG messages.
# su - spark
# spark-shell --master yarn --deploy-mode client --files /tmp/log4j.properties --conf "spark.executor.extraJavaOptions='-Dlog4j.configuration=log4j.properties'" --driver-java-options "-Dlog4j.configuration=file:/tmp/log4j.properties"
Multiple versions of Spark are installed but SPARK_MAJOR_VERSION is not set
Spark1 will be picked by default
2018-09-17 07:52:29,343 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Rate of successful kerberos logins and latency (milliseconds)])
2018-09-17 07:52:29,388 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Rate of failed kerberos logins and latency (milliseconds)])
2018-09-17 07:52:29,389 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[GetGroups])
2018-09-17 07:52:29,390 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field private org.apache.hadoop.metrics2.lib.MutableGaugeLong org.apache.hadoop.security.UserGroupInformation$UgiMetrics.renewalFailuresTotal with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Renewal failures since startup])
2018-09-17 07:52:29,390 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field private org.apache.hadoop.metrics2.lib.MutableGaugeInt org.apache.hadoop.security.UserGroupInformation$UgiMetrics.renewalFailures with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, sampleName=Ops, always=false, type=DEFAULT, valueName=Time, value=[Renewal failures since last successful login])
2018-09-17 07:52:29,392 [main] DEBUG org.apache.hadoop.metrics2.impl.MetricsSystemImpl - UgiMetrics, User and group related metrics
2018-09-17 07:52:29,845 [main] DEBUG org.apache.hadoop.security.SecurityUtil - Setting hadoop.security.token.service.use_ip to true
2018-09-17 07:52:30,386 [main] DEBUG org.apache.hadoop.util.Shell - setsid exited with exit code 0
2018-09-17 07:52:30,501 [main] DEBUG org.apache.hadoop.security.Groups - Creating new Groups object
2018-09-17 07:52:30,523 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - Trying to load the custom-built native-hadoop library...
2018-09-17 07:52:30,534 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path
2018-09-17 07:52:30,535 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2018-09-17 07:52:30,535 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-09-17 07:52:30,536 [main] DEBUG org.apache.hadoop.util.PerformanceAdvisory - Falling back to shell based
2018-09-17 07:52:30,537 [main] DEBUG org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback - Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
2018-09-17 07:52:30,709 [main] DEBUG org.apache.hadoop.security.Groups - Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000
2018-09-17 07:52:30,751 [main] DEBUG org.apache.hadoop.security.UserGroupInformation - hadoop login
.
@Saurabh
In case of Spark2 you can enable the DEBUG logging as by invoking the "sc.setLogLevel("DEBUG")" as following:
$ export SPARK_MAJOR_VERSION=2
$ spark-shell --master yarn --deploy-mode client
SPARK_MAJOR_VERSION is set to 2, using Spark2
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Spark context Web UI available at http://newhwx1.example.com:4040
Spark context available as 'sc' (master = yarn, app id = application_1536125228953_0007).
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.3.0.2.6.5.0-292
Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_112)
Type in expressions to have them evaluated.
Type :help for more information.
scala> sc.setLogLevel("DEBUG")
scala> 18/09/17 07:58:57 DEBUG Client: IPC Client (1024266763) connection to newhwx1.example.com/10.10.10.10:8032 from spark sending #69
18/09/17 07:58:57 DEBUG Client: IPC Client (1024266763) connection to newhwx1.example.com/10.10.10.10:8032 from spark got value #69
18/09/17 07:58:57 DEBUG ProtobufRpcEngine: Call: getApplicationReport took 8ms
.
Terms & Conditions
Privacy Policy and Data Policy
Unsubscribe / Do Not Sell My Personal Information
Supported Browsers Policy
Apache Hadoop
and associated open source project names are trademarks of the
Apache Software Foundation. For a complete list of trademarks, click here.