datasource - Quartz scheduler does not renew DB connections on AWS RDS failover

link管理

链接快照平台

输入网页链接，自动生成快照
标签化管理网页链接

Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

We're using the Java Quartz scheduler with an AWS RDS Aurora cluster as the underlying datastore. RDS is configured as a cluster with one primary read/write database and one read replica.

When I click "Instance Actions > Failover" in the AWS RDS Console, the current writer becomes the reader and the read replica becomes the writer.

However in that scenario, the Quartz JDBC DataSource/Connection pool does not seem to be able to handle the failover and the scheduler dies with below errors:

2018-08-22 13:10:21.106 ERROR 14824 --- [_ClusterManager] org.quartz.impl.jdbcjobstore.JobStoreTX  : ClusterManager: Error managing cluster: Failure updating scheduler state when checking-in: The MySQL server is running with the --read-only option so it cannot execute this statement
org.quartz.JobPersistenceException: Failure updating scheduler state when checking-in: The MySQL server is running with the --read-only option so it cannot execute this statement
        at org.quartz.impl.jdbcjobstore.JobStoreSupport.clusterCheckIn(JobStoreSupport.java:3468)
        at org.quartz.impl.jdbcjobstore.JobStoreSupport.doCheckin(JobStoreSupport.java:3315)
        at org.quartz.impl.jdbcjobstore.JobStoreSupport$ClusterManager.manage(JobStoreSupport.java:3920)
        at org.quartz.impl.jdbcjobstore.JobStoreSupport$ClusterManager.run(JobStoreSupport.java:3957)
Caused by: java.sql.SQLException: The MySQL server is running with the --read-only option so it cannot execute this statement
        at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:127)
        at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:95)
        at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122)
        at com.mysql.cj.jdbc.ClientPreparedStatement.executeInternal(ClientPreparedStatement.java:960)
        at com.mysql.cj.jdbc.ClientPreparedStatement.executeUpdateInternal(ClientPreparedStatement.java:1116)
        at com.mysql.cj.jdbc.ClientPreparedStatement.executeUpdateInternal(ClientPreparedStatement.java:1066)
        at com.mysql.cj.jdbc.ClientPreparedStatement.executeLargeUpdate(ClientPreparedStatement.java:1396)
        at com.mysql.cj.jdbc.ClientPreparedStatement.executeUpdate(ClientPreparedStatement.java:1051)
        at com.mchange.v2.c3p0.impl.NewProxyPreparedStatement.executeUpdate(NewProxyPreparedStatement.java:384)
        at org.quartz.impl.jdbcjobstore.StdJDBCDelegate.updateSchedulerState(StdJDBCDelegate.java:2975)
        at org.quartz.impl.jdbcjobstore.JobStoreSupport.clusterCheckIn(JobStoreSupport.java:3462)
        ... 3 common frames omitted
The quartz.properties configuration is as follows
org.quartz.dataSource.quartzDataSource.driver=com.mysql.cj.jdbc.Driver
org.quartz.dataSource.quartzDataSource.URL=jdbc:mysql://sqldbprd...:3306/quartz?useSSL=false
org.quartz.dataSource.quartzDataSource.user=quartz  
org.quartz.dataSource.quartzDataSource.password=...
org.quartz.dataSource.quartzDataSource.maxConnections=5
org.quartz.dataSource.quartzDataSource.validationQuery=SELECT 1
org.quartz.dataSource.quartzDataSource.TestConnectionOnCheckin=false
org.quartz.dataSource.quartzDataSource.TestConnectionOnCheckout=true
On the contrary, our main Spring Boot API which is configured using the default HikariCP continues to work and seems to pick up the writer/reader switch. 
Has anybody encountered this scenario as well? Can Quartz be configured to use a (existing) HikariCP instead of creating its own DataSource/Pool? Any suggestions are appreciated!
I had the same issue with a spring boot app connecting to Aurora postgres.
JVM caches the DNS and after failover it refers the old (now reader) instance.
You can set the cache to different value by putting the following line to the startup of your application:
java.security.Security.setProperty("networkaddress.cache.ttl", "60");
The other issue we had is that we accidentally put the instance url of the db instead of the cluster url.
For more info: https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/java-dg-jvm-ttl.html
                Best practices by AWS about the ttl cache settings:  docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/…
– HDCase
                Dec 13, 2018 at 13:28
                Thanks for your reply! The issue is indeed related to DNS caching. I also found that the issue is not just related to the quartz datasource but any other application datasource. In my case the problem was amplified by having a custom Route53 DNS on top of the RDS DNS (e.g. proddb.company.com) which has its own TTL. I ended up reducing the TTL in  Route53 and also implemented a read-only check in the pool test query. You can find more detailed information at stackoverflow.com/questions/52629074/…
– Bernie Lenz
                Dec 26, 2018 at 22:15
        Thanks for contributing an answer to Stack Overflow!
Please be sure to answer the question. Provide details and share your research!
But avoid …
Asking for help, clarification, or responding to other answers.
Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.