概览 | Apache Flink CDC

link管理

链接快照平台

输入网页链接，自动生成快照
标签化管理网页链接

相关文章推荐

冷冷的草稿本 · txt文件转数组_python读取txt为数组· 1 月前 ·

淡定的盒饭 · ASP.NET 核心 Blazor ...· 1 月前 ·

傻傻的馒头 · STRING_SPLIT ...· 1 月前 ·

坚韧的啤酒 · 表达式和函数 - Azure Data ...· 1 月前 ·

犯傻的黄豆 · Branches API | GitLab ...· 1 周前 ·

大方的酸菜鱼 · Search Results | ...· 1 年前 ·

沉稳的草稿本 · 1-SIM卡复位ATR解析-CSDN博客· 1 年前 ·

傲视众生的碗 · 图片加载失败如何用默认图片代替 - 掘金· 2 年前 ·

正直的凉茶 · 如何在 PL/pgSQL IF 语句中运行 ...· 2 年前 ·

刚毅的草稿纸 · Qt开发笔迹：QGLWidget、QOpen ...· 2 年前 ·

Flink CDC sources is a set of source connectors for Apache Flink ^® , ingesting changes from different databases using change data capture (CDC). Some CDC sources integrate Debezium as the engine to capture data changes. So it can fully leverage the ability of Debezium. See more about what is Debezium .

You can also read tutorials about how to use these sources.

Supported Connectors mysql-cdc

MySQL : 5.6, 5.7, 8.0.x

RDS MySQL : 5.6, 5.7, 8.0.x

PolarDB MySQL : 5.6, 5.7, 8.0.x

Aurora MySQL : 5.6, 5.7, 8.0.x

MariaDB : 10.x

PolarDB X : 2.0.1

JDBC Driver: 8.0.28 oceanbase-cdc

OceanBase CE : 3.1.x, 4.x

OceanBase EE : 2.x, 3.x, 4.x

OceanBase Driver: 2.4.x oracle-cdc

Oracle : 11, 12, 19, 21

Oracle Driver: 19.3.0.0 postgres-cdc

PostgreSQL : 9.6, 10, 11, 12, 13, 14

JDBC Driver: 42.5.1 sqlserver-cdc

Sqlserver : 2012, 2014, 2016, 2017, 2019

JDBC Driver: 9.4.1.jre8 tidb-cdc

TiDB : 5.1.x, 5.2.x, 5.3.x, 5.4.x, 6.0.0

JDBC Driver: 8.0.27 db2-cdc

Db2 : 11.5

Db2 Driver: 11.5.0.0 vitess-cdc

Vitess : 8.0.x, 9.0.x

MySql JDBC Driver: 8.0.26

Supports reading database snapshot and continues to read binlogs with exactly-once processing even failures happen.

CDC connectors for DataStream API, users can consume changes on multiple databases and tables in a single job without Debezium and Kafka deployed.

CDC connectors for Table/SQL API, users can use SQL DDL to create a CDC source to monitor changes on a single table.

The following table shows the current features of the connector:

Connector No-lock Read Parallel Read Exactly-once Read Incremental Snapshot Read

We need several steps to setup a Flink cluster with the provided connector.

Setup a Flink cluster with version 1.12+ and Java 8+ installed.

Download the connector SQL jars from the Downloads page (or build yourself ).

Put the downloaded jars under


    FLINK_HOME/lib/

Restart the Flink cluster.

The example shows how to create a MySQL CDC source in Flink SQL Client and execute queries on it.

-- creates a mysql cdc table source
CREATE TABLE mysql_binlog (
 id INT NOT NULL,
 name STRING,
 description STRING,
 weight DECIMAL(10,3),
 PRIMARY KEY(id) NOT ENFORCED
) WITH (
 'connector' = 'mysql-cdc',
 'hostname' = 'localhost',
 'port' = '3306',
 'username' = 'flinkuser',
 'password' = 'flinkpw',
 'database-name' = 'inventory',
 'table-name' = 'products'
-- read snapshot and binlog data from mysql, and do some transformation, and show on the client
SELECT id, UPPER(name), description, weight FROM mysql_binlog;
  Usage for DataStream API
Include following Maven dependency (available through Maven Central):
<dependency>
  <groupId>org.apache.flink</groupId>
  <!-- add the dependency matching your database -->
  <artifactId>flink-connector-mysql-cdc</artifactId>
  <!-- The dependency is available only for stable releases, SNAPSHOT dependencies need to be built based on master or release branches by yourself. -->
  <version>3.0-SNAPSHOT</version>
</dependency>
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.cdc.debezium.JsonDebeziumDeserializationSchema;
import org.apache.flink.cdc.connectors.mysql.source.MySqlSource;
public class MySqlBinlogSourceExample {
  public static void main(String[] args) throws Exception {
    MySqlSource<String> mySqlSource = MySqlSource.<String>builder()
            .hostname("yourHostname")
            .port(yourPort)
            .databaseList("yourDatabaseName") // set captured database
            .tableList("yourDatabaseName.yourTableName") // set captured table
            .username("yourUsername")
            .password("yourPassword")
            .deserializer(new JsonDebeziumDeserializationSchema()) // converts SourceRecord to JSON String
            .build();
    StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    // enable checkpoint
    env.enableCheckpointing(3000);
      .fromSource(mySqlSource, WatermarkStrategy.noWatermarks(), "MySQL Source")
      // set 4 parallel source tasks
      .setParallelism(4)
      .print().setParallelism(1); // use parallelism 1 for sink to keep message ordering
    env.execute("Print MySQL Snapshot + Binlog");
  Deserialization
The following JSON data show the change event in JSON format.
  "before": {
    "id": 111,
    "name": "scooter",
    "description": "Big 2-wheel scooter",
    "weight": 5.18
  "after": {
    "id": 111,
    "name": "scooter",
    "description": "Big 2-wheel scooter",
    "weight": 5.15
  "source": {...},
  "op": "u",  // the operation type, "u" means this this is an update event 
  "ts_ms": 1589362330904,  // the time at which the connector processed the event
  "transaction": null
Note: Please refer Debezium documentation  to know the meaning of each field.
In some cases, users can use the JsonDebeziumDeserializationSchema(true) Constructor to enabled include schema in the message. Then the Debezium JSON message may look like this:
  "schema": {
    "type": "struct",
    "fields": [
        "type": "struct",
        "fields": [
            "type": "int32",
            "optional": false,
            "field": "id"
            "type": "string",
            "optional": false,
            "default": "flink",
            "field": "name"
            "type": "string",
            "optional": true,
            "field": "description"
            "type": "double",
            "optional": true,
            "field": "weight"
        "optional": true,
        "name": "mysql_binlog_source.inventory_1pzxhca.products.Value",
        "field": "before"
        "type": "struct",
        "fields": [
            "type": "int32",
            "optional": false,
            "field": "id"
            "type": "string",
            "optional": false,
            "default": "flink",
            "field": "name"
            "type": "string",
            "optional": true,
            "field": "description"
            "type": "double",
            "optional": true,
            "field": "weight"
        "optional": true,
        "name": "mysql_binlog_source.inventory_1pzxhca.products.Value",
        "field": "after"
        "type": "struct",
        "fields": {...}, 
        "optional": false,
        "name": "io.debezium.connector.mysql.Source",
        "field": "source"
        "type": "string",
        "optional": false,
        "field": "op"
        "type": "int64",
        "optional": true,
        "field": "ts_ms"
    "optional": false,
    "name": "mysql_binlog_source.inventory_1pzxhca.products.Envelope"
  "payload": {
    "before": {
      "id": 111,
      "name": "scooter",
      "description": "Big 2-wheel scooter",
      "weight": 5.18
    "after": {
      "id": 111,
      "name": "scooter",
      "description": "Big 2-wheel scooter",
      "weight": 5.15
    "source": {...},
    "op": "u",  // the operation type, "u" means this this is an update event
    "ts_ms": 1589362330904,  // the time at which the connector processed the event
    "transaction": null

Usually, it is recommended to exclude schema because schema fields makes the messages very verbose which reduces parsing performance.

The JsonDebeziumDeserializationSchema can also accept custom configuration of JsonConverter , for example if you want to obtain numeric output for decimal data, you can construct JsonDebeziumDeserializationSchema as following:

 Map<String, Object> customConverterConfigs = new HashMap<>();
 customConverterConfigs.put(JsonConverterConfig.DECIMAL_FORMAT_CONFIG, "numeric");
 JsonDebeziumDeserializationSchema schema = 
      new JsonDebeziumDeserializationSchema(true, customConverterConfigs);
  Building from source
Prerequisites:
Maven
At least Java 8
git clone https://github.com/apache/flink-cdc.git
cd flink-cdc
mvn clean install -DskipTests
The dependencies are now available in your local .m2 repository.
 Back to top