注意 请记住,只有在扩展 API 时才需要使用到物理提示。使用预定义的 source、sink 以及 Flink 函数时,不需要用到物理提示。在使用 Table API 编写程序时,Flink 会忽略物理提示(例如 field.cast(TIMESTAMP(3).bridgedTo(Timestamp.class)))。
List of Data Types
This section lists all pre-defined data types.
For the JVM-based Table API those types are also available in org.apache.flink.table.api.DataTypes.
For the Python Table API, those types are available in pyflink.table.types.DataTypes.
The default planner supports the following set of SQL types:
Data Type
Remarks for Data Type
The type can be declared using CHAR(n) where n is the number of code points. n must have a value between 1
and 2,147,483,647 (both inclusive). If no length is specified, n is equal to 1.
VARCHAR / STRING
Data type of a variable-length character string.
Declaration
DataTypes.STRING()
Attention The specified maximum number of code points n in DataTypes.VARCHAR(n) must be 2,147,483,647 currently.
The type can be declared using VARCHAR(n) where n is the maximum number of code points. n must have a value
between 1 and 2,147,483,647 (both inclusive). If no length is specified, n is equal to 1.
STRING is a synonym for VARCHAR(2147483647).
Binary Strings
BINARY
Data type of a fixed-length binary string (=a sequence of bytes).
Declaration
The type can be declared using BINARY(n) where n is the number of bytes. n must have a value
between 1 and 2,147,483,647 (both inclusive). If no length is specified, n is equal to 1.
VARBINARY / BYTES
Data type of a variable-length binary string (=a sequence of bytes).
Declaration
DataTypes.BYTES()
Attention The specified maximum number of bytes n in DataTypes.VARBINARY(n) must be 2,147,483,647 currently.
The type can be declared using VARBINARY(n) where n is the maximum number of bytes. n must
have a value between 1 and 2,147,483,647 (both inclusive). If no length is specified, n is
equal to 1.
BYTES is a synonym for VARBINARY(2147483647).
Exact Numerics
DECIMAL
Data type of a decimal number with fixed precision and scale.
Declaration
DataTypes.DECIMAL(p,s)
Attention The precision and scale specified in DataTypes.DECIMAL(p, s) must be 38 and 18 separately currently.
The type can be declared using DECIMAL(p, s) where p is the number of digits in a
number (precision) and s is the number of digits to the right of the decimal point
in a number (scale). p must have a value between 1 and 38 (both inclusive). s
must have a value between 0 and p (both inclusive). The default value for p is 10.
The default value for s is 0.
NUMERIC(p, s) and DEC(p, s) are synonyms for this type.
TINYINT
Data type of a 1-byte signed integer with values from -128 to 127.
Declaration
Data type of an 8-byte signed integer with values from -9,223,372,036,854,775,808 to
9,223,372,036,854,775,807.
Declaration
Data type of a 4-byte single precision floating point number.
Compared to the SQL standard, the type does not take parameters.
Declaration
Data type of a date consisting of year-month-day with values ranging from 0000-01-01
to 9999-12-31.
Compared to the SQL standard, the range starts at year 0000.
Declaration
Data type of a time without time zone consisting of hour:minute:second[.fractional] with
up to nanosecond precision and values ranging from 00:00:00.000000000 to
23:59:59.999999999.
Compared to the SQL standard, leap seconds (23:59:60 and 23:59:61) are not supported as
the semantics are closer to java.time.LocalTime. A time with time zone is not provided.
Compared to the SQL standard, leap seconds (23:59:60 and 23:59:61) are not supported.
A time with time zone is not provided.
Declaration
Describes the number of milliseconds of the day. Output only if type is not nullable.
java.lang.Long
Describes the number of nanoseconds of the day.
Describes the number of nanoseconds of the day. Output only if type is not nullable.
DataTypes.TIME(p)
Attention The precision specified in DataTypes.TIME(p) must be 0 currently.
The type can be declared using TIME(p) where p is the number of digits of fractional
seconds (precision). p must have a value between 0 and 9 (both inclusive). If no
precision is specified, p is equal to 0.
TIMESTAMP
Data type of a timestamp without time zone consisting of year-month-day hour:minute:second[.fractional]
with up to nanosecond precision and values ranging from 0000-01-01 00:00:00.000000000 to
9999-12-31 23:59:59.999999999.
Compared to the SQL standard, leap seconds (23:59:60 and 23:59:61) are not supported as
the semantics are closer to java.time.LocalDateTime.
A conversion from and to BIGINT (a JVM long type) is not supported as this would imply a time
zone. However, this type is time zone free. For more java.time.Instant-like semantics use
TIMESTAMP_LTZ.
Compared to the SQL standard, leap seconds (23:59:60 and 23:59:61) are not supported.
A conversion from and to BIGINT is not supported as this would imply a time zone.
However, this type is time zone free. If you have such a requirement please use TIMESTAMP_LTZ.
Declaration
DataTypes.TIMESTAMP(p)
Attention The precision specified in DataTypes.TIMESTAMP(p) must be 3 currently.
The type can be declared using TIMESTAMP(p) where p is the number of digits of fractional
seconds (precision). p must have a value between 0 and 9 (both inclusive). If no precision
is specified, p is equal to 6.
TIMESTAMP(p) WITHOUT TIME ZONE is a synonym for this type.
TIMESTAMP WITH TIME ZONE
Data type of a timestamp with time zone consisting of year-month-day hour:minute:second[.fractional] zone
with up to nanosecond precision and values ranging from 0000-01-01 00:00:00.000000000 +14:59 to
9999-12-31 23:59:59.999999999 -14:59.
Compared to the SQL standard, leap seconds (23:59:60 and 23:59:61) are not supported as the semantics
are closer to java.time.OffsetDateTime.
Compared to the SQL standard, leap seconds (23:59:60 and 23:59:61) are not supported.
Compared to TIMESTAMP_LTZ, the time zone offset information is physically
stored in every datum. It is used individually for every computation, visualization, or communication
to external systems.
Declaration
The type can be declared using TIMESTAMP(p) WITH TIME ZONE where p is the number of digits of
fractional seconds (precision). p must have a value between 0 and 9 (both inclusive). If no
precision is specified, p is equal to 6.
TIMESTAMP_LTZ
Data type of a timestamp with local time zone consisting of year-month-day hour:minute:second[.fractional] zone
with up to nanosecond precision and values ranging from 0000-01-01 00:00:00.000000000 +14:59 to
9999-12-31 23:59:59.999999999 -14:59.
Leap seconds (23:59:60 and 23:59:61) are not supported as the semantics are closer to java.time.OffsetDateTime.
Compared to TIMESTAMP WITH TIME ZONE, the time zone offset information is not stored physically
in every datum. Instead, the type assumes java.time.Instant semantics in UTC time zone at
the edges of the table ecosystem. Every datum is interpreted in the local time zone configured in
the current session for computation and visualization.
Leap seconds (23:59:60 and 23:59:61) are not supported.
Compared to TIMESTAMP WITH TIME ZONE, the time zone offset information is not stored physically
in every datum.
Every datum is interpreted in the local time zone configured in the current session for computation and visualization.
This type fills the gap between time zone free and time zone mandatory timestamp types by allowing
the interpretation of UTC timestamps according to the configured session time zone.
Java Type
Input
Output
Remarks
Describes the number of milliseconds since epoch. Output only if type is not nullable.
java.sql.Timestamp
Describes the number of milliseconds since epoch.
org.apache.flink.table.data.TimestampData
Internal data structure.
Attention The precision specified in DataTypes.TIMESTAMP_LTZ(p) must be 3 currently.
The type can be declared using TIMESTAMP_LTZ(p) where p is the number
of digits of fractional seconds (precision). p must have a value between 0 and 9
(both inclusive). If no precision is specified, p is equal to 6.
TIMESTAMP(p) WITH LOCAL TIME ZONE is a synonym for this type.
INTERVAL YEAR TO MONTH
Data type for a group of year-month interval types.
The type must be parameterized to one of the following resolutions:
interval of years,
interval of years to months,
or interval of months.
An interval of year-month consists of +years-months with values ranging from -9999-11 to
+9999-11.
The value representation is the same for all types of resolutions. For example, an interval
of months of 50 is always represented in an interval-of-years-to-months format (with default
year precision): +04-02.
The type can be declared using the above combinations where p is the number of digits of years
(year precision). p must have a value between 1 and 4 (both inclusive). If no year precision
is specified, p is equal to 2.
INTERVAL DAY TO SECOND
Data type for a group of day-time interval types.
The type must be parameterized to one of the following resolutions with up to nanosecond precision:
interval of days,
interval of days to hours,
interval of days to minutes,
interval of days to seconds,
interval of hours,
interval of hours to minutes,
interval of hours to seconds,
interval of minutes,
interval of minutes to seconds,
or interval of seconds.
An interval of day-time consists of +days hours:months:seconds.fractional with values ranging from
-999999 23:59:59.999999999 to +999999 23:59:59.999999999. The value representation is the same
for all types of resolutions. For example, an interval of seconds of 70 is always represented in
an interval-of-days-to-seconds format (with default precisions): +00 00:01:10.000000.
The type can be declared using the above combinations where p1 is the number of digits of days
(day precision) and p2 is the number of digits of fractional seconds (fractional precision).
p1 must have a value between 1 and 6 (both inclusive). p2 must have a value between 0
and 9 (both inclusive). If no p1 is specified, it is equal to 2 by default. If no p2 is
specified, it is equal to 6 by default.
Constructured Data Types
ARRAY
Data type of an array of elements with same subtype.
Compared to the SQL standard, the maximum cardinality of an array cannot be specified but is
fixed at 2,147,483,647. Also, any valid type is supported as a subtype.
Declaration
The type can be declared using ARRAY<t> where t is the data type of the contained
elements.
t ARRAY is a synonym for being closer to the SQL standard. For example, INT ARRAY is
equivalent to ARRAY<INT>.
Data type of an associative array that maps keys (including NULL) to values (including NULL). A map
cannot contain duplicate keys; each key can map to at most one value.
There is no restriction of element types; it is the responsibility of the user to ensure uniqueness.
The map type is an extension to the SQL standard.
Declaration
The type can be declared using MAP<kt, vt> where kt is the data type of the key elements
and vt is the data type of the value elements.
MULTISET
Data type of a multiset (=bag). Unlike a set, it allows for multiple instances for each of its
elements with a common subtype. Each unique value (including NULL) is mapped to some multiplicity.
There is no restriction of element types; it is the responsibility of the user to ensure uniqueness.
Declaration
The type can be declared using MULTISET<t> where t is the data type
of the contained elements.
t MULTISET is a synonym for being closer to the SQL standard. For example, INT MULTISET is
equivalent to MULTISET<INT>.
Data type of a sequence of fields.
A field consists of a field name, field type, and an optional description. The most specific type
of a row of a table is a row type. In this case, each column of the row corresponds to the field
of the row type that has the same ordinal position as the column.
Compared to the SQL standard, an optional field description simplifies the handling with complex
structures.
A row type is similar to the STRUCT type known from other non-standard-compliant frameworks.
The type can be declared using ROW<n0 t0 'd0', n1 t1 'd1', ...> where n is the unique name of
a field, t is the logical type of a field, d is the description of a field.
ROW(...) is a synonym for being closer to the SQL standard. For example, ROW(myField INT, myOtherField BOOLEAN) is
equivalent to ROW<myField INT, myOtherField BOOLEAN>.
User-Defined Data Types
Attention User-defined data types are not fully supported yet. They are
currently (as of Flink 1.11) only exposed as unregistered structured types in parameters and return types of functions.
A structured type is similar to an object in an object-oriented programming language. It contains
zero, one or more attributes. Each attribute consists of a name and a type.
There are two kinds of structured types:
Types that are stored in a catalog and are identified by a catalog identifier (like cat.db.MyType). Those
are equal to the SQL standard definition of structured types.
Anonymously defined, unregistered types (usually reflectively extracted) that are identified by
an implementation class (like com.myorg.model.MyType). Those are useful when programmatically
defining a table program. They enable reusing existing JVM classes without manually defining the
schema of a data type again.
Registered Structured Types
Currently, registered structured types are not supported. Thus, they cannot be stored in a catalog
or referenced in a CREATE TABLE DDL.
Unregistered Structured Types
Unregistered structured types can be created from regular POJOs (Plain Old Java Objects) using automatic reflective extraction.
The implementation class of a structured type must meet the following requirements:
The class must be globally accessible which means it must be declared public, static, and not abstract.
The class must offer a default constructor with zero arguments or a full constructor that assigns all
fields.
All fields of the class must be readable by either public declaration or a getter that follows common
coding style such as getField(), isField(), field().
All fields of the class must be writable by either public declaration, fully assigning constructor,
or a setter that follows common coding style such as setField(...), field(...).
All fields must be mapped to a data type either implicitly via reflective extraction or explicitly
using the @DataTypeHintannotations.
Fields that are declared static or transient are ignored.
The reflective extraction supports arbitrary nesting of fields as long as a field type does not
(transitively) refer to itself.
The declared field class (e.g. public int age;) must be contained in the list of supported JVM
bridging classes defined for every data type in this document (e.g. java.lang.Integer or int for INT).
For some classes an annotation is required in order to map the class to a data type (e.g. @DataTypeHint("DECIMAL(10, 2)")
to assign a fixed precision and scale for java.math.BigDecimal).
Declaration
// enrich the extraction with precision information
public@DataTypeHint("DECIMAL(10, 2)")BigDecimaltotalBalance;// enrich the extraction with forcing using RAW types
public@DataTypeHint("RAW")Class<?>modelClass;DataTypes.of(User.class);
Bridging to JVM Types
Java Type
Input
Output
Remarks
// enrich the extraction with precision information
@DataTypeHint("DECIMAL(10, 2)")totalBalance:java.math.BigDecimal,// enrich the extraction with forcing using a RAW type
@DataTypeHint("RAW")modelClass:Class[_]DataTypes.of(classOf[User])
Bridging to JVM Types
Java Type
Input
Output
Remarks
Data type of a boolean with a (possibly) three-valued logic of TRUE, FALSE, and UNKNOWN.
Declaration
Data type of an arbitrary serialized type. This type is a black box within the table ecosystem
and is only deserialized at the edges.
The raw type is an extension to the SQL standard.
Declaration
The type can be declared using RAW('class', 'snapshot') where class is the originating class and
snapshot is the serialized TypeSerializerSnapshot in Base64 encoding. Usually, the type string is not
declared directly but is generated while persisting the type.
In the API, the RAW type can be declared either by directly supplying a Class + TypeSerializer or
by passing Class and letting the framework extract Class + TypeSerializer from there.
Data type for representing untyped NULL values.
The null type is an extension to the SQL standard. A null type has no other value
except NULL, thus, it can be cast to any nullable type similar to JVM semantics.
This type helps in representing unknown types in API calls that use a NULL literal
as well as bridging to formats such as JSON or Avro that define such a type as well.
This type is not very useful in practice and is just mentioned here for completeness.
Declaration
Flink Table API 和 Flink SQL 支持从 输入 数据类型 到 目标 数据类型的转换。有的转换
无论输入值是什么都能保证转换成功,而有些转换则会在运行时失败(即不可能转换为 目标 数据类型对应的值)。
例如,将 INT 数据类型的值转换为 STRING 数据类型一定能转换成功,但无法保证将 STRING 数据类型转换为 INT 数据类型。