添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接

Data Types #

Presto has a set of built-in data types, described below. Additional types can be provided by plugins.

Connectors are not required to support all types. See connector documentation for details on supported types.

  • Boolean

  • Integer

  • Floating-Point

  • Fixed-Precision

  • String

  • Date and Time

  • Structural

  • Network Address

  • HyperLogLog

  • KHyperLogLog

  • Quantile Digest

  • T-Digest

  • Boolean #

    BOOLEAN #

    This type captures boolean values true and false .

    A 32-bit signed two’s complement integer with a minimum value of -2^31 and a maximum value of 2^31 - 1 . The name INT is also available for this type.

    BIGINT #

    A 64-bit signed two’s complement integer with a minimum value of -2^63 and a maximum value of 2^63 - 1 .

    A fixed precision decimal number. Precision up to 38 digits is supported but performance is best up to 18 digits.

    The decimal type takes two literal parameters:

  • precision - total number of digits

  • scale - number of digits in fractional part. Scale is optional and defaults to 0.

  • Example type definitions: DECIMAL(10,3) , DECIMAL(20)

    Example literals: DECIMAL '10.3' , DECIMAL '1234567890' , 1.1

    For compatibility reasons decimal literals without explicit type specifier (e.g. 1.2 ) are treated as values of the DOUBLE type by default up to version 0.198. After 0.198 they are parsed as DECIMAL.

  • System wide property: parse-decimal-literals-as-double

  • Session wide property: parse_decimal_literals_as_double

  • CHAR #

    Fixed length character data. A CHAR type without length specified has a default length of 1. A CHAR(x) value always has x characters. For instance, casting dog to CHAR(7) adds 4 implicit trailing spaces. Leading and trailing spaces are included in comparisons of CHAR values. As a result, two character values with different lengths ( CHAR(x) and CHAR(y) where x != y ) will never be equal.

    Example type definitions: char , char(20)

    VARBINARY #

    Variable length binary data.

    Binary strings with length are not yet supported: varbinary(n)

    JSON #

    JSON value type, which can be a JSON object, a JSON array, a JSON number, a JSON string, true , false or null .

    Time of day (hour, minute, second, millisecond) without a time zone. Values of this type are parsed and rendered in the session time zone.

    Example: TIME '01:02:03.456'

    TIME WITH TIME ZONE #

    Time of day (hour, minute, second, millisecond) with a time zone. Values of this type are rendered using the time zone from the value.

    Example: TIME '01:02:03.456 America/Los_Angeles'

    TIMESTAMP #

    Instant in time that includes the date and time of day without a time zone. Values of this type are parsed and rendered in the session time zone.

    Example: TIMESTAMP '2001-08-22 03:04:05.321'

    TIMESTAMP WITH TIME ZONE #

    Instant in time that includes the date and time of day with a time zone. Values of this type are rendered using the time zone from the value.

    Example: TIMESTAMP '2001-08-22 03:04:05.321 America/Los_Angeles'

    INTERVAL YEAR TO MONTH #

    Span of years and months.

    Example: INTERVAL '3' MONTH

    INTERVAL DAY TO SECOND #

    Span of days, hours, minutes, seconds and milliseconds.

    Example: INTERVAL '2' DAY

    A structure made up of named fields. The fields may be of any SQL type, and are accessed with field reference operator .

    Example: CAST(ROW(1, 2.0) AS ROW(x BIGINT, y DOUBLE))

    An IP address that can represent either an IPv4 or IPv6 address.

    Internally, the type is a pure IPv6 address. Support for IPv4 is handled using the IPv4-mapped IPv6 address range ( RFC 4291#section-2.5.5.2 ). When creating an IPADDRESS , IPv4 addresses will be mapped into that range.

    When formatting an IPADDRESS , any address within the mapped range will be formatted as an IPv4 address. Other addresses will be formatted as IPv6 using the canonical format defined in RFC 5952 .

    Examples: IPADDRESS '10.0.0.1' , IPADDRESS '2001:db8::1'

    IPPREFIX #

    An IP routing prefix that can represent either an IPv4 or IPv6 address.

    Internally, an address is a pure IPv6 address. Support for IPv4 is handled using the IPv4-mapped IPv6 address range ( RFC 4291#section-2.5.5.2 ). When creating an IPPREFIX , IPv4 addresses will be mapped into that range. Additionally, addresses will be reduced to the first address of a network.

    IPPREFIX values will be formatted in CIDR notation, written as an IP address, a slash (‘/’) character, and the bit-length of the prefix. Any address within the IPv4-mapped IPv6 address range will be formatted as an IPv4 address. Other addresses will be formatted as IPv6 using the canonical format defined in RFC 5952 .

    Examples: IPPREFIX '10.0.1.0/24' , IPPREFIX '2001:db8::/48'

    This type represents a UUID (Universally Unique IDentifier), also known as a GUID (Globally Unique IDentifier), using the format defined in RFC 4122 .

    Example: UUID '12151fd2-7586-11e9-8f9e-2a86e4085a59'

    HyperLogLog #

    Calculating the approximate distinct count can be done much more cheaply than an exact count using the HyperLogLog data sketch. See HyperLogLog Functions .

    HyperLogLog #

    A HyperLogLog sketch allows efficient computation of approx_distinct() . It starts as a sparse representation, switching to a dense representation when it becomes more efficient.

    P4HyperLogLog #

    A P4HyperLogLog sketch is similar to HyperLogLog , but it starts (and remains) in the dense representation.

    KHyperLogLog #

    A KHyperLogLog is a data sketch that can be used to compactly represents the association of two columns. See KHyperLogLog Functions .

    QDigest #

    A quantile digest (qdigest) is a summary structure which captures the approximate distribution of data for a given input set, and can be queried to retrieve approximate quantile values from the distribution. The level of accuracy for a qdigest is tunable, allowing for more precise results at the expense of space.

    A qdigest can be used to give approximate answer to queries asking for what value belongs at a certain quantile. A useful property of qdigests is that they are additive, meaning they can be merged together without losing precision.

    A qdigest may be helpful whenever the partial results of approx_percentile can be reused. For example, one may be interested in a daily reading of the 99th percentile values that are read over the course of a week. Instead of calculating the past week of data with approx_percentile , qdigest s could be stored daily, and quickly merged to retrieve the 99th percentile value.

    See Quantile Digest Functions .

    TDigest #

    A t-digest is similar to qdigest , but it uses a different algorithm to represent the approximate distribution of a set of numbers. T-digest has better performance than quantile digests but only supports the DOUBLE type. See T-Digest Functions .