Json Schema说明

  1. json schema 本身也是一个json串;
  2. 每个schema可以描述一个json实例,并且该json实例里每一个节点都可以用一个schema来描述,因此schema与json一样,本身也是一个层级结构,一个schema中可能嵌套着另外若干层schema;
  3. json schema 定义的检查规则以数据格式验证为主(字段存在性、字段类型),并可以支持一些简单的数据正确性验证(例如数值范围、字符串的模式等),但不能进行复杂的逻辑校验(例如进价必须小于售价等);
  4. Json Schema 格式

    Json schema 本身遵循Json规范,本身就是一个Json字符串,先来看一个例子

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    {
    "$schema": "http://json-schema.org/draft-04/schema#",
    "title": "Product",
    "description": "A product from Acme's catalog",
    "type": "object",
    "properties": {
    "id": {
    "description": "The unique identifier for a product",
    "type": "integer"
    },
    "name": {
    "description": "Name of the product",
    "type": "string"
    },
    "price": {
    "type": "number",
    "minimum": 0,
    "exclusiveMinimum": true
    }
    },
    "required": ["id", "name", "price"]
    }

    我们来看一下json schema 最外层包含以下几个字段

    $schema $schema $schema 关键字状态,表示这个模式与 v4 规范草案书写一致。 title 标题,用来描述结构 description properties required

    上面只是一个简单的例子,从上面可以看出Json schema 本身是一个JSON字符串,由通过key-value的形式进行标示。
    type 和 properties 用来定义json 属性的类型。required 是对Object字段的必段性进行约束。事实上,json Schema定义了json所支持的类型,每种类型都有0-N种约束方式。下一节我们来,细致介绍一下。

    Json Schema 类型

    Object

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    {
    "$schema": "http://json-schema.org/draft-04/schema#",
    "title": "Product",
    "description": "A product from Acme's catalog",
    "type": "object",
    "properties": {
    "id": {
    "description": "The unique identifier for a product",
    "type": "integer"
    },
    "name": {
    "description": "Name of the product",
    "type": "string"
    },
    "price": {
    "type": "number",
    "minimum": 0,
    "exclusiveMinimum": true
    }
    },
    "required": ["id", "name", "price"]
    }

    object类型有三个关键字:type(限定类型),properties(定义object的各个字段),required(限定必需字段),如下:

    properties required maxProperties 最大属性个数 minProperties 最小属性个数 additionalProperties true or false or object

    properties 定义每个属性的名字和类型,方式如上例。

    array

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    {
    "$schema": "http://json-schema.org/draft-04/schema#",
    "title": "Product",
    "description": "A product from Acme's catalog",
    "type": "array",
    "items": {
    "type": "string"
    },
    "minItems": 1,
    "uniqueItems": true
    }

    array有三个单独的属性:items,minItems,uniqueItems:

    items array 每个元素的类型 minItems 约束属性,数组最小的元素个数 maxItems 约束属性,数组最大的元素个数 uniqueItems 约束属性,每个元素都不相同 additionalProperties 约束items的类型,不建议使用 Dependencies patternProperties

    string

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    {
    "$schema": "http://json-schema.org/draft-04/schema#",
    "title": "Product",
    "description": "A product from Acme's catalog",
    "type": "object",
    "properties": {
    "ip": {
    "mail": "string",
    "pattern":"w+([-+.]w+)*@w+([-.]w+)*.w+([-.]w+)*"
    },
    "host": {
    "type": "phoneNumber",
    "pattern":"((d{3,4})|d{3,4}-)?d{7,8}(-d{3})*"
    },
    },
    "required": ["ip", "host"]
    }

    integer

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    {
    "$schema": "http://json-schema.org/draft-04/schema#",
    "title": "Product",
    "description": "A product from Acme's catalog",
    "type": "object",
    "properties": {
    "name": {
    "description": "Name of the product",
    "type": "string"
    },
    "price": {
    "type": "integer",
    "minimum": 0,
    "exclusiveMinimum": true
    }
    },
    "required": ["id", "name", "price"]
    }

    number

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    {
    "$schema": "http://json-schema.org/draft-04/schema#",
    "title": "Product",
    "description": "A product from Acme's catalog",
    "type": "object",
    "properties": {
    "name": {
    "description": "Name of the product",
    "type": "string"
    },
    "price": {
    "type": "number",
    "minimum": 0,
    "exclusiveMinimum": true
    }
    },
    "required": ["id", "name", "price"]
    }

    number 关键字可以描述任意长度,任意小数点的数字。number类型的约束有以下几个:

    minimum exclusiveMinimum 如果存在 “exclusiveMinimum” 并且具有布尔值 true,如果它严格意义上大于 “minimum” 的值则实例有效。 maximum 约束属性,最大值 exclusiveMaximum 如果存在 “exclusiveMinimum” 并且具有布尔值 true,如果它严格意义上小于 “maximum” 的值则实例有效。

    boolean

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    {
    "type": "object",
    "properties": {
    "number": { "type": "boolean" },
    "street_name": { "type": "string" },
    "street_type": { "type": "string",
    "enum": ["Street", "Avenue", "Boulevard"]
    }
    }
    }

    true or false

    enum

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    {
    "type": "object",
    "properties": {
    "number": { "type": "number" },
    "street_name": { "type": "string" },
    "street_type": { "type": "string",
    "enum": ["Street", "Avenue", "Boulevard"]
    }
    }
    }

    也可以这么做

    1
    2
    3
    4
    5
    6
    7
    8
    {
    "type": "object",
    "properties": {
    "number": { "type": "number" },
    "street_name": { "type": "string" },
    "street_type": ["Street", "Avenue", "Boulevard"]
    }
    }

    null

    Json Schema进阶

    了解了上面的各个类型的定义及约定条件,就可以满足大部分情况了。但为了写出更好的json schema,我们再学习几个关键字

    $ref

    $ref 用来引用其它schema,
    示例如下:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    {
    "$schema": "http://json-schema.org/draft-04/schema#",
    "title": "Product set",
    "type": "array",
    "items": {
    "title": "Product",
    "type": "object",
    "properties": {
    "id": {
    "description": "The unique identifier for a product",
    "type": "number"
    },
    "name": {
    "type": "string"
    },
    "price": {
    "type": "number",
    "minimum": 0,
    "exclusiveMinimum": true
    },
    "tags": {
    "type": "array",
    "items": {
    "type": "string"
    },
    "minItems": 1,
    "uniqueItems": true
    },
    "dimensions": {
    "type": "object",
    "properties": {
    "length": {"type": "number"},
    "width": {"type": "number"},
    "height": {"type": "number"}
    },
    "required": ["length", "width", "height"]
    },
    "warehouseLocation": {
    "description": "Coordinates of the warehouse with the product",
    "$ref": "http://json-schema.org/geo"
    }
    },
    "required": ["id", "name", "price"]
    }
    }

    definitions

    当一个schema写的很大的时候,可能需要创建内部结构体,再使用$ref进行引用,示列如下:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    {
    "type": "array",
    "items": { "$ref": "#/definitions/positiveInteger" },
    "definitions": {
    "positiveInteger": {
    "type": "integer",
    "minimum": 0,
    "exclusiveMinimum": true
    }
    }
    }

    allOf

    意思是展示全部属性,建议用requires替代

    不建议使用,示例如下

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    {
    "definitions": {
    "address": {
    "type": "object",
    "properties": {
    "street_address": { "type": "string" },
    "city": { "type": "string" },
    "state": { "type": "string" }
    },
    "required": ["street_address", "city", "state"]
    }
    },
    "allOf": [
    { "$ref": "#/definitions/address" },
    { "properties": {
    "type": { "enum": [ "residential", "business" ] }
    }
    }
    ]
    }

    anyOf

    意思是展示任意属性,建议用requires替代和minProperties替代,示例如下:

    1
    2
    3
    4
    5
    6
    {
    "anyOf": [
    { "type": "string" },
    { "type": "number" }
    ]
    }

    oneOf

    其中之一

    1
    2
    3
    4
    5
    6
    {
    "oneOf": [
    { "type": "number", "multipleOf": 5 },
    { "type": "number", "multipleOf": 3 }
    ]
    }

    not

    非 * 类型
    示例

    1
    { "not": { "type": "string" } }

    Java Json Schema库

    表中给出了两种java中使用的JSON Schema库

    https://github.com/daveclayton/json-schema-validator draft-04 draft-03 everit https://github.com/everit-org/json-schema draft-04
  5. 如果在项目中使用了jackson json,那么使用fge是一个好的选择,因为fge就是使用的jackson json。

  6. 如果项目中使用的是org.json API,那么使用everit会更好。

  7. 如果是使用以上两个库以外的库,那么就使用everit,因为everit会比fge的性能好上两倍。

    fge的使用:

    maven配置

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    <dependency>
    <groupId>com.fasterxml.jackson.core</groupId>
    <artifactId>jackson-core</artifactId>
    <version>2.3.0</version>
    </dependency>

    <dependency>
    <groupId>com.fasterxml.jackson.core</groupId>
    <artifactId>jackson-databind</artifactId>
    <version>2.3.0</version>
    </dependency>

    <dependency>
    <groupId>com.fasterxml.jackson.core</groupId>
    <artifactId>jackson-annotations</artifactId>
    <version>2.3.0</version>
    </dependency>

    <dependency>
    <groupId>com.github.fge</groupId>
    <artifactId>json-schema-validator</artifactId>
    <version>2.2.6</version>
    </dependency>

    测试代码:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    @Test
    public void testJsonSchema1() {
    JsonNode schema = readJsonFile("src/main/resources/Schema.json");
    JsonNode data = readJsonFile("src/main/resources/failure.json");
    ProcessingReport report = JsonSchemaFactory.byDefault().getValidator().validateUnchecked(schema, data);
    Assert.assertTrue(report.isSuccess());
    }
    private JsonNode readJsonFile(String filePath) {
    JsonNode instance = null;
    try {
    instance = new JsonNodeReader().fromReader(new FileReader(filePath));
    } catch (IOException e) {
    e.printStackTrace();
    }
    return instance;
    }

    真正的调用只有一行代码,需要传入验证规则和数据。分别有validate和validateUnchecked两种方法,区别在于validateUnchecked方法不会抛出ProcessingException异常。

    还可以从字符串中读取json,代码如下:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    @Test
    public void testJsonSchema2() {
    String failure = new String("{\"foo\":1234}");
    String Schema = "{\"type\": \"object\", \"properties\" : {\"foo\" : {\"type\" : \"string\"}}}";
    ProcessingReport report = null;
    try {
    JsonNode data = JsonLoader.fromString(failure);
    JsonNode schema = JsonLoader.fromString(Schema);
    report = JsonSchemaFactory.byDefault().getValidator().validateUnchecked(schema, data);
    } catch (IOException e) {
    e.printStackTrace();
    }
    //Assert.assertTrue(report.isSuccess());
    Iterator<ProcessingMessage> it = report.iterator();
    while (it.hasNext()) {
    System.out.println(it.next());
    }
    }

    其中ProcessingReport对象中维护了一共迭代器,如果执行失败(执行成功时没有信息),其提供了一些高级故障信息。每个错误可能包含以下属性:

    1
    2
    3
    4
    5
    6
    7
    level: 错误级别(应该就是error)
    schema:引起故障的模式的所在位置的 URI
    instance:错误对象
    domain:验证域
    keyword:引起错误的约束key
    found:现在类型
    expected:期望类型

    以上代码的json信息为:

    failure.json :

    1
    {"foo" : 1234}

    Schema.json :

    1
    2
    3
    4
    5
    6
    7
    8
    {
    "type": "object",
    "properties" : {
    "foo" : {
    "type" : "string"
    }
    }
    }

    执行错误信息为:

    1
    2
    3
    4
    5
    6
    7
    8
    error: instance type (integer) does not match any allowed primitive type (allowed: ["string"])
    level: "error"
    schema: {"loadingURI":"#","pointer":"/properties/foo"}
    instance: {"pointer":"/foo"}
    domain: "validation"
    keyword: "type"
    found: "integer"
    expected: ["string"]

    everit的使用:

    maven配置

    1
    2
    3
    4
    5
    <dependency>
    <groupId>org.everit.json</groupId>
    <artifactId>org.everit.json.schema</artifactId>
    <version>1.3.0</version>
    </dependency>
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    @Test
    public void testJsonSchema3() {
    InputStream inputStream = getClass().getResourceAsStream("/Schema.json");
    JSONObject Schema = new JSONObject(new JSONTokener(inputStream));
    JSONObject data = new JSONObject("{\"foo\" : 1234}");
    Schema schema = SchemaLoader.load(Schema);
    try {
    schema.validate(data);
    } catch (ValidationException e) {
    System.out.println(e.getMessage());
    }
    }

    如果验证失败会抛出一个ValidationException异常,然后在catch块中打印出错误信息。everit中的错误信息想比fge来说比较简单,相同的json测试文件,打印的信息如下:

    1
    #/foo: expected type: String, found: Integer

    此外everit提供了一个format关键字,可以自定义validator来校验json中一些复杂数据,比如IP地址,电话号码等。具体请参考官方文档。

    性能测试:

    1、一共执行1000次,成功和失败分开执行,每种情况执行250次。然后记录下每次的执行时间,执行10次,取平均值。

    fge每1000次的执行时间(ms):1158, 1122, 1120, 1042, 1180, 1254, 1198,1126,1177,1192
    everit每1000次的执行时间(ms):33, 49, 54, 57, 51, 47, 48, 52, 53, 44

    2、一共执行10000次,成功和失败分开执行,每种情况执行2500次。

    方法/场景 每次执行时间(ms) fge/场景1 1.1569 fge/场景2 0.3407 everit/场景1 0.0488 everit/场景2