添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接

Compare the performance, advantages, and disadvantages of fastjson , gjson , and jsonparser

This article delves into the analysis of how the standard library in Go parses JSON and then explores popular JSON parsing libraries, their characteristics, and how they can better assist us in development in different scenarios.

This article is first published in the medium MPP plan. If you are a medium user, please follow me on the medium . Thank you very much.

I didn’t plan to look into the JSON library’s performance issue. However, recently, I did a pprof on my project and found from the flame graph below that more than half of the performance consumption in business logic processing is during JSON parsing. Therefore, this article came about.

This article delves into the analysis of how the standard library in Go parses JSON and then explores popular JSON parsing libraries, as well as their characteristics and how they can better assist us in development in different scenarios.

Mainly introduce the analysis of the following libraries (2024-06-13):

“The official JSON parsing library requires two parameters: the object to be serialized and the type of this object. Before actually performing JSON parsing, reflect.ValueOf is called to obtain the reflection object of parameter v . Then, the method for parsing is determined based on the non-empty characters at the beginning of the incoming data object.”

func (d *decodeState) value(v reflect.Value) error {
    switch d.opcode {
    default:
        panic(phasePanicMsg)
    // array 
    case scanBeginArray:
    // struct or map
    case scanBeginObject:
	// Literals, including int, string, float, etc.
    case scanBeginLiteral:
    return nil

If the parsed object starts with [, it indicates that this is an array object and will enter the scanBeginArray branch; if it starts with {, it indicates that the parsed object is a struct or map, and then enters the scanBeginObject branch, and so on.

Sub Summary

Looking at Unmarshal’s source code, it can be seen that a large amount of reflection is used to obtain field values. If the JSON is nested, recursive reflection is needed to obtain values. Thus, the performance can be imagined to be very poor.

However, if performance is not highly valued, using it directly is actually a very good choice. It has complete functionality, and the official team has been continuously iterating and optimizing it. Maybe its performance will also make a qualitative leap in future versions. It should be the only one that can directly convert JSON objects into Go structs.

fastjson

The characteristic of this library is fast, just like its name suggests. Its introduction page says so:

Fast. As usual, up to 15x faster than the standard encoding/json.

Its usage is also very simple, as follows:

func main() {
    var p fastjson.Parser
    v, _ := p.Parse(`{
                "str": "bar",
                "int": 123,
                "float": 1.23,
                "bool": true,
                "arr": [1, "foo", {}]
    fmt.Printf("foo=%s\n", v.GetStringBytes("str"))
    fmt.Printf("int=%d\n", v.GetInt("int"))
    fmt.Printf("float=%f\n", v.GetFloat64("float"))
    fmt.Printf("bool=%v\n", v.GetBool("bool"))
    fmt.Printf("arr.1=%s\n", v.GetStringBytes("arr", "1"))
// Output:
// foo=bar
// int=123
// float=1.230000
// bool=true
// arr.1=foo

To use fastjson, first, give the JSON string to the Parser parser for parsing, and then retrieve it through the object returned by the Parse method. If it is a nested object, you can directly pass in the corresponding parent-child key when passing parameters to the Get method.

Analysis

The design of fastjson differs from the standard library Unmarshal in that it divides JSON parsing into two parts: Parse and Get.

Parse is responsible for parsing the JSON string into a structure and returning it. Data is then retrieved from the returned structure. The Parse process is lock-free, so if you want to call Parse concurrently, you need to use ParserPool.

fastjson processes JSON by traversing it from top to bottom, storing the parsed data in a Value structure:

type Value struct { o Object a []*Value s string t Type }

This structure is very simple:

  • o Object: Indicates that the parsed structure is an object.
  • a []*Value: Indicates that the parsed structure is an array.
  • s string: If the parsed structure is neither an object nor an array, other types of values are stored in this field as a string.
  • t Type: Represents the type of this structure, which can be TypeObject, TypeArray, TypeString, TypeNumber, etc.
  • type Object struct { kvs []kv keysUnescaped bool } type kv struct { k string v *Value }
    

    This structure stores the recursive structure of objects. After parsing the JSON string in the example above, the resulting structure looks like this:

    In terms of implementation, the absence of reflection code makes the entire parsing process very clean. Let’s directly look at the main part of the parsing:

    func parseValue(s string, c *cache, depth int) (*Value, string, error) {
        if len(s) == 0 {
            return nil, s, fmt.Errorf("cannot parse empty string")
        depth++
        // The maximum depth of the json string cannot exceed MaxDepth
        if depth > MaxDepth {
            return nil, s, fmt.Errorf("too big depth for the nested JSON; it exceeds %d", MaxDepth)
        // parse object
        if s[0] == '{' {
            v, tail, err := parseObject(s[1:], c, depth)
            if err != nil {
                return nil, tail, fmt.Errorf("cannot parse object: %s", err)
            return v, tail, nil
        // parse array
        if s[0] == '[' {
        // parse string
        if s[0] == '"' {
        return v, tail, nil
    

    parseValue will determine the type to be parsed based on the first non-empty character of the string. Here, an object type is used for parsing:

    func parseObject(s string, c *cache, depth int) (*Value, string, error) {
        o := c.getValue()
        o.t = TypeObject
        o.o.reset()
        for {
            var err error
            // 获取Ojbect结构体中的 kv 对象
            kv := o.o.getKV()
            // 解析 key 值
            kv.k, s, err = parseRawKey
    
    
    
    
        
    (s[1:])
            // 递归解析 value 值
            kv.v, s, err = parseValue(s, c, depth)
            // 遇到 ,号继续往下解析
            if s[0] == ',' {
                s = s[1:]
                continue
            // 解析完毕
            if s[0] == '}' {
                return o, s[1:], nil
            return nil, s, fmt.Errorf("missing ',' after object value")
    

    The parseObject function is also very simple. It will get the key value in the loop, and then call the parseValue function recursively to parse the value from top to bottom, parsing JSON objects one by one until encountering } at last.

    Sub Summary

    Through the above analysis, it can be seen that fastjson is much simpler in implementation and has higher performance than the standard library. After using Parse to parse the JSON tree, it can be reused multiple times, avoiding the need for repeated parsing and improving performance.

    However, its functionality is very rudimentary and lacks common operations such as JSON to struct or JSON to map conversion. If you only want to simply retrieve values from JSON, then using this library is very convenient. But if you want to convert JSON values into a structure, you will need to manually set each value yourself.

    GJSON

    In my test, although the performance of GJSON is not as extreme as fastjson, its functionality is very complete and its performance is also quite OK. Next, let me briefly introduce the functionality of GJSON.

    The usage of GJSON is similar to fastjson, it is also very simple. Just pass in the JSON string and the value that needs to be obtained as parameters.

    json := `{"name":{"first":"li","last":"dj"},"age":18}`
    lastName := gjson.Get(json, "name.last")
    

    In addition to this function, simple fuzzy matching can also be performed. It supports wildcard characters * and ? in the key. * matches any number of characters, while ? matches a single character, as follows:

    json := `{
        "name":{"first":"Tom", "last": "Anderson"},
        "age": 37,
        "children": ["Sara", "Alex", "Jack"]
    fmt.Println("third child*:", gjson.Get(json, "child*.2"))
    fmt.Println("first c?ild:", gjson.Get(json, "c?ildren.0"))
    
  • child*.2: First, child* matches children, .2 reads the third element;
  • c?ildren.0: c?ildren matches children, .0 reads the first element;
  • In addition to fuzzy matching, it also supports modifier operations.

    json := `{
        "name":{"first":"Tom", "last": "Anderson"},
        "age": 37,
        "children": ["Sara", "Alex", "Jack"]
    fmt.Println("third child*:", gjson.Get(json, "children|@reverse"))
    

    children|@reverse 先读取数组children,然后使用修饰符@reverse翻转之后返回,输出。

    nestedJSON := `{"nested": ["one", "two", ["three", "four"]]}` fmt.Println(gjson.Get(nestedJSON, "nested|@flatten"))
    

    @flatten flattens the inner array of array nested to the outer array and returns:

    ["one," "two," "three," "four"]
    

    There are some other interesting features, you can check the official documentation.

    Analysis

    The Get method parameter of GJSON is composed of two parts, one is a JSON string, and the other is called Path, which represents the matching path of the JSON value to be obtained.

    In GJSON, because it needs to meet many definitions of parsing scenarios, the parsing is divided into two parts. You need to parse the Path before traversing the JSON string.

    If you encounter a value that can be matched during the parsing process, it will be returned directly, and there is no need to continue to traverse down. If multiple values are matched, the whole JSON string will be traversed all the time. If you encounter a Path that cannot be matched in the JSON string, you also need to traverse the complete JSON string.

    In the process of parsing, the content of parsing will not be saved in a structure like fastjson, which can be used repeatedly. So when you call GetMany to return multiple values, you actually need to traverse the JSON string many times, so the efficiency will be relatively low.

    It’s important to be aware that when using the @flatten function to parse JSON, it won’t be validated. This means that even if the input string is not a valid JSON, it will still be parsed. Therefore, it’s essential for users to double-check that the input is indeed a valid JSON to avoid any potential issues.

    func Get(json, path string) Result {
        // 解析 path 
        if len(path) > 1 {
        var i int
        var c = &parseContext{json: json}
        if len(path) >= 2 && path[0] == '.' && path[1] == '.' {
            c.lines = true
            parseArray(c, 0, path[2:])
        } else {
    // Parse according to different objects, and loop here until '{' or '[' is found
            for ; i < len(c.json); i++ {
                if c.json[i] == '{' {
                    parseObject(c, i, path)
                    break
                if c.json[i] == '[' {
                    parseArray(c, i, path)
                    break
        if c.piped {
            res := c.value.Get(c.pipe)
            res.Index = 0
            return res
        fillIndex(json, c)
        return c.value
    

    In the Get method, you can see a long code string used to parse various paths. Then, a for loop continuously traverses JSON until it finds ‘{’ or ‘[’ before performing the corresponding logic processing.

    func parseObject(c *parseContext, i int, path string) (int, bool) {
        var pmatch, kesc, vesc, ok, hit bool
        var key, val string
        rp := parseObjectPath(path)
        if !rp.more && rp.piped {
            c.pipe = rp.pipe
            c.piped = true
    	// Nest two for loops to find the key value
        for i < len(c.json) {
            for ; i < len(c.json); i++ {
                if
    
    
    
    
        
     c.json[i] == '"' { 
                    var s = i
                    for ; i < len(c.json); i++ {
                        if c.json[i] > '\\' {
                            continue
                        // Find the key value and jump to parse_key_string_done
                        if c.json[i] == '"' {
                            i, key, kesc, ok = i+1, c.json[s:i], false, true
                            goto parse_key_string_done
                    key, kesc, ok = c.json[s:], false, false
                // break
                parse_key_string_done:
                    break
                if c.json[i] == '}' {
                    return i + 1, false
            if !ok {
                return i, false
            // Check whether it is a fuzzy match
            if rp.wild {
                if kesc {
                    pmatch = match.Match(unescape(key), rp.part)
                } else {
                    pmatch = match.Match(key, rp.part)
            } else {
                if kesc {
                    pmatch = rp.part == unescape(key)
                } else {
                    pmatch = rp.part == key
            // parse value
            hit = pmatch && !rp.more
            for ; i < len(c.json); i++ {
                switch c.json[i] {
                default:
                    continue
                case '"':
                    i, val, vesc, ok = parseString(c.json, i)
                    if !ok {
                        return i, false
                    if hit {
                        if vesc {
                            c.value.Str = unescape(val[1 : len(val)-1])
                        } else {
                            c.value.Str = val[1 : len(val)-1]
                        c.value.Raw = val
                        c.value.Type = String
                        return i, true
                case '{':
                    if pmatch && !hit {
                        i, hit = parseObject(c, i+1, rp.path)
                        if hit {
                            return i, true
                    } else {
                        i, val = parseSquash(c.json, i)
                        if hit {
                            c.value.Raw = val
                            c.value.Type = JSON
                            return i, true
                break
        return i, false
    

    In reviewing the parseObject code, the intention was not to teach JSON parsing or string traversal but to illustrate a bad-case scenario. The nested for loops and consecutive if statements can be overwhelming and may remind you of a colleague’s code you’ve encountered at work.

    Sub Summary

    Advantages:

  • Performance: jsonparser performs relatively well compared to the standard library.
  • Flexibility: It offers various retrieval methods and customizable return values, making it very convenient.
  • Disadvantages:

  • No JSON Validation: It does not check for the correctness of the JSON input.
  • Code Smell: The code structure is cumbersome and hard to read, which can make maintenance challenging.
  • When parsing JSON to retrieve values, the GetMany function will traverse the JSON string multiple times based on the specified keys. Converting the JSON to a map can reduce the number of traversals.

    Conclusion

    While jsonparser has notable performance and flexibility, its lack of JSON validation and complex, hard-to-read code structure present significant drawbacks. If you need to parse JSON and retrieve values frequently, consider the trade-offs between performance and code maintainability.

    jsonparser

    Analysis

    jsonparser also processes an input JSON byte slice and allows for quickly locating and returning values by passing multiple keys.

    Similar to GJSON, jsonparser does not cache the parsed JSON string in a data structure as fastjson does. However, when multiple values need to be parsed, the EachKey function can be used to parse multiple values in a single pass through the JSON string.

    If a matching value is found, jsonparser returns immediately without further traversal. For multiple matches, it traverses the entire JSON string. If a path does not match any value in the JSON string, it still traverses the entire string.

    jsonparser reduces the use of recursion by employing loops during JSON traversal, thus decreasing the call stack depth and enhancing performance.

    In terms of functionality, ArrayEach, ObjectEach, and EachKey functions allow for passing a custom function to meet specific needs, greatly enhancing the utility of jsonparser.

    The code for jsonparser is straightforward and clear, making it easy to analyze. Those interested can examine it themselves.

    Sub Summary

    The high performance of jsonparser compared to the standard library can be attributed to:

  • Using for loops to minimize recursion.
  • Avoid the use of reflection, unlike the standard library.
  • Exiting immediately upon finding the corresponding key value without further recursion.
  • Operating on the passed-in JSON string without allocating new space, thus reducing memory allocations.
  • Additionally, the API design is highly practical. Functions like ArrayEach, ObjectEach, and EachKey allow for passing custom functions, solving many issues in actual business development.

    However, jsonparser has a significant drawback: it does not validate JSON. If the input is not valid JSON, jsonparser will not detect it.

    Performance Comparison

    Parsing Small JSON Strings

    Parsing a simple JSON string of approximately 190 bytes

    Library Operation Time per Iteration Memory Usage Memory Allocations Performance

    Summary

    During this comparison, I analyzed several high-performance JSON parsing libraries. It was evident that these libraries share several common characteristics:

  • They avoid using reflection.
  • They parse JSON by traversing the bytes of the JSON string sequentially.
  • They minimize memory allocation by directly parsing the input JSON string.
  • They sacrifice some compatibility for performance.
  • Despite these trade-offs, each library offers unique features. The fastjson API is the simplest to use; GJSON offers fuzzy searching capabilities and high customizability; jsonparser supports inserting callback functions during high-performance parsing, providing a degree of convenience.

    For my use case, which involves simply parsing certain fields from HTTP response JSON strings with predetermined fields and occasional custom operations, jsonparser is the most suitable tool.

    Therefore, if performance concerns you, consider selecting a JSON parser based on your business requirements.

    Reference

    https://github.com/buger/jsonparser
    https://github.com/tidwall/gjson
    https://github.com/valyala/fastjson
    https://github.com/json-iterator/go
    https://github.com/mailru/easyjson
    https://github.com/Jeffail/gabs
    https://github.com/bitly/go-simplejson