MAR 2 11

Ultra fast JSON encoder and decoder for Python

We do a lot of JSON encoding and decoding here at ESN. Python 2.6 ships with an accurate but rather slow implementation which we’ve switched for simplejson. There’s a lot of stuff going on with JavaScript and JSON today and I thought maybe this was a place where my good old C optimization skills could be of good use. To be honest I also wanted to prove that I still had the skills

UltraJSON

Not being able to stay out of this mess I spent a weekend researching the quickest and (perhaps) also the dirtiest way of  encoding and decoding JSON. I call the result UltraJSON and it’s by my preliminary and perhaps somewhat limited benchmarks the fastest JSON encoder and decoder I’ve found so far (and if it’s not I’m gonna make it faster!).

Python bindings

Neither the decoder nor the encoder part of UltraJSON is specific to any language. It can be integrated with most anything and since I wanted my colleges to use it I implemented Python bindings for it as the module ‘ujson’.

UPDATE: UltraJSON is now available on PyPI package ujson. Installation should be simple through easy_install or pip!

Current benchmarks:

64-bit benchmarks Linux
Python 2.6.6 (r266:84292, Sep 15 2010, 16:22:56)
OS Version: Ubuntu 10.10
System Type: x64-based PC
Processor: Intel(R) Core(TM) i5-2300 CPU @ 2.80GHz
Total Physical Memory: 4096 MB

Array with 256 utf-8 strings:
ujson encode      : 2874.54652 calls/sec
simplejson encode : 1539.47999 calls/sec
cjson encode      : 132.33571 calls/sec

ujson decode      : 2072.09417 calls/sec
cjson decode      : 991.20903 calls/sec
simplejson decode : 310.75309 calls/sec

Medium complex object:
ujson encode      : 19001.01929 calls/sec
simplejson encode : 3512.29205 calls/sec
cjson encode      : 3063.69959 calls/sec

ujson decode      : 12791.80993 calls/sec
cjson decode      : 8288.32916 calls/sec
simplejson decode : 6640.22169 calls/sec

Array with 256 strings:
ujson encode      : 40161.78453 calls/sec
simplejson encode : 19301.40779 calls/sec
cjson encode      : 12337.13166 calls/sec

ujson decode      : 36944.81317 calls/sec
cjson decode      : 30187.40167 calls/sec
simplejson decode : 25105.56562 calls/sec

Array with 256 doubles:
ujson encode      : 6054.71950 calls/sec
simplejson encode : 2912.44353 calls/sec
cjson encode      : 3539.51228 calls/sec

ujson decode      : 27794.29735 calls/sec
cjson decode      : 14892.38775 calls/sec
simplejson decode : 14879.00070 calls/sec

Array with 256 True values:
ujson encode      : 168086.95325 calls/sec
simplejson encode : 49348.93309 calls/sec
cjson encode      : 67392.90623 calls/sec

ujson decode      : 139359.25968 calls/sec
cjson decode      : 82552.26652 calls/sec
simplejson decode : 114998.51396 calls/sec

Array with 256 dict{string, int} pairs:
ujson encode      : 24125.68837 calls/sec
simplejson encode : 5751.74871 calls/sec
cjson encode      : 4735.65147 calls/sec

ujson decode      : 17176.70493 calls/sec
cjson decode      : 13420.93963 calls/sec
simplejson decode : 9854.27352 calls/sec

Dict with 256 arrays with 256 dict{string, int} pairs:
ujson encode      : 86.52449 calls/sec
simplejson encode : 17.46117 calls/sec
cjson encode      : 18.31323 calls/sec

ujson decode      : 49.54660 calls/sec
cjson decode      : 38.34094 calls/sec
simplejson decode : 28.18035 calls/sec

More on GitHub!

I’d love to see more people using and contributing to this project, so please check out my GitHub repository!

Bindings to more languages would be awesome!

by Jonas Tärnström
  • YeaSayer

    Out of curiosity, did you check out the other JSON libraries listed at json.org? The two most popular C libraries are yajl and jansson. The former has no less than 3 python extensions: yajl, yajl-py, and ijson. I’d be interested to see those added to your benchmarks.

  • u52c7u6d69 u8d56

    u00a0this article have been translated into Chinese. read it here:http://blog.csdn.net/lanphaday/archive/2011/06/25/6567408.aspx

  • http://blog.jamiesun.me/archives/76 UltraJSON:超级json编解码工具 | 涛声

    [...] 这里还有一篇详细基准测试: http://pushingtheweb.com/2011/03/ultra-fast-json-encoding-decoding-python/ [...]

  • http://blog.seoprofiler.com Andre Voget

    Thank you very much for your work to develop a faster json library for the Python community!

  • Juztin

    Just found your project from a link on SO.nThanks for the work!nMy req/sec have been increased, by about 400.

  • Christopher Brown

    Splendid, love it!u00a0@TheYeasayer:disqusu00a0: yajl-py has either awful documentation or a bizarre interface. How do I even decode a simple string of json into a python dict?

  • Huang ChuanTong

    bug :
    Python 2.6.6 (r266:84297, Aug 24 2010, 18:46:32) [MSC v.1500 32 bit (Intel)] onwin32
    >>> import json, ujson
    >>> ujson.__version__
    >>> ’1.18′
    >>>  ujson.loads(‘[18446098363113800555]‘)
    >>>  [-645710595751061L]
    >>>  json.loads(‘[18446098363113800555]‘)
    >>>  [18446098363113800555L]

  • http://www.facebook.com/monotonemonk Rory McGuire

    Still appears to be the fastest at the little benchmark you provided.

  • Nathaniel

    version 1.20
    This problem still exists.

  • Allen Rice

    I’ve seen and admired the structure of ESN’s json via battlelog.  You and your teams have some great organization and readability in your designs!  I particularly like how the status and return data has an extra layer of abstraction in it so (afaik) you can handle application or request specific exceptions programmatically instead of generically via an error callback.

    I’m a C# developer and am admiring all the new python work you’re doing but this all seems so foreign to me.  I’d be curious to see how the different serializers / deserializers perform on various platforms (.net, python, etc).

  • http://twitter.com/drtune drtune

    Sweet jesus that thing is fast. I wrote a quick tool to read a bunch of json files and munge them,  basically doing nothing but a stack of   “json.loads(file.read()) “.  I tried ‘pypy’ JIT compiler as well..

    Time results:
    CPython + regular json lib: 30secs
    PyPy + regular json lib: 7 secs (very good! ideal workload for JITting)
    but..
    CPython + ujson : 1.3secs (!!!)
    PyPy + ujson : didn’t even bother trying it. :-)
    Very nice thx!

  • Jason

    Updated:

    Python 2.7.1 (r271:86832, Jun 16 2011, 16:59:05) [GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)] on darwinType “help”, “copyright”, “credits” or “license” for more information.>>> import json, ujson>>> ujson.__version__’1.30′>>> ujson.loads(‘[18446098363113800555]‘)Traceback (most recent call last):  File “”, line 1, in ValueError: Value is too big>>> json.loads(‘[18446098363113800555]‘)[18446098363113800555L]>>> 

blog comments powered by Disqus