The Lightning Fast JSON Parser Library For Java
Written by Nikos Vaggalis   
Thursday, 24 August 2023

simdjson-java is the Java version of simdjson, the JSON parser that uses SIMD instructions. How fast can it go?

We had a look at simdjson its first version 1. 0 back in 2021. In plain terms simdjson is a C++ library that can parse JSON documents very fast:

Does parsing 3 gigabytes of JSON per second sound fast enough?

This library achieves it. In last year's benchmark against the fastest standard compliant C++ JSON parsers, RapidJSON and sajson, smidjson by far outperformed them. It can parse 4x faster than RapidJSON and 25x faster than Modern C++.

This efficiency is mainly achieved due to the library under the hood using SIMD instructions, which excel at data level parallelism by fitting operations many times over per instruction, even under a single core.

You might think that since it is a C++ lib that only devs writing in C++ are benefited. This is not true as there were already bindings for other languages like Go, Ruby, Python and more. There's even a port for PostgreSQL in pg_simdjson. Well now there's one for Java too.

With simdjson-java you can now leverage the power of the parser from Java, as easy as :

byte[] json = loadTwitterJson();

SimdJsonParser parser = new SimdJsonParser();

JsonValue jsonValue = simdJsonParser. parse(json, json. length);

Iterator<JsonValue> tweets = jsonValue. get("statuses"). arrayIterator();

while (tweets. hasNext()) {
     JsonValue tweet = tweets. next();
    JsonValue user = tweet. get("user");
    if (user. get("default_profile"). asBoolean()) {
        System. out. println(user. get("screen_name"). asString());
   }
}

While the library still not feature complete, it outperformed the rest of the Java json libraries by far when benchmarked under a target machine with the following specs:

  • CPU: Intel(R) Core(TM) i5-4590 CPU @ 3. 30GHz
  • OS: Ubuntu 23. 04, kernel 6. 2. 0-23-generic
  • Java: OpenJDK 64-Bit Server VM Temurin-20. 0. 1+9

The benchmark showed that simdjson-java produced 1450. 951 ops/sec while the rest (jackson, fastjson2, jsoniter) performed in the range of 500 ops/sec.

With that said, what's missing from the library at this early stage?

  • Support for Unicode characters
  • UTF-8 validation
  • Full support for parsing floats
  • Support for 512-bit vectors

They are features, however, on the project's roadmap and upon their completion the library will achieve an even better position.
That said, the point is that you can start using it now in your own code to enjoy the performance benefits by of course taking the missing functionality into consideration.

 

More Information

simdjson Java

simdjson

Related Articles

A Lightning Fast JSON Parser Library 

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


GitHub Announces Free Copilot
19/12/2024

GitHub has launched GitHub Copilot Free, a free version of Copilot that provides limited access to selected features of Copilot and is automatically integrated into VS Code. The free tier is aimed at  [ ... ]



Simplify PostgreSQL Database Access With Neon Authorize
30/12/2024

By fusing PostgreSQL native row-level security
with external to the database authentication providers, Neon Authorize offers a new, efficient and transparent way for securing access for database-driven [ ... ]


More News

espbook

 

Comments




or email your comment to: comments@i-programmer.info