When transmitting data, all the information must be serialized. When Serializing there are two relevant criteria in order to select a suitable format: size and performance. This article compares the efficiency of different serialization formats (Java Serialization, ByteBuffer, Unsafe, Json and Protobuf).
Setup
In order to compare the efficiency of all these different formats a sample data structure is used. Furthermore for Protobuf an object containing the same elements was generated.
The Test simply serializes the data and then it deserializes the stream. The working code can be downloaded.
package com.robert_franz.serialize; import java.io.Serializable; public class Testobject implements Serializable { private static final long serialVersionUID = 1L; String a; int b; long c; public String getA() { return a; } public void setA(String a) { this.a = a; } public int getB() { return b; } public void setB(int b) { this.b = b; } public long getC() { return c; } public void setC(long c) { this.c = c; } }
package com.robert_franz.serialize; message Testobject { optional string a = 1; optional int32 b = 2; optional int64 c = 3; }
Java Serialization
This is the standard serialization mechanism provided by the JVM. In my opinion the performance is quite ok. When having a look at Java EE, this is the standard serialization mechanism over a container’s border. The are no complaints about the serialization performance at all.
ByteBuffer
A standard api in Java that enables the programmer to write data into a byte array. As writing the data directly the programmer has do do everything by itself. It’s binary serialization by hand.
Unsafe
It is quite tricky to handle and it uses an unofficial api of the JVM. But it is so widely used that it is almost an “official” api. I don’t think Oracle is removing this api, as many pieces of software that provide a state of the art performance use it. As ByteBuffer but with directly writing into memory. It’s f**king fast, but be aware! The JVM directly compiles the statements into machine code. Writing into the wrong memory location will definately crash your JVM. A dream for each C or C++ programmer. A horror for guys that don’t like pointers. As it’s not an official api, there is no official documentation.
Json
The Javascript Object Notation is provided by Jackson. In past the performance of Jackson increased a lot. It less verbose than XML but the output is human readable. Futhermore there is a wide support for de/serializing Json. Many REST apis support Json.
Protobuf
This is a protocol provided by Google. They use it e.g. for the Android Play Store. Have a look at the licence information of the store and there will be the protobuf licence, as it is open source using a BSD licence. Protocol Buffer is highly optimized concerning performance and transfer volume. The even compress an integer. In there are often very low numbers (e.g. < 1000) protobuf safes memory and does’nt require full 4 bytes.
XML
I simply ignore it. There are reasons why to use XML. But performance isn’t one for sure. The size of the serialized object isn’t a reason also. In my experience, XML serialization is about 100 times slower than Java serialization! But the killer feature is validation. In case you don’t need it, you don’t need XML.
The Results
In order to get some results there is a setup, that de/serializes 1000.000 times.
Handling
Definately the easiest way is standard java serialization. Every object marked with the serialization interface is capable to transmitted. But Jackson is also a good way. It only requires a bean (members with getters an setters). Protobuf requires a source code generator to run. But then it’s quite simple to use. There is no need to work on byte/bit level. Concerning the memory usage it’s even better than using a ByteBuffer as there is a lot of optimization. ByteBuffer is quite performant but it’s like writing a TCP/IP package by hand. It’s writing a binary format by hand. The most tricky is Unsafe. In case you don’t know what you do, dont’t use it. It is not well documented and so on …
Performance
The standard Java serialization is the slowest solution. The second is Jackson. But I’m quite impressed about its performance. In comparsion with the standard Java serialization it’s times faster! Then there is Protobuf and ByteBuffer. The fastest is Unsafe.
The difference between the fastest and the slowest almost factor 100! Even when using Json, this is four times faster than standard java serialization.
Java | Json | Protobuf | ByteBuffer | Unsafe | |
---|---|---|---|---|---|
Average | 13754,57 | 3093,21 | 537,21 | 385,41 | 154,91 |
Median | 13643,5 | 3086 | 533 | 383,5 | 154 |
Deviation | 351,60 | 54,36 | 16,98 | 12,25 | 4,49 |
Size | 109 | 26 | 9 | 22 | 22 |
average/median time in ms, size in bytes |
Conclusion
Without surprise (at least for me) the Unsafe solution won concerning performance. Java serialization loses with a big gap to Json. I’m surprised concerning the size of the Json, but the testing values where quite short, so with longer longs this might result in different results. I wouldn’t use Unsafe unless there is a really really good reason. But it shows what is possible in case there is a need for. I think Protobuf is a good way, but only in case you like a code generator that provides the objects.
Files
Update – 2020-10-28
I added an update with adding a java copy constructor. This simply compares the object clone by adding a copy constructor and then running the same tests with that copy constructor. For these tests Java 8 was used. As a different computer was used for those performance tests you can find a full comparsion of all serialization methods below:
Java | Json | Protobuf | Bytebuffer | Unfsave | Copy | |
---|---|---|---|---|---|---|
Total | 6903 ms | 688 ms | 331 ms | 122 ms | 65 ms | 7 ms |
serialize+copy-constructor.tar.gz
Sources
- http://www.javacodegeeks.com/2012/07/native-cc-like-performance-for-java.html
- http://www.javacodegeeks.com/2010/07/java-best-practices-high-performance.html
- http://mishadoff.github.io/blog/java-magic-part-4-sun-dot-misc-dot-unsafe/
- http://code.google.com/p/protobuf/
- https://github.com/FasterXML/jackson
Here is a complete comparison of Java serializers. https://code.google.com/p/thrift-protobuf-compare/wiki/Benchmarking
Thanks for detailed code for comparison. I learned a new thing today. I was also shocked by results. I ran the query java and json serialisation with dataset I have, java serialisation is taking 1.5 times the time taken by json serialisation.
My dataset contains 8 fields with some real data names like modificationTime, resourceId etc..,
Considering that the test object chosen is a very simple one, it is to be noted that a simple java copy constructor solution would simply beat every other solution by a very big margin(10 times or so).
Do you think java copy constructor would be faster?
Correction to my above statement: It is from a cloning point of view and not for serialization or transfer over the network.
Honestly said I don’t know. But I assume it might be faster. But extending the tests and then running them would of course be the best solution. Only then you have proper and reliable results.