Storing arrays using JSON, serialize and var_export

Recently I was dealing with precessing and storing large arrays in PHP (around 100 000 items) and I found out some quiet surprising facts that are very useful in performance critical applications.

Maybe you just want to see the results.

When I started looking for some benchmark I found article Cache a large array: JSON, serialize or var_export?. That is really good but I wanted to compare a few more things (eg. how type of stored data influences save and load times), that's why I decided to extend the article above a little bit.

What's this benchmark all about

In this test I'll compare save time, load time and serialized size of 4 different types of arrays, each array in five sizes. Just to make it clear:

Tested array types:

Array sizes: 10, 100, 1 000, 10 000 items.

I ran total 5 testes for 3 functions and all values here are their averages.

Results

Because 5 array sizes mean 5 charts per each array type, this article would be too long and hard read. That's why I'm putting here just results for 10, 1 000 and 100 000 items. If you want to see results for all array sizes check this XLSX with all results and charts.

Type #1: string => string

Type #2: string => int

Type #3: int => string

Type #4: int => int

Conclusion

There's one general rule. Always try to use integers for keys and values where it's possible (use type casting like (int)) and make sure numbers are stored as integers, not as strings. I know, usually you can't avoid strings but it's better to keep this rule in mind during development process than rewriting half of your code.

For smaller arrays up to around 1000 items it's usually better to use serialize. In some cases it depends on which action you need to perform more often. (eg. for int => int array saving in JSON is much faster than loading). I would recommend you to run some benchmark and see which method is better for your particular case.

For large arrays (let's say 100 000 and more) it's always better to use JSON. Loading arrays with unserialize tends to be extremely slow.

Of course with serialize you can convert to string also objects (which is not possible with JSON) but in performance critical applications it's good to be very careful and rather check twice that it's really necessary to use objects.

blog comments powered by Disqus