Today, we would like to illustrate the importance of finding the ideal number of new elements per commit in the TitanDB because it could have an impact on the performance.
First, we will describe the algorithm which the proper size of commit is looking for. Basically, it is graph merging. The newly inserted sub-graph is compared to the existing big graph in the database. The algorithm tries to find common parts and creates only vertices and edges which do not exist in the database yet. To summarize, in addition to creating new vertices and edges, it fetches vertices, compares attributes and traverses graphs.
The measurement was performed very simply. We had the algorithm from the past, so we only needed to change the number of newly created elements (both vertices and edges together) per commit. Before the test, we searched the Internet for the proper size of commit for TitanDB and found that it should be something about 10k elements per commit.
For the test we used two data samples, the same as in the previous post about indices. The first one was part of a real data warehouse with about 1M vertices and 2M edges. The second one was an artificial sample with 50K vertices and almost no edges, but with the super node problem.
You can see the results in the table:
The best result for our algorithm was only 500 new elements per commit, which is 200-times smaller than the universal advice. And the difference in time spent was about 30%, which is pretty nice. The artificial sample needed an even smaller commit size and the difference in performance was greater (91%). Why are our results so different? We think the cause was the nature of our algorithm, in which the creation of new objects is only one part of all database operations.
The main conclusion we can draw from this test is that the size of the commit can make a significant difference. We definitely recommend performing similar tests to find your ideal commit size.
The tests were done on a computer with the following configuration:
OS: Win 7 64-bit
Processor: Intel Core i7-3740QM; 8 CPUs; 2,7 GHz
RAM: 8GB DDR3 (1600MHz)
JVM: 2GB RAM
Backend Storage: Persistit
Db cache size: 0.6
Any comments on this article? Just let us know in the form on the right. Also, do not forget to subscribe to our blog feed.