I have been working a couple of years with NHibernate and thought I would share some experiences about NHibernate performance tuning. This blog post will focus on batch inserts. When we talk about NHibernate performance tuning you should take into account that tuning NHibernate and tuning the database are 2
I have been working a couple of years with NHibernate and thought I would share some experiences about NHibernate performance tuning. This blog post will focus on batch inserts.
When we talk about NHibernate performance tuning you should take into account that tuning NHibernate and tuning the database are 2 seperated things. This post doesn't discuss any database tuning. We will focus on the NHibernate part.
First when doing performance tuning you should install the NHibernate Profiler. Installation and use is pretty straightforward. You open the profiler when running the application and you will see a trace of all your queries. Below is a typical tracing session of NHibernate Profiler :
For this experiment I created a small console application that will insert 200 objects into the database. The inserting of the objects will be executed 3 times but each time with a different method. We will compare the different execution times between the methods.
You can find the project on github.
For our first method we will be inserting objects with ID's that are generated by the database. The object that we are inserting in the database is a good old Product class:
There is not special about this class. The ID field is of the type integer and will contain our database generated ID. Now If we look at the project you will find a class InsertSimulator. This class contains the following method :
As you can see there is nothing special about this method. We setup some stopwatches, create 200 instances of Product and insert them into the database. Now let's look what the result is when we run this query:
The query execution time was 172ms. Which is kind of slow. Remember that the query is executed on a rather empty database and on my local machine. I say rather empty because I do a warming up run before the actual test.
Now if we look at our NHibernate profiler you will see that we get a gray ball with the following message "Too many database calls per session". The reason is that NHibernate will execute a database call with each insert and this will result in 200 database calls. Too many database calls is bad for the performance and should be avoided.
Wouldn't it be better if we could batch our inserts ? Luckily for us NHibernate support this feature.
In method 2 we are going to batch insert statements. NHibernate has a configuration setting that makes it possible to let batch insert statements. This configuration setting is called AdoNetBatchSize(N) where N stands for the amount of inserts you want to batch.
Now you should know that batch inserts only works when the ID is known before inserting the data. There 2 options here you can use:
For this method I am going to go with HiLo. We will discuss the GUID later on. Now if you don't know what HiLo is, I suggest you read this post
Basically you ask the database a set of ID's you can use. These ID's are then reserved and can only be used for our data. With this in mind we expect the performance to be better so let's take a look at the result:
The execution time of the query was 29ms. That's a difference of 143ms ! Which is a huge performance gain. When we check the NHibernate profiler you will see no more warnings and the amount of database calls is reduced to 5. Remember that our previous method had 200 database calls.
For the third method we will batch the insert statements with GUID generated keys. The assignment of the GUID is done in code. Now when we run this code we get the following result:
The result is 55ms. It's slower than HiLo but in my tests the 2 methods are in the same range of performance. Sometimes GUID is faster than HiLo but when running 10 or more tests you will see that HiLo is the fastest. A downside for using Guids can be the readability. When you are troubleshooting reading numbers like 1000 is easier then using {25892e17-80f6-415f-9c65-7395632f0223} :-).
I created a table with all the query results. There can be a small difference in the numbers when you run the queries yourself. I also added some result from runs I did on a Azure database.
DB generated | HiLo generated | GUID generated | |
---|---|---|---|
Query Result (localhost) | 172ms | 29ms | 55ms |
Query Result (Azure) | 5320ms | 374ms | 271ms |
As you can see using HiLo or GUID's will give you far better performance when batching inserts. The difference in speed between HiLo and GUID is small.
Now if you look at the tests on Azure you will even see bigger differences between method 1 and the others. You must take into account that I called the Azure database from my local machine and I didn't host my application on Azure.
When to use which method ? Depends on your project and personal / team preferences (some people like GUIDs others not so much). On 1 of my customers projects we are using a mix of method 1 and 2. For data that doesn't change much, like countries or nationalities, we use db generated keys. The tables where we need to insert a lot of data we decided to use HiLo.
Although the performance of our tests are good you should remember that NHibernate is not really the best tool for batch inserts. From the moment you start to insert 1000 or more records you should consider using SqlBulkCopy.
I hope you'll find this post useful.