Find and fix issues in Hibernate with Retrace

How to Find Hibernate Performance Issues in Development and Production

Thorben Janssen Developer Tips, Tricks & Resources, Popular Leave a Comment

The Java Persistence API (JPA), and Hibernate ORM as its most popular implementation are used in most Java applications to interact with a relational database. The main reason for their popularity is that they use object-relational mapping to abstract the database interactions and make it very easy to implement simple CRUD operations.

But this abstraction also has its downsides. Hibernate uses a lot of internal optimizations and hides all database interactions behind its API.

As a developer using Hibernate, you most often don’t know if or when Hibernate will execute an SQL statement for your call of a method on the EntityManager. That makes it hard to find inefficiencies and potential performance problems before they cause trouble in production.

Options to Monitor Database Interactions

The best way to avoid these issues is to monitor the database interactions during development, test and in production. There are lots of different ways you can do that. You can analyze your database logs, activate Hibernate’s internal logging, use a JDBC data source that logs all executed statements or use an external tool like Retrace to monitor your system.

All of these options can help you to better understand how Hibernate interacts with the database and you should choose the one you feel most comfortable with.

As a Java developer, I don’t like to work with database logs, and I know that I can’t always replace the JDBC data source. That’s why I focus on Hibernate’s internal logging and Retrace in this post.

Let’s have a closer look at both options before I show you how you can use them to find different performance problems.

Hibernate Logging

It’s no surprise that Hibernate has all the information you need to understand how it interacts with the database. It can write log messages telling you:

  • When it executed which SQL statement,
  • Which bind parameter values it used,
  • How many records the query returned and
  • How long each execution took.

As most frameworks and applications, Hibernate uses different categories and log levels to log this information. With the correct logging configuration, you get all the information you’re looking for. Let’s start with the logging of the executed SQL statements.

Log SQL statements

The best way to activate the logging of executed SQL queries is to set the log level of the category org.hibernate.SQL to DEBUG. You can change it at any time without changing your deployed application. Hibernate will use the configured logging framework to write the messages to the log file.

You can see an example of a log4j configuration in the following code snippet and you can fork the project with this and all following code snippets on github.

log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.Target=System.out
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%d{HH:mm:ss,SSS} %-5p [%c] - %m%n
 
log4j.rootLogger=info, stdout
# basic log level for all messages
log4j.logger.org.hibernate=info
 
# SQL statements and parameters
log4j.logger.org.hibernate.SQL=debug

When you activate this logging configuration, Hibernate will write a log message for each executed SQL statement.

14:36:39,325 DEBUG SQL:92 - select order0_.id as id1_1_, order0_.version as version2_1_ from purchase_order order0_ where order0_.id=1

Activate Hibernate Statistics for More Details

The logging of all executed queries provides some basic information about all database interactions. But Hibernate can do a lot more.

If you enable its statistics component, it measures how long it took to execute a query. It also provides a summary of the executed queries and their execution time at the end of each session. That makes it a lot easier to get a general overview of your application and helps you to identify the sessions you should take a closer look at.

But be careful, collecting all these information slows down your application and you shouldn’t use it in production.


Free Download

You can activate the statistics component by setting the system property hibernate.generate_statistics or a configuration parameter with the same name to true.

<persistence>
    <persistence-unit name="my-persistence-unit">
        <provider>org.hibernate.jpa.HibernatePersistenceProvider</provider>
        <properties>
            <property name="hibernate.generate_statistics" value="true" />

			...
        </properties>
    </persistence-unit>
</persistence>

Hibernate will then write a summary of all database interactions at the end of each session. As I will show you in a later part of this post, this is really helpful to find use cases in which Hibernate executes too many SQL queries.

14:37:30,715  INFO StatisticalLoggingSessionEventListener:258 - Session Metrics {
    48986 nanoseconds spent acquiring 1 JDBC connections;
    23326 nanoseconds spent releasing 1 JDBC connections;
    259859 nanoseconds spent preparing 1 JDBC statements;
    1092619 nanoseconds spent executing 1 JDBC statements;
    0 nanoseconds spent executing 0 JDBC batches;
    0 nanoseconds spent performing 0 L2C puts;
    0 nanoseconds spent performing 0 L2C hits;
    0 nanoseconds spent performing 0 L2C misses;
    22383767 nanoseconds spent executing 1 flushes (flushing a total of 1 entities and 1 collections);
    72779 nanoseconds spent executing 1 partial-flushes (flushing a total of 0 entities and 0 collections)
}

When you activate the statistics component, you should also set the log level of the org.hibernate.stat category to DEBUG. Hibernate will then write a log message with the SQL statement, the number of returned rows and the execution time of each query to the log file.

14:38:05,231 DEBUG ConcurrentStatisticsImpl:387 - HHH000117: HQL: SELECT o FROM Order o WHERE id = 1, time: 48ms, rows: 1

Collecting information Takes Time

With these settings, Hibernate provides you a lot of information which help you to find potential performance issues before they cause problems in production.

A potential downside of this approach is that it can be difficult to identify all log statements that were triggered by a specific use case. That is especially the case when you increase the log output on a production system or any other system that’s used by multiple, parallel users.

And keep in mind, that collecting all these information and writing the additional log messages slow down your application. So better be careful and don’t use it in production. Retrace can provide you similar information for your production system and presents them in a graphical interface.

You can view and filter all performed SQL queries, and you get additional information about how often a query was executed and how long the average execution took.

Filter by SQL

If you need more information about a specific query, you can click on it to get the average execution time and the execution count over time. You can also see the web requests that triggered the SQL query.

Retrace Execution Count

Finding Typical Performance Issues

OK, we talked about different approaches to get more information about Hibernate’s database interactions so that you can find potential performance issues. But how do you find these problems and how can you fix them?

There are two groups of performance problems that you can easily spot with Hibernate’s additional log messages or with Retrace. The first group contains all problems that are caused by slow SQL queries and the second one the problems caused by too many SQL statements.

Let’s begin with the first group.

Find and Fix Slow SQL Queries

Analyzing these kinds of problems requires a good understanding of your database and the tools it offers. So, if you have an experienced database administrator in your team, don’t be shy and ask for some help. It will make both your jobs a lot easier if you work together.

If you followed the instructions at the beginning of this post, finding the slow queries is the easy part. Hibernate and Retrace show you exactly how long the execution of a query took.

Improve Your Queries

Improving the performance of these queries is often a lot harder. You first need to find out why a query is slow. That’s where your database tools and the experience of your administrator are important.

In general, the performance problems are either caused by the database, e.g. due to outdated statistics or missing indices, or by an inefficient SQL statement.

You can’t do anything in your Java code to fix the internal database issues. So, I will not get into any more details about that. Just talk to your database administrator. He will know what you need to do.

Your database administrator will also be able to help you to analyze and improve your inefficient SQL queries. Your job will then be to either transform the optimized SQL query into a JPQL statement or you can execute it as a native SQL query.

Hibernate’s and JPA’s support for native SQL queries allows you to execute a plain SQL statement. You can do that by calling the createNativeQuery method of the EntityManager with your SQL statement. Hibernate doesn’t parse these statements, so you can use all standard and proprietary SQL features that are supported by your database.

Order o = (Order) em.createNativeQuery("SELECT * FROM purchase_order o WHERE id = 1", Order.class).getSingleResult();

Find and fix too many queries

Another common group of performance problems is caused by too many queries. In contrast to the previous group, this one is only caused by Hibernate and how you use it. So, you need to fix it in your Java code.

The most common situation in which Hibernate performs a lot more queries than you might expect is called the n+1 select issue.

N+1 Select issue

The n+1 select issue is probably the most popular Hibernate performance issue. It’s caused by the initialization of a lazy association between two entities. Let’s take a look at an example.

An order consists of multiple items. If you model that as entities, you get an Order entity with a lazy one-to-many association to the OrderItem entity.

The n+1 select issue occurs if you load multiple Order entities and iterate through them to call the getOrderItem method for each of them.

List orders = em.createQuery("SELECT o FROM Order o", Order.class).getResultList();
for (Order o : orders) {
	log.info("Order: "+o.getId() + " Num items: "+o.getItems().size());
}

When you execute this code, Hibernate performs 1 query to load n Order entities. It then has to execute another SQL statement for each Order entity to fetch the associated OrderItem entities. So in the end, Hibernate performs n+1 queries.

14:44:59,316 DEBUG SQL:92 - select items0_.fk_order as fk_order4_0_0_, items0_.id as id1_0_0_, items0_.id as id1_0_1_, items0_.name as name2_0_1_, items0_.fk_order as fk_order4_0_1_, items0_.version as version3_0_1_ from OrderItem items0_ where items0_.fk_order=?
14:44:59,334  INFO TestOrder:55 - Order: 1 Num items: 3
14:44:59,334 DEBUG SQL:92 - select items0_.fk_order as fk_order4_0_0_, items0_.id as id1_0_0_, items0_.id as id1_0_1_, items0_.name as name2_0_1_, items0_.fk_order as fk_order4_0_1_, items0_.version as version3_0_1_ from OrderItem items0_ where items0_.fk_order=?
14:44:59,337  INFO TestOrder:55 - Order: 2 Num items: 3
14:44:59,356  INFO StatisticalLoggingSessionEventListener:258 - Session Metrics {
    26126 nanoseconds spent acquiring 1 JDBC connections;
    21927 nanoseconds spent releasing 1 JDBC connections;
    427344 nanoseconds spent preparing 3 JDBC statements;
    2645707 nanoseconds spent executing 3 JDBC statements;
    0 nanoseconds spent executing 0 JDBC batches;
    0 nanoseconds spent performing 0 L2C puts;
    0 nanoseconds spent performing 0 L2C hits;
    0 nanoseconds spent performing 0 L2C misses;
    17105884 nanoseconds spent executing 1 flushes (flushing a total of 8 entities and 2 collections);
    67647 nanoseconds spent executing 1 partial-flushes (flushing a total of 0 entities and 0 collections)
}

Depending on the number of selected Order entities, Hibernate might need to execute a huge number of queries which can slow down your application significantly.

This issue is easy to find. Hibernate’s session statistics and Retrace provide you with the number of queries that were executed within a given session. You then just need to analyze the executed queries and find the one that causes the n+1 select issue.

How to Fix the N+1 Select Issue

As soon as you’ve done that, you can fix the problem. That’s also not a big deal. You just need to tell Hibernate that it has to initialize the association when it loads the Order entity. You can do that with a JOIN FETCH clause in a JPQL query.

List orders = em.createQuery("SELECT DISTINCT o FROM Order o JOIN FETCH o.items", Order.class).getResultList();

As you can see in the log output, the JOIN FETCH clause changes the projection and adds a JOIN clause to the SQL statement. Hibernate now loads the Order entities and all associated OrderItem entities with one query.

14:47:12,121 DEBUG SQL:92 - select distinct order0_.id as id1_1_0_, items1_.id as id1_0_1_, order0_.version as version2_1_0_, items1_.name as name2_0_1_, items1_.fk_order as fk_order4_0_1_, items1_.version as version3_0_1_, items1_.fk_order as fk_order4_0_0__, items1_.id as id1_0_0__ from purchase_order order0_ inner join OrderItem items1_ on order0_.id=items1_.fk_order
14:47:12,183 DEBUG ConcurrentStatisticsImpl:387 - HHH000117: HQL: SELECT DISTINCT o FROM Order o JOIN FETCH o.items, time: 66ms, rows: 6
14:47:12,187  INFO TestOrder:71 - Order: 2 Num items: 3
14:47:12,187  INFO TestOrder:71 - Order: 1 Num items: 3
14:47:12,216  INFO StatisticalLoggingSessionEventListener:258 - Session Metrics {
    19594 nanoseconds spent acquiring 1 JDBC connections;
    58317 nanoseconds spent releasing 1 JDBC connections;
    244930 nanoseconds spent preparing 1 JDBC statements;
    1984630 nanoseconds spent executing 1 JDBC statements;
    0 nanoseconds spent executing 0 JDBC batches;
    0 nanoseconds spent performing 0 L2C puts;
    0 nanoseconds spent performing 0 L2C hits;
    0 nanoseconds spent performing 0 L2C misses;
    23932655 nanoseconds spent executing 1 flushes (flushing a total of 8 entities and 2 collections);
    76044 nanoseconds spent executing 1 partial-flushes (flushing a total of 0 entities and 0 collections)
}

Summary

As you’ve seen, Retrace and Hibernate’s statistics can provide you with a lot of information about Hibernate’s database interactions. That can help you to identify and fix performance problems during development, test and in production.

About Thorben Janssen

Thorben is an independent trainer and author of the Amazon bestselling book Hibernate Tips - More than 70 solutions to common Hibernate problems.He writes about Java EE related topics on his blog Thoughts on Java.