Using native queries with JPA? Here’s the catch

I’m a big fan of native SQL queries. I mean when you start abstracting away from the real thing – like with JPA – you have to pay the price. Let’s say you have a use case where you need to fetch some data with some specific transformation function in Postgres or you have to deal with recursive queries?

The easiest option is to abandon JPA for that particular use-case and directly go with native queries. The effectiveness of the query will only depend on your expertise. Gaining back control is not a bad thing at all.

I believe in using JPA and Hibernate to help you with development and make sure you can move faster towards producing value instead of spending time writing elementary SQL for all your tables. But, that doesn’t mean you can’t implement a hybrid solution where you use JPA for the majority of your use-cases and where complex database layer logic should come in place, you deal with those using native queries.

The unforeseen gift

Using abstractions will hide a lot of things from you. That’s good and bad at the same time. If you don’t learn how to use the framework, you’ll shoot yourself in the foot. Why? Because JPA and Hibernate is complex. It’s not as straightforward as you’d think.

The first time you met JPA, I bet you read a tutorial on how to configure it and started using the features. You probably had some real simple use-cases at first, like this:

Product p1 = new Product(1, "notebook");
entityManager.persist(p1);

Awesome. Works like a charm.

Your product person comes there and says, hey Mr. Developer, we need to save 2 rows into the database instead. You’re a good worker, you do make the change.

Product p1 = new Product(1, "notebook");
entityManager.persist(p1);
Product p2 = new Product(2, "headphone");
entityManager.persist(p2);

Works perfectly. Since you are inserting more than one entity in a single transaction, you plan to enable statement batching for this particular transaction because you heard somewhere batching is good to build blazing fast applications. You implement the change.

entityManager.unwrap(Session.class).setJdbcBatchSize(2);

2021-10-14 11:05:11.801  INFO 8456 --- [Test worker] n.t.d.l.l.SLF4JQueryLoggingListener: Name:dataSource, Connection:6, Time:2, Success:True, Type:Prepared, Batch:True, QuerySize:1, BatchSize:2, Query:["insert into products (name, id) values (?, ?)"], Params:[(notebook,1),(headphone,2)]

Great. The 2 rows are grouped into a single insert statement.

Everything is fine until the product guy comes again and says. Well, this is good but instead of headphone we should query that name from another table for which you’d need a recursive query and native database functions.

Product p1 = new Product(1, "p1");
entityManager.persist(p1);
String name = (String) entityManager.createNativeQuery("// VERY COMPLICATED SELECT").getSingleResult();
Product p2 = new Product(2, name);
entityManager.persist(p2);

2021-10-14 11:07:57.644  INFO 2308 --- [Test worker] n.t.d.l.l.SLF4JQueryLoggingListener: Name:dataSource, Connection:6, Time:2, Success:True, Type:Prepared, Batch:True, QuerySize:1, BatchSize:1, Query:["insert into products (name, id) values (?, ?)"], Params:[(notebook,1)]
2021-10-14 11:07:57.663  INFO 2308 --- [Test worker] n.t.d.l.l.SLF4JQueryLoggingListener: Name:dataSource, Connection:6, Time:0, Success:True, Type:Prepared, Batch:False, QuerySize:1, BatchSize:0, Query:["// VERY COMPLICATED SELECT"], Params:[()]
2021-10-14 11:07:57.671  INFO 2308 --- [Test worker] n.t.d.l.l.SLF4JQueryLoggingListener: Name:dataSource, Connection:6, Time:0, Success:True, Type:Prepared, Batch:True, QuerySize:1, BatchSize:1, Query:["insert into products (name, id) values (?, ?)"], Params:[(notebook,2)]

What just happened? Instead of a single batched insert we have 2 separate inserts.

Is it a bug?

No, it is not a bug. In fact it behaves as expected even though we just ruined statement batching with that native query.

The reason this happens is because there’s a mechanism in the JPA world called flushing. Flushing is the fundamental concept related to the entity manager’s behavior i.e. collecting the changes made to the entities and at the right time, all those changes will be propagated to the underlying database. That’s flushing in a nutshell.

Summing up the JPA and Hibernate flush modes concludes 4 different flush strategies:

AUTO (JPA)
COMMIT (JPA)
ALWAYS (Hibernate)
MANUAL (Hibernate)

Out of those, AUTO is the default mode and the JPA specification has defined what kind of behavior it should achieve under certain circumstances. Handling native queries is just one example that could mess up your batching.

The solution to this use-case is to disable flushing prior to the native query, for example using COMMIT mode. Of course if the native query’s result depends on the changes collected prior to the query, your only option is to flush those changes, otherwise you’ll end up in having functional problems in your app.

Batching and flushing is a difficult topic. Last time I wrote about How to mess up statement batching in Hibernate, check it out.

For a complete explanation of flushing and batching, check out my course Effective Hibernate with Spring, you won’t regret it.

In the meantime, follow me on social media, Twitter and Facebook for more.