Batching SQL statements with Java
Regular expressions are a powerful tool in every Java developer’s toolkit. They allow us to validate input, parse strings, and perform complex text transformations with just a few lines of code. However, this power comes with a hidden performance cost if not used correctly.
The key to unlocking efficient regex in Java lies in understanding the java.util.regex.Pattern class. In this post, we’ll explore the best practices for using Pattern, how to avoid common performance pitfalls, and why you should be wary of the “convenient” regex methods on the String class.
Using agnostic ways
JDBC API
PreparedStatement.addBatch() + executeBatch() Gestion manuelle des lots, erreurs partielles, getUpdateCounts()
JPA
Flush manuel en boucle (persist() + flush() + clear() toutes les N entités) Limites : pas de addBatch direct ; dépend du provider pour l’ordre/optimisation
Using builders/DSL
JOOQ
dslContext.batch(insert…).execute() Avantage : type-safe, génère du vrai batch JDBC
MyBatis
SqlSessionFactory.openSession(ExecutorType.BATCH)
<foreach> pour lots dynamiques
Using an ORM
Hibernate
Propriétés : hibernate.jdbc.batch_size, hibernate.order_inserts/updates @BatchSize sur entités/collections session.flush()/clear() obligatoire
Blaze persistence
Pièges et bonnes pratiques
The choice
Any abstraction over the driver brings its own overhead.
Quand clear() est vital (mémoire, ID generation) Erreurs de batch (tout le lot échoue si une ligne KO, sauf continueOnError) Comparaison : JDBC > DSL > ORM pour perfs brutes ; ORM pour simplicité avec entités
References
- Demystifying Java Object Sizes: Compact Headers, Compressed Oops, and Beyond by Peter Lawrey
- String.matches(String regex)
- RegExUtils.java
Demo
A showcase of the concepts illustrated in this post is available here: regex-performance-benchmark