I set the batch_size
in application.yml:
spring: profiles: dev jpa: properties: hibernate: generate_statistics: true jdbc: batch_size: 20
Whats the reason that batch_size
should not be set to a value as high as possible? Whats the disadvantage of setting batch_size
to lets say 10000? Doesn´t it behave the same as batch_size: 20
for <= 20 queries and for > 20 queries it keeps sending them in one batch instead of doing another request to the database. Isn´t that a better behaviour than setting it to a lower value like 20?
Advertisement
Answer
The reason for small batch sizes is the memory consumption.
From the docs:
Hibernate caches all the newly inserted Customer instances in the session-level cache, so, when the transaction ends, 100 000 entities are managed by the persistence context. If the maximum memory allocated to the JVM is rather low, this example could fail with an OutOfMemoryException. The Java 1.8 JVM allocated either 1/4 of available RAM or 1Gb, which can easily accommodate 100 000 objects on the heap.
long-running transactions can deplete a connection pool so other transactions don’t get a chance to proceed.
JDBC batching is not enabled by default, so every insert statement requires a database roundtrip. To enable JDBC batching, set the hibernate.jdbc.batch_size property to an integer between 10 and 50.