Testing Databases
github repository : https://github.com/DarrenForsythe/testing-databasesWelcome to the first in hopefully a series of blogs that will discuss some tools on how to test more effectively.
While my examples are in java the core concepts discussed not specific, even some tooling has implementations in other languages. I’ll finish each post with links to tools used and try to include other language implementations.
These blogs will discuss what I’ve seen as poorly, under-tested, or simply not tested integrations in services.
I’ll discuss some examples and how we could improve them to increase our confidence.
|
|
Mocks are always the answer
How often have you came across pieces of code close to this,
A simple service saves some arbitrary entity.
Note: The UserRepository could be any data access client e.g. JdbcTemplate with a row mapper provided, mongoose, or SQLAlchemy/Django-ORM. The point is all of these are higher level abstractions.
public class UserService {
private final UserRepository userRepository;
public UserService(final UserRepository userRepository) {
this.userRepository = userRepository;
}
public void saveUser(String name, String employeeId) {
var user = new User();
user.setName(name);
user.setEmployeeId(employeeId);
userRepository.save(user);
}
}
Our user entity is a simple class,
@Entity
public class User {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private long id;
private String name;
private String employeeId;
//Getters, Setters, equals, hashcode omitted for clarity.
}
With an accompanying test,
class UserServiceTest {
private UserService userService;
private UserRepository userRepository;
@BeforeEach
void setUp() {
userRepository = mock(UserRepository.class);
userService = new UserService(userRepository);
}
@Test
void userShouldSaveWithoutModificationOfNameOrEmployeeId() {
userService.saveUser("darren", "1");
var savedUser = new User();
savedUser.setName("darren");
savedUser.setEmployeeId("1");
verify(userRepository).save(savedUser);
verifyNoMoreInteractions(userRepository);
}
}
Looks fine? If we run a coverage tool against the test it will report that the class is 100% tested! Yay, job done.
Wrong!
Let’s take a look at the UserRepository
,
public interface UserRepository extends JpaRepository<User, Long> {
}
That’s it, the only thing we needed to do was add some arguments the type the repository will handle, User
and what the ID
will be a long in this case. There is a ton of complexity that Spring Data handles for us that
developers do not either care about, or think about until it goes wrong.
The Spring Data team has tested its JpaRepository
implementation
that when the save(S entity)
method is called an entity it will work its way through Hibernate,
and eventually the entity will end up in a database.
Should we care about testing this any more than expected?
Yes we should. Of the top of my head in any semi-complex application,
- How will the service handle errors?
- Is the entity setup correctly to be saved?
- Will the entity generate an
id
correctly? - Does the entity have the correct column definitions set?
- What if we use Liquibase or Flyway to manage database migrations, does the entity align correctly to what’s defined?
- Given a bug how could it be effectively re-created by using mocks?
- new or inexperienced developers making changes around the data access?
This is “only” a “simple” save
method.
Unless you work on the Spring Data team then I would doubt that a
person can know all the potential flows, gotchas, and possible
outcomes of the save
method, don’t forget that this is abstracting Hibernate too,
which in turns requires a datasource, which in turn would require a driver.
I do not trust any code base which only has mocked data access clients, as I am 100% sure any testing will involve the following.
me: I’ve updated the entity, the changelogs, and added some new methods to the UserRepository. How do we test that?
developer: We just deploy it and run through some [Rest API/UI] and verify there’s no errors
Would it not be better to write a better test to increase
your confidence that you are using the JpaRepository
and other interacting components/tools
as expected?
What’s the alternative?
Hopefully it has been made obvious we need some sort of database to connect to and use real objects over mocks for these complex abstractions.
What’s our choices?
In-Memory
I am sure most people are well aware using an in-memory database is a pretty good solution to avoiding mocking and give us our first step to gaining more confidence.
Using some Spring Boot Test black magic, @DataJpaTest
, an in-memory database can
easily be created with a datasource automatically created for us.
Note: While this does contain Spring Boot black magic the same concept can be applied to any language. Configure an in-memory database, configure a datasource and connect the client to it etc.
@DataJpaTest
class UserServiceH2Test {
private UserService userService;
@BeforeEach
void setUp(@Autowired UserRepository userRepository) {
userService = new UserService(userRepository);
}
@Test
void userShouldSaveWithoutModificationOfNameOrEmployeeId(
@Autowired UserRepository userRepository) {
userService.saveUser("darren", "1");
assertThat(userRepository.findAll())
.isNotEmpty()
.hasSize(1)
.element(0)
.extracting(User::getEmployeeId, User::getName, User::getId)
.containsOnly("1", "darren", 1L);
}
}
No longer do we need to write a tedious bit of verification code that the object
passed to the UserRepository#save
method has changed. We have in fact went further
and verified it hasn’t changed down to the database layer.
On the flip side we have introduced some Spring Boot annotations and magic which can lead to some interesting exceptions on start-up of the test, but I’ll discuss that in another blog.
Is this now good enough? We are,
- Using real objects and not mocking a complex abstraction.
- Verified our entity will at least work with H2.
- Able to verify down to the database layer that our entity will save to a table successfully without changing
No, it’s not. You do not deploy against H2. It does its very best with compatability modes, but it cannot replicate what would be used at runtime e.g. MySQL, DB2, Postgres, MSSQL etc.
Ideally you would not be using Hibernate’s DDL Auto mode to create the table for you, this has its own set of issues. Race conditions for greater than 1 instance, not knowing exactly what types for what database it should use. What next?
Test against what you will use at runtime
If you want to get to the highest levels of confidence you should be testing against what you will be connecting to at runtime.
Two years ago I still setup tests that would enable if they found a local MySql
database that could be connected. We even found an issue between our in-memory database,
and our deployed applications specifically around MySQL’s JSON
Data Type. We didn’t need
to wait until runtime and pickup on increased errors rates or manual tests.
That isn’t a great solution, it’s a local stateful database. We need a blank slate each time if we want them to be repeatable, and contained with in the repository to allow a CI pipeline to run them.
Testcontainers
Testcontainers to the rescue. I believe I found Testcontainers
sometime mid 2019, and it has changed my life when it comes to testing abstractions or dealing
with something that would cause network I/O.
Note: Various implementations exist, it was a java library at first (or rather a github gist ) but now has support with multiple languages with differing levels of support, but the same core idea.
Let’s make our simple application a little more complicated and introduce a changelog that will setup the required schema for the entity to be saved to the table, and assume we are going to use MySql at runtime.
<changeSet id="create-user-table" author="darrenforsythe">
<createTable tableName="user">
<column name="id" type="bigint" autoIncrement="true">
<constraints primaryKey="true"/>
</column>
<column name="name" type="varchar(255)">
<constraints nullable="false"/>
</column>
<column name="employee_id" type="varchar(255)"/>
</createTable>
</changeSet>
Let’s breakdown our changelog,
- We are creating a
user
table for our user entity - First column is our
id
table. The column is auto-incrementing of typeBIGINT
which maps directly to Java’s Long as both hold 8 Bytes of data. - Next the
name
column which is not nullable and of typevarchar(255)
- Finally, the
employee_id
column of typevarchar(255)
with nothing fancy about it.
We cannot easily test this against H2
it will run against MySQL
. This is also a simple example,
which I had to correct three times as I had named the employee_id
column named wrong, and the incorrect
column types that Hibernate was expecting.
Now let’s update our test to give us even more confidence by using Testcontainers
@DataJpaTest(excludeAutoConfiguration = TestDatabaseAutoConfiguration.class)
@Testcontainers
class UserServiceTestcontainersTest {
@Container private static final MySQLContainer<?> MY_SQL_CONTAINER = new MySQLContainer<>();
@DynamicPropertySource
static void setupMySQLDB(DynamicPropertyRegistry propertyRegistry) {
propertyRegistry.add("spring.datasource.username", MY_SQL_CONTAINER::getUsername);
propertyRegistry.add("spring.datasource.password", MY_SQL_CONTAINER::getPassword);
propertyRegistry.add("spring.datasource.url", MY_SQL_CONTAINER::getJdbcUrl);
}
private UserService userService;
@BeforeEach
void setUp(@Autowired UserRepository userRepository) {
userService = new UserService(userRepository);
}
@Test
void userShouldSaveWithoutModificationOfNameOrEmployeeId(
@Autowired UserRepository userRepository) {
userService.saveUser("darren", "1");
assertThat(userRepository.findAll())
.isNotEmpty()
.hasSize(1)
.element(0)
.extracting(User::getEmployeeId, User::getName)
.containsOnly("1", "darren");
}
}
Not much has changed, we’ve disabled the TestDatabaseAutoConfiguration
which was creating the H2
in-memory
database for us, and added a MySQLContainer
and a bit of Spring Framework magic will
allow us to set dynamic properties easily.
We have not changed our assertion, and it still passes successfully but now we have
increased confidence that our User
entity, Liquibase changelog, and MySQL will all
work together.
We could also add some more tests to further test our changelogs are acting as we expect.
@Test
void nameCannotBeNull() {
assertThatThrownBy(() -> userService.saveUser(null, "1"))
.hasCauseInstanceOf(ConstraintViolationException.class);
}
With 5 lines of code there’s now a guarantee that if the name
column ever was updated nullable
it would be caught before it ever got to into a development region.
Let’s do something more ridiculous,
@RepeatedTest(5)
void idShouldIncrementCorrectly(
@Autowired UserRepository userRepository, RepetitionInfo repetitionInfo) {
userService.saveUser("darren_" + repetitionInfo.getCurrentRepetition(), "1");
assertThat(
userRepository.findById(
Integer.toUnsignedLong(repetitionInfo.getCurrentRepetition())))
.isNotEmpty()
.get()
.extracting(User::getName)
.isEqualTo("darren_" + repetitionInfo.getCurrentRepetition());
}
Using the Junit5 RepeatedTest
annotation we repeat the same test 5 times, set the name to be darren_<currentIncrement>
and then validate that on each pass we finding a record with the correct name by the current test increment.
Not something you would normally test for, but something that may work differently on different databases or slightly differently is easily tested for.
Conclusion
Hopefully it’s easy to see why you should seriously consider using testcontainers the next time you writing a test for a component in which a component uses a database client please do not mock it. The amount of confidence that the test will give you is superficial here.
Testcontainers is super simple to set up, all it requires is Docker which is ubiquitous across software development. There is no need to assume the functionality of the abstractions you are mocking.
It gets you incredibly close to being able to write complex tests, and re-create bugs, far more simply than if you start mocking everything.
I will show off more complex uses of testcontainers
in the future, it is one of my favourite tools and have
uncovered many bugs before they ever got into any release.
Working examples of the code in this blog is available at the top!
Thanks for reading.