Test-Driven Architecture

Published: 2022-11-12 by Lars methodologyarchitecturetest

Test-Driven development (TDD) has been around for many years as a design and coding methodology. In this blog post we will investigate the value of also test-driving the overall architecture of a system.

Wolf blowing at the house of the 3 small pigs

Test-Driven Development

With TDD we write automated tests for a feature before we write code to implement that feature. TDD is focused on functional requirements: The system is required to behave in a certain way; we will first write an (initially failing) test to verify this behavior; and only then will we write the code necessary to implement the behavior and make the test pass.

For example, we might have a requirement that prevents a user from logging in if they provide an invalid password. Before writing the code to implement this, we will write a test that executes the following steps:

create a user with a specific password
open the login page
enter user name and wrong password for the created user
verify the error message shown to the user
verify that navigating to an authentication-protected page fails

After running this test and seeing it fail at the two verification steps, we will continue to write the code to check the password, until the test passes successfully.

TDD is a valuable methodology because it provides a number of benefits:

Testability: We know that we can add tests anywhere in the code base.
Quality: We can extend the code and will receive fast and robust feedback on any unintended changes in existing behavior.
Documentation: Tests provide correct descriptions of the behavior of existing features.

By writing the tests first, the team has some good incentives in place:

to never have large untested parts of the system
to never add more code than needed to fulfill the requirements

Since TDD has been so valuable for functional requirements, how can we apply the same methodology when working with the architecture of our systems?

Test-Driven Architecture

With Test-Driven Architecture, we write automated tests for an architectural capability before we add infrastructure to provide that capability. System architecture is focused on non-functional requirements. The system should provide a required capability; we will first write an (initially failing) test to verify this capability; and only then will we establish the infrastructure necessary to provide this capability and make the test pass.

For example, we might need a capability that minimizes data loss in case of catastrophic storage failure. Before adding the infrastructure to implement a backup mechanism, we will write a test that executes the following steps:

write a unique value somewhere in the database
run a backup
restore the backup to separate storage
verify that the unique value can be read from the newly restored database

Using Test-Driven Architecture will provide us with a number of benefits similar to TDD:

Testability: We know that we can add tests for any part of the architecture.
Quality: We can extend the architecture and will receive fast and robust feedback on any unintended changes in existing capabilities.
Documentation: Tests provide correct descriptions of the capabilities of existing architectural elements.

By writing the architectural tests first, the team has some good incentives in place:

to never have large untested parts of the architecture
to never add more architecture elements than needed to provide the capabilities

Example architecture capabilities with tests

To illustrate how Test-Driven Architecture can work in practice, here is a list of typical architectural capabilities and the tests that we can write when we do Test-Driven Architecture:

Backup: as described above.
Message queue: Test that events are not lost under heavy load (e.g. HTTP status 429), by posting a large amount of events in a short period of time and verifying that all posted events have eventually been fully registered.
Indexing: Test that search results are presented quickly by running some random complex queries on a large data set and verifying the query time.
Logging: Test that relevant events are logged by triggering some events (e.g. authentication failures) and verify that those events are included in the logs.
Firewall: Test that relevant ports (e.g the database) are blocked, by attempting to connect to those ports and verify that it fails to connect.
Port scanning: Test that automated port scanning works, by temporarily deploying a firewall port opening and verifying that proper alerts are raised.
Auto scalability: Test that infrastructure is automatially scaled up and down by temporarily posting a large amount of requests and verify that performance does not degrade proportionally or much at all.
Failover: Test database redundancy by terminating the leader database while posting requests and verifying that no requests are lost and that performance is only temporarily impacted.

Such tests are sometimes called "Fitness Functions", and Thoughtworks has proprosed the term "Fitness Function-Driven Development" similarly to what I here call Test-Driven Architecture. AWS also has a blog post on Using Cloud Fitness Functions.

Conclusion

Most of these architectural tests can be implemented as normal system-level tests, end-to-end tests, load tests, performance tests, smoke tests, or similar. We already have plenty of tools available to help us write these kinds of tests. On existing projects we often already have some of these tests in place, and we can adopt a Test-Driven Architecture methodology by building on top of them.

Architectural tests can thus also be part of existing CI/CD pipelines, so that a change is blocked from being deployed to production if an architectural test fails because that architectural capability is presumably no longer in place.

We might consider in which environments to run these architectural tests. It will be useful to run them in a test environment, to be able to verify that an architectural change does not unintentionally impact an existing capability before deploying the change into production. It might also be useful to run at least some of them in the production environment, especially if there is reason to suspect that production and test environments are not 100% identical.

It would be nice to know what architectural coverage our tests provide. This could provide us with valuable feedback about architectural areas that have no or too few tests. I am not aware of any such tool, though, so please let me know if you have ideas for how this could be achieved.

Test-Driven Architecture is all about improving the way we work with architectural change, a structured methodology for architectural work that fits well with existing TDD practices and CI/CD tooling.

I would like to thank Lund og Bendsen for facilitating discussions about Test-Driven Architecture at a recent Software Architecture Open Space.

Discuss on Twitter