By TEN BITCOMB

Signs That Your Tests Really Suck

Are your tests really helping you? We delude ourselves into thinking that any testing is good testing, but that's just true.

When we are writing tests, it's common for us to value the wrong things. Tests are supposed to allow us to be more productive and confident in our work. Too often, I've seen the exact opposite.

Red Flags

Here, I will go over signs that say your tests are bad. Later, I'll go over why we end up writing such awful tests, and how to avoid going down the wrong path in the first place.

The Tests Are Fragile

Do your tests fail spontaneously? Or worse, do they spontaneously fail only in TravisCI? Such test are hard to diagnose, waste time, and lower developer confidence.

Can't or won't solve the underlying problem? Then disable or delete that specific test! It's shocking how teams will actually keep tests that fail even 1 in 10 times. Tests that fail randomly tell you very little.

The most common reason I've seen tests fail randomly, when the problem can be understood, is a race condition that occurs when a multi-threaded service doesn't perform a change in a transactional way. Sometimes these conditions can be difficult to solve, but not always.

They Are Slow

In my experience, tests are usually slow because they are too comprehensive. In other words, when a particular test scenario is behaving more like a full acceptance test, touching lines of code that aren't even the target of the scenario, that's a sign that your application needs to be decoupled so that individual components can be tested in isolation. Effectively, this causes small-scale tests to behave more like acceptance tests.

Even worse is when too many test cases are actually end-to-end tests that test the entire system and its interaction with the external environment. If your tests rely heavily on the presence of a database, search index, web driver, etc., they will inevitably perform slower than tests that touch none of those things. Sometimes you need to test how code uses a database, but this shouldn't be the case most of the time.

You don't need to turn all your tests into unit tests either, but find the right balance between necessary end-to-end tests and integration tests, or tests that focus on how components integrate with each other. Integration tests give you your biggest bang for your buck because they usually perform well and test what really matters without too much isolation or too much dependency.

If only product owners realized how much time and money get wasted when tests run slowly. If a test suite takes 20+ minutes to run, which is common, that's thousands of developer hours wasted per year.

The Tests Don't Reliably Catch Side Effects

Your tests will never catch every problem that comes up.

But do they often fail to catch problems caused by small code changes? If so, then why bother having tests at all?

You should have a high level of confidence in your tests. If you don't, the tests ought to be thrown out and rewritten from scratch.

The usual answer to this sort of flawed automated testing is to use it alongside manual Quality Assurance(or really Quality Control) testing, and while QA has advantages, you are still sacrificing time and consistency so that you don't have to address fundamental problems.

QA is good to have, but it's not the answer to flawed automated tests. If your tests don't catch enough side effects, you need to sit down with your team and discuss how you are going to address testing going forward. Maybe you're relying too heavily on unit tests, or you aren't testing enough of the things that matter.

Why Our Tests Become Terrible

Not Writing Tests Early Enough

While I think TDD doesn't solve every problem, and that religiously adhering to such ideas can be an antipattern, writing your tests as early as possible is a good thing.

We are taught about TDD as if it's a method that you must stick to when writing every facet of an application. I've never found strict adherence to TDD to be a benefit, since you don't always know about the nature of what you are building until you simply attempt to build it and see what happens.

Developers discover that rather quickly, conclude that "TDD sucks", and opt for TAD(test after development). Those who appreciate TDD might even it beaten into them by a boss that they should TAD because TDD "wastes time".

If you can write tests as early as possible, you will also catch poor architecture decisions earlier. Writing code with tests in mind encourages writing said code to be testable, which usually equates to code that is more decoupled.

Leaving Old Tests Lying Around

Another thing that leads to tests that suck is a belief that all tests should live forever, and this is a very common belief. In other words, code is always being added to an application, and therefore more tests. If a year has gone by and a certain test hasn't ever failed, and the unit of code almost never changes, then why are you still keeping it around? That's developer seconds being wasted, and as grains of sand make a sand castle, seconds become minutes, and minutes become hours.

This is especially true for teams practicing continuous integration or fail-fast development. If you are going to count on being able to quickly address problems, then tests that have passed forever are hardly helping you. Just disable or delete said tests, and if a problem comes up, write a new test in a way that's better than the one that you removed.

Unit tests are prime for this sort of removal process, as they test components in such isolation that it's fairly obvious to tell whether they are obsolete. If test case is for a function that hasn't been modified in years, the test is telling you nothing.

If you lack the confidence that you can take away a test and have your application to continue to see improvement without more problems, then you ought to question how you are writing applications.

The Chase for Code Coverage

Code coverage is a very misunderstood testing metric.

Knowing whether your tests are touching all your lines of code is great and all, but it can't tell you the quality of the tests being performed against a particular line of code.

Getting a high percentage of code coverage can be tough when writing very granular tests. The quickest win in getting high code coverage is to write lots of grand scale end-to-end tests and acceptance tests, because they are most likely to "touch" many parts of an application. That doesn't mean that your tests are testing your code well and, more importantly, such tests can be difficult to maintain and slow.

Code coverage should be treated more as a tool for finding code that qualifies for removal, which is the best answer for achieving high coverage. Rather than recording and simulating HTTP requests, automating browser activity in a web driver, performing tricks with a test database, etc., consider whether the code should be there at all.