Cypress Best Practices & Troubleshooting

We at Taxdoo use an array of testing techniques: unit tests, integration tests, end-2-end tests, contract testing, browser-based tests, tests against Docker environments as well as tests against real AWS environments, canaries, … and of course some manual testing when automating a test simply isn’t feasible.

Looking at the possible techniques, browser-based testing often gets a bad reputation as being slow, fragile, and needing constant maintenance. It is the technique that can provide the strongest guarantees though and is the easiest to understand (and be valued) by non-technical stakeholders. For example, a browser-based test might ensure that a certain critical functionality always works, even taking browser specific quirks and subtle aspects like infrastructure specific aspects into account. While browser-based tests have the major benefit of being as close to actual user behaviour as possible, they have the major downside that most engineers are not familiar with them. They pose a new set of tools and techniques that an engineer has to master while for example unit tests are much closer to the coding techniques an engineer uses anyway.

In order to minimise effort and avoid typical challenges of end-2-end tests, we have developed a set of best practices and troubleshooting tricks we are happy to share here.

Maintainability of Browser-Based Tests

A core principle of automated testing is to create tests that test behaviour, not implementation.

This is in particular true for browser-based tests as UIs are often subject to change, both because of voluntary changes in design and layout or involuntary changes by updating the component libraries in use. Many developers have had bad experiences with browser-based tests because they seem to easily break as soon as someone changes a control or the timing of underlying backend functionality changes. This however is not inherently a problem of browser-based tests, but incorrect usage of them.

In fact, browser-based tests can be among the most stable tests requiring the least amount of maintenance while performing large refactorings in the underlying implementation, precisely because they act on the “top layer” and can be unaware of large architectural or implementation changes. Changes in APIs, serialisation formats, code structure/organisation, database schemas and so forth often cause a need to adapt unit tests. But when used correctly, browser-based tests only require updating when the actual behaviour changes.

There are two key principles to achieving stable browser-based tests that many software engineers get wrong:

The correct usage of the data-testid property to identify elements of a page instead of using HTML ID properties, CSS classes, or textual content.
Writing tests in an event-driven way that is independent of the actual timing of the application.

The first is rather obvious: the exact HTML tags and properties used to render an application are often in a flux while the application’s design evolves. It is quite common to move controls to different places, change margins, etc. Similarly, user-facing texts get updated and typos corrected. In these cases, we want the tests to still be valid and run without the need for change. If on the other hand a text control is replaced with a drop down, the actual behaviour of the application changes and it is to be expected that browser-based tests would need adoption. Still, when using data-test ids, only the test code that manipulates the control has to be adopted instead of the code that tries to locate the element, making what needs to change much more transparent.

Using an event-driven approach to testing seems to be a much less commonly known practice. In practice, many browser-based tests incorrectly use sleeps/waits to wait for a predetermined amount of time before continuing. This not only unnecessarily slows tests down but also can be the source of flakiness as tests might get executed on different CI job runner machines from run to run.

In an earlier generation of browser-based testing tools (such as Puppeteer), the commands often waited for network traffic to stop as an indicator that an operation such as pressing a form submit button was completed and the test could continue. This caused numerous hard to diagnose timing issues.

A much better solution is to actually wait for application events, such as a confirm notification, that are shown to the user or the changed data becoming visible in the UI. If that is not possible, our recommendation is to use Cypress intercepts to define an alias for them, and then wait for the respective XHR to be complete. This is probably the most robust way to design tests but it has the downside of needing updates if the request’s API path changes (which luckily only happens infrequently).

Using Custom Commands & APIs to Create Test Data

A technique that has shown great promise internally in Taxdoo is to have tests create their own test data using our internal service APIs and wrapping this in Cypress custom commands. From a test perspective, this leads to easy to understand code, e.g.

cy.createClient(…);

cy.addToAccount(…);.

cy.addFiling(…);

…

but it also keeps the tests easily maintainable. Since the commands just fire simple GraphQL queries against our internal APIs, any changes in database schemas or similar that would break tests that use SQL inserts to add test data no longer are an issue. Any new logic added to the client creation code is automatically invoked as well every time the test runs, as it just uses the internal APIs.

Compared to direct database insertion of test data, the effects on test runtime are mostly negligible and as a nice side effect, the tests also notice if there are any unexpected breaking changes in our APIs, something that can be very relevant because we expose some of our APIs to our customers as well.

To wrap up this post, here is a list of dos and don’ts we have developed internally for when we write tests using Cypress.

Things to Do

Keep defaults: Be wary of changing Cypress defaults lightly, such as disabling test isolation, extending timeouts to higher than default values, changing the default folder structure because most of them exist for a reason and the majority of applications will work fine with them. Having to change a Cypress default can be an indicator that the application does something unusual and would benefit from moving to common web application standards.
Don’t assert for things that can be checked implicitly: There is no need to validate the test has navigated to the right URL if it then continues to search and click on specific elements anyway. Adding such asserts just adds more points that can break during refactorings that don’t actually change the applications behaviour. Use .should(…) for the things that really matter and are at the core of the test. That makes the intent more clear and the test easier to read.
Have tests create their own data: Don’t rely on predefined data in some database (e.g. a specific test user in a test environment) but – where feasible – have tests create their own data. This a) allows the tests to actually modify the data without side effects to other tests and b) doesn’t run the danger of some external effect changing the predefined data and thus failing the tests for unrelated reasons.
Fixtures & Config: Put predefined values such as environment variables or subsets of test data into their appropriate places in the Cypress structure instead of hard coding them in the test themselves. This makes it easier to run the same test in different environments such as locally, in a Docker environment, and in an AWS test environment.
Data Test IDs: Use the HTML attribute data-testid (be careful of consistent spelling, not test-data-id, not data-test-id) to mark frontend elements and then reference them from the Cypress tests. Values should be defined as a constant that both the frontend and tests use instead of duplicating the string in the tests.
Create test data via APIs: don’t do raw inserts into databases if it can be avoided. Using APIs a) is usually less subject to change and b) also nicely tests that the underlying service doesn’t introduce breaking changes without anyone noticing.
Clear data before test run: if the test is not only running in a temporary Docker container but in an AWS environment, a test must clear the data it created before the next execution. The reason to clear before instead of after is that this makes debugging easier when a flaky test fails as one can see what state the test was in when it aborted.
Integrate into CI pipeline: When using the Cypress junit reporter, CIs such as GitLab can be configured to pick up the results automatically and show them in the pipeline. Also the screenshots Cypress takes upon error should be stored as artefacts of the pipeline run. This makes searching for the cause of errors much faster.
Write tests so they can be run anywhere: Ideally, the difference between running a test against a local Docker environment and running it against our AWS test environment is just a change of the Cypress configuration / environment variables.
Don’t use intercepts to return fake data: While Cypress’ capability to intercept requests and return predefined data is very convenient, it adds more places that have to be adapted if the underlying API changes.
Use Typescript instead of Javascript: for writing tests, it usually doesn’t make that much of a difference in practice. However, when trying to establish a culture of “every engineer writes tests” (as opposed to only dedicated QA engineers), using a more modern language like Typescript increases the chance that engineers will accept tests as part of their ownership.

Things to Avoid

Do not use cy.wait for fixed amounts of time: Timing will change depending on where the tests run. One CI runner might be faster, another slower, your local machine completely different. Therefore use of things like “wait for 5 sec” must be avoided. Instead, use cy.intercept(…) to wait for the event that needs to finish before the tests should continue.
No side effects: Tests should be able to run in any order or in parallel. If the setup for each test would be prohibitive, have the data created by the test suite (e.g. a single Cypress test file) and then run multiple tests (=it(…)) against it.
Increasing Timeouts: Be careful of increasing timeouts. If a long timeout is needed, this usually is an indicator the application/frontend itself is slower than a user expects.
Too many Custom Commands: Don’t overdo it. Usually only very few custom commands are actually needed and wrapping some basic stuff into commands makes the code harder to read for people that don’t know what the commands do.
Don’t check for specific text: Strings presented to users often change when for example typos are fixed or translations adapted. Define a data-testid on the element that shows the text and rather test for that.
Don’t check against URLs: URLs are always subject to change. Instead, simply check for the content that is presented on the page.
Don’t use Cypress-if: This is an anti-pattern, as stated also in the Cypress documentation.
Don’t assume a certain cookie is set: Unless the test itself sets it, assume a blank browser with empty local storage and no cookies. e.g. the cookie consent form might show up depending on the environment, the test might have to switch to a particular user language, …

Conclusion

End-2-end tests are a powerful tool that is often underutilised because of misconceptions and suboptimal implementation. To employ them successfully in a project, one needs to be careful that certain fundamental principles are adhered to as otherwise they quickly get a bad reputation and will ultimately be rejected by much of the engineering team. Used correctly, they can be fast, provide strong, easy to understand guarantees and require little maintenance. We hope the ideas in this post might inspire you to use them if you haven’t done so so far or re-consider them if you had poor results using them in the past.