(August 2021) Streetmix documentation has now moved! Please update your bookmarks to https://docs.streetmix.net/.
Tests are extremely important to the health and stability of Streetmix. We have established some systems and processes to help ensure the ongoing reliability of our platform.
We do not have a strict test-driven development (TDD) methodology, although individual engineers may use this approach if that’s the development pattern they are most comfortable with. Also, while we do measure code coverage, our goal is not necessarily to reach 100%. We’re looking for “enough” coverage to have confidence that new features or refactoring will not create new bugs, which can be more of a subjective approach. As Guillermo Rauch says, “Write tests. Not too many. Mostly integration.”
We did not have any test infrastructure in the early phases of Streetmix. Tests have been added over time and are constantly improving. This document reflects our current thoughts about how we should test, but you’ll find lots of moments in the codebase where tests are incomplete or non-existent. We could always use some help with writing tests!
Running tests locally¶
When testing in a local development environment, only linting and unit tests are run.
Full integration tests happen in our continuous integration infrastructure. You’re not required to run this locally, but if you’d like, you can do so with this command.
Unit and integration tests¶
This section is incomplete and should be expanded.
Our primary test framework is the Jest test runner with React Testing Library (RTL). (These do not do the same thing and are not interchangeable; these two systems work closely together to provide a full unit and integration test environment.) A number of resources already exist that fully document why and how we use these, see the resources list just below.
Our goal is to be as close as possible to a community-designed “best practice” in order to simplify our understanding and comprehension of tests; do not do anything exotic in these tests if you can avoid it.
- Introducing the react-testing-library [Kent C. Dodds] - why RTL instead of Enzyme?
- How to use React Testing Library Tutorial [Robin Wieruch] - start here for the basics
- Common mistakes with React Testing Library [Kent C. Dodds]
This section is incomplete and should be expanded.
We use Cypress sparingly. We do eventually want more tests to exist in Cypress, when appropriate, and can replace the unit or integration tests that end-to-end tests can cover.
Cypress only runs in our automated continuous integration test environment by default, but can also be run locally:
npm run cypress:run
In the future…
…we may adopt Prettier (or prettier-standard) to automatically format code. We have not introduced it yet because doing so across the entire codebase would be disruptive to existing work. We currently use Prettier to lint and format JSON. If someone wants to champion adoption of Prettier, please get in touch.
PropTypes is a runtime typechecking library used for React development. Because it is a runtime checker, PropTypes will only throw errors in the console when running in the browser or in test suites. (The PropTypes library is not compiled into production code.)
We currently enforce using PropTypes for React components in development. This means that React components must declare all of its props and what types of values that prop should be. The benefit of this approach is that React components self-document what props it accepts. Sometimes, a prop can be overloaded with multiple types, but this is generally discouraged if you can avoid it.
We have experimented with TypeScript in auxiliary codebases, but we’ve not incorporated it into Streetmix itself. Because we already compile code with Babel, adopting TypeScript piecemeal is doable. However, we have not yet run into a situation where the solution is specifically to adopt TypeScript. That being said, if and when a good case can be made for adopting it, we will likely jump on board. If a migration to TypeScript occurs in React components, it will supercede using PropTypes.
We do not currently implement device testing, but this is on our to-do list. We have a Browserstack account for this purpose.
Continuous integration (CI)¶
We use Travis CI to automatically run tests for every commit and pull request to our repository.
Continuous integration testing is, unfortunately, not deterministic. Because there are various moving parts and third party services involved, CI can sometimes fail, despite the code running perfectly locally (or even in production)! When CI fails, we need to examine why. Passing CI is almost always required to maintain confidence that a deploy will not break the production site.
Even after determining that CI is failing not because of a bug or linting problem, here are some common tips for addressing issues with the CI infrastructure.
- Try running the build again. Because CI isn’t deterministic, sometimes running it a second time with no changes will cause it to pass. This is commonly the issue when the Selenium smoke test fails.
- Check the status of third-party services. Sometimes, TravisCI itself has issues, so also be sure to check TravisCI status.
CI can be skipped by appending
[skip ci] to a commit message.
Every commit or merged pull request to the
main branch that passes CI is automatically deployed to the staging server.
Currently, there is no automatic deployment to the production server. We’ve noticed that each deploy introduces a small amount of lag while the server software restarts. As a result, we now manually trigger deployments to the production server.
In addition to continuous integration, we use some third-party services to keep an eye on code quality and test coverage. These services should be considered “code smell” detectors, but treat them with a grain of salt. They are not required to pass before merging pull requests.
CodeClimate measures technical debt, or the long-term maintainability and readability of code. It applies some heuristics to detect and track “code smells,” which are opportunities to refactor code or fix potential bugs. A CodeClimate review is triggered automatically on every pull request, but some of the thresholds it uses are quite arbitrary. Here’s some of the issues are raised, and how we’d address them, in order of increasing severity (as it applies to Streetmix):
- Lines of code. CodeClimate triggers a warning when functions and modules exceed an arbitrary line limit. This means there is a potential opportunity to separate concerns, but we will never enforce this, since we don’t want to encourage “code golf” or quick workarounds instead of actually taking the time to separate logic. If something can be refactored into smaller pieces, but can’t be prioritized immediately, add a
TODOcomment instead. If something doesn’t make sense to shorten, mark the issue as Wontfix.
- Duplicate code. CodeClimate triggers a warning when it detects code that look the same as other code elsewhere. This can be an opportunity to refactor code, but more often than not, CodeClimate is seeing similar-looking boilerplate code or patterns. In this case, mark the issue as Invalid.
- Cognitive complexity. CodeClimate triggers a warning when a function contains too many conditional statements, resulting in complex branching or looping code. Not all code can be made simpler, but you may want to consider whether it can be written diffferently. However, use your best judgment here. If you don’t agree with CodeClimate’s assessment, mark the issue as Wontfix.
- TODOs. CodeClimate tracks when a
FIXMEcomment is written in the code. Because this is a developer’s own judgment call, this takes priority above other issues and should be addressed in the future. Never mark this as Wontfix or Invalid. If it’s no longer valid, instead remove the
FIXMEcomment from the code.
Issues that should be addressed in the future, but can’t or won’t be addressed immediately, should be marked with Confirmed.
In spite of CodeClimate’s warnings, reviewers may approve its review even if the issues it raises are not addressed right away.
Codecov measures code coverage, which is the percentage of code that is covered by at least one test suite. This percentage is a commonly used metric that software projects use to show how complete its test suites are. However, the percentage itself is not necessarily a measurement of test quality. As a result, while we strive for higher coverage, 100% is not the goal.
A Codecov review is triggered automatically on every pull request, which allows a reviewer to see at a glance whether a pull request increases or decreases overall code coverage. It fails if a large amount of new code is added without increasing a corresponding amount of test coverage.
Because our test suite coverage is quite low at the moment, it is preferred that all new code and refactored code come with test suite coverage.
These additional resources from the developer community help guide our approach to testing. This is not an exhaustive list, and we’ll keep updating this over time.
- Write tests. Not too many. Mostly integration. [Kent C. Dodds], on Guillermo Rauch’s tweet.