Details
-
Task
-
Resolution: Done
-
P2: Important
-
None
-
None
-
0bc8b64629947cd5dd7bac30fd5c2ed8f79587e4
Description
Historically we have had a lot of problems with unstable tests. The testrunner can be extended to aid in our handling of unstable tests.
I suggest the following logic (optional, of course):
- if a test passes, good nothing more to be done.
- if a test fails, run it a second time:
- if it fails the second time, consider it a stable failure
- if it passes the second time, run it a third time
- the third result determines the pass/fail status
In other words, if a test does not pass on the first attempt, a "best two out of three" method is used.
This mechanism allows stability problems to be hidden or indefinitely postponed, so we have to be careful about using this.
Most of our testing is trying to test if an incoming change introduces a regression: in this case, we should detect unstable tests, warn loudly about them, and fix up the results. The goal of the testing here is about making sure new changes are good, and it does not make sense to semi-randomly block incoming changes because some tests are unstable.
We also have testing which is trying to ascertain the product status, e.g. to verify that code is ready for release. At this level, we should detect unstable tests and treat them as failures.