The 5th Agile and Automation Days conference took place 28-29th October 2019 in the Museum of the Second World War in Gdańsk. The theme for this year was: “Delivery at Speed” while not jeopardizing quality.
We went there as Kainos Smart representatives in a group of 4: Anna Rogalska, Brendan Allen, Josh Galloway and Piotr Boho. The Museum building itself is amazing, it contains a lot of free space (fulfilled during the conference!), a pretty nice auditorium and a slightly smaller but still very nice cinema room, so talks were held simultaneously in two places. The format of this year’s conference consisted a mix of talks, demos and workshops. In between, these we were provided with some tasty dishes, cakes and drinks to keep us fueled for the 2 days ahead.
Most of the speakers this year tried to show us different angles on how to achieve successful delivery at speed. They all came to the same conclusion and this can be summarized as follows:
- Quality is everyone’s responsibility (not so new, huh?)
- Observability – we should observe and learn our product, analyze the logs and events, it will help us to estimate risks better, test better, and expect the unexpected :)
- Testability / Testable Architecture (reusable components, strict process for technical design, well documented)
- Controllability – control risk exposure for instance using feature toggles, canary releases or A/B testing
It is important to remember that failure is inevitable – we should be prepared for this and work on resilience in our processes and tests.
Fast Feedback From Automated Tests
The major theme touched on throughout much of the A&A Days conference was the concept that quality and speed go hand in hand. ‘Speed to Market’ is considered the new quality attribute and it’s now the case of ‘the fast eating the slow rather than the large eating the small’ according to the keynote speaker, Anne-Marie Charret. Therefore, it is the role of our automated checks to assist this high-speed delivery rather than become an obstacle in our processes.
So how do we ensure tests offer fast, reliable feedback? Firstly, we can maximize the value of our automated checks by identifying common components, using parallelization to reduce execution times and optimizing our scheduling to ensure quicker execution. We can avoid creating bottlenecks to speedy delivery by ensuring our processes are not bloated; in other words unnecessary testing and approvals. We can also question the value of certain automated checks and remove ones that we feel provide little to no value. Finally, we can remove flaky tests from our suites and run them independently to avoid unnecessary re-executions.
Can a code reach production without exhaustive testing (but with monitoring tools and rollback mechanism in place)?
One month after the conference passes and I’m in own garden of DevOps team and performance tests in Smart. I still have some conference keynotes on my mind:
- Is our monitoring good enough to detect a production problem before it starts to grow?
- Are rollback steps clear enough and ready to revert any kind of release changes?
- Are we OK with restoring a database from a snapshot (losing newest data) if needed?
While we are in weekly releases mode. It may be too painful to revert all weekly changes from production and postpone a delivery. In case of a major issue on client-side our strategy is based on the midweek hotfix and that O.K. Fast rollback is good for small pieces of changes, ideally not affecting the database.
We can leverage the monitoring/rollback approach once we move to a more frequent delivery model that can allow us to deliver features separately rather than bundled as a work of several teams.
Regarding must-have monitoring we need to make sure not only we have it, but also we use it and our reaction time is as fast as it can be. So there are alerts acknowledging abnormal behaviours and quick analysis and decisive persons informed. Some products e.g. Atlassian JIRA have this process automated. Based on monitoring they have a mechanism to detect which particular commit caused an issue and rollback it from production.
After all the whole process must have been …tested before they started to rely on it ;)
Experiment and have fun :)
The second day was a workshops time. Geoffrey and Mark from Netherlands introduced us to resilience testing. With the usage of Gatling simulations as load test and stress test we were able to watch what will happen with an application (end test results)when we interfere with normal operations by affecting shared resources like CPU time or randomly terminating server instances. We had VM’s with auto-scaling setup on Kubernetes. A tool to produce an abnormal behaviour was Simian Army from Netflix.
The whole exercise gave me a better understanding of resilience testing, which is: how an application behaves once abnormal situation occurs and how (if) it’s back to normality after an unusual factor goes away. Proper handling of strange situations can save money in production and our reputation.
This also leads me to the idea of another beneficial experiment we may answer about the quality of our monitoring. Let’s prove our environment monitoring alerts can detect an abnormal factor. This can enhance our Disaster Recovery tests periodically conducted by Smart DevOps team.
To sum up the conference was amazing and we can recommend it to everyone nevermind if it’s a Tester, a Developer or WebOps Engineer. This year we have learned how to approach inevitable problems that can always surprise us when we don’t expect them. We are now equipped with the tools that will help us prepare for such situations but we hope we will never have a chance to see them in action. :)
Automated Tests aren’t Always the Answer
A lot of the speakers throughout the conference explained the benefits of automation when used correctly. Though most shared that if automation is used as the answer rather than the question than it causes problems. Tests don’t always show the correct result. A pass doesn’t always mean a pass? This can cause issues because until a test fails no one will investigate it. The answer to fixing problems like this is making tests simplistic and make use of assertions. Flaky tests can also be ignored when they could be showing genuine errors. Automation isn’t always cost-effective or necessary, we need to use our skills and knowledge to work out when automation brings benefit to a project. Automation doesn’t explore more than what its coded to look for. “Exploration is Human”. Therefore if you write good automated tests, you’ll get good results.
Of course, automated tests bring great benefits to the teams/projects that use them. It can reduce the cost of tests and the time of testing. Leaving more time for testers to do effective testing and planning. Test automation must be done right, then measure and communicate metrics to the team. It’s all about balance, of course, automated tests are a great tool to have but you have to never forget that automated tests aren’t always the solution. They are great to have if used correctly to improve testing coverage. Test automation needs to be efficient and sustainable. We need to make components within our automated tests reusable. This increases code quality and also improves the simplicity of our tests. All projects should be creating high-quality automated tests, rather than basing their coverage on the number of tests.
Agile and Automation Days was a fantastic conference, which was well run from start to finish. The conference was filled full of interesting talks and people. There was a variety of different topics discussed meaning there was something for everyone. It was a privilege to attend and it is a conference which will only improve each year. The conference was attended by a worldwide audience and conferences like this are a good way to share ideas throughout the testing community.
Authors: Anna Rogalska, Brendan Allen, Josh Galloway and Piotr Boho