Our systems always depend on other systems and services and thus may and will be subject to failures – network glitches, dropped connections, load spikes, deadlocks, slow or crashed subsystems. We will explore how to create robust systems that can sustain blows from its users, interconnecting networks, and supposedly allied systems yet carry on as well as possible, recovering quickly – instead of aggreviating these difficulties and turning them into an extended outage and potentially substiantial financial loss. In systems not designed for robustness, even a minor and transient failure tends to cause a chain reaction of failures, spreading destruction far and wide. Here you will learn how to avoid that with a few crucial yet simple stability patterns and the main antipatterns to be aware of. Based primarily on the book Release It! and Hystrix. (Presented at Iterate winter conference 2015.)
Cross-posted from The Holy Java.
Credit: Dennis Jarvis, CC
Are you tired of days spent in front of the screen, with no results to show? Have you once again engaged in yak shaving? Today, after having failed previously, I have finally managed to solve a problem while avoiding this trap by following rigorously two guidelines preached by grandmaster programmers. Be warned: Following this approach, you will get a working solution – but you won’t like it. It will be ugly, stained by compromises, far from the elegant solution you wish for. But if your resources are limited and you want to avoid death by too many yaks, this is your only option. But first, what are these guidelines?
One: Maintain a laser-sharp focus. A great programmer is constantly aware of what she is trying to achieve and never strays far from it. If the path leads away, she backs up. If something else pops up, she writes it down for later and gets back to the job. This is essentially about deciding what not to do. (Many thanks to Kent Beck for sharing his focus secret!)
Digest of chapter 9 of the Continuous Delivery bible by Humble and Farley. See also the digest of ch 8: Automated Acceptance Testing.
(“cross-functional” might be better as they too are crucial for functionality)
- f.ex. security, usability, maintainability, auditability, configurability but especially capacity, throughput, performance
- performance = time to process 1 transaction (tx); throughput = #tx/a period; capacity = max throughput we can handle under a given load while maintaining acceptable response times.
- NFRs determine the architecture so we must define them early; all (ops, devs, testers, customer) should meet to estimate their impact (it costs to increase any of them and often they go contrary to each other, e.g. security and performance)
- das appropriate, you can either create specific stories for NFRs (e.g. capacity; => easier to prioritize explicitely) or/and add them as requirements to feature stories
- if poorly analyzed then they constrain thinking, lead to overdesign and inappropriate optimization
- only ever optimize based on facts, i.e. realistic measurements (if you don’t know it: developers are really terrible at guessing the source of performance problems)
A strategy to address capacity problems:
These are my rather extensive notes from reading the chapter 8 on Automated Acceptance Testing in the Continuous Delivery bible by Humble and Farley. There is plenty of very good advice that I just had to take record of. Acceptance testing is an exciting and important subject. Why should you care about it? Because:
We know from experience that without excellent automated acceptance test coverage, one of three things happens: Either a lot of time is spent trying to find and fix bugs at the end of the process when you thought you were done, or you spend a great deal of time and money on manual acceptance and regression testing, or you end up releasing poor-quality software.
I have recently published quite a popular blog post Frustration-Driven Development – Towards DevOps, Lean, Clojure. My good colleague Tom Bang has pointed out that it actually shows nicely the reasons why many of us work for Iterate and why Iterate does what and how it does. I therefore republish it here under a modified name.
A post about development practices, speed, and frustration.
My wife has mentioned that she likes my passion for doing things right in software development. That made me thinking, why do I actually care so much and do not just enjoy the coding itself? It boils down to that I am not happy until my code is in production. Seeking the satisfaction of having my code used by and helping people while trying to eliminate all unnecessary mental drain is behind all the practices that I embrace and evangelize. It’s a drug I like to take often, in small doses.
practices = f(max(delivered value), min(mental energy))
So how does this relate to DevOps, Continuous Delivery, testing, single-piece-flow, Lean Startup, Clojure? It is simple.
Technical debt is not the only monster we have to fight – it has a hidden evil twin, as pointed out by Niklas Björnerstedt: Competence Debt. The rope of ignorance that binds our hands and suffocates us by fear so that we don’t dare to change the system. Technical debt makes change difficult because the structure of the system does not support it. Competence debt makes change difficult because we do not know the system well enough, what & where to change and what impacts a change may have.
There is an often neglected tool at our fingertips that might help us fight competence debt. Its name is – behold – JavaDoc. An example from practice: I have returned to a client after two years and needed to understand the functionality of a part of the system. And, to my surprise, I found my own JavaDoc providing exactly the answer I was looking for. A colleague of mine mentioned that I should get the award for the “most documenting developer”. But I don’t do it for fun or just to help my bad memory and to be nice to my colleagues. It is an important contribution to the fight against the ever growing hydra of legacy code. Next time you code, try to remember that you are not typing code, but fighting. As every fight, it is hard – but do not give up or the enemy will prevail.
Side note: Writing JavaDoc that helps yet is not too verbose and not too likely to get outdated soon is hard. Getting the right balance – neither too little nor too much, focusing on the why and the broader context and relationships instead of the changing implementation etc. is difficult. But it is worth it. Help yourself, help your fellow colleagues, strike the hydra. Write good JavaDoc.
PS: This post is about competence debt even though the title mentions technical debt, sorry for the confusion (even though they are two sides of the same coin). And “good enough” documentation, though important, is not the sole remedy, as well as developers’ lack of knowledge is not the sole cause of competence debt. Also, Niklas provides a more in-depth review of the technical & competence debt terms in Misunderstanding technical debt (tech. debt as evolving understanding [but not code] and crappy code). He wrote also A deeper look at Competence debt.
Our code has been broken for weeks. Compiler errors, failing tests, incorrect behavior plagued our team. Why? Because we have been struck by a Blind Frog Leap. By doing multiple concurrent changes to a key component in the hope of improving it, we have leaped far away from its ugly but stable and working state into the marshes of brokenness. Our best intentions have brought havoc upon us, something expected to be a few man-days work has paralized us for over a month until the changes were finally reverted (for the time being).
Lessons learned: Avoid Frog Leaps. Follow instead Kent Beck’s strategy of Sprinting Centipede – proceed in small, safe steps, that don’t break the code. Deploy it to production often, preferably daily, to force yourself to really small and really safe changes. Do not change multiple unrelated things at the same time. Don’t assume that you know how the code works. Don’t assume that your intended change is a simple one. Test thoroughly (and don’t trust your test suite overly). Let the computer give you feedback and hard facts about your changes – by running tests, by executing the code, by running the code in production.
Iterate has traditionally sponsored the conferences JavaZone and Smidig. This year we have decided to support flatMap(Oslo), the conference about Scala and functional programming on the JVM.
We believe that alternative JVM languages have the potential to enable higher productivity and better software. Especially functional programming (FP) is very promissing, with its focus on better reusability via higher-order functions and on concurrency via referentially transparent functions and immutability.
Many companies can benefit from the powerful mixture of new and time-proven features that these modern languages offer but they are often discouraged by their (relative) novelty, lack of knowledge and experience, and fear of unavailability of capable professionals. Therefore we want to contribute to spreading information, exchanging experience, and attracting more developers to these languages to enable the IT market to step forward and grab the benefits of FP.
Kent Beck in his recent post Functional TDD: A Clash of Cultures summarizes well the key principles and benefits that underlie test-driven development. I think it is really worthwhile becoming aware of and thinking over these foundation stones of TDD (and testing in general). Knowing them enables you to apply TDD in the most effective way with respect to a particular context to gain the maximum of these benefits. People that do not really understand the value of their tests and TDD tend to write hard to maintain tests of limited (and occasionally even negative) value. Are you one of them?
Too often companies and IT departments believe that they know what software they should create. However users often need and want something different than you believe, even if you’re a domain expert yourself. My colleague and dear friend Ivar had exactly this experience when designing a meal planning tool for his girlfriend and himself. He knew everything perfectly – the previous tool they’ve used (a paper on the fridge), the problem domain, the users. Yet the first prototype tested did not at all match the ideas of what he thought was needed.