Developer Team Designing for Disaster

Is Your Developer Team Designing for a Disaster?

Stackify BuildBetter Leave a Comment

Development teams work at top speed, and the environment in which they work is demanding and complex. Software is no longer considered done until it’s shipped, and there’s been a tendency to view shipped software as preferable to perfect software.

These philosophies, created in large part by agile and lean methodologies, celebrate deployments and meeting release cycle deadlines. But have our standards of working software quality been trumped by the rapid pace of delivery? We may be designing faster, but are we designing disaster?

We practice Agile development at Stackify, and are advocates of the methodology as a system to build and deploy better software more frequently. What we don’t subscribe to is the notion that the process we take to create better performing, high-quality software is more important than the software itself. If the process gets in the way of the product, it’s time to re-evaluate it.

The beauty of a process like Agile is that you can modify it to suit your team and what’s most important in your delivery. Here’s a bit of what we’ve done to optimize, and some of the things we’ve learned along the way.

Don’t let quality draw the short straw.

I’ve yet to see a project that doesn’t have a problem when it gets past development and heads into the final push towards “done.” Primarily, I see one of two things happen:

Testing cycles are compressed or incomplete due to time constraints. A major contributor to this is that code often isn’t ready to be tested until it’s all ready. As the sprint burns down, more and more code tasks are complete but they’ve yet to be reviewed, merged, and deployed to test environments. The code is rushed through QA, and ultimately, issues are found in production. Fixing those issues robs time away from the next sprint.

AppFailsTesting doesn’t get compressed, but it extends the sprint in order to be complete and/or fix problems. In a scenario where sprints overlap (i.e. once dev is complete for Sprint A, developers begin picking up tasks for Sprint B), this has a domino effect throughout the entire schedule of releases. We commonly joke about this being a “death march,” as it      creates wider deltas of code diffs, more complex merges, and a general log jam of productivity.

We are no stranger to these phenomena ourselves. Like any other dev shop, we have testing tools that help out, running automated UI tests, unit testing around core functions, and automated/ manual integration tests of complex system functions. But, it’s still really hard to get through everything we’d like, at the level of detail we’d like, in a reasonable time frame. There are two criteria I ultimately base a go/ no-go release decision on:


Free Download

  1. Confidence. Have we accomplished our goals for the release while also improving (or at least maintaining) our application’s overall performance, stability, and reliability?
  2. Risk. What have we changed, what can it impact, and do we fully know the scope and scale of that impact?

Throughout our development process, we use Prefix and Retrace to help us build confidence toward the next release and to assess how much risk we have. Our developers run Prefix on their development machines, finding and eliminating bugs, bad code patterns, and performance issues before they commit code.

Our developers, QA team, and management all use Retrace to look at overall performance and new and regressed errors in each one of our pre-production environments at each stage of our dev lifecycle. We know, from build to build, if we have introduced new problems and moved the platform forward or backward. It’s a tangible measurement of the overall health of our release.

Don’t fear a punch.

As Mike Tyson once said, “Everybody has a plan until they get punched in the face.” You’re going to get punched in the face. It’s inevitable. There will always be a problem, an unexpected server or cloud failure, a critical bug uncovered, support requests, or an “urgent” need from someone else in the company to get something that isn’t in the plan done ASAP.

If you know you are going to get punched, your process must plan for it. At a minimum, you must be leaving some capacity to deal with it. If you’re doing it well, you should have an “expedite” lane on your planning board (along with an established process) to deal with items that come out of nowhere with a lot of urgency.

Be wary of the tendency to let too many tasks fall into the “urgent” category. It happens sometimes due to pressure from somewhere in your organization, or just because of fallout from a chaotic application. By its very nature, urgent work will always have higher risk and the introduction of an opportunity for shrinking quality. If it’s all urgent, nothing is, and you’re just sacrificing the quality of your product.

Don’t be a process zealot.

In theory, having an Agile development shop sounds great. In practice, it can be great.

CloudAppsA trap that many fall into is trying to carry out a textbook implementation of agile, often times attempting to cram what really happens into what should be happening. But it’s not a one-size-fits-all process.

Craft your Agile implementation around the way your business needs to deliver software. A great example of this is the concept of “scrumban” that many teams have started to adopt. The focus is on limiting the work in progress, but still having timeboxed sprints and releases.

SCRUM would dictate that you release at the end of each sprint. If this could have a negative impact on your customers, would be difficult to manage because of frequency, or for any other reason doesn’t fit your needs, then simply modify the process to work well for your specific project.

Make agile work for you.

Agile is meant to help teams get on a cycle of continuous learning and deployment, and if systems are not in place to check for success or quality, you’ll have a lot of problems. Any amount of downtime or rework is too much for software teams and businesses that rely on applications as their primary revenue source. Efficiency, in regards to the software development lifecycle, should not be married to “faster” deployments, but stabler, higher quality deployments. Successful deployments should adhere to the rule of “working code.” Too many times, “done code” is not “complete code,” and then it turns into an emergency. Agile teams must build a better criteria for success to accomplish the company’s objectives for the code, and not just adhere to a time-boxed deployment schedule.

Many companies wait until there’s a problem to make an adjustment to their team processes and products, but we think time is better spent preventing problems and avoiding work that interrupts from the greater business objective. Prefix and Retrace are more than pieces of software, they’re comprehensive app performance tools that works alongside your agile team to help you plan ahead, build proactively, and deploy without interruption or emergency.

There’s nothing like a few emergencies to challenge your team’s morale and focus. Agile is a great start, but building in support structures of purpose and product means that your developers can focus on their code performance and your company can focus on building better products. It’s a win-win for everybody.

 


Jason Taylor has worked in a number of high-growth business units centered around delivering Software as a Service. The experiences gained in those shops directly led him to Stackify, and those experiences help shape the product. Jason has led small- and medium- sized development teams through his career and is intently focused on delivering a great product while helping developers grow, learn, and realize their full potential.

Connect with Jason on:  Twitter  |  LinkedIn