Many of the articles on this blog dive down into lower-level Magik code details, so I thought I’d switch things up a bit and give a higher-level perspective on what it takes to write great code. Note these ideas apply to any development environment, not just Smallworld GIS and Magik, but the links I’ve provided point to Magik examples.
As you may know, the majority of projects run significantly over budget, are substantially delayed or outright fail.
In a nutshell, it’s because people don’t follow good development practices. Despite decades of software engineering research and thousands of books on the subject, it appears there are still many in the industry that don’t understand how to build high quality software. They set arbitrary deadlines based on wishful thinking, leave developers out of the estimating process, cut corners and are guilty of a variety of other transgressions.
I’ve personally witnessed numerous examples over the years, but I’m not going to dwell on them here. Instead, I’ll list some of the things you can do to ensure your project not only delivers on time and on budget, but also delivers high quality software with very few bugs.
To kick things off, let’s start with what we’re trying to achieve.
The primary goal of a software project is to accurately implement business requirements in an application that is easy to maintain and enhance. The goal is not to hit an arbitrary deadline nor is the goal subjective – it is objective and easy to define the end criterion: have all the requirements been implemented? Yes, we’re done. Otherwise continue with the project.
This provides a traceable baseline, but it must be built on top of sound development practices so the software is robust and can meet both current and future demands, including being easy to use, scalable, easy to maintain and enhance as well as being easy to test and write automated tests for.
But in order to understand how to write high quality software, it’s necessary to define what the quintessential high quality software looks like. Of course this theoretical unicorn is not achievable in the real world, but it gives us an idea of what we need to do in order to move in that direction and come as close as practically possible given the limitations of a real project.
So how do we get there from here?
High Quality Software Development
As I touched upon earlier, we should be able to agree high quality software must be easy to understand, easy to maintain, easy to enhance and test and should not have serious bugs. It should be built on a platform that provides tested building blocks that are sufficient to do most things required by a general-purpose application so developers can focus on writing custom business logic rather than on, say, developing code to open and write to a JSON file or connect a GUI.
In contrast, ugly, low quality software consists of tightly coupled components where a change to one part affects multiple, unrelated pieces. Ugly software contains lots of duplicated code with hardcoded values throughout and is difficult to comprehend because its functionality and state are scattered across the codebase. It incurs an inordinate amount of technical debt which makes it difficult to work on or upgrade. In extreme cases, people don’t want to touch it because they’re afraid even minor changes will break it.
Nobody starts out planning to write ugly software. That just happens because most developers don’t view the process in a holistic manner. They simply implement their piece of the system in whatever way gets them to the finish line fastest. And most Project Managers have very little technical skills, so their focus is on schedules and budgets. Even Tech Leads often don’t understand all the details from the very top to the very bottom — they should, but unfortunately that’s not usually the case. And Software Architects are even worse. They generally deal in concepts at such a high level, it’s rare they understand or can enforce low-level implementation details.
On top of all this, projects are usually siloed where new development is independent of previous work and brute forced into working without thought to what previous project teams were thinking — who among us developers have not seen a method hanging off a class it has no business being part of simply because someone under time pressure jumped at the first solution that worked (yes, gis_program_manager, I’m glancing in your general direction)?
The end result is a patchwork of pieces that must be stitched together to form the application, but as statistics show, this ad hoc methodology doesn’t work most of the time. And even if the application makes it into production on time, it’s quite probable the code is not of high quality.
To counter the status quo, software development practices should make it easy to create high quality software by defining a set of best practices and giving resources a step-by-step plan to implement them. I’ve already discussed the details in my DevReqs article, so I’ll just focus on high-level concepts here to explain how we can achieve high quality code and avoid ugly code.
It should be obvious that if our application just used well-tested library and core functionality, we would be confident bugs are minimized. It’s only when we start writing custom code in ways we can’t test that bugs and other problems are introduced.
As long as we’re not writing custom code, the chances of injecting a bug is close to zero. Further, creating an application that executes in a linear fashion (from start to finish with no branches) makes it easy to understand what’s happening and therefore reduces complexity and further eliminates bugs (because complexity makes it difficult to see errors while simplicity exposes errors without much effort).
So if we could implement an application that meets the requirements using only fully tested components, with no custom code, and only one path of execution through the code, that should put us on the right path.
There are, of course, other things we need to do but the two most important are that the application implements all business requirements and doesn’t contain significant bugs. We want the number of bugs to be as low as possible because it’s a good assumption our customers want the application to perform flawlessly and they don’t want to be continually finding and fixing bugs after the development process is supposed to have completed.
Obviously all non-trivial software falls short of this standard because we will always have to write custom business logic — so there will always be bugs introduced during the development process. Since we know this will happen, no matter what we do, we must focus on finding these bugs as early as possible in order to eliminate them before they escape the development phase.
Further, the types of bugs are of greater importance than the number of bugs.
We want to find all bugs, but we want to eliminate the most serious ones first. Given there are real world limitations (such as resources, budget and schedule), we must prioritize our efforts and concentrate on finding the most serious types of bugs first.
As an example, if our software contains, say, 100 minor or trivial bugs (such as spelling mistakes), that is better than if it contains just one serious bug that produces inaccurate results or causes the application to crash. Naturally we should strive to eliminate even the minor bugs, but we should also understand there are real world constraints most of the time.
Non-trivial bugs also tend to cause downstream problems that affect other components, so it is desirable to ensure all components are robust and can not only gracefully handle bad parameters but also help uncover upstream bugs as soon as possible.
There is an additive effect of bugs in an application, because if the output of one component feeds bad data into the input of another, failures can propagate through downstream functions.
Assertions are indispensable in quickly identifying these types of issues and should be liberally used in all functions and methods.
How Software Quality Impacts Schedules
There are numerous reasons why software quality is of the utmost importance, but let’s focus on what is perhaps the most highly visible one for now: accurate status reporting.
All projects follow a schedule and, for obvious reasons, it is extremely important to fully understand how accurate that schedule is. If a project reports the development on a one year project is 80% complete when in fact there are tons of unknown bugs lurking beneath the surface that will take a further year to fix, that report misstates the true state of the project, as measured against the schedule, and makes the status report pretty much worthless.
But how do we know there are unknown bugs waiting to pounce when they are… well… unknown?
That’s a very big dilemma. And it occurs because typical methods for measuring code quality are usually inadequate or nonexistent. Many projects rely on methodologies and reporting techniques whose results are not only flawed, but simply untrue — painting rosy predictions that more often than not fail to materialize.
However if our development processes allow us to automatically find bugs as soon as they are introduced, the reported status will be accurate (and even if we come across an issue that is so bad it will cause scheduled dates to be pushed, at least we can immediately report the problem and everyone can adjust to accommodate for it earlier, rather than simply believing the original schedule will be met and then discovering, just before scheduled deployment, there is a huge issue that will delay the project by another year).
Therefore it should be clear the accuracy of status reports depends heavily on the quality of code that has already been developed. High quality code is ready to implement into production and does not require further work. Low quality code will blow up and require going back to the development phase in order to fix it – thus increasing effort, cost and negatively impacting the schedule.
Furthermore, low quality code is usually more difficult to fix and test, so when a bug occurs, it takes far longer to fix than if a similar type of bug occurred in high quality code. So not only do bugs occur more frequently, but they take longer to fix.
As we improve our ability to eliminate bugs immediately after they are introduced, the accuracy of our status reports, and consequently our schedules, improve right along with it. Simply put, the goal of writing bug-free software has many benefits but the one that’s most visible to project managers, and customer executives, is schedule accuracy.
How to Measure Software Quality
The majority of bugs in an application result from bad or risky development practices and lack of solid, repeatable testing. This means we must use less risky algorithms and engage development processes that eliminate certain types of bugs so we don’t have to test for them. We must also write automated tests for all our code.
It really is that simple.
Simplicity is the key. We should be constantly moving in the direction of simplicity. The simplest code that will do the job is best because it’s usually easy to understand and easy to test. The fewest number of functions in a pipeline (and greatest percentage of pure functions) will ultimately provide the highest quality software.
If we can implement everything we need with the smallest number of cohesive, loosely coupled, pure functions, why would we use more complex, bigger, tightly coupled, impure ones — or worse, gigantic classes with loads of shared state?
It’s also wise to focus on using proven paradigms, processes and testing techniques rather than implementing ad hoc, untested ideas because not all algorithms, development processes and paradigms are equal. Some have far higher probabilities of introducing bugs while others eliminate any chance a particular type of bug can be created in the first place (such as how the use of pure functions, those that don’t mutate external state, eliminate threading errors and race conditions).
But the problem is more complicated than just that. Once we’re using our shiny new processes, how do we know whether they’re moving code quality in the right direction?
We need an objective measurement and, unfortunately, that has proven elusive over the years.
To address that I’m going to suggest using a Bug to Functions Ratio (BFR) metric. Every time we find a bug outside the development phase, trace it back to the root function (note I’m using the term function to represent both procedures and methods) that caused the bug. Classify the bug type and severity and then identify all downstream functions that were impacted by this bug. We can now calculate two baseline ratios: Root Bug to Functions Ratio (RBFR) and Downstream Bug to Functions Ratio (DBFR).
Note the following:
- RBFR = number of Root bugs / total number of functions.
- DBFR = number of Downstream bugs / total number of functions.
- BFR = RBFR + DBFR.
- Total number of functions = number of custom functions written for the project (it does not include core or library functions).
The RBFR tells us how many errors were created in our code that escaped the development phase (that is, they were not found during unit testing and thus subsequently checked into the main branch).
We also need to attach a weight to each bug based on type (i.e. severity) and number of occurrences. The first bug found in a function is assigned a weight based on its severity. As an example, if we find a low impact bug, its weight might be 0.25 whereas a high severity bug might be assigned a weight of 2.0. This makes sense because higher impact bugs should count more than lower impact ones.
Additionally, the number of bugs found in a function should also affect weightings. So if we find, say, three bugs in one function, we can say that function is more buggy than one where we only found one bug of the same type. But it should count for more than simply three times as much because the fact we found multiple bugs in that function indicates there may be a bigger problem lurking beneath the covers.
As such we would increase the weightings to account for this. As further successive bugs are discovered in the same function, the weightings continue to rise so they contribute more and more to that function’s Root bug count.
Functions with high weighted bug counts deserve closer inspection because they are contributing more to the lack of code quality. There may be a variety of reasons, such as being too complicated, too large, indirectly dependent on state scattered elsewhere, too difficult to test or were developed using risky coding practices. Whatever the cause, buggy functions need to be analyzed and the fundamental issues corrected. Increasing weights put them squarely in our sights.
The DBFR tells us how resilient our code is (that is, how well the code handles erroneous inputs and other problems).
If we use assertions to check pre-and-post conditions in all our code, not only would we have a better chance of finding Root bugs in the functions where they occur, but we could also enlist downstream functions to help us identify those bugs.
Think about it. Our code takes an input and applies successive operations and transformations (via functions) to that input until a final output is produced. Each component in the chain could introduce bugs, so we need a way to find those bugs immediately after they’ve been created. Assertions fill this role by checking that inputs and outputs to and from functions are valid. If something is amiss, an assertion will immediately alert the developer.
We should also add versioned metadata tags (via comments) to functions that were affected by Root and Downstream bugs, keeping a history of them in the source code repository (of course when promoting code to production, these metadata tags could be automatically removed by a script).
We can then use an automated tool to scan the entire source code tree at a particular commit point, function by function, analyzing the metadata tags, at a particular version, to produce a report that flags problematic functions – detailing the number of bugs, types, severity and weights.
This report tells us a few things.
- What types of bugs escaped the development phase.
- How many bugs our coding practices let through.
- How robust our code is.
- Gives a baseline of code quality at a particular point in time (the BFR).
We can use (1), (2) and (3) as feedback to improve our practices and use (4) to chart our progress as the project moves forward. If the quality of our code increases (for example, BFR decreases) over time, then we know we’re moving in the right direction.
Keep in mind this method only applies to bugs caused during the development phase (that is, erroneous code or code that doesn’t correctly implement a design specification). Problems such as missing or incorrect requirements, bad design specifications or other issues external to the creation of code must be handled using alternative methods during the different phases of the Software Development Life Cycle (SDLC).
While a single phase of a project may be capable of delivering excellent results, if those results are dependent on flawed outputs from previous phases, or if it passes its results to downstream phases utilizing poor practices, the final result will only be as good as the poorest performing phase in the SDLC (note, however, the impact of errors in SDLC phases are very unbalanced. Errors in earlier phases negatively impact the project to a greater degree than those in later phases, so a particular type of error in the design phase is typically worse than a similar type of error in the testing phase).
As we’ve seen, different types of bugs cause different levels of problems. All bugs don’t negatively affect an application in the same way. A typo in a label, for example, is less severe than a bug that causes a traceback which is itself less severe than a bug that silently results in erroneous output.
So although even the most innocuous bugs should be pursued and eliminated, we should prioritize bug types by their severity and focus on eliminating the worst offenders first.
Critical code should attract the highest scrutiny – even going so far as to use the MagikCheck Property-based Testing library to re-implement that functionality using a different algorithm (it doesn’t matter whether the alternate code is slower or uses significantly more resources — because it won’t be implemented into production, just used for testing — it only matters that the output is correct). Then when testing this critical code, both algorithms are used on the same MagikCheck-generated random inputs and their outputs are confirmed to be the same. Make no mistake, this can be a significant amount of work. However for critical components it is well worth the effort to automate testing in this manner. Putting in the effort upfront once will pay huge dividends going forward and the project will continue to reap benefits as it progresses.
Improving Development Processes
Before we continue, let’s pause a moment to recap what we’ve discussed so far.
- An application that reuses fully tested library and core functionality will exhibit significantly fewer bugs than one that uses custom code created from scratch.
- Applications that abstract away complexity and are easy to understand will have fewer bugs.
- Applications that use less-risky coding practices will have fewer bugs.
- Applications that can be easily tested will have fewer bugs.
- Loosely-coupled applications that don’t depend on mutable state scattered throughout the codebase will have fewer bugs.
- Applications that execute in a linear fashion will have fewer bugs.
So an application that is composed of small, cohesive, independent pure functions and reuses a significant number of built-in library and core components that are executed in a linear pipeline will be far easier to write and test, thereby resulting in fewer bugs.
But since we know custom code will have to be written to implement a particular business’s logic, we should separate it from all the other code and ensure it is as pure as possible. The important point to remember is by following this methodology, code quality will dramatically improve compared to the ad hoc development processes usually followed today.
Another important point to consider is how components interact with one another because interactions can quickly become very complex. If you think about multiple large classes, for example, each containing many methods that call each other in myriad ways and result in a multitude of changes to objects’ internal states, it follows this will result in a tight coupling of objects all dependent on each other’s state.
Not only is this design difficult to follow and understand, but it makes it impossible to write good, comprehensive tests – which inevitably leads to low quality software. Complicated, large and tightly coupled code that is difficult to reason about increases the severity and number of bugs in a nonlinear fashion – they are more than simply additive.
Contrast this to a design that breaks large classes into small, cohesive, pure and independent functions that are easy to understand and test. If we combine them using a pipe to create the application, bugs are easy to find and fix. The pipe will not add additional errors, so once bugs in each function are eradicated, we can be confident composition won’t introduce new bugs. This leads to high quality code.
MagikFP, GSSKit and DevReqs
We now know the concepts necessary to create high quality code. But, as they say, the devil is in the details. High-level concepts are required to understand what has to be done — but without the necessary details, they’re unlikely to be implemented.
And that’s the reason I developed the MagikFP library, GSSKit framework and the DevReqs methodology… they ensure developers of all abilities can achieve the best possible results, within reasonable limits of cost and effort, by providing step-by-step instructions describing how to proceed.
Developers have different skill levels and their knowledge of best practices and patterns vary, so they use subjective practices when developing code. DevReqs was designed to standardize these practices so all developers are able to use the very best methodologies and paradigms, even if they don’t realize they’re using them. It includes the most commonly used building blocks for typical software applications – such as GUI separation, data connectors and business logic helper libraries.
Software development is a combination of art and science, but the more science we can inject into the process (through the use of better coding practices and automated testing, for example) the better our software will be.
Remember, an application’s quality is only as good as its weakest component. Creating a number of cohesive, well-tested, pure functions that are then fed into a tightly coupled, monolithic ball of mud will still result in a buggy application.
Therefore we should start with the best proven designs and processes then build upon them in a structured manner as new features are required.
We should never deviate from this plan. Ever.
It’s easy to become overwhelmed and begin taking shortcuts or start prematurely coding, especially when there are lots of vague requirements and a tight deadline with which to contend – but this will most likely result in poor quality code and increased technical debt. Always ensure the requirements and design specifications are complete, understandable and approved before starting the development phase. Also ensure estimates are reasonable.
Quality takes focused determination and discipline, but it’s significantly less expensive in the end than not pursuing quality. Attempting to hit arbitrary deadlines that aren’t based on reality or taking shortcuts will always be a mistake.
Software architecture, design and implementation is a learned process and like any process there are several good ways to do it but far more bad ways. If we learn and understand the best-of-breed methodologies at each phase in the SDLC and implement them correctly in the right order, our software will be of high quality.
Without using best practices, such as good architectural design or immediate bug elimination, it is all but certain our projects will run into severe problems as they progress. Imagine if our projects could drive all bug discovery in the development phase, not only would that enormously reduce the testing phase and greatly improve project efficiency, but our schedules would also be more accurate and we would eliminate the need to go back and fix code that has already been developed.
Quality code construction is self-evident when you see it. It’s like fine art – clean, aesthetically pleasing, easy to comprehend, built upon well-constructed layers and beautifully put together. Changes seamlessly fit in with very little effort. When we see high quality code, we just know it.
And in hindsight, based on my personal experiences and from analyzing the post-mortems of failed and delayed projects, doing things correctly the first time ends up being far less expensive while delivering higher quality results on schedule and with less effort. Your project doesn’t have to be average or deliver low-quality code.
Take the long view. Software development is a marathon, not a sprint.