Many legacy codebases have significant portions of technical debt in their database design. Such codebases tend to accumulate tons of business logic embedded in database scripts in the shape of stored procedures. From a maintenance perspective, the quality of that database code is at least as important as the application code. In practice, that’s unfortunately not always the case, and you might find that general software principles aren’t necessarily applied to the SQL design. The consequences are database scripts that are hard to understand and, hence, expensive to maintain and extend. Let’s see how we can uncover and highlight such potential problems.
CodeScene’s main analyses are language neutral, which means that it can detect hotspots and prioritize improvements to SQL scripts too. As an example, have a look at the following analysis of the ambitious PL/JSON project that implements generic
JSON support for Oracle’s
The previous figure shows a hotspot analysis of the SQL code, where a hotspot is defined as complex code that the organization has to work with often. In this example, I have highlighted the file
pljson_printer.package.sql, and we can see that it consists of 487 lines of effective code (comments and blanks stripped away) to which the organization has made 62 commits. CodeScene can also show the trend to indicate in which direction that code evolves. You see the complexity trend in the lower right corner of the figure. In this case, we note that the complexity trend seems to be decreasing. That’s a good sign.
Until recently, CodeScene’s SQL analyses stopped at a file level. Sometimes that’s good enough, but if you’re unfortunate enough to come across SQL scripts consisting of thousands of lines of code, a file level hotspot won’t deliver enough information. We need much more detailed recommendations. To help organizations address technical debt in SQL code, we have extended our X-Ray analysis to work with Oracle’s
PL/SQL code too.
An X-Ray runs CodeScene’s behavioral code analyses at the level of individual building blocks like functions and procedures. Let’s see what it looks like on the
As you see in the preceding figure, X-Ray provides a list of potential refactoring targets. The hotspot functions and procedures are ranked by their change frequency, which we use as a proxy for an interest rate on any potential technical debt. You use this list to prioritize which parts of the code that could benefit the most from refactorings. To aid your investigation, CodeScene also calculates the cyclomatic complexity of each function and procedure in order to indicate whether the hotspot needs refactoring or not.
The X-Ray analysis offers additional insights into the evolution of your SQL code. By combining a change coupling analysis with a copy-paste detection algorithm, CodeScene can identify duplicated code where you have to make predictable modifications in multiple places as part of the same change. The next figure shows an example:
From here we click the Compare button in CodeScene to reveal a missing abstraction:
Eliminating this kind of duplicated code is the low-hanging fruit that lets you reduce both code complexity and overall maintenance costs. In a large codebase there’s likely to be a substantial amount of duplicated code, but most of that duplication won’t affect you. That’s why CodeScene’s prioritized results are important as they offer findings that likely have a positive ROI on improvements.
Finally, technical debt may come in the shape of complex inter-dependencies between different modules in a system. One of the main challenges in working with legacy code is knowing how a particular change will impact other parts of the system. CodeScene can provide some guidance by visualizing such change coupling across whole clusters of files.
In the preceding figure, you see a cluster of three SQL files that are often modified together. CodeScene tells us that those files tend to be changed as part of the same commit in 70% of all cases. That’s quite a strong change coupling and it indicates that those files are logically related. To find out exactly why these files co-evolve, we run the X-Ray analysis on that cluster. Here’s what it looks like:
Using these X-Ray results, we narrow down our findings and get a view of the different stored procedures that co-evolve. This is data that we use either to guide our changes as we modify legacy code, or to identify refactoring opportunities. For that second purpose it’s interesting to know that CodeScene’s copy-paste detection algorithm works across multiple files too.
This additional support for
PL/SQL analyses open several new possibilities that will help you get deep insights into large-scale enterprise projects. As a final note, I want to point out an interesting implication of CodeScene’s support for detailed SQL analyses. With that support, you can now uncover change patterns in your whole development stack. Since CodeScene’s change coupling analysis is language neutral, it’s possible to highlight change patterns that go all the way from your