Nistor Marian is a senior software developer at Tecknoworks with an extensive experience in the development and implementation of software programs using various technologies.
I’m going to start off by explaining the concept of refactoring, what is it useful at and when it should be applied. Then I’m going to dive into the three big types of refactorings that I’ve come across in my experience: the simple changes, the execution path changes and the design changes. In order to determine the need for refactoring we must be able identify certain code smells. Following the definition of a code smell, I will list a couple of them along with C# code examples in which I will take the reader from the Bad Code to the Good Code, explaining in detail how I handled the change of the code and which types of refactorings I’ve applied to achieve my goal. The smells will be split into the three categories of refactorings based on what they change.
The article is spilt into three parts. In part 1, I will detail refactoring through simple changes.
Refactoring is one of our many tools as software craftsmen that enables us to mold the unclean code into clean code. There are many types of refactoring and they can be applied either individually or in bulk in order to create code that is clean. Refactoring should be part of our daily routine of coding, whenever we code a new feature, whenever we see some code that can be made better, we should do it. We just put our refactoring hat on and we make that code better.
The code is a living organism. It changes and adapts rapidly and wildly in some cases, but most of the time the code begins to rot and eventually it starts to smell worse and worse. “It stinks, me thinks!” is what I like to use in order to relay the code smelliness to the colleagues on my team. Some smells are worse than others, however every smell must be removed from the code. The code has to be odourless! The code must shine!
As you uncover the code smells you also uncover the need for refactoring. The smells are signals that a code has become unclean. There are many code smells out there in the void and this is not an extensive list, by far, but it provides good coverage of some of the most encountered code smells. For each smell I will give some examples of Bad Code and explanations of why that code is bad. Then, where relevant, I will list the Good Code so you can see for yourself what was actually changed. If the resulting refactoring is more complicated, I will also include a description of the steps I’ve taken in order to get to the Good Code from the Bad Code. The code smells are categorized based on their refactoring amplitude into the three categories listed above.
Simple changes
They do not affect the execution of the software system. These include renames, code removal, instructions compressions, etc. They usually only touch the look of the code and not the way the code gets executed. As a rule of thumb whenever you do these kind of refactorings you don’t really need unit tests or any other kind of mechanism to let you know you didn’t break something.
Bad Code
public void MainMethod(){
Method1();
Method2();
//Method3();
}
private void Method1(){
// Do something
}
private void Method2(){
// Do something
}
public void Method3(){
// Do some processing
}
The method was clearly used at some point in time but instead of it being removed, it was left inside the code base. This kind of practice is bad, especially, since in time you might end up with more unused code than actual code. Nowadays with the power that the source control systems give us we should avoid keeping old code, or code that is not used, at all costs. Let the source control system do its magic.
Bad Code
public int MainMethod(){
var x = 0;
x += Method1();
return Method2();
}
private int Method1(){
// Do something
return 4;
}
private int Method2(){
// Do something
return 5;
}
It is clearly obvious that the x variable is not used for anything purposely. Perhaps the old code was reading something like x += Method2(); return x;. You can only guess the reason for its existence! This is a very sloppy way of changing the MainMethod to only return the value of the Method2. Below you can see the correctly refactored code.
Bad Code
public int MainMethod(){
// here we do something extremely bizzare
Method1();
// here we return the result of something that looks like a dinosaur
return Method2();
}
private int Method1(){
// Do something
return 4;
}
private int Method2(){
// Do something
return 5;
}
Right from the very top the MainMethod contains comments that try to explain what the methods below them do. The code does not speak for itself, because in this case the name of the methods should state exactly what the comments are stating, making the comments futile and useless.
Good Code
public int MainMethod(){
DoSomethingExtremelyBizzare();
return SomethingThatLooksLikeADinosaur();
}
private int DoSomethingExtremelyBizzare(){
// Do something
return 4;
}
private int SomethingThatLooksLikeADinosaur(){
// Do something
return 5;
}
Bad Code
public int MainMethod(){
Method1(); return Method2();
}
private int Method1(){
// Do something
return 4;
}
private int Method2(){
// Do something
return 5;
}
If I take one look at this code my first question is where does Method1 end? Where does Method2 start? The code doesn’t allow me to easily distinguish this important information right away, instead I have to scout for the end curly braces myself. Nowadays most IDEs have a simple key combination that allow you to correctly indent all the file at once. In Visual Studio for example: Ctrl+K, Ctrl+D.
Bad Code
public int MainMethod(){
Method1();
return Method2();
}
private int Method1(){
return 4;
}
private int Method2(){
return 5;
}
As I look at this code I have no idea what 4 and 5 actually mean or why are they there. I can try to guess but the code gives me no clue. However, if I replace these number with constants that have a name, I can easily convey my message to the code reader about what does 4 or 5 mean.
Good Code
private const int NumberOfDolphins = 4;
private const int NumberOfWhales = 5;
public int MainMethod(){
Method1();
return Method2();
}
private int Method1(){
return NumberOfDolphins;
}
private int Method2(){
return NumberOfWhales;
}
Bad Code
private const int NOD = 4;
private const int NOW = 5;
public int MainMethod(){
return Method1() + Method2();
}
private int Method1(){
return NOD;
}
private int Method2(){
return NOW;
}
What does NOD mean? What does NOW mean? Is it referring to the actual time or what? What does Method1 do? What does Method2 do? I can’t tell. The function names are not conveying any message in regard to what they actually accomplish. The constant names do not represent eligible meanings.
Good Code
private const int NumberOfDolphins = 4;
private const int NumberOfWhales = 5;
public int GetTotalAquaticInhabitants(){
return GetNumberOfDolphins() + GetNumberOfWhales();
}
private int GetNumberOfDolphins(){
return NumberOfDolphins;
}
private int GetNumberOfWhales(){
return NumberOfWhales;
}
Now I can clearly understand what this code does without actually having to search for specs in regard to the functionality. I can read it like prose. If I need to change the number of dolphins I know exactly what constant I need to set. In case we have 4 tanks, each of them holding 4 dolphins I can easily change the GetNumberOfDolphins method in order to relay this information correctly.
Bad Code
public int MainMethod(){
return Method1() + Method2(false);
}
private int Method1(){
int x = 0;
for (int i = 0; i < 100; i++){
bool downstream = true; // Due to change request #123 this is always true! i < 50;
if (downstream){
x -= i % 10;
// do some more processing
}
else{
x += i % 10;
// do some more processing
}
}
return x;
}
private int Method2(bool downstream){
int y = downstream ? 100 : 0;
return y;
}
As we can see the if in Method1 is no longer useful, it was even commented to point out that it will always be true. Apparently the developer thought this code might be useful in the future, but that is rarely the case. We simply remove the unneeded code.
Good Code
public int MainMethod(){
return Method1() + Method2(false);
}
private int Method1(){
int x = 0;
for (int i = 0; i < 100; i++){
x -= i % 10;
// do some more processing
}
return x;
}
private int Method2(bool downstream){
int y = downstream ? 100 : 0;
return y;
}
Same rule can be applied to Method2, if we can make sure that it is never called with different Boolean values for the downstream parameter.
Another good example for unneeded code are methods that are no longer part of the execution path of the software system.
Bad Code
public int MainMethod(){
return Method2(false);
}
private int Method1(){
int x = 0;
for (int i = 0; i < 100; i++){
x -= i % 10;
// do some more processing
}
return x;
}
private int Method2(bool downstream){
int y = downstream ? 100 : 0;
for (int j = 0; j < 100; j++){
if (downstream)
y -= j % 5;
else
y += j % 5;
}
return y;
}
As we can see the Method1 is never called throughout the execution of the program. So we can safely remove it.
Good Code
public int MainMethod(){
return Method2(false);
}
private int Method2(bool downstream){
int y = downstream ? 100 : 0;
for (int j = 0; j < 100; j++){
if (downstream)
y -= j % 5;
else
y += j % 5;
}
return y;
}
There is no reason to have code that is no longer used or that never runs, ever!