Nistor Marian is a senior software developer at Tecknoworks with an extensive experience in the development and implementation of software programs using various technologies.
The article is a continuation of "Eliminating Code Smells Through the Use of Simple Refactorings" where we wrote about refactoring through simple changes. You can access the article here: http://www.blog.tecknoworks.com/posts/code-smells-1/
In this part, we give a few tips and examples on refactoring through execution path changes.
Execution path changes. These are more complex refactorings that usually change the execution plan and path of the software system. They are usually applied in specific classes and methods. The execution path changes but the system is still working in pretty much the same specific manner. These kind of refactorings require you to have a mechanism in place that can detect when you are breaking the functionality of the original code (unit tests are a good way for you to achieve this).
Duplicated code. Usually can spawn one or more methods, not necessarily in the same class. By virtue, this code should be extracted in its own class and reused. It is also a sign that the single responsibility principle is broken for that class, but be careful if the duplicated code is not actually part of different responsibilities.
Bad Code
public int MainMethod(){
Method1();
return Method2();
}
private int Method1(){
int x = 0;
for (int i = 0; i < 100; i++){
x += i % 10;
}
return x;
}
private int Method2(){
int y = 0;
for (int j = 0; j < 100; j++){
y += j % 5;
}
return y;
}
I have marked the duplicated code in the snippet with bold characters. As you can see the code does almost exactly the same thing. In order to get rid of this duplicated code we can do an extract method that takes in a parameter which is the divisor (10 or 5 in our case).
Good Code
public int MainMethod(){
Method1();
return Method2();
}
private int Method1(){
return SumOfModulo(10);
}
private int Method2(){
return SumOfModulo(5);
}
private int SumOfModulo(int divisor){
int x = 0;
for (int i = 0; i < 100; i++){
x += i % divisor;
}
return x;
}
You can further improve this code by removing the unnecessary methods Method1 and Method2, since now you can simply call in the SumOfModulo method.
Long conditions. Usually conditions that span multiple lines of code or are extremely hard to understand/read. They should be split up into smaller Boolean variables each representing a different part of the condition. The reason is to make the code more readable and easier to understand.
Bad Code
public int MainMethod(){
return Method1();
}
private int Method1(){
int x = 0;
for (int i = 0; i < 100; i++){
if (i % 10 == 0 && i / 5 > 1 && i < 30 && i > 10)
x -= i % 10;
}
return x;
}
The bolded condition is not extremely hard to follow or to make sense of, but it could also contain bugs, hidden bugs, that we are not aware of. Instead of having such a complex condition we can choose to split it into smaller Boolean values so we easily understand the logic behind it.
Good Code
public int MainMethod(){
return Method1();
}
private int Method1(){
int x = 0;
for (int i = 0; i < 100; i++){
bool isDivisibleBy10 = i % 10 == 0;
bool isQuotientBiggerThan1WhenDividedBy5 = i / 5 > 1;
bool isInRange10To30 = i > 10 && i < 30;
if (isDivisibleBy10 && isQuotientBiggerThan1WhenDividedBy5 && isInRange10To30)
x -= i % 10;
}
return x;
}
Not only this makes for an easier read, but you can easily detect issues with the logic of the condition just by simply looking at it. In this case I can simply acknowledge that the code for the if statement is only run for the value of 20.
Imbricated for statements that do a lot of work. This is usually a performance smell especially when dealing with databases. The code needs to be changed either by incorporating the logic into the database query or by reading data before the for statements.
Bad Code
public class OrderData{
public string OrderNumber { get; set; }
public decimal TotalPrice { get; set; }
}
public List<OrderData> LoadOrders(){
var orderList = new List<OrderData>();
for (int i = 1; i <= 100; i++){
var data = LoadOrderDataFromDatabase(i);
orderList.Add(data);
}
return orderList;
}
private OrderData LoadOrderDataFromDatabase(int orderId){
var orderData = new OrderData();
var order = LoadOrderFromDatabase(orderId);
orderData.OrderNumber = order.OrderNumber;
var orderDetails = LoadOrderDetailsFromDatabase(orderId);
foreach (var detail in orderDetails){
var product = LoadProductFromDatabase(detail.ProductId);
orderData.TotalPrice += product.Price * detail.Quantity;
}
return orderData;
}
private dynamic LoadOrderFromDatabase(int orderId){
using (SqlConnection conn = new SqlConnection())
using (SqlCommand cmd = new SqlCommand("SELECT * FROM Orders WHERE OrderId = @orderId"){
return new { OrderNumber = orderId };
}
}
private dynamic LoadOrderDetailsFromDatabase(int orderId){
using (SqlConnection conn = new SqlConnection())
using (SqlCommand cmd = new SqlCommand("SELECT * FROM OrderDetails WHERE OrderId = @orderId")){
return new List<dynamic>() { new { ProductId = 1, Quantity = 1 }, new { ProductId = 2, Quantity = 2 } };
}
}
private dynamic LoadProductFromDatabase(int productId){
using (SqlConnection conn = new SqlConnection())
using (SqlCommand cmd = new SqlCommand("SELECT * FROM Products WHERE ProductId = @productId")){
return new { ProductId = productId, Price = 200m };
}
}
As we can see in the above code a list of orders is loaded and then we load the order details pertaining to each of those orders in order to calculate the Total value. This results in a lot of queries being done to the database especially on orders that have a lot of details. Furthermore, the price of the product in the detail of an order needs to be loaded from a different table, which incurs another connection and database query. In order to correct this code and improve its performance we are going to change code in order to reduce it to a single query to the database, which will return the list of orders.
Good Code
public class OrderData{
public string OrderNumber { get; set; }
public decimal TotalPrice { get; set; }
}
public List<OrderData> LoadOrders(){
var orderIds = Enumerable.Range(1, 100).ToList();
var orderList = LoadOrdersFromDatabase(orderIds);
return orderList;
}
private List<OrderData> LoadOrdersFromDatabase(List<int> orderId){
using (SqlConnection conn = new SqlConnection())
using (SqlCommand cmd = new SqlCommand(@"SELECT O.OrderNumber, SUM(OD.Quantity * P.Price) AS TotalPrice
FROM Orders O
LEFT JOIN OrderDetails OD ON O.OrderId = OD.OrderId
LEFT JOIN Products P ON OD.ProductId = P.ProductId
WHERE O.OrderId IN (....)
GROUP BY O.OrderNumber")){
return new List<OrderData>();
}
}
As we can see the code has been drastically reduced in size and a lot its functionality has been moved to the database query. But the most important thing to look at is the fact that there is only one query being done to the database. From a performance perspective this is a huge gain.
Complicated code. Just for the sake of complication. Remove the complication.
Bad Code
public void Method1(){
int i = 0;
int sumOfNumbers = 0;
while (i < 100){
sumOfNumbers += (i % 2 == 0 ? (i++ * 2) : i++);
}
}
This is a simple example of code that not only is very hard to understand but also very easy to change in a way that affects its functionality resulting in bugs.
Good Code
public void Method1(){
int i = 0;
int sumOfNumbers = 0;
while (i < 100){
bool isEven = i % 2 == 0;
sumOfNumbers += isEven ? (i * 2) : i;
i++;
}
}
As we can see this code is now clearer. All even numbers are doubled into the sum. Also we moved the increment of the i counter variable to its own line so that the statement above is clearer and easy to peek at. The key point here is to determine how much of the code is actually complicated and needs to be expanded/extracted.
Hidden functionalities inside properties getters and setters. Replace getter or setter with a method.
Bad Code
private List<string> messages = new List<string>();
private int numberOfMessages = 0;
public int NumberOfMessages{
get{
return numberOfMessages;
}
set{
numberOfMessages = value;
if (value >= 2)
messages.Clear();
}
}
public void AddNewMessage(){
messages.Add("I am test message");
NumberOfMessages++;
}
public void AddAnotherMessage(){
messages.Add("I am another test message, and I will clear the number");
NumberOfMessages = NumberOfMessages + 1;
}
public void MainMethod(){
AddNewMessage();
AddNewMessage();
AddNewMessage();
AddAnotherMessage();
}
As we can see in the above code example if we are setting the NumberOfMessages property to anything bigger than 2 the private messages list clears. This is something that the user of our code does not expect to happen, and has no way of knowing it happens. In order to correct this, we must either move the code from the property to its own method or create methods that incorporate this code and call them from the setter. We are going to go with the first choice.
Good Code
private List<string> messages = new List<string>();
private int numberOfMessages = 0;
public int NumberOfMessages{
get{
return numberOfMessages;
}
}
public void SetNumberOfMessagesAndClearListIfBiggerThan2(int nr){
numberOfMessages = nr;
if (nr >= 2)
messages.Clear();
}
public void AddNewMessage(){
messages.Add("I am test message");
SetNumberOfMessagesAndClearListIfBiggerThan2(NumberOfMessages + 1);
}
public void AddAnotherMessage(){
messages.Add("I am another test message, and I will clear the number");
SetNumberOfMessagesAndClearListIfBiggerThan2(NumberOfMessages + 1);
}
public void MainMethod(){
AddNewMessage();
AddNewMessage();
AddNewMessage();
AddAnotherMessage();
}
The code becomes much more robust and our user can correctly understand what happens if he should set a number of messages bigger than 2. The code speaks for itself and more importantly it doesn’t hide features that it exposes!