In previous posts I have talked about some great new features in the newest (2013 Q2) release of JustTrace. The new features are terrific and certainly contribute to make JustTrace an indispensable part of any .NET developer’s toolbox, but there are three features that are so cool that they are deserving of their own individual blog posts:

I have covered the Disposed Object View and the Potential Biding Leaks in WPF View in the previous posts in this series. I’ll be demonstrating the Bottlenecks View in this post. Today I’m going to show you how a bottleneck can potentially degrade the performance of your .NET application. 

What is a Bottleneck?

A bottleneck is a condition where a large amount of your applications workflow flows through a single or limited number of components. Sometimes these components are abstractions of resources with a limited amount of bandwidth, such as a file system or a database connection. If you were to map out a bottleneck using a diagram, it would look something like this:

image

Figure 1 – A diagram of a bottleneck

In this case we have three “clients” which could be a component, service or interface in your application that need to interact with various “worker” objects or method. They must all go through the bottleneck in the center. When the number of clients and calls increase the strain of the bottleneck increases and creates a drag on application performance.

Bottlenecks can be difficult to find in your code because on the surface nothing seems wrong or out of place. In most cases the code that leads to a bottleneck situation appears to be well designed and written .NET code. Most developers work to write their applications as a set of reusable components, so a method that can be leveraged and called from several points is good design.

Bottlenecks differ from hotspots in that a hotspot is a computationally expensive method, but is not necessarily called from multiple unique places in the application. In some cases having a bottleneck or hotspot is OK. An application that performs a complex computation will likely have the methods that perform these calculations identified as hotspots. This is expected; they are the methods doing most of the work. In this case these methods are hotspots by design. This can also be true with bottlenecks. If an application needs to interact with a computers file system the class that abstracts the file system is a bottleneck by design; if there were no gate between the application and the file system hardware than the hardware would quickly become overwhelmed and errors would result. A bottleneck in code that is not protecting a limited resource like the hardware that controls the files system can even be acceptable if the impact to the execution of the application is negligible or the architectural benefits outweigh the impact on execution. However, problems can manifest as the application grows and gains more users. Issues related to unnecessary and unintended bottlenecks can grow quickly and without a profiler these types of issues can be difficult to locate.

Find the Bottlenecks in Your Code With JustTrace

A new feature for the 2013 Q2 release of JustTrace is the bottleneck view. This view will show any methods in your code that consume a high percentage of processor time and are being called from more than one unique parents criteria. In addition to showing you which methods might be causing bottlenecks, JustTrace’s bottleneck view will show you which methods are calling your potential bottleneck method, sometimes referred to as “fan-in”. In some cases, particularly cases where the fan-in methods are code you or someone on your teams has written, it may be possible to tune your application to remove the bottleneck.

To demonstrate this, I’ve created a simple console application that takes a list of numbers in the Fibonacci sequence and sums them. The algorithm to get the Fibonacci sequence resides in a class called “Engine” and is pretty straight forward:

 1: using System;
 2:  
 3: namespace BottleNeckView
 4: {
 5:  public sealed class Engine
 6:     {
 7:  public ulong Fibonacci(long n)
 8:         {
 9:  return DoWork(n);
 10:         }
 11:  
 12:  private ulong DoWork(long n)
 13:         {
 14:  ulong X0 = 0;
 15:  ulong X1 = 1;
 16:  for (var idx = 0; idx < n; idx++)
 17:             {
 18:                 var step = X0;
 19:                 X0 = X1;
 20:                 X1 = step + X1;
 21:             }
 22:  return X0;
 23:         }
 24:     }
 25: }

For this example I’m replicating a scenario very common in most applications. I have a class with a public API (the Fibonacci method) that delegates work to an internal private method or methods, in this case the “DoWork” method. This is a very simple example, but in an actual line of business application the Fibonacci method may represent a method that performs a workflow by calling a series of steps implemented by a series of private methods, some of which are probably also leveraged by methods on the public API. There is nothing wrong with this. It is a great design pattern that enhances reuse of code and hides unnecessary implementation details from the consumer.

Speaking of the consumer, the console application consists of three methods which are responsible for calling the Fibonacci method on the Engine and summing the results:

 1:  private static readonly Engine engine = new Engine();
 2:  
 3:  private static void DoShortFib()
 4:         {
 5:  ulong shortTotal = 0;
 6:  for (var idx = 0; idx < 100000; idx++)
 7:             {
 8:                 shortTotal += engine.Fibonacci(idx);
 9:             }
 10:             Console.WriteLine("Short: done");
 11:         }
 12:  
 13:  private static void DoMidFib()
 14:         {
 15:  ulong midTotal = 0;
 16:  for (var idx = 0; idx < 200000; idx++)
 17:             {
 18:                 midTotal += engine.Fibonacci(idx);
 19:             }
 20:             Console.WriteLine("Mid: done");
 21:         }
 22:  
 23:  private static void DoLongFib()
 24:         {
 25:  ulong longTotal = 0;
 26:  for (var idx = 0; idx < 300000; idx++)
 27:             {
 28:                 longTotal += engine.Fibonacci(idx);
 29:             }
 30:             Console.WriteLine("Long: done");
 31:         }

As you can see there is a short, medium and long versions of the method which call the Fibonacci method on a static instance of the Engine class that get from 1 to the 100,000th, the 200,000th and the 300,000th number in the sequence respectively. Looking at this code you will probably notice that it’s very repetitive. Normally these methods would be refactored so that there would be one method that contains a loop that called the Fibonacci method on the Engine class with the number of the sequence to work too passed in as an argument. For the purposes of this demo these are intended to represent three unique work flows (as methods) that make calls to the Fibonacci method on the Engine object. In a normal line of business application it’s not uncommon to have three (or more) methods that all call the same method to perform some work.

The Main method of the console application calls these three methods in sequence and also tracks how long the application takes to complete:

 1:  static void Main(string[] args)
 2:         {
 3:             var startTime = DateTime.Now;                        
 4:  
 5:             DoShortFib();
 6:             DoMidFib();
 7:             DoLongFib();
 8:  
 9:             Console.WriteLine("Done");
 10:             var endTime = DateTime.Now;
 11:             var elapsed = endTime - startTime;
 12:             Console.WriteLine("Elapsed time: {0}", elapsed);
 13:             Console.ReadKey();
 14:         }

In order to find the bottleneck in this code I need to select Performance Profiling form the JustTrace toolbar in Visual Studio:

image

Figure 2 – The Performance Profiler option is selected

The application runs without debugging (CTRL+F5) and the results are displayed on the console:

image

Figure 3 – The application has completed

The next step is to look at the results JustTrace collected. On the JustTrace tab in visual studio you will see a new view listed at the bottom of the “Caller Trees” section called “Bottlenecks”:

image

Figure 4 – The Bottlenecks view option in the JustTrace Caller Trees section

By clicking on “Bottlenecks” I can see that I do in fact have a bottleneck in my application:

 

image

Figure 5 – There is a bottleneck in the application

JustTrace has tagged the Fibonacci method on the Engine class as a bottleneck. This makes sense; it is a single method taking up a lot of the processor time. But that alone doesn’t make it a bottleneck. For that it would need to be called from several different places in the application. By expanding the row in the Bottleneck View we can see that it in fact does get called from several different places in the application:

image

 

 

Figure 6 – The unique parents of the bottleneck call

JustTrace has helped us find the bottleneck in our code. If I was to create a diagram similar to figure 1 for this application, it would look like this:

image

Figure 7 – The Fibonacci method as a bottleneck

Now it’s time for some refactoring.

Optimizing the Bottlenecks Away

We have a couple different options to optimize this code. The first and probably most obvious way would be to optimize the DoWork method on the engine. If you recall from the code listing above, it uses a looping algorithm to determine what number of the Fibonacci sequence is at a particular position. Loops in general are actually pretty inefficient, and in this case we’re doing a lot of them. Luckily there is a more efficient mathematical way to determine which number in the Fibonacci sequence is at the position we are interested in:

image

This is Binet’s formula and it provides a much more efficient way (well, more efficient for computers, which are good at math) to find the particular value of the Fibonacci sequence for a specified location in the sequence, represented by n in the formula. To utilize this formula I’ve made the following changes to the Engine class:

 1: using System;
 2:  
 3: namespace BottleNeckView
 4: {
 5:  public sealed class Engine
 6:     {
 7:  private readonly double _sqrt5 = Math.Sqrt(5);
 8:  private readonly double _phi;
 9:  private readonly double _psi;
 10:  
 11:  public Engine()
 12:         {
 13:             _phi = (1 + _sqrt5) / 2;
 14:             _psi = (1 - _sqrt5) / 2;
 15:         }
 16:  
 17:  public ulong Fibonacci(long n)
 18:         {
 19:  return DoWork(n);
 20:         }
 21:  
 22:  private ulong DoWork(long n)
 23:         {
 24:             var result = ((Math.Pow(_phi, n) - Math.Pow(_psi, n)) / _sqrt5);
 25:  return (ulong)Math.Round(result, 2);
 26:         }
 27:     }
 28: }

Since the square root of five, the computations in the left side of the numerator (phi) and the computations on the right side of the numerator (psi) are all constant I declared them as instance variables and populated them with the appropriate values. From there it’s just a simple matter of plugging my value for n into the equation and getting the number. When it’s run this application performs much faster than the previous iteration:

image

Figure 8 – The use of Binet’s formula is more efficient than loops.

Clearly the use of Binet’s formula is much more efficient than looping through the whole Fibonacci sequence to find the number I’m looking for. The previous version of this application took over four minutes, now it takes about a second. Let’s take a look at the bottleneck view:

image

Figure 9 – A bottleneck still exists

Despite the fact that that application is much faster than before, we still appear to have a bottleneck. And this makes sense; I made the methods in the Engine class faster but I did nothing to change the architectural topology that routs a large portion of our applications workflow through one place. The instance of the Engine class I use in main is still a shared static instance. Yes, it’s faster. Yes, it’s MUCH faster. Right now. But as the application grows and develops more complexities the potential for a bottleneck that creates a meaningful performance drain remains. It’s clear I still have a bottleneck to deal with.

Make Your Application Like a Super-Freeway

Our solution for this particular problem can be solved by using the parallel programming features of .NET. As each of the methods from the client (DoShortFib, DoMidFib and DoLongFib) are independent of each other for any information, don’t create any potential resource lock situations and can run in any order they are a perfect candidate to be run in parallel. Some refactoring of the Main method is required:

 1:  static void Main(string[] args)
 2:         {
 3:             var startTime = DateTime.Now;                        
 4:             var actions = new List<Action>();
 5:             actions.Add(new Action(DoShortFib));
 6:             actions.Add(new Action(DoMidFib));
 7:             actions.Add(new Action(DoLongFib));
 8:  
 9:             Parallel.ForEach(actions, x => { x.Invoke(); });
 10:  
 11:             Console.WriteLine("Done");
 12:             var endTime = DateTime.Now;
 13:             var elapsed = endTime - startTime;
 14:             Console.WriteLine("Elapsed time: {0}", elapsed);
 15:             Console.ReadKey();
 16:         }

To enable the client methods to run in parallel I’ve first had to create a list of Action objects. That list is then populated with new Action objects that will run each of the three client methods.

I also want to eliminate the dependency on a static instance of the Engine class, so I’ll refactor the DoShortFib, DoMidFib and DoLongFib methods to accommodate this:

 1:  private static void DoShortFib()
 2:         {
 3:             var engine = new Engine();
 4:  ulong shortTotal = 0;
 5:  for (var idx = 0; idx < 100000; idx++)
 6:             {
 7:                 shortTotal += engine.Fibonacci(idx);
 8:             }
 9:             Console.WriteLine("Short: done");
 10:         }
 11:  
 12:  private static void DoMidFib()
 13:         {
 14:             var engine = new Engine();
 15:  ulong midTotal = 0;
 16:  for (var idx = 0; idx < 200000; idx++)
 17:             {
 18:                 midTotal += engine.Fibonacci(idx);
 19:             }
 20:             Console.WriteLine("Mid: done");
 21:         }
 22:  
 23:  private static void DoLongFib()
 24:         {
 25:             var engine = new Engine();
 26:  ulong longTotal = 0;
 27:  for (var idx = 0; idx < 300000; idx++)
 28:             {
 29:                 longTotal += engine.Fibonacci(idx);
 30:             }
 31:             Console.WriteLine("Long: done");
 32:         }

Each method is creating its own instance of the Engine class. If you are a practitioner of TDD you no doubt see a problem here; a tight dependency on the Engine class as it’s being instantiated in each method. In a normal application that I would be building using TDD, I would make sure that this dependency is supplied at runtime by using a Dependency Injection framework like Ninject or Structure Map. If I couldn’t do that I could still test this using the elevated mocking capabilities of JustMock. In either case there’s no excuse for not testing your code, so make sure you have your test!

Also please note that this step is not necessarily required. In many cases keeping the instance of Engine as a static object would be OK so long as I can lower the pressure the instance of Engine is under, meaning it takes a lower percentage of the processor time to complete its tasks. There are other types of refactoring I potentially could employ here, such as making the call to Fibonacci and/or DoWork methods as asynchronous calls. In this case I can’t do that since I need the results of those methods immediately to complete my computation, but there are many cases where calling these asynchronously is not only possible, but advisable.

After this refactoring I’m now ready to test my application for bottlenecks again. When I run it I can see that the performance is even better than the previous version:

image 

Figure 10 – The application is a bit faster when run in parallel

The parallel version of the application completes in less than a second. But the previous change I made created a huge performance gain but didn’t eliminate the bottleneck. I can check the bottleneck view in JustTrace and see if the bottleneck still exists:

image

Figure 11 – No bottlenecks

Running my workflows in parallel has eliminated the bottleneck. This is because by running the Do*Fib method in parallel I’ve enabled my application to use more resources, essentially widening the road from a one-lane road through the Engine object to a three lane highway through multiple Engine objects. If I update the diagram in figure 7 to reflect the changes, the new diagram would look like this:

image

Figure 12 – Each Do*Fib call has its own process

Each of the horizontal rows above is running in its own parallel process and has its own instance of Engine, thus eliminating the bottleneck. I can feel comfortable now that as the complexity of my application increases that the performance will not be hindered by a bottleneck in the Engine class.

Summary

Code reuse and adherence to the Single Responsibility Principal are always something to strive for in designing and developing applications. However as developers we have to keep an eye on application performance as well and understand the implications of having code that creates bottleneck conditions in our applications. JustTrace is a great tool that can help you find these issues in your applications, enabling you to make sure that your applications are not only well designed and built, but are high performing as was well. So download the 2013 Q2 version of JustTrace and check your application for bottlenecks. You may be surprised by what you find.

JustTrace banner
About the Author

James Bender

is a Developer and has been involved in software development and architecture for almost 20 years. He has built everything from small, single-user applications to Enterprise-scale, multi-user systems. His specialties are .NET development and architecture, TDD, Web Development, cloud computing, and agile development methodologies. James is a Microsoft MVP and the author of two books; "Professional Test Driven Development with C#" which was released in May of 2011 and "Windows 8 Apps with HTML5 and JavaScript" which will be available soon. James has a blog at JamesCBender.com and his Twitter ID is @JamesBender. Google Profile

Comments

Comments are disabled in preview mode.