How To Time Managed Code Using QueryPerformanceCounter and QueryPerformanceFrequency

From Guidance Share
Revision as of 05:26, 17 December 2007 by JD (talk | contribs)
(diff) ←Older revision | Current revision (diff) | Newer revision→ (diff)
Jump to navigationJump to search

- J.D. Meier, Srinath Vasireddy, Ashish Babbar, and Alex Mackman


Summary

This How To shows you how to create a managed wrapper class to encapsulate the Microsoft® Win32® functions QueryPerformanceCounter and QueryPerformanceFrequency. You can use this class to time the execution of managed code. This How To also provides examples that show you how to use the class to measure the overhead associated with boxing and string concatenation.


Applies To

Microsoft .NET Framework version 1.1


Overview

You can use the Win32 functions QueryPerformanceCounter and QueryPerformanceFrequency to measure the performance of your code to nanosecond accuracy. For comparison, a nanosecond (ns or nsec) is one billionth (10-9) of a second. A millisecond (ms or msec) is one thousandth of a second.

Note At the time of this writing, the .NET Framework 2.0 (code-named "Whidbey") provides a wrapper to simplify using QueryPerformanceCounter and QueryPerformanceFrequency.


Creating a QueryPerfCounter Wrapper Class

In this step, you will create a wrapper class to encapsulate the Win32 function calls used to obtain performance information.

To create the wrapper class

Use Microsoft Visual Studio® .NET or any text editor to create a new C# file named QueryPerfCounter.cs. Add an empty class named QueryPerfCounter as shown.

  public class QueryPerfCounter
  {
  }

Add a using statement to reference System.Runtime.InteropServices so that you can make calls to native Win32 functions.


  using System.Runtime.InteropServices;


Create the declarations to call the QueryPerformanceCounter and QueryPerformanceFrequency Win32 APIs as shown.


  [DllImport("KERNEL32")]
  private static extern bool QueryPerformanceCounter(out long lpPerformanceCount);
  [DllImport("Kernel32.dll")]
  private static extern bool QueryPerformanceFrequency(out long lpFrequency);


Add a constructor. In the constructor, call QueryPerformanceFrequency, passing a global variable to hold a value that will be used to calculate a duration in nanoseconds.


  private long frequency;
  public QueryPerfCounter()
  {
    if (QueryPerformanceFrequency(out frequency) == false)
    {
      // Frequency not supported
      throw new Win32Exception();
    }
  }


Create a Start method that gets the current value from QueryPerformanceCounter. Use a global variable to store the retrieved value.


  public void Start()
  {
    QueryPerformanceCounter(out start);
  }


Create a Stop method that gets the current value from QueryPerformanceCounter. Use another global variable to store the retrieved value.


  public void Stop()
  {
    QueryPerformanceCounter(out stop);
  }

Create a Duration method that accepts the number of iterations as an argument and returns a duration value. Use this method to calculate the number of ticks between the start and stop values. Next, multiply the result by the frequency multiplier to calculate the duration of all the operations, and then divide by the number of iterations to arrive at the duration per operation value.


  public double Duration(int iterations)
  {
    return ((((double)(stop - start)* 
              (double) multiplier) / 
              (double) frequency)/iterations);
  }

Your code in QueryPerfCounter.cs should resemble the following.

  QueryPerfCounter.cs


  // QueryPerfCounter.cs
  using System;
  using System.ComponentModel;
  using System.Runtime.InteropServices;


  public class QueryPerfCounter
  {
    [DllImport("KERNEL32")]
    private static extern bool QueryPerformanceCounter(
      out long lpPerformanceCount);
    [DllImport("Kernel32.dll")]
    private static extern bool QueryPerformanceFrequency(out long lpFrequency);
    private long start;
    private long stop;
    private long frequency;
    Decimal multiplier = new Decimal(1.0e9);
    public QueryPerfCounter()
    {
      if (QueryPerformanceFrequency(out frequency) == false)
      {
        // Frequency not supported
        throw new Win32Exception();
      }
    }
    public void Start()
    {
      QueryPerformanceCounter(out start);
    }
    public void Stop()
    {
      QueryPerformanceCounter(out stop);
    }
    public double Duration(int iterations)
    {
      return ((((double)(stop - start)* (double) multiplier) / (double)           frequency)/iterations);
    }
  }To compile the code, use the following command line.


  csc.exe /out:QueryPerfCounter.dll /t:library /r:System.dll QueryPerfCounter.cs


Using the Wrapper Class

To use the QueryPerfCounter wrapper class in your code, you need to reference QueryPerfCounter.dll and then instantiate the QueryPerfCounter class. Your client code should resemble the following.


  QueryPerfCounter myTimer = new QueryPerfCounter();
  // Measure without boxing
  myTimer.Start();
  for(int i = 0; i < iterations; i++)
  {
    // do some work to time
  }
  myTimer.Stop();
  // Calculate time per iteration in nanoseconds
  double result = myTimer.Duration(iterations);


The following sections show examples of how to use the wrapper to time the execution of managed code.


Validating Your QueryPerfCounter Class

In the following example, you will validate your QueryPerfCounter class by creating a simple console application. The application puts a thread to sleep for a specified time so that you can compare the results against your own timing results.

The following example code puts a thread to sleep for one second and loops five times. As a result, each iteration should take one second, and the total duration should be five seconds.

  ValidateQueryPerfCounter.cs


  // ValidateQueryPerfCounter.cs
  using System;
  public class ValidateQueryPerfCounter
  {
    public static void Main()
    {
      RunTest();
    }
 
    public static void RunTest()
    {
      int iterations=5;
      // Call the object and methods to JIT before the test run
      QueryPerfCounter myTimer = new QueryPerfCounter();
      myTimer.Start();
      myTimer.Stop();
      // Time the overall test duration
      DateTime dtStartTime = DateTime.Now;
      // Use QueryPerfCounters to get the average time per iteration
      myTimer.Start();
      for(int i = 0; i < iterations; i++)
      {
        // Method to time
        System.Threading.Thread.Sleep(1000);
      }
      myTimer.Stop();
      // Calculate time per iteration in nanoseconds
      double result = myTimer.Duration(iterations);
      // Show the average time per iteration results
      Console.WriteLine("Iterations: {0}", iterations);
      Console.WriteLine("Average time per iteration: ");
      Console.WriteLine(result/1000000000 + " seconds");
      Console.WriteLine(result/1000000 + " milliseconds");
      Console.WriteLine(result + " nanoseconds");
      // Show the overall test duration results
      DateTime dtEndTime = DateTime.Now;
      Double duration = ((TimeSpan)(dtEndTime-dtStartTime)).TotalMilliseconds;
      Console.WriteLine();
      Console.WriteLine("Duration of test run: ");
      Console.WriteLine(duration/1000 + " seconds");
      Console.WriteLine(duration + " milliseconds");
      Console.ReadLine();
    }
  }

To compile the code above, use the following command line.

  csc.exe /out:ValidateQueryPerfCounter.exe /r:System.dll,QueryPerfCounter.dll /t:exe      ValidateQueryPerfCounter.cs


Note the reference to the QueryPerfCounter.dll assembly that you built earlier.


Results

When you run ValidateQueryPerfCounter.exe, the output will resemble the following.


  Iterations: 5
  Average time per iteration:
  0.999648279320416 seconds
  999.648279320416 milliseconds
  999648279.320416 nanoseconds
  Duration of test run:
  5.137792 seconds
  5137.792 milliseconds


Example A: Boxing Overhead

In the following console application example, you will use your wrapper class, QueryPerfCounter, from your QueryPerfCounter.dll to measure the performance cost of boxing an integer.

  BoxingTest.cs


  // BoxingTest.cs
  using System;
  public class BoxingTest
  {
    public static void Main()
    {
      RunTest();
    }
    public static void RunTest()
    {
      int iterations=10000;
      // Call the object and methods to JIT before the test run
      QueryPerfCounter myTimer = new QueryPerfCounter();
      myTimer.Start();
      myTimer.Stop();
      // variables used for boxing/unboxing
      object obj = null;
      int value1 = 12;
      int value2 = 0;
      // Measure without boxing
      myTimer.Start();
      for(int i = 0; i < iterations; i++)
      {
        // a simple value copy of an integer to another integer
        value2 = value1;
      }
      myTimer.Stop();
 
      // Calculate time per iteration in nanoseconds
      double result = myTimer.Duration(iterations);
      Console.WriteLine("int to int (no boxing): " + result + " nanoseconds");
      // Measure boxing
      myTimer.Start();
      for(int i = 0; i < iterations; i++)
      {
        // point the object to a copy of the integer
        obj = value1;
      }
      myTimer.Stop();
 
      // Calculate time per iteration in nanoseconds
      result = myTimer.Duration(iterations);
      Console.WriteLine("int to object (boxing): " + result + " nanoseconds");
      // Measure unboxing
      myTimer.Start();
      for(int i = 0; i < iterations; i++)
      {
        // copy the integer value from the object to a second integer
        value2 = (int)obj;
      }
      myTimer.Stop();
 
      // Calculate time per iteration in nanoseconds
      result = myTimer.Duration(iterations);
      Console.WriteLine("object to int (unboxing): " + result + " nanoseconds");
      Console.ReadLine();
    }
  }

Compiling the Sample

To compile the code, use the following command line.


  csc.exe /out:BoxingTest.exe /r:System.dll,QueryPerfCounter.dll /t:exe BoxingTest.cs


Results

Run BoxingTest.exe. The results show you the overhead when boxing occurs.


  int to int (no boxing): 1.22920650529606 nanoseconds
  int to object (boxing): 77.132708207328 nanoseconds
  object to int (unboxing): 2.87746068285215 nanoseconds 

In the scenario above, an additional object is created when the boxing occurs.


Example B: String Concatenation

In this example, you will use the QueryPerfCounter class to measure the performance impact of concatenating strings. This example allows you to increase iterations so that you can observe the impact as the number of iterations grows.

  StringConcatTest.cs


  // StringConcatTest.cs
  using System;
  using System.Text;
  public class StringConcatTest
  {
    public static void Main()
    {
      RunTest(10);
      RunTest(100);
    }
    public static void RunTest(int iterations)
    {
      // Call the object and methods to JIT before the test run
      QueryPerfCounter myTimer = new QueryPerfCounter();
      myTimer.Start();
      myTimer.Stop();


      Console.WriteLine("");
      Console.WriteLine("Iterations = " + iterations.ToString());
     Console.WriteLine("(Time shown is in nanoseconds)");
      // Measure StringBuilder performance
      StringBuilder sb = new StringBuilder("");
      myTimer.Start();
      for (int i=0; i<iterations; i++)
      {
        sb.Append(i.ToString());
      }
      myTimer.Stop();
 
      // Pass in 1 for iterations to calculate overall duration
      double result = myTimer.Duration(1);
      Console.WriteLine(result + " StringBuilder version");
      // Measure string concatenation
      string s = string.Empty;
      myTimer.Start();
      for (int i=0; i<iterations; i++)
      {
        s += i.ToString();
      }
      myTimer.Stop();
 
      // Pass in 1 for iterations to calculate overall duration
      result = myTimer.Duration(1);
      Console.WriteLine(result + " string concatenation version");
      Console.ReadLine();
    }
  }


Compiling the Sample

To compile the code, use the following command line.

  csc.exe /out:StringConcat.exe /r:System.dll,QueryPerfCounter.dll /t:exe StringConcat.cs


Results

With a small number of concatenations, the benefits of using StringBuilder are less obvious. However, with a hundred concatenations, the difference is more apparent. For example:

  10 Iterations


  Iterations = 10
  12292.0650529606 StringBuilder version
  20393.6533833211 string concatenation version
  100 Iterations


  Iterations = 100
  62019.0554944832 StringBuilder version
  112304.776165686 string concatenation version