Useful null reference exceptions

As a .NET developer you’re guaranteed to have at some point run into the “Object reference not set to instance of an Object” message, or NullReferenceException. Encountering one without exception means there is a logic problem in your code, but finding what the supposed “instance of an Object” or target was can be quite hard. After all, the target was not assigned to the reference, so how do we know what should have been there ? In fact we can’t, as Eric Lippert explains.

However, just because we don’t know the target doesn’t mean we don’t know anything: in most cases there is some information we can obtain like:

  • type info: the type of the reference, and so the type of the target or at least what it inherits from (baseclass) or what it implements (interface)
  • operation info: the type of operation that caused the reference to be dereferenced

For example, in case the operation is a method call (callvirt), we know the C# compiler only compiles if it knows the methods are supported by the supposed target (through inheritance or directly), so with the info about the intended method we would have a good hint about what object is null.

According to MSFT, some other instructions that can throw NullReferenceException are: “The following Microsoft intermediate language (MSIL) instructions throw NullReferenceException: callvirt, cpblk, cpobj, initblk, ldelem.<type>, ldelema, ldfld, ldflda, ldind.<type>, ldlen, stelem.<type>, stfld, stind.<type>, throw, and unbox“. The amount of useful information will depend on the IL instruction context. However, currently none of this information is in the NullReferenceException: it just states some object reference was null, it was dereferenced, and in what method, nothing more, deal with it.

If you’re running debug code under Visual Studio this isn’t so much of a problem, as the debugger will break as soon as the dereferencing occurs (‘first chance’) and show you the source, but what about production code without source? Sure you can catch the exception and log it, but between the time it is thrown and the moment a catch handler is found, the runtime (CLR) is in control. At the time we are back in user code inside our catch handler, all we have to go on is the information in the NullReferenceException itself, which is essentially nothing. Debugging pros can attach WinDbg+SOS to a crashdump, but when the crash is the result of a ‘second chance exception’ (no exception handler was found) we’re at an even later stage. What we really want is a tool which can attach to the production process like a debugger, and get more info about first chance exceptions as they are thrown, as that’s the moment when all the info is available.

To give you an idea of the available info, the code below periodically throws a series of NullReferenceExceptions with a very different origin:

using System;
using System.Threading;

namespace TestNullReference
{
    interface TestInterface
    {
        void TestCall();
    }

    class TestClass
    {
        public int TestField;

        public void TestCall() { }
    }

    class TestClass2 : TestClass
    {   
    }

    class Program
    {
        #region methods to invoke null reference exceptions for various IL opcodes

        /// <summary>
        /// IL throw
        /// </summary>
        static void Throw()
        {
            throw null;
        }

        /// <summary>
        /// IL callvirt on interface
        /// </summary>
        static void CallVirtIf()
        {
            TestInterface i = null;
            i.TestCall();
        }

        /// <summary>
        /// IL callvirt on class
        /// </summary>
        static void CallVirtClass()
        {
            TestClass i = null;
            i.TestCall();
        }

        /// <summary>
        /// IL callvirt on inherited class
        /// </summary>
        static void CallVirtBaseClass()
        {
            TestClass2 i = null;
            i.TestCall();
        }

        /// <summary>
        /// IL ldelem
        /// </summary>
        /// <param name="a"></param>
        static void LdElem()
        {
            int[] array = null;
            var firstElement = array[0];
        }

        /// <summary>
        /// IL ldelema
        /// </summary>
        /// <param name="a"></param>
        static unsafe void LdElemA()
        {
            int[] array = null;
            fixed (int* firstElementA = &(array[0]))
            {                
            }
        }

        /// <summary>
        /// IL stelem
        /// </summary>
        /// <param name="a"></param>
        static void StElem()
        {
            int[] array = null;
            array[0] = 3;
        }

        /// <summary>
        /// IL ldlen
        /// </summary>
        /// <param name="a"></param>
        static void LdLen()
        {
            int[] array = null;
            var len = array.Length;
        }

        /// <summary>
        /// IL ldfld
        /// </summary>
        /// <param name="a"></param>
        static void LdFld()
        {
            TestClass c = null;
            var fld = c.TestField;
        }

        /// <summary>
        /// IL ldflda
        /// </summary>
        /// <param name="a"></param>
        static unsafe void LdFldA()
        {
            TestClass c = null;
            fixed (int* fld = &(c.TestField))
            {
                
            }
        }

        /// <summary>
        /// IL stfld
        /// </summary>
        /// <param name="a"></param>
        static void StFld()
        {
            TestClass c = null;
            c.TestField = 3;
        }

        /// <summary>
        /// IL unbox_any
        /// </summary>
        static void Unbox()
        {
            object o = null;
            var val = (int) o;
        }

        /// <summary>
        /// IL ldind
        /// </summary>
        static unsafe void LdInd()
        {
            int* valA = null;
           
            var val = *valA;
        }

        /// <summary>
        /// IL ldind
        /// </summary>
        static unsafe void StInd()
        {
            int* valA = null;

            *valA = 3;
        }
        #endregion

        static void LogNullReference(Action a)
        {
            try
            {
                a();
            }
            catch (NullReferenceException ex)
            {
                var msg = string.Format("NullReferenceException executing {0} : {1}", a.Method.Name, ex.Message);
                Console.WriteLine(msg);
            }
        }

        static void Main(string[] args)
        {
            while (!Console.KeyAvailable)
            {
                LogNullReference(Throw);

                LogNullReference(CallVirtIf);
                LogNullReference(CallVirtClass);
                LogNullReference(CallVirtBaseClass);

                LogNullReference(LdElem);
                LogNullReference(LdElemA);
                LogNullReference(StElem);
                LogNullReference(LdLen);

                LogNullReference(LdFld);
                LogNullReference(LdFldA);
                LogNullReference(StFld);

                LogNullReference(Unbox);

                LogNullReference(LdInd);
                LogNullReference(StInd);

                Thread.Sleep(2000);   
            }           
        }
    }
}

All 14 of them will give us the dreaded “Object reference not set to an instance of an object” message.

Now what happens if we attach a tracing tool that gets as much info as possible:

Attempted to throw an uninitialized exception object. In static void TestNullReference.Program::Throw() cil managed  IL 1/1 (reported/actual).
Attempted to call void TestNullReference.TestInterface::TestCall() cil managed  on an uninitialized type. In static void TestNullReference.Program::CallVirtIf() cil managed  IL 3/3 (reported/actual).
Attempted to call void TestNullReference.TestClass::TestCall() cil managed  on an uninitialized type. In static void TestNullReference.Program::CallVirtClass() cil managed  IL 3/3 (reported/actual).
Attempted to call void TestNullReference.TestClass::TestCall() cil managed  on an uninitialized type. In static void TestNullReference.Program::CallVirtBaseClass() cil managed  IL 3/3 (reported/actual).
Attempted to load elements of type System.Int32 from an uninitialized array. In static void TestNullReference.Program::LdElem() cil managed  IL 3/4 (reported/actual).
Attempted to load elements of type System.Int32 from an uninitialized array. In static void TestNullReference.Program::LdElemA() cil managed  IL 3/4 (reported/actual).
Attempted to store elements of type System.Int32 in an uninitialized array. In static void TestNullReference.Program::StElem() cil managed  IL 3/5 (reported/actual).
Attempted to get the length of an uninitialized array. In static void TestNullReference.Program::LdLen() cil managed  IL 3/3 (reported/actual).
Attempted to load non-static field int TestNullReference.TestClass::TestField from an uninitialized type. In static void TestNullReference.Program::LdFld() cil managed  IL 3/3 (reported/actual).
Attempted to load non-static field int TestNullReference.TestClass::TestField from an uninitialized type. In static void TestNullReference.Program::LdFldA() cil managed  IL 3/3 (reported/actual).
Attempted to store non-static field int TestNullReference.TestClass::TestField in an uninitialized type. In static void TestNullReference.Program::StFld() cil managed  IL 3/4 (reported/actual).
Attempted to cast/unbox a value/reference type of type System.Int32 using an uninitialized address. In static void TestNullReference.Program::Unbox() cil managed  IL 3/3 (reported/actual).
Attempted to load elements of type System.Int32 indirectly from an illegal address. In static void TestNullReference.Program::LdInd() cil managed  IL 4/4 (reported/actual).
Attempted to store elements of type System.Int32 indirectly to a misaligned or illegal address. In static void TestNullReference.Program::StInd() cil managed  IL 4/5 (reported/actual).

You can download and play with the tool already. Below I’ll shed some light on how this info can be obtained.

What the tracer does

One blog post is not enough to fully explain how to write a managed debugger. However, enough has been written about how to leverage the managed debugging API so for this post I’m going to assume we’ve attached a managed debugger to the target process, implemented debugger callback handlers, hooked them up and are handling exception callbacks.

The exception callback has the following signature:

HRESULT Exception (
    [in] ICorDebugAppDomain   *pAppDomain,
    [in] ICorDebugThread      *pThread,
    [in] ICorDebugFrame       *pFrame,
    [in] ULONG32              nOffset,
    [in] CorDebugExceptionCallbackType dwEventType,
    [in] DWORD                dwFlags
);

The actual exception can be obtained from the thread as an ICorDebugReferenceValue which can be dereferenced to an ICorDebugObjectValue of which you can ultimately get the ICorDebugClass and metadata token (mdTypeDef). To find out if this exception is a NullReferenceException, you can either look up this token using the metadata APIs, or compare it to a prefetched metadata token.

When we know we’re dealing with a 1st chance null reference exception, we can dig deeper and try to find out the offending IL instruction. From nOffset, we already have the IL offset in the method frame’s code. The code itself can be obtained by querying the ICorDebugFrame for an ICorDebugILFrame interface, and requesting it for its code (ICorDebugCode2), which has a method for retreiving the actual IL bytes.

Depending on the IL instruction we find at nOffset in the IL bytes, we can get various details and log them.

For the instructions that can throw:

  • callvirt: a call to a known instance method (mdMethodDef) on an uninitialized type
  • cpblk, cpobj, initblk: shouldn’t happen (not exposed by C#)
  • ldelem.<type>, ldelema, stelem.<type>: an attempt to load/store elements of a known type (mdTypeDef) from/to an uninitialized array
  • ldfld, ldflda, stfld: an attempt to load/store a known non-static field (mdFieldDef) of a known uninitialized type
  • ldind.<type>, stind.<type>: an invalid address was passed to the load instruction, or a misaligned address was passed to the store instruction (shouldn’t happen as this would be a compiler instead of user code bug)
  • ldlen: an attempt to get the length of an uninitialized array
  • throw: an attempt to throw an uninitialized exception object
  • unbox, unbox_any: an attempt to cast/unbox a value/reference type of a known type (mdTypeDef) using an uninitialized address

The various metadata tokens can be looked up using the metadata APIs mentioned before, and finally formatted into a nice message.

TraceCLI: a production debugging and tracing tool

Ever had this application that just crashed or malfunctioned in a Production environment ? And whatever you tried, it wasn’t reproducible in Test ? Or perhaps it was just slow in serving some pages (web application), and you wanted to see the actual SQL your Object Relational Model (ORM) generated and sent to the database ?

Well, there are some options.

For one, you could add more logging. Typically though, Murphy’s Law interferes and even though you think you log ‘everything’, you find out  when you really need it you forgot about that one spot, or need more detail (like argument or field values). Changing the code and building a new release, not to mention get it through Test, Acceptance and into Production is not really an attractive option here.

An alternative is to use WinDbg (+ SOS) on a crashdump obtained using a GUI app like DebugDiag or a command line app like ProcDump. For real crashes this ‘post mortem debugging’ option is often sufficient: you’ll have a full memory dump to work with taken at the exact time of the crash so all the details are there. However, using WinDbg is a pretty hardcore skill not every (.NET) developer has, and for more subtle malfunctions – for instance your application not behaving nicely with other systems – or in case of performance issues, you typically want to inspect a live target anyway.

And here it becomes problematic: I probably don’t have to tell you attaching a debugger to a live production process and stepping through it is not a good idea. Although even without WinDbg there’s still a large set of tools at your disposal (SQL tracing/profiling, performance counters, ETW tracing), those are mostly indirect and slow to setup: you’re watching symptoms instead of the real disease.

What I really wanted was a tool which I can attach to a live target without disrupting it, and in which I can specify some classes, methods or even namespaces to monitor (hitcounts, timings). Perhaps I even want to dump some (private) fields of a class when a method is hit. That sort of thing.

It didn’t exist (or I couldn’t find it), and I was searching for a challenge anyway, so I made one myself using native C++ and the debugging APIs.

The tool – for now named TraceCLI – allows you to live attach to a process (x86 or x64), and instrument the namespaces, classes, and methods you want. On hitting them, you can either just monitor their hitcounts or do more advanced stuff like measure the time in the method (for now including calls to other methods), as well as dump fields of the class the method is a member of on entering the method. For simple much used scenarios, there are presets (like SQL tracing) and for very advanced scenarios or multiple hit filter settings, you can also provide a configuration with an xml config file.

Although ultimately the goal is to use it on production processes, it’s still very alpha, both in terms of stability as well as feature set. So please use at your own risk.

Example: SQL tracing

For example, to trace SQL queries manually (without a preset), we could use the following commandline:

traceCLI_cmdline

which specifies we want to dump the field ‘_commandText’ of a class with a method ‘RunExecuteReader’ upon entering the method.

During startup, TraceCLI attaches to the process and checks all loaded assemblies for such methods, and instruments them:

14:33:53.588 TraceCLI (x86) v0.1 - Ruurd Keizer
14:33:53.588 ### PARAMETERS
14:33:53.589 Filter #1: namespace (null), class (null), method RunExecuteReader.
14:33:53.589   Fields dumped on entry: _commandText
14:33:53.658 ### ATTACH
14:33:53.659 Attaching to process with pId 5320: C:\Program Files (x86)\IIS Express\iisexpress.exe
14:33:53.738 Processing filter...
14:33:53.739 Searching domain: DefaultDomain
14:33:53.739 Searching assembly: C:\Windows\Microsoft.Net\assembly\GAC_MSIL\System.Configuration\v4.0_4.0.0.0__b03f5f7f11d50a3a\System.Configuration.dll
14:33:53.755 Searching assembly: C:\Windows\Microsoft.Net\assembly\GAC_MSIL\System.Core\v4.0_4.0.0.0__b77a5c561934e089\System.Core.dll
14:33:53.800 Searching assembly: C:\Windows\Microsoft.Net\assembly\GAC_MSIL\Microsoft.Build.Utilities.v4.0\v4.0_4.0.0.0__b03f5f7f11d50a3a\Microsoft.Build.Utilities.v4.0.dll
(...)
14:33:54.077 Searching domain: /LM/W3SVC/4/ROOT-1-130487604823520283
14:33:54.078 Searching assembly: C:\Windows\Microsoft.Net\assembly\GAC_MSIL\System.Data.Linq\v4.0_4.0.0.0__b77a5c561934e089\System.Data.Linq.dll
14:33:54.093 Searching assembly: C:\Windows\Microsoft.NET\Framework\v4.0.30319\Temporary ASP.NET Files\root\4ecd037f\b8cc2ad2\assembly\dl3\2c89eab9\1cd35552_3295cf01\Solvay.Common.dll
14:33:54.094 Searching assembly: C:\Windows\Microsoft.Net\assembly\GAC_MSIL\Microsoft.Web.Infrastructure\v4.0_1.0.0.0__31bf3856ad364e35\Microsoft.Web.Infrastructure.dll
(...)
14:33:54.796 Searching assembly: C:\Windows\Microsoft.NET\Framework\v4.0.30319\Temporary ASP.NET Files\root\4ecd037f\b8cc2ad2\assembly\dl3\5283b6cf\d83f43ef_2d8acf01\System.Web.Http.WebHost.dll
14:33:54.798 Searching assembly: C:\Windows\Microsoft.Net\assembly\GAC_32\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dll
14:33:54.902 Found 4 methods satisfying the filters
14:33:54.902 Found method: internal System.Data.SqlClient.SqlDataReader System.Data.SqlClient.SqlCommand::RunExecuteReader(System.Data.SqlClient.SqlDataReader, System.Data.SqlClient.SqlDataReader, System.Data.SqlClient.SqlDataReader, System.Data.SqlClient.SqlDataReader) cil managed
14:33:54.903 Found method: internal System.Data.SqlClient.SqlDataReader System.Data.SqlClient.SqlCommand::RunExecuteReader(System.Data.SqlClient.SqlDataReader, System.Data.SqlClient.SqlDataReader, System.Data.SqlClient.SqlDataReader, System.Data.SqlClient.SqlDataReader, System.Data.SqlClient.SqlDataReader, System.Data.SqlClient.SqlDataReader, System.Data.SqlClient.SqlDataReader, System.Data.SqlClient.SqlDataReader) cil managed
14:33:54.903 Found method: private System.Data.SqlClient.SqlDataReader System.Data.SqlClient.SqlCommand::RunExecuteReaderTds(System.Data.SqlClient.SqlDataReader, System.Data.SqlClient.SqlDataReader, System.Data.SqlClient.SqlDataReader, System.Data.SqlClient.SqlDataReader, System.Data.SqlClient.SqlDataReader, System.Data.SqlClient.SqlDataReader, System.Data.SqlClient.SqlDataReader, System.Data.SqlClient.SqlDataReader) cil managed
14:33:54.903 Found method: private System.Data.SqlClient.SqlDataReader System.Data.SqlClient.SqlCommand::RunExecuteReaderSmi(System.Data.SqlClient.SqlDataReader, System.Data.SqlClient.SqlDataReader, System.Data.SqlClient.SqlDataReader) cil managed
14:33:54.904 Activating all breakpoints

While the tool is attached it dumps SQL to the log as it executes it:

14:33:54.904 ### TRACING
14:34:00.617 Field: private string System.Data.SqlClient.SqlCommand::_commandText, Value: SELECT
    [GroupBy1].[A1] AS [C1]
    FROM ( SELECT
        COUNT(1) AS [A1]
        FROM  [dbo].[T_Computer] AS [Extent1]
        INNER JOIN [dbo].[T_OS] AS [Extent2] ON [Extent1].[HWComputerIndex] = [Extent2].[ComputerIndex]
    )  AS [GroupBy1]
14:34:00.626 Field: private string System.Data.SqlClient.SqlCommand::_commandText, Value: SELECT
    [GroupBy1].[A1] AS [C1]
    FROM ( SELECT
        COUNT(1) AS [A1]
        FROM  [dbo].[T_Computer] AS [Extent1]
        INNER JOIN [dbo].[T_OS] AS [Extent2] ON [Extent1].[HWComputerIndex] = [Extent2].[ComputerIndex]
    )  AS [GroupBy1]
14:34:00.635 Field: private string System.Data.SqlClient.SqlCommand::_commandText, Value: SELECT
    [GroupBy1].[A1] AS [C1]
    FROM ( SELECT
        COUNT(1) AS [A1]
        FROM ( SELECT DISTINCT
            [Extent1].[SlaSID] AS [SlaSID],
            [Extent1].[SlaInfo] AS [SlaInfo],
            [Extent1].[SLaTeam] AS [SLaTeam],
            [Extent1].[SlaSource] AS [SlaSource]
            FROM [dbo].[T_SID_SLA_DRP] AS [Extent1]
            WHERE ( NOT ((N'int' = [Extent1].[SLaTeam]) AND ([Extent1].[SLaTeam] IS NOT NULL))) AND ( NOT ((N'wintel' = [Extent1].[SlaSource]) AND ([Extent1].[SlaSource] IS NOT NULL)))
        )  AS [Distinct1]
    )  AS [GroupBy1]