Creating an automatic self-updating process

Recently I was asked by a client to replace a single monolithic custom workflow engine with a more scaleable and loosely coupled modern alternative. We decided on a centralized queue which contained the work items and persisted them, with a manager (scheduler) on top which would accept connections of a dynamically scaleable number of processors which would request and then do the actual work. It’s an interesting setup in itself which relies heavily on dependency injection, Command-Query-Seperation, Entity Framework code first with Migrations for the database, and code first WCF for a strongly typed communication between the scheduler and its processors.

Since there would be many Processors without an administration of where they would be installed, one of the wishes was to make them self-update at runtime when new versions of the code would be available.

Detecting a new version

A key component of the design is for the processors to register themselves on the scheduler when they come start. In the same spirit, they could call to an updatemanager service periodically to check for updates. I implemented this by placing a version inside the processor primary assembly (in the form of an embedded resource). The update manager returns the current latest available version and download location. If this version is more recent than the built in version, the decision to update can be made.

This completes the easy part.

Updating

The problem with updating a process in-place at runtime, is that the operating system locks executable images (exe/dll) when they are mapped inside a running processes. So when you try to overwrite them, you get ‘file is in use by another process’ errors. The natural approach would therefore be to unload every non-OS library except the executable itself, followed by the overwrite action and subsequent reload.

In fact this works for native code/processes, however managed assemblies once loaded can not be unloaded. It therefore appears we are out of luck and can’t use this method. However, we have a (brute force) escape: while we can’t unload managed assemblies, we can unload the whole AppDomain they have been loaded into.

Updating: managed approach

The idea therefore becomes to spin up the process with almost nothing in the default AppDomain (which can never be unloaded), and from there spawn a new AppDomain with the actual Processor code. If an update is detected, we can unload, update, and respawn it again.

And still it didn’t work…the problem I ran into now is that somehow the default domain persisted in loading the one of the user defined assemblies. I loaded my new AppDomain with the following lines:

public class Processor : MarshalByRefObject
{
    AppDomain _processorDomain;

    public void Start()
    {
       // startup code here...
    }

    public static Processor HostInNewDomain()
    {
        // Setup config of new domain to look at parent domain app/web config.
        var procSetup = AppDomain.CurrentDomain.SetupInformation;
        procSetup.ConfigurationFile = procSetup.ConfigurationFile;

        // Start the processor in a new AppDomain.
        _processorDomain = AppDomain.CreateDomain("Processor", AppDomain.CurrentDomain.Evidence, procSetup);

        return (Processor)domain.CreateInstanceAndUnwrap(Assembly.GetExecutingAssembly().FullName, typeof(Processor).FullName);
    }
}

and in a seperate assembly:

public class ProcessorHost
{
    Processor _proc;

    public void StartProcessor()
    {
        proc = Processor.HostInNewDomain();
        proc.Start();
    }
}

There are several problems in this code:

  • the Processor type is used inside the default AppDomain in order to identify the assembly and type to spawn in there – this causes the assembly which contains the type to get loaded in the default domain as well.
  • after spawning the new AppDomain, we call into the Processor.Start() to get it going. For the remoting to work, the runtime generates a proxy inside the default domain to get to the Processor (MarshalByRefObject) in the Processor domain. It does so by loading the type from the assembly containing the Processor type and reflecting on that. I tried different approaches (reflection, casting to dynamic), but it seems the underlying mechanism to generate the proxy is always the same.

So what is the solution ? For one we can make it autostart by starting all the action in the constructor of the Processor. That way we don’t need to call anything to start the Processor, so the runtime doesn’t generate a proxy. Moreover, we can take a stringly typed dependency on the assembly and type. This will result in the code above to change to:

public class Processor : MarshalByRefObject
{
    public Processor()
    {
        Start();
    }

    public void Start()
    {
        // startup code here....
    }
}

with in a seperate assembly:

public class ProcessorHost
{
    private const string ProcessorAssembly = "Processor, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null";
    private const string ProcessorType = "Processor.Processor";

    AppDomain _processorDomain;
    ObjectHandle _hProcessor;

    public void Start()
    {
        // Setup config of new domain to look at parent domain app/web config.
        var procSetup = AppDomain.CurrentDomain.SetupInformation;
        procSetup.ConfigurationFile = procSetup.ConfigurationFile;

        // Start the processor in a new AppDomain.
        _processorDomain = AppDomain.CreateDomain("Processor", AppDomain.CurrentDomain.Evidence, procSetup);

        // Just keep an ObjectHandle, no need to unwrap this.
        _hProcessor = _processorDomain.CreateInstance(ProcessorAssembly, ProcessorType);
    }
}

Communicating with the new AppDomain

Above I circumvented the proxy generation (and thereby type assembly loading in the default AppDomain) by kicking off the startup code automatically in the Processor constructor. However, this restriction introduces a new problem: since we cannot ever call into or out of the new domain by going through user defined types, as that would cause user defined assemblies to be locked in place, how then do we communicate to the parent/default domain an update is ready ?

For the moment I do this by writting AppDomain data in the Processor domain – AppDomain.SetData(someKey, someData) – and reading it periodically from the parent domain – AppDomain.GetData(someKey). It’s not ideal as it requires polling, but it at least works: I only use standard framework methods and types, and so the update works.

wcf

Contract first WCF service (and client)

There seems to be a popular misconception stating WCF is obsolete/legacy, supposedly because it has been replaced by new techniques like ASP.NET WebAPI. For sure: since WebAPI was created to simplify the development of very specific – but extensively used – HTTP(S) REST services on port 80, it’s clear if you are developing a service exactly like that, you would be foolish to do so in WCF (since it’s just extra work).

However, to state a technology is obsolete because there is a better alternative for a very specific use case when there are over 9000 use cases which no other tech addresses is silly. Your personal bubble might be 100% web, but that doesn’t mean there is nothing else: WCF – unlike WebAPI – is a communications abstraction. Its strong point is versatility: if I want my service to use HTTP today and named pipes or message queueing tomorrow, I can do so with minimum effort.

It’s in this versatility where we hit the second misconception: ‘WCF is too hard’. I have to admit I spent quite some days cursing at my screen over WCF. Misconfigured services, contracts, versioning, endpoints, bindings, wsdls, etc. there is a lot you can do wrong, and every link in the chain has to be working before you can stop cursing.

However, that is all with WCF done the classic way, but none of that stuff is necessary if you do it the right way, which is contract first without much/any configuration.

WCF the right way™: contract first

The basic ingredients you need are:

  • a service contract in a shared library
  • the service: an implementation of the contract
  • some testing code (client)

And that’s it. I’ll give a minimal implementation of this concept to show the power and simplicity of this approach below. Did I mention no configuration ?

The service contract

using System.ServiceModel;

namespace Contracts
{
    [ServiceContract]
    public interface IContractFirstServiceContract
    {
        [OperationContract]
        bool Ping();
    }
}

It’s important you place this contract/interface in a seperate – shared – assembly so both the service implementations as well as the client(s) can access them.

The service implementation

using System;
using Contracts;

namespace Service
{
    public class ContractFirstService : IContractFirstServiceContract
    {
        public bool Ping()
        {
            return true;
        }
    }
}

That’s all, and the service can now be hosted. This can be done in IIS with a svc, or by self hosting it (in a console app). In any case, we need a ServiceHost implementation, which I’ll place in the service project:

using System;
using System.ServiceModel;
using System.ServiceModel.Channels;
using Contracts;

namespace Service
{
    public class ContractFirstServiceHost : ServiceHost
    {
        public ConstractFirstServiceHost(Binding binding, Uri baseAddress)
            : base(typeof(ContractFirstService), baseAddress)
        {
            AddServiceEndpoint(typeof(IContractFirstServiceContract), binding, baseAddress);
        }
    }
}

The client (testing the service)

In the following piece of code I’ll self-host the service, open a client and test the WCF call roundtrip.

using System;
using System.ServiceModel;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using Contracts;
using Service;

namespace Tests
{
    [TestClass]
    public class IntegrationTest
    {
        [TestMethod]
        public void TestClientServer()
        {
            var binding = new BasicHttpBinding();
            var endpoint = new EndpointAddress("http://localhost/contractfirstservice");

            // Self host the service.
            var service = new ContractFirstServiceHost(binding, endpoint.Uri);
            service.Open();

            // Create client.
            var channelFactory = new ChannelFactory(binding, endpoint);
            var client = channelFactory.CreateChannel();

            // Call the roundtrip test function.
            var roundtripResult = client.Ping();

            Assert.IsTrue(roundtripResult);
        }
    }
}

There you have it: a contract first WCF service in a few lines.

mcg_implemented elsewhere

Compile time marshalling

In one of my posts about managed/unmanaged interop in C# (P/Invoke), I left you with the promise of answering a few questions, namely: can we manually create our own marshalling stubs in C# (at compile time), and can they be faster than the runtime generated ones ?

A bit of background

It’s funny that when I raised these questions back in March, I was still unaware of .NET Native and ASP vNext which were announced by Microsoft in the following months. The main idea behind these initiatives is to speed up especially the startup time of .NET code on resource constrained systems (mobile, cloud).
For instance, while traditionally on desktop systems intermediate language (IL) in .NET assemblies is compiled to machine code at runtime by the Just-In-Time Compiler (JIT), .NET Native moves this step to compile time. While this has several advantages, a direct consequence of the lack of runtime IL compilation is that we can’t generate and run IL code on the fly anymore. Even though not much user code uses this, the framework itself critically depends on this feature for interop marshalling stub generation. Since it is no longer available in .NET Native, this phase had to be moved to compile time as well. In fact, this step – called Marshalling and Code Generation (MCG) is one of the elements of the .NET Native toolchain. By the way, .NET Native isn’t the first project which has done compile time marshalling. For example, it has been used for a long time in the DXSharp project.

The basic concepts are always the same: generate code which marshals the input arguments and return values, and wrap it around a calli IL instruction. Since the C# compiler will never emit a calli instruction, this actual call will always have to be implemented in IL directly (or the compiler will have to be extended, recently possible with Roslyn). Where the desktop .NET runtime (CLR) emits the whole marshalling stub in IL, the MCG generated code is C# so it requires a seperate call to an IL method with the calli implementation. If you drill down far enough in the generated sources for a .NET Native project, in the end you’ll find something like this (all other classes/methods omitted for brevity):

internal unsafe static partial class Interop
{
    private static partial class McgNative
    {
        internal static partial class Intrinsics
        {
            internal static T StdCall(IntPtr pfn, void* arg0, int arg1)
            {
                // This method is implemented elsewhere in the toolchain
                return default(T);
            }
        }
    }
}

Note the giveaway comment ‘this method is implemented elsewhere in the toolchain’, which you can read as ‘this is as far as we can go with C#’, and which indicates that some other tool in the .NET Native chain will emit the real body for the method.

DIY compile time marshalling

So what would the .NET Native ‘implemented elsewhere’ source look like, or: how can we do our own marshalling ? To call a native function which expects an integer argument (like the Sleep function I used in previous posts), first we would need to create an IL calli implementation which takes the address of the native callsite  and the integer argument:

.assembly extern mscorlib {}
.assembly CalliImpl { .ver 0:0:0:0 }
.module CalliImpl.dll

.class public CalliHelpers
{
    .method public static void Action_uint32(native int, unsigned int32) cil managed
    {
        ldarg.1
        ldarg.0
        calli unmanaged stdcall void(int32)
        ret
    }
}

If we feed it the address of the Sleep function in kernel32 (using LoadLibrary and GetProcAddress, which we ironically invoke through P/Invoke…), we can see the CalliHelper method on the managed stack instead of the familiar DomainBoundILStubClass. In other words, compile time marshalling in action:

Child SP IP Call Site
00f2f264 77a9d4bc [InlinedCallFrame: 00f2f264]
00f2f260 010b03e4 CalliHelpers.Action_uint32(IntPtr, UInt32)
00f2f290 010b013b TestPInvoke.Program.Main(System.String[])
00f2f428 63c92652 [GCFrame: 00f2f428]

This ‘hello world’ example is nice but ideally you would like to use well tested code. Therefore, I wanted to try and leverage the MCG from .NET Native, but it turned out to be a bit more work than I anticipated as you need to somehow inject the actual IL calli stubs to make the calls work. So perhaps in a future blog.

What about C++ interop ?

There seems to be a lot of confusion around this type of interop: some claim it to be faster, some slower. In reality it can be both depending on what you do. The C++ compiler understands both types of code (native and managed), and with it comes its main selling point: not speed but type safety. Where in C# the developer has to provide the P/Invoke signature, including calling convention and marshalling of the arguments and return values, the C++ compiler knows this already from the native header files. Therefore, in C++/CLI you simply include the header and if necessary (you are in a managed section) the compiler does the P/Invoke for you implicitly.

#include

using namespace System;

int main(array ^args)
{
    Console::WriteLine(L"Press any key...");
    while (!Console::KeyAvailable)
    {
        Sleep(500);
    }
    return 0;
}

Sleep is an unmanaged function included from Windows.h, and invoked from a managed code body. From the managed stack in WinDbg you can see how it works:

00e3f16c 00fa2065 DomainBoundILStubClass.IL_STUB_PInvoke(UInt32)
00e3f170 00fa1fcc [InlinedCallFrame: 00e3f170] .Sleep(UInt32)
00e3f1b4 00fa1fcc .main(System.String[])
00e3f1c8 00fa1cff .mainCRTStartupStrArray(System.String[])

As you can see, there is again a marshalling stub, as in C#, it is however generated without developer intervention. This alone should be reason enough to use C++/CLI in heavy interop scenarios, but there are more advantages. For instance, the C++ compiler can optimize away multiple dependent calls across the interop boundary, making the whole thing faster, or can P/Invoke to native C++ class instance functions, something entirely impossible in C#. It moreover allows you to apart from depending on external native code, create ‘mixed mode’ or IJW (It Just Works) assemblies which contain native code as well as the usual managed code in a self contained unit.
Despite all this, the P/Invoke offered by C++/CLI still leverages the runtime stub generation mechanism, and therefore, it’s not intrinsically faster than explicit P/Invoke.

Word of warning

Let me end with this: the aim of this post is to offer an insight in the black box called interop, not as a promotion for DIY marshalling. If you find yourself in need of creating your own (compile time) marshalling stubs for faster interop, chances are you are doing something wrong. Especially for enterprise/web development it’s not very likely the interop itself is the bottleneck. Therefore, focussing on improving the interop scenario yourself – instead of letting the .NET framework team worry about it – is very, very likely a case of premature optimization. However, for game/datacenter/scientific scenarios, you can end up in situations where you want to use every CPU cycle efficiently, and perhaps after reading this post you’ll have a better idea of where to look.

Pinvoke

PInvoke: beyond the magic

Ever ran into problems passing data between unmanaged code and managed code ? Or just curious what really happens when you slap that [DllImport] on a method ? This post is for you: below I’ll shine some light inside the blackbox that’s called Platform Invoke.

Let’s start with a very minimal console app that has a call to an unmanaged Win32 function:

namespace TestPInvoke
{
    class Program
    {
        [DllImport("kernel32.dll")]
        static extern void Sleep(uint dwMilliseconds);

        static void Main(string[] args)
        {
            Console.WriteLine("Press any key...");

            while (!Console.KeyAvailable)
            {
                Sleep(1000);
            }
        }
    }
}

Nothing exciting going on there: just the console polling for a keypress, and sleeping the thread for 1 second after every poll. The important thing of course is the way in which we sleep the thread, which is with PInvoke instead of using the usual mscorlib System.Threading.Thread.Sleep(Int32).

Now let’s run it under WinDbg + SOS, and see if we can find out what happens. The managed stack while sleeping looks like this:

Child SP IP       Call Site
00ebee24 0108013d DomainBoundILStubClass.IL_STUB_PInvoke(UInt32)
00ebee28 0108008e [InlinedCallFrame: 00ebee28] TestPInvoke.Program.Sleep(UInt32)
00ebee6c 0108008e TestPInvoke.Program.Main(System.String[])

On the bottom is the entrypoint. The next frame on the stack is just an information frame telling us the call to Program.Sleep was inlined in Main (notice the same IP). The next frame is more interesting: as the last frame on the managed stack this must be our marshalling stub.

We can dump the MethodDescriptor of the Program.Main and DomainBoundILStubClass.IL_STUB_PInvoke methods for comparison, which gives us:

0:000> !IP2MD 0108008e
MethodDesc: 00fc37c8
Method Name: TestPInvoke.Program.Main(System.String[])
Class: 00fc12a8
MethodTable: 00fc37e4
mdToken: 06000002
Module: 00fc2ed4
IsJitted: yes
CodeAddr: 01080050

and

0:000> !IP2MD 0108013d
MethodDesc: 00fc38f0
Method Name: DomainBoundILStubClass.IL_STUB_PInvoke(UInt32)
Class: 00fc385c
MethodTable: 00fc38b0
mdToken: 06000000
Module: 00fc2ed4
IsJitted: yes
CodeAddr: 010800c0

This tells us both methods are originally IL code, and they are JIT compiled. For the Main method we knew this of course, and for the PInvoke stub it can’t be a surprise either given the class and method names. So let’s dump out the IL:

0:000> !DumpIL 00fc37c8
IL_0001: ldstr "Press any key..."
IL_0006: call System.Console::WriteLine
IL_000c: br.s IL_001b
IL_000f: ldc.i4 1000
IL_0014: call TestPInvoke.Program::Sleep
IL_001b: call System.Console::get_KeyAvailable
IL_0020: ldc.i4.0
IL_0021: ceq
IL_0025: brtrue.s IL_010e
IL_0027: ret

No surprises there. Next the stub:

0:000> !DumpIL 00fc38f0
error decoding IL

OK, that’s weird. The metadata tells us we have an IL compiled method, the JITted code is there:

0:000> !u 010800c0
Normal JIT generated code
DomainBoundILStubClass.IL_STUB_PInvoke(UInt32)
(actual code left out)

but where is the IL body?

In fact, it turns out since .NET v4.0, all interop stubs are generated at runtime in IL and JIT compiled for the relevant architecture. Note this runtime IL has a clear difference with the IL emitted in runtime assemblies (for instance the ones generated for XML serialization), as the interop stubs aren’t contained in a runtime generated assembly or module. Instead, the module token is spoofed to be identical to the calling frame’s module (you can check this above). Likewise, there is only runtime data for these methods, and looking up its class info gives:

!DumpClass 00fc385c
Class Name: 
mdToken: 02000000
File: C:\dev\voidcall\Profiler\ProfilerNext\TestPInvoke\TestPInvoke\bin\Debug\TestPInvoke.exe
Parent Class: 00000000
Module: 00fc2ed4
Method Table: 00fc38b0
Total Method Slots: 0
Class Attributes: 101
Transparency: Critical

This containing class – DomainBoundILStubClass – is some weird thing as well: it doesn’t inherit anything (not even System.Object), the name isn’t filled in, and there are no method slots, even though we know there is a at least one method in this class, namely the one we just followed to get to it. So probably this class is just a construct for keeping integrity in the CLR internal datastructures.

So there really seems to be no good way to get the IL of those stubs. The CLR team realized this as well and decided to publish the generated IL as ETW events. The ILStub Diagnostics tool can be used to intercept them. If we do this for our test program we see the following (formatted for readability):

// Managed Signature: void(uint32)
// Native Signature: unmanaged stdcall void(int32)
.maxstack 3
.locals (int32 A,int32 B)
// Initialize
    call native int [mscorlib] System.StubHelpers.StubHelpers::GetStubContext()
    call void [mscorlib] System.StubHelpers.StubHelpers::DemandPermission(native int)
// Marshal
    ldc.i4 0x0
    stloc.0    
IL_0010:
    ldarg.0
    stloc.1    
// CallMethod
    ldloc.1
    call native int [mscorlib] System.StubHelpers.StubHelpers::GetStubContext()
    ldc.i4 0x14
    add
    ldind.i
    ldind.i
    calli unmanaged stdcall void(int32) //actual unmanaged method call
// Unmarshal (nothing in this case)
// Return
    ret

The (un)marshalling isn’t very interesting in this case (int32 in and nothing out). To make it more clear for those who don’t use IL daily, I used ILAsm to compile this method body into a dll and used ILSpy to view it in decompiled C#:

static void ILStub_PInvoke(int A)
{
    //initialize
    StubHelpers.DemandPermission(StubHelpers.GetStubContext());
    //CallMethod
    calli(void(int32), A, *(*(StubHelpers.GetStubContext() + 20))); //not actual C#, but more readable anyway
}

The call to the unmanaged method is done with a calli instruction, which is a strongly typed call to an unmanaged callsite. The first parameter (not on the stack but encoded in IL), is the signature of the callsite [void(int32)], followed by (on the stack) the argument (in this case A), ultimately followed by the unmanaged function pointer (which must be stored in offset 20 of the context returned from StubHelpers.GetStubContext()).

So what magic takes place in StubHelpers.GetStubContext() ?

The answer will come naturally if we take for example a simple program that has 2 PInvoke methods with the same input and output arguments:

[DllImport("kernel32.dll")]
static extern void ExitThread(uint dwExitCode);

[DllImport("kernel32.dll")]
static extern void Sleep(uint dwMilliseconds);

If I let the CLR generate an IL stub for both methods, I have exactly the same input and output marshalling, and even the unmanaged function call signature (not address) is the same.

That seems a bit of a waste, so how could one optimize this ?

Indeed, we would save on basically everything we care about (RAM, JIT compilation) by just generating one IL stub for every unique input+output argument signature, and injecting that stub with the unmanaged address it needs to call.

This is exactly how it works: when the CLR encounters a PInvoke method, it pushes a frame on the stack (InlinedCallFrame) with info about – among other things – the unmanaged function address just before calling the actual IL stub.

The stub in turn requests this information through StubHelpers.GetStubContext() (aka ‘gimme my callframe’), and calls into the unmanaged function.

To see this in action, consider the code:

namespace TestPInvoke
{
    class Program
    {
        [DllImport("kernel32.dll")]
        static extern void Sleep(uint dwMilliseconds);

        [DllImport("kernel32.dll", EntryPoint = "Sleep")]
        static extern void SleepAgain(uint dwMilliseconds);

        static void Main(string[] args)
        {
            Console.WriteLine("Press any key...");

            while (!Console.KeyAvailable)
            {
                Sleep(500);
                SleepAgain(500);
            }
        }
    }
}

I’ll run this from WinDbg+SOS, here’s the disassembly of the calls to Sleep and SleepAgain in main:

mov     ecx,1F4h
call    0042c04c (TestPInvoke.Program.Sleep(UInt32), mdToken: 06000001)
mov     ecx,1F4h
call    0042c058 (TestPInvoke.Program.SleepAgain(UInt32), mdToken: 06000002)

You see the calls to Sleep and SleepAgain are pointing to different addresses. If we dump the unmanaged code at these locations we have:

!u 0042c04c (Sleep)
Unmanaged code
mov     eax,42379Ch
jmp     006100d0 (DomainBoundILStubClass.IL_STUB_PInvoke(UInt32))

!u 0042c058 (SleepAgain)
Unmanaged code
mov     eax,4237C8h
jmp     006100d0 (DomainBoundILStubClass.IL_STUB_PInvoke(UInt32)

Indeed, we see in a few lines that some different value is loaded into eax, before jumping to the same address (the IL stub). Since the value in eax is the only thing seperating the two, this must be a pointer to our call frame.

So let’s consider these as memory addresses and check what’s there:

dd 42379Ch
0042379c  63000001 20ea0005 00000000 00192385
004237ac  001925ec 00423808 0042c010 00000000

dd 4237C8h
004237c8  630b0002 20ea0006 00000000 00192385
004237d8  001925ec 00423810 0042c01c 00000000

Now remember the offset in the calli instruction above ? The unmanaged call was to a pointer reference at offset 20 (14h) in our stubcontext. Or in plain words: take the value at offset 20 in the callframe (emphasized), and dereference it. This gives us:

00423808 => 7747cf49 (KERNEL32!SleepStub)
00423810 => 7747cf49 (same)

And there we have it, PInvoke demystified.

In a next post I’ll address the following questions:

  • can we manually create our own marshalling stubs in C# (at compile time) ?
  • can it be faster than the runtime generated one ?
  • what about the reverse case (unmanaged code calling us) ?
RxLogo

Reactive Extension and ObserveOn

I’ve been actively following and using the work of the Reactive Extensions (Rx) team from early on, as Rx is truly a unique library for working with events. However, some days ago I discovered something didn’t quite work as expected, and it involves the ObserveOn and SubscribeOn methods of Observable.

The problem case

We had an eventstream – in particular XML messages arriving on a TCP port – which arrived at a relatively high rate. We did all of the event detection, filtering and handling with a Rx chain, which worked great. In the end, the event data had to be persisted using a database. This last step is where we meet the real problem, as the database operation could take longer than the time between arriving events causing queuing of the incoming events.

The solution (or so we naively thought)

Let’s put every database operation on a seperate thread so we offload all IO delays and free our main thread for the real computations. How ? There is this nice little method on Observable called ObserveOn which allows you to specify where you want the observing to take place:

public static class Observable
{
    public static IObservable ObserveOn(this IObservable source, IScheduler scheduler);

    public static IObservable SubscribeOn(this IObservable source, IScheduler scheduler);
}

So let’s ObserveOn(ThreadPool), and we fix our problem !

WTF or ‘Why are my events still queueing’ ?

The essential thing to remember is Rx is not multithreaded. If you specify you want to Observe events on a particular thread, Rx will help you, but that doesn’t mean your main thread won’t block until that call returns. So what’s the point of ObserveOn and SubscribeOn ? It’s mostly useful for STA scenarios: most notably the one where a UI thread receives events, which you SubscribeOn ObserveOn a background thread to prevent blocking of the UI thread, and eventually ObserveOn the UI thread again to update the UI. Sure the case uses two threads, but it’s all sequential.

The real fix

Explictly spawn a new thread/task in the OnNext which takes care of the database update, and immediately return to the observable.

Self-hosting ASP.NET MVC

Recently I was considering the technology to use for the GUI of a windows desktop client application I’m working on in my spare time. The standard picks are obviously WinForms or the more modern WPF. However, I have some problems with them in this case:

  • the technologies above are (windows) platform dependent, while for instance HTML5 isn’t.
  • the application I’m working on is a debugger. Perhaps details about that will follow in a future post, but I would like to be able to run it remotely from an arbitrary machine (and OS).
  • I really appreciate the power of modern javascript libraries such as knockout.js, and I can’t use those with WPF or WinForms.

Now, for Windows 8, the store applications can be developed in HTML5 already, while desktop apps can’t. Of course I could resort to creating a real web application (hosted in IIS), but that would require the debugger host machine to have IIS installed. The same is true for the rehostable web core.

There is an open source library – NancyFX – which can be self-hosted, and I was tempted to use that but it had some unknowns coming from the MVC experience: controllers are modules in Nancy, and I couldn’t quite find what existing functionality in MVC was or wasn’t available in the Nancy library.

So with all other options out of the window, I set out to self-host ASP.NET MVC4.

Surprisingly, much of the stuff that’s needed to do this is undocumented: while the MVC pipeline is well known, the internals of what happens between the TCP port and the first entry point in ASP.NET aren’t. However, logic dictates:

  1. there should be an open TCP/IP port listening somewhere
  2. there should be a dedicated app domain for the web application
  3. somehow the request received on the port should be transferred to the web application http runtime

1. Receiving requests

The traditional way of listening for web requests was to simply open a socket on port 80 using the Windows Socket API (WinSocks), and run the HTTP protocol stack on top of this in your app in user mode. However, this had severe drawbacks, which led to the creation of a dedicated kernel HTTP protocol stack – http.sys – which does all the low level HTTP work, and lies directly on top of the TCP/IP protocol stack.

In the user mode space, applications communicate with http.sys through the Windows HTTP Server API. IIS uses this API, and if we want to self-host ASP.NET, we will have to as well. Fortunately, the .NET framework includes a wrapper class around this API: System.Net.HttpListener

Using HttpListener to process requests should then be easy. I started by implementing a listener on top of HttpListener, and using Reactive Extensions to push the incoming request context to clients:

internal class Listener : IDisposable
{
    private HttpListener _listener = null;
    public IObservable IncomingRequests { get; private set; }

    internal Listener(int port)
    {
        _listener = new HttpListener();
        _listener.Prefixes.Add(string.Format("http://localhost:{0}/", port));
        _listener.AuthenticationSchemes = AuthenticationSchemes.Anonymous;
        _listener.Start();

        IncomingRequests = _listener.GetContextsAsObservable().ObserveOn(NewThreadScheduler.Default);
    }

    public void Dispose()
    {
        try
        {
            if (_listener == null) return;
            _listener.Stop();
            _listener = null;
        }
        catch (ObjectDisposedException)
        {
        }
    }
}

internal static class ListenerExtensions
{
    private static IEnumerable Listen(this HttpListener listener)
    {
        while (true)
        {
            yield return listener.GetContextAsync().ToObservable();
        }
    }
    internal static IObservable GetContextsAsObservable(this HttpListener listener)
    {
        return listener.Listen().Concat();
    }
}

Now all client code has to do is create a new listener, and subscribe to the requestcontext stream:

private Listener _listener;

private void SetupListener(int port)
{
    _log.InfoFormat("Setting up new httplistener on port {0}", port);
    _listener = new Listener(port);

    _log.InfoFormat("Start forwarding incoming requests to ASP.NET pipeline");
    _listener.IncomingRequests.Subscribe(
    (c) =>
    {
        try
        {
            ProcessRequest(c);
        }
        catch (Exception ex)
        {
            _log.Error("Exception processing request", ex);
        }
    },
    (ex) => _log.Error("Exception in request sequence", ex),
    () => _log.Info("HttpListener completed"));
    _log.Info("Completed httplistener setup");
}

The real magic happens in the yet to be implemented ProcessRequest method. However, before feeding requests to the ASP.NET pipeline, we first have to set that up in its own AppDomain. When hosting your project in IIS, your application is typically bound to a dedicated application pool (AppDomain). In theory the web application could be run in the already running default app domain, however, in this post I’ll try to stay as close as possible to the IIS approach.

System.Web contains the class ApplicationHost which allows you to do exactly what we want: create a new AppDomain, and specify a user supplied type (class) which should be instanced there to bootstrap the domain.

Again, in theory, you could use any class, but there are two things to keep in mind:

  • you want to communicate with the new AppDomain from your default AppDomain
  • you don’t want this bootstrapping instance to be garbage collected as it’s the root object in the new domain

The classic solution for the first issue is to use a class derived from MarshalByRefObject, as this will automagically enable remoting by RPC between your AppDomains. Another – more modern – option would be to use WCF, but I didn’t check if that works in practice yet.

The second issue is fixed by explicitly telling the remoting infrastructure the object’s lease is never expiring (which ties its actual lifetime to the lifetime of the new AppDomain).

The code below demonstrates this:

public class AppHost : MarshalByRefObject
{
    //factory method
    private static AppHost GetHost(string virtualDir, string physicalPath)
    {
        // Fix for strange CreateApplicationHost behavior (can only resolve assembly when in GAC or bin folder)
        if (!(physicalPath.EndsWith("\"))) physicalPath += "\";

        // Copy this hosting DLL into the /bin directory of the application
        var fileName = Assembly.GetExecutingAssembly().Location;
        try
        {
            var folderName = string.Format("{0}bin\", physicalPath);

            //create folder if it doesn't exist
            if (!Directory.Exists(folderName)) Directory.CreateDirectory(folderName);

            //copy file
            File.Copy(fileName, Path.Combine(folderName, Path.GetFileName(fileName)), true);

            //and all assemblies
            var pathToAppHost = Path.GetDirectoryName(fileName);
            foreach (var fn in Directory.EnumerateFiles(pathToAppHost, "*.dll", SearchOption.TopDirectoryOnly))
            {
                File.Copy(fn, Path.Combine(folderName, Path.GetFileName(fn)), true);
            }
        }
        catch { }

        return (AppHost)ApplicationHost.CreateApplicationHost(typeof(AppHost), virtualDir, physicalPath);
    }

    //set an infinite lease
    public override object InitializeLifetimeService()
    {
        return null;
    }
}

The code before the actual CreateApplicationHost call in the factory method requires an explanation: while the CreateApplicationHost is a convenient method, it’s hardwired to only search for assemblies in the bin folder relative to the physical path of the web project (or the GAC). Rick Strahl mentions this on his blog, and in fact if you check the framework reference sources, or inspect the assemblies with ILSPY you’ll discover a hardwired bin reference.

So for a quick fix, in the code above I just copy the relevant assemblies to that folder. A more elegant solution would be to do a bit more work and create a new AppDomain using AppDomain.CreateDomain(…) and tell it where to find assemblies yourself, or even better, override the AssemblyResolve eventhandler.

Next we want to create an instance of your web/MVC project’s main class, which is some class – in global.asax.cs – derived from HttpApplication, so we also need a method on AppHost which does this:

private HttpApplication _mvcApp;

private void HostMvcApp() where T: HttpApplication
{
    //usually IIS does this, but we need to serve static files ourselves
    HttpApplication.RegisterModule(typeof(StaticFileHandlerModule));

    _mvcApp = Activator.CreateInstance();
}

This will bootstrap your MVC application and call into it’s Application_Start method where you can register routes, areas, bundles, etc as usual.

Since the ASP.NET pipeline does not serve static files, we need to do this ourselves. I do this here by registering a StaticFileHandlerModule, which we’ll implement in a future post.

So now we have hosted the MVC application and set up a HTTP listener. The only thing that’s left is to connect these two together so the ASP.NET pipeline will handle these requests, an undertaking which turns out to be more complex than it sounds.

The way in which requests should be handed to the HttpRuntime is through a call to the static ProcessRequest method:

public static void ProcessRequest(HttpWorkerRequest wr)

The problem is: HttpWorkerRequest is abstract and has over 70 methods that should be implemented to be able to pass every possible request to the ASP.NET pipeline (the HttpRuntime will call into those methods to find details about the request).

So whatever host calls into the HttpRuntime will have to provide its own implementation of HttpWorkerRequest for wrapping its http requests.

We can check those who did by searching derived types:

httpworkerrequest

All except the SimpleWorkerRequest are IIS specific implementations, which tightly integrate with the proprietary and native code IIS engine. The SimpleWorkerRequest itself is an implementation which when used to wrap HTTP requests allows you to send simple requests into the ASP.NET pipeline. However, that’s about it: the concept of streams, certificates (security), completely misses, so it’s not much more than a proof of concept: it won’t enable you to unleash the full power of the ASP.NET engine.

So we’re out of luck, we have to make our own implementation. An (incomplete) example can be found in an old msdn magazine.

Wrapping the request and sending it into the pipeline then looks like this:

private void ProcessRequest(HttpListenerContext context)
{
   _log.DebugFormat("Processing request");
   var wr = new HttpListenerWorkerRequest(context, VPath, PPath);
   HttpContext.Current = new HttpContext(wr);

   HttpRuntime.ProcessRequest(wr);
   _log.DebugFormat("Finished processing request");
}

But it fails at runtime…

It turns out we run into another problem: since the request is received in the default AppDomain, and processed in the ASP.NET AppDomain, we have to pass either the HttpListenerContext – like we did above – or our implementation of the HttpWorkerRequest between AppDomains. No matter how you do this (MarshalByRefObject, WCF), it requires serialization. And guess what ? None of these is easily serializable, for one because they contain the request and response streams…

At this point I decided it would save me a lot of work if I moved the Listener we created into the ASP.NET AppDomain as well, and do away with the serialization mess.

So ultimately, I ended up with a factory method which:

  1. creates a new appdomain and hosts my implementation of HttpApplication
  2. sets up a HttpListener in this new AppDomain
  3. wraps the HttpListenerContext in a custom implementation of HttpWorkerRequest
  4. sends it into the HttpRuntime

And finally…it works…I managed to have a fully functioning MVC application hosted in-process with my main application.

There are some loose ends to tie up though, like serving static files. I may touch on that in a follow up, but if you want to try it yourself, try the hostasp NuGet package incorporating these concepts right now.

UPDATE: or try the source code

workflow

One (AppFabric) workflow with multiple persistence stores

For a new project, we decided to model and implement the business processes of our customers using Windows Workflow Foundation 4.

To do this we use a single workflow definition in the form of a workflow service host (.xamlx) in IIS, which depending on the type of process invokes different final actions. To persist our workflow instances and manage their lifecycle we use Windows Server AppFabric.

Initially, we started with the default configuration: a single persistence database separate from our customer databases, but this presented a problem: in rollback scenarios, only rolling back the database of one customer would cause the workflows and the customer’s other business data to get out of sync. One solution would be to also rollback the workflow database, however, in a scenario with multiple customers who all use the same persistence database this is no option either.

In the end, the best approach would be to have a customer’s business data and workflow persistence data in the same location, so both can be managed at the same time. However, AppFabric doesn’t have built-in support for the scenario in which we have a single WF service definition, which is exposed to all customers, which then use their own instance stores for persistence.

So we had to create support for this scenario ourselves and the schematic solution consisted of the following steps:

  • Create a factory for the xamlx service, which enables us to set the persistence store for a new workflow service host in code based on the customer.
  • Tell the factory which customer should be serviced for a request. The only way in which we can do this is by exposing the factory on a different url for every customer.
  • To expose the factory dynamically based on the customers in the database, we used a virtual path provider.

Technically what we did was:

1 – In Global.asax of the web project register add the virtual path provider:

protected override void OnApplicationStarted()
{            
    HostingEnvironment.RegisterVirtualPathProvider(new MyPathProvider());
    base.OnApplicationStarted();
}

2 – The virtual path provider uses a simple parser – MyServiceUrlParser (not shown) – to find out which customer is connecting (unique id in the url), and if it ends on “svc” it returns a virtual file (MyServiceFile) with a service host definition. The first two overrides are necessary for serving the file, the last two for preventing IIS from throwing exceptions for not finding the folder or file in cache:

public class MyPathProvider : VirtualPathProvider
{                
    public override bool FileExists(string virtualPath)
    {            
        return !string.IsNullOrWhiteSpace(MyServiceUrlParser.Parse(virtualPath).Service) || base.FileExists(virtualPath);
    }

    public override VirtualFile GetFile(string virtualPath)
    {
        var match = MyServiceUrlParser.Parse(virtualPath);

        if (match.ServiceType.EndsWith("svc")) return new MyServiceFile(virtualPath);
        if (match.ServiceType.EndsWith("xamlx")) return base.GetFile("~/Workflows/MyFlow.xamlx");

        return base.GetFile(virtualPath);
    }

    public override System.Web.Caching.CacheDependency GetCacheDependency(string virtualPath, System.Collections.IEnumerable virtualPathDependencies, DateTime utcStart)
    {
        return MyServiceUrlParser.IsValidServiceUrl(virtualPath) ? null : base.GetCacheDependency(virtualPath, virtualPathDependencies, utcStart);            
    }

    public override bool DirectoryExists(string virtualDir)
    {
        return MyServiceUrlParser.IsValidServiceUrl(virtualDir) || base.DirectoryExists(virtualDir);
    }
}

3 – The virtual file (svc) provided at this customer dependent path, this will trigger the IIS to load the MyFlow.xamlx from the customer specific url. Since our virtualpath provider is setup to return MyFlow.xamlx fom any xamlx request, it will service the one workflow for all customers:

public class MyServiceFile : VirtualFile
{
    public MyServiceFile(string virtualPath) : base(virtualPath) { }

    //return the same factory and service for the same url...only the factory will specify another persistance db based on the url
    public override Stream Open()
    {
        var serviceDef = new MemoryStream();
        var defWriter = new StreamWriter(serviceDef);

        // Write host definition
        defWriter.Write("");
        defWriter.Flush();

        serviceDef.Position = 0;
        return serviceDef;
    }
}

4 – finally the factory for setting the persistence store:

public class DynamicHostFactory : WorkflowServiceHostFactory
{        
    protected override WorkflowServiceHost CreateWorkflowServiceHost(WorkflowService service, Uri[] baseAddresses)
    {
        var host = new WorkflowServiceHost(service, baseAddresses);

        host.DurableInstancingOptions.InstanceStore = SetupInstanceStore(baseAddresses);

        return host;
    }

    private static SqlWorkflowInstanceStore SetupInstanceStore(Uri[] baseaddresses)
    {
        //do stuff based on the url
    }
}

And there we have it, a single xamlx workflow definition with a dedicated persistence database per customer.