Wednesday 12 December 2007

Not so Static Classes?

A colleague of mine noticed something interesting whilst working with C# the other day.
He had a breakpoint in a static constructor (on a non-static class) which seemed to be hit several times during execution of his program.
He discussed this with me and we both agreed that in C# it should not be possible for a static constructor to be invoked more than once in the same AppDomain, contrary to the behaviour he was experiencing.

I consulted MSDN, and found the following definitions:
Static Classes [C#]
"Static class members can be used to separate data and behavior that is independent of any object identity..."
"Static classes are loaded automatically by the .NET Framework common language runtime (CLR) when the program or namespace containing the class is loaded."

Static Constructors
"A static constructor is used to initialize any static data, or to perform a particular action that needs performed once only. It is called automatically before the first instance is created or any static members are referenced."

My colleague was working with .Net 3.0 using a class that derived from ClientBase in the System.ServiceModel namespace.
Basically, he added a static constructor to define a single EndPoint for all the instances of this derived ClientBase(T).

Upon further inspection, we saw that what was actually happening was that the Assembly containing the derived class was being loaded and instantiated by WCF using Reflection.

With this in mind, I decided to go back to basics and create a simple program to see whether there was a difference in the behaviour of static classes and static constructors when working with an Assembly loaded using Reflection.

I created a new C# solution that contained a Console Application called Demo and a Class Library called ReflectedAssembly.

1. Demo
In Demo, I added a static class called MyStaticClass.
To this class I added a static constructor and a public static method called Call which had a string parameter.
When Call is invoked, it will write the string parameter to the Console.
In the static constructor, I added a call to the Call method, passing "cctor" as a parameter.

2. ReflectedAssembly
I added a reference from ReflectedAssembly to Demo.
I added a class named ReflectedClass and a single method named DoIt which accepted a string as a parameter.
In the DoIt method, I added a call to MyStaticClass.Call, passing the string parameter.

3. Demo
Back in Demo, I wrote the following in the program Main:
---
Assembly TheAssembly;
Type TheType;

MyStaticClass.Call("main start");

TheAssembly = Assembly.LoadFile("**PATH**");
TheType = TheAssembly.GetType("ReflectedClass");

if (TheType != null)
TheType.GetMethod("DoIt").Invoke(Activator.CreateInstance(TheType), new object[] { "hello" });

MyStaticClass.Call("main end");

---

I added a breakpoint to the static constructor in MyStaticClass, but was disappointed to see it only get hit once. It seems I had not managed to reproduce the scenario that my colleague had experienced.

I spent some time trying to figure out precisely what would cause the static class to be constructed mulitple times.
In the end I discovered that it could be achieved in the following way:
1. In the program code, making a copy of the ReflectedAssembly and Demo Assemblies in a different folder.
2. Using Assembly.LoadFrom to load the ReflectedAssembly Assembly as opposed to Assembly.LoadFile.

Once my code was changed to use this method, I found that my static constructor was being invoked twice.

Next, I wanted to find out why this was happening.
I added some additional code to the Call method in MyStaticClass, such that the following information was also written out to the Console:
  • The process ID

  • The managed thread ID

  • The AppDomain ID

  • The handle of the current Call method

I was hoping to see that somewhere, one of these values would be different.
My initial concern was that there may be some kind of context based issue around multi-threading.

After running the program again, I discovered that the process, thread and AppDomain IDs were all the same.
The difference was in the MethodHandle for the Call method.
In the two calls from the program Main (where "main start" and "main end" were written), the handle is the same, but in the call from ReflectedClass, the handle was different.
This seems to indicate that the instance of MyStaticClass that is used by ReflectedClass differs from the instance used by the program Main, which would explain why the static constructor was being called twice.

By enumerating the Assemblies of the current AppDomain, I can see that there are actualy two copies of my Demo Assembly in memory.
It appears that each contains an instance of the static class MyStaticClass, and this is why the static constructor has been invoked twice.
It is interesting to note that if I use Assembly.LoadFile to load ReflectedAssembly, there will also be two copies of the Assembly in memory, but the reference to the Assembly has been resolved to the same instance as my program Main is running.
However, if I use Assembly.LoadFrom, there seems to be an issue with resolving the reference to the existing Assembly and the reference is resolved to the new Assembly in memory instead.

I decided to take my work a little further and see if I could produce multiple instances of the static class in memory.
I modified my program so that it loads the ReflectedAssembly Assembly three times, from a different location each time.
If I use Assembly.LoadFile, although the Assembly is actually loaded into memory an additional three times, the reference to Demo is always resolved to the first copy in memory, the static constructor only runs once, and the calls to the MyStaticClass.Call method are on the same instance of the static class.
However, using Assembly.LoadFrom (with the Assemblies in a different location each time) causes the same behaviour as before - this time the static constructor is invoked 4 times, and the calls to MyStaticClass.Call are on different instances of the static class.

I refer to the MSDN entry for the Assembly.LoadFrom method:
"If an assembly with the same identity is already loaded, LoadFrom returns the loaded assembly even if a different path was specified."
This does not appear to be true. The code I ran above demonstrates that actually, LoadFrom is returning the newly loaded Assembly each time, and that the Assembly references are being resolved to newly loaded Assemblies too.

Demo Program
A demo program that shows this in detail can be downloaded here.
You are welcome to the source code, which is written in C# 2.0. Please contact me if you would like it.

Conclusion
Static classes and static constructors may only static in the context of the loaded Assembly.
If using Reflection to load Assemblies, you may unknowingly end up with multiple instances of your static classes contrary to accepted wisdom.
Essentially, you cannot rely on there being a single instance of a static class in your AppDomain.

In addition, using Assembly.LoadFrom does not guarantee that any Assemblies you are loading are resolved to in-memory copies, despite what is asserted in MSDN.
This is particularly important because this is what it appears is happening when using WCF in .Net 3.0.

None of this necessarily poses a problem however: as long as we are aware that this is the case we can work with it.

3 comments:

Anonymous said...

Bravo! I stumbled across the same problem in the 2.0 and was also mislead by MSDN. But the truth seems to be what you wrote here!

Wayne Bayever said...

Thank you! Thank you! I have been hitting my head against this problem for hours. Your post helped me figure it out.

Daniel said...

Excellent. I just hit that problem experiencing the same thing. Did you inform Microsoft? (Besides, you did work very scientifically - congratulations. You might want to publish a paper on that issue...)