Wednesday, 12 December 2007

Not so Static Classes?

A colleague of mine noticed something interesting whilst working with C# the other day.
He had a breakpoint in a static constructor (on a non-static class) which seemed to be hit several times during execution of his program.
He discussed this with me and we both agreed that in C# it should not be possible for a static constructor to be invoked more than once in the same AppDomain, contrary to the behaviour he was experiencing.

I consulted MSDN, and found the following definitions:
Static Classes [C#]
"Static class members can be used to separate data and behavior that is independent of any object identity..."
"Static classes are loaded automatically by the .NET Framework common language runtime (CLR) when the program or namespace containing the class is loaded."

Static Constructors
"A static constructor is used to initialize any static data, or to perform a particular action that needs performed once only. It is called automatically before the first instance is created or any static members are referenced."

My colleague was working with .Net 3.0 using a class that derived from ClientBase in the System.ServiceModel namespace.
Basically, he added a static constructor to define a single EndPoint for all the instances of this derived ClientBase(T).

Upon further inspection, we saw that what was actually happening was that the Assembly containing the derived class was being loaded and instantiated by WCF using Reflection.

With this in mind, I decided to go back to basics and create a simple program to see whether there was a difference in the behaviour of static classes and static constructors when working with an Assembly loaded using Reflection.

I created a new C# solution that contained a Console Application called Demo and a Class Library called ReflectedAssembly.

1. Demo
In Demo, I added a static class called MyStaticClass.
To this class I added a static constructor and a public static method called Call which had a string parameter.
When Call is invoked, it will write the string parameter to the Console.
In the static constructor, I added a call to the Call method, passing "cctor" as a parameter.

2. ReflectedAssembly
I added a reference from ReflectedAssembly to Demo.
I added a class named ReflectedClass and a single method named DoIt which accepted a string as a parameter.
In the DoIt method, I added a call to MyStaticClass.Call, passing the string parameter.

3. Demo
Back in Demo, I wrote the following in the program Main:
---
Assembly TheAssembly;
Type TheType;

MyStaticClass.Call("main start");

TheAssembly = Assembly.LoadFile("**PATH**");
TheType = TheAssembly.GetType("ReflectedClass");

if (TheType != null)
TheType.GetMethod("DoIt").Invoke(Activator.CreateInstance(TheType), new object[] { "hello" });

MyStaticClass.Call("main end");

---

I added a breakpoint to the static constructor in MyStaticClass, but was disappointed to see it only get hit once. It seems I had not managed to reproduce the scenario that my colleague had experienced.

I spent some time trying to figure out precisely what would cause the static class to be constructed mulitple times.
In the end I discovered that it could be achieved in the following way:
1. In the program code, making a copy of the ReflectedAssembly and Demo Assemblies in a different folder.
2. Using Assembly.LoadFrom to load the ReflectedAssembly Assembly as opposed to Assembly.LoadFile.

Once my code was changed to use this method, I found that my static constructor was being invoked twice.

Next, I wanted to find out why this was happening.
I added some additional code to the Call method in MyStaticClass, such that the following information was also written out to the Console:
  • The process ID

  • The managed thread ID

  • The AppDomain ID

  • The handle of the current Call method

I was hoping to see that somewhere, one of these values would be different.
My initial concern was that there may be some kind of context based issue around multi-threading.

After running the program again, I discovered that the process, thread and AppDomain IDs were all the same.
The difference was in the MethodHandle for the Call method.
In the two calls from the program Main (where "main start" and "main end" were written), the handle is the same, but in the call from ReflectedClass, the handle was different.
This seems to indicate that the instance of MyStaticClass that is used by ReflectedClass differs from the instance used by the program Main, which would explain why the static constructor was being called twice.

By enumerating the Assemblies of the current AppDomain, I can see that there are actualy two copies of my Demo Assembly in memory.
It appears that each contains an instance of the static class MyStaticClass, and this is why the static constructor has been invoked twice.
It is interesting to note that if I use Assembly.LoadFile to load ReflectedAssembly, there will also be two copies of the Assembly in memory, but the reference to the Assembly has been resolved to the same instance as my program Main is running.
However, if I use Assembly.LoadFrom, there seems to be an issue with resolving the reference to the existing Assembly and the reference is resolved to the new Assembly in memory instead.

I decided to take my work a little further and see if I could produce multiple instances of the static class in memory.
I modified my program so that it loads the ReflectedAssembly Assembly three times, from a different location each time.
If I use Assembly.LoadFile, although the Assembly is actually loaded into memory an additional three times, the reference to Demo is always resolved to the first copy in memory, the static constructor only runs once, and the calls to the MyStaticClass.Call method are on the same instance of the static class.
However, using Assembly.LoadFrom (with the Assemblies in a different location each time) causes the same behaviour as before - this time the static constructor is invoked 4 times, and the calls to MyStaticClass.Call are on different instances of the static class.

I refer to the MSDN entry for the Assembly.LoadFrom method:
"If an assembly with the same identity is already loaded, LoadFrom returns the loaded assembly even if a different path was specified."
This does not appear to be true. The code I ran above demonstrates that actually, LoadFrom is returning the newly loaded Assembly each time, and that the Assembly references are being resolved to newly loaded Assemblies too.

Demo Program
A demo program that shows this in detail can be downloaded here.
You are welcome to the source code, which is written in C# 2.0. Please contact me if you would like it.

Conclusion
Static classes and static constructors may only static in the context of the loaded Assembly.
If using Reflection to load Assemblies, you may unknowingly end up with multiple instances of your static classes contrary to accepted wisdom.
Essentially, you cannot rely on there being a single instance of a static class in your AppDomain.

In addition, using Assembly.LoadFrom does not guarantee that any Assemblies you are loading are resolved to in-memory copies, despite what is asserted in MSDN.
This is particularly important because this is what it appears is happening when using WCF in .Net 3.0.

None of this necessarily poses a problem however: as long as we are aware that this is the case we can work with it.

Sunday, 2 December 2007

Private Invoking Using Reflection

Whilst working on a product the other day, a thought occured to me with regard to Reflection in Microsoft.Net.
I wondered whether it would be possible to create an instance of a type with Internal or Private scope which was located in another assembly.

As I understand it:
  • An object declared with Internal scope should only be visible inside the assembly in which it is located.

  • An object with Private scope should only be visible to the object that contains it. Private objects can not be declared at namespace level, only within other objects.
To see whether it was possible, I created a test.
I created a Class Library, FormLib, which contained a Public class name FormWrapper.
FormWrapper contains three further classes: PublicForm, InternalForm and PrivateForm.

PublicForm is declared as a Public class with a Public constructor.
InternalForm is declared as an Internal class with an Internal constructor.
PrivateForm is declared as a Private class with a Private constructor.
All three classes inherit from Form and contain a single Label control that states "This is a ... form", with "..." replaced by the scope of the class.

I then created a Windows Forms project called Demo which contained a Form. Demo is the startup project.
In Demo, I added three buttons, which would attempt to create an instance of PublicForm, InternalForm and PrivateForm respecitvely.

To start, I needed to obtain the Assembly object representing FormLib. This was relatively easy and was achieved by referencing FormLib from my Demo project and then enumerating the Assembly objects contained in the current AppDomain:
---
...
Assembly _Target;

foreach (Assembly Item in AppDomain.CurrentDomain.GetAssemblies())
{
  if (Item.FullName.StartsWith("FormLib"))
  {
    _Target = Item;
    break;
  }
}
...

---

I assumed creating an instance of the Public class would be easiest, so I implemented this first.
The Assembly object has a method CreateInstance, which can be used to generate an instance of a type, as so:
---
...
object TheObject;

TheObject = _Target.CreateInstance("FormLib.FormWrapper+PublicForm", false, BindingFlags.Public | BindingFlags.Instance, null, null, null, null);
...

---

The first parameter of CreateInstance is the full name of the type I want to create.
"FormWrapper+PublicForm" indicates that the type PublicForm is inside the type FormWrapper, which is unlike the standard syntax for a type inside a namespace, which would simply be "Namespace.Type".
The third paremeter, which is a BindingFlags enumeration, indicates:
  • Public - The public declarations of the type should be searched

  • Instance - An instance should be created
The way that CreateInstance actually works is to generate the Type specified in the first parameter ("FormLib.FormWrapper+PublicForm" in this case) and then using the static CreateInstance method of the Activator object to actually generate the instance. This in turn uses RuntimeType.CreateInstanceImpl (RuntimeType is Internal to System) which actually invokes the constructor of the Type.
We could, of course, use Activator directly, but we would be unable to do this with Internal and Private types in a different Assembly because the types are not available to us directly.

Now if the above code works, I want to actually display the form I have just created. To achieve this, following the above code I carry out the following:
---
...
if (TheObject is Form)
{
  Form TheForm;

  TheForm = (Form)TheObject;
  TheForm.ShowDialog();
}
...

---

I ran my code to see what happened, and PublicForm was shown as expected.

Moving on, I decided to try and repeat the above with the Internal and Private classes.
To achieve this, I made a small modification to the call to CreateInstance, first to create the types of the InternalForm and PrivateForm types, and second to change the BindingFlags to Private instead of Public:
---
...
TheObject = _Target.CreateInstance("FormLib.FormWrapper+InternalForm", false, BindingFlags.Private | BindingFlags.Instance, null, null, null, null);
...
TheObject = _Target.CreateInstance("FormLib.FormWrapper+PrivateForm", false, BindingFlags.Private | BindingFlags.Instance, null, null, null, null);
...

---

Upon executing the code, in both cases the form was shown as expected.
To provide further proof that internal or private methods could be executed using reflection, I decided to add a method named "AMethod" to each of the classes.
The scope of the method was the same as the class itself, so there was a Public, Internal and Private method.
In each method, a messagebox will be shown displaying that the method had been invoked.

Directly following the line "TheForm.ShowDialog();" in each case I added the following code to attempt to invoke the method:
---
...
foreach (MethodBase Item in TheObject.GetType().GetMethods(BindingFlags.Public | BindingFlags.Private | BindingFlags.Instance)
{
  Item.Invoke(TheObject, null);
  break;
}
...

---

The objective of the above code is to enumerate all of the publically and privately available methods in the type represented by TheObject.
This reveals the "AMethod" method (which is of type MethodBase), which I can then invoke by calling Invoke and passing the instance of the type in as a parameter.
The second parameter for Invoke is the parameters for the method call (which can be passed as an array of objects - "object[]"). In this case, the method has no parameters so I simply pass in null.

The method could also be invoked by using the InvokeMember method of the Type. However, I found this unsuccessful with methods with Private access so I discounted it.

A Potential Solution
A method that you could use to add some protection to your internal or private code is to monitor the stack trace when your method is invoked. You could also do this upon object construction by adding a monitor to your class constructors.
Basically, you can retrieve a list of the current call stack at the start of each method and ensure that the call to your method came from within your own class or within your own assembly, depending on whether it is a private or internal method respectively.
The following code demonstrates the use of this to ensure the caller is in the current assembly, which will validate access to an internal method:
---
...
StackTrace Trace = new StackTrace();
if (Trace.FrameCount > 2)
  if (Trace.GetFrame(1).GetType().Assembly != Assembly.GetExecutingAssembly())
    throw new Exception("The call to this method is invalid");
...

---

This works by examining the second method on the call stack (frame 1), which will be the direct caller of the current method on the call stack (frame 0). If the caller is not in the same assembly as the current frame then you know that it is in another assembly and has not been made internally.

This works a treat but sadly adds complexity to your program and slows it down. However, if you are producing code that must only be executed internally or privately, this method may offer a reasonable way of protecting it.

Conclusion
You may create classes for internal use only in your assembly, which have internal or private members, properties or methods, but you cannot rely on the fact that these classes and members will not be created and consumed outside of your own code.
The power of .Net Reflection demands that you must not consider your internal and private code to be internal or private.

If you would like the demo program I produced to show this working, it is available here.

Until next time!