So, in the end, what went into Pumpkin?

Control was performed at compilation time or execution time? And if it is execution, using which technique?

In general, compilation has a big pro (you can notify immediately the snippet creator that he did something wrong, and even preventing the code block from becoming an executable snippet) and a big con (you control only the code that is written. What if the user code calls down some (legitimate) path in the BCL that results in a undesired behaviour?)

AppDomain sandboxing has some big pros (simple, designed with security in mind) and a big con (no "direct" way to control some resource usage, like thread time or CPU time).
Hosting has a big advantage (fine control of everything, also of "third" assemblies like the BCL) which is also the big disadvantage (you HAVE to do anything by yourself).

So each of them can handle the same issue with different efficacy. Consider the issue of controlling thread creation:
  • at compilation, you "catch" constructs that create a new thread (new Thread, Task.Factory.StartNew, ThreadPool.QueueUserWorkItem, ...)
    • you have to find all of them, and live with the code that creates a thread indirectly.
    • but you can do wonderful things, like intercepting calls to thread and sync primitives and substitute them - run them on your own scheduler!
  • at runtime, you:
    • (AppDomain) check periodically. Count new threads from the last check.
    • (hosting) you are notified of thread creation, so you monitor it.
    • (debugger) you are notified as well, and you can even suspend the user code immediately before/after.

Another example:
  • at compilation, you control which namespaces can be used (indirectly controlling the assembly)
  • at runtime you can control which assemblies are really loaded (you are either notified OR asked to load them - and you can prevent the loading)

What I ended up doing is to use a mix of techniques. 

In particular, I implemented some compiler checks.
Then, run the compiled IL on a separate AppDomain with a restricted PermissionSet (sandboxing).
Then, run all the managed code in an hosted CLR.

I am not crazy...
 

Guess who is using the same technique? (well, not compiler checks/rewriting, but AppDomain sandboxing + Hosting?)
A piece of software that has the same problem, i.e. running unknown, third party pieces of code from different parties in a reliable, efficient way: IIS.
There is very little information on the subject; it is not one of those things for which you have extensive documentation already available. Sure, MSDN has documented it (MSDN has documentation for everything, thankfully), but there is no tutorial, or Q&As on the subject on StackOverflow. But the pieces of information you find in blogs and articles suggests that this technology is used in two Microsoft products: SQL Server, for which the Hosting API was created, and IIS.

Also, this is a POC, so one of the goals is to let me explore different ways of doing the same thing, and assess robustness and speed of execution. Testing different technologies is part of the game :)


Copyright 2020 - Lorenzo Dematte