More on yield

Yesterday I discovered and introduced the yield keyword. Today, I wanted to see some examples and to discover how it is implemented. Let's start with a classic example, the Fibonacci function, both in Python and in C#.
def fib():
   a, b = 0, 1
   while 1:
       yield b
       a, b = b, a + b
 
static IEnumerable<int> Fib(int max)
{
int a = 0;
int b = 1;
yield return 1;

for (int i = 0; i < max - 1; i++)
{
int c = a + b;
yield return c;

a = b;
b = c;
}
}

The two method are essentially equivalent. How are they implemented? The first thought that came into my mind when I saw the Fibonacci example was: the compiler traslate it to static variables. But  clearly this is not the case: if two clients call the same function, they will share its internal state, which is not desiderable.
The second guess was: well, as in anonymous delegates the current state is captured by an anonymous class generated by the compiler. The compiler generates it inseting one field for every local variable in the function, and then instantiate an object of that class for every call site. This explains the Fibonacci function, but what about this one?

static IEnumerable<string> gen2()
{

yield return "A";

string s = "A" + "B";
yield return s;

s = 1.ToString();
yield return s;

for (int i = 0; i < 5; ++i)
yield return i.ToString();
}
Clearly, execution of the function body must stop after a yield statement, like after a return statement, but it must also be resumed in the same place, not at the begininng of the function. So, we have to save also the point at which execution stopped, and resume it (with a jump right at the beginning of the function) right at the same point. Using the excellent Reflector tool by Lutz Roeder you can see in pseudo-code (reverse engineered C#) how this class is generated and how the function is wrapped inside MoveNext, with a big switch right at the beginning of the function that allows resuming execution at different points based on a state variable:
[CompilerGenerated]
private sealed class d__7 : IEnumerable<string>, IEnumerable, 
   
IEnumerator<string>,IEnumerator, IDisposable { // Methods [DebuggerHidden] public d__7(int <>1__state);
      private bool MoveNext()
      {
         switch (this.<>1__state)
         {
            case 0:
            {
                  this.<>1__state = -1;
                  this.<>2__current = "A";
                  this.<>1__state = 1;
                  return true;
            }
            case 1:
            {
                  this.<>1__state = -1;
                  this.5__8 = "AB";
                  this.<>2__current = this.5__8;
                  this.<>1__state = 2;
                  return true;
            }
            case 2:
            {
                  this.<>1__state = -1;
                  int num2 = 1;
                  this.5__8 = num2.ToString();
                  this.<>2__current = this.5__8;
                  this.<>1__state = 3;
                  return true;
            }
            case 3:
            {
                  this.<>1__state = -1;
                  this.5__9 = 0;
                  while (this.5__9 < 5)
                  {
                        this.<>2__current = this.5__9.ToString();
                        this.<>1__state = 4;
                        return true;
                  Label_00DE:
                        this.<>1__state = -1;
                        this.5__9++;
                  }
                  break;
            }
            case 4:
            {
                  goto Label_00DE;
            }
         }
         return false;
      }
[DebuggerHidden] IEnumerator<string> IEnumerable<string>.GetEnumerator(); [DebuggerHidden] IEnumerator IEnumerable.GetEnumerator(); [DebuggerHidden] void IEnumerator.Reset(); void IDisposable.Dispose(); // Properties string IEnumerator<string>.Current { [DebuggerHidden] get; } object IEnumerator.Current { [DebuggerHidden] get; } // Fields private int <>1__state; private string <>2__current; public int 5__9; public string 5__8; }
All this reasoning happened yesterday with a guy that work on dynamic languages (I don't want to say who right now... I want only to say "cross your fingers for me!" I may have great news for you in the future). This guy correctly pointed out an issue with this approach: what if the yield statement is inside a try block? In CIL, you can't jump inside a try block: you can only jump at the first instruction of the block. I was puzzled at first: what can we do? The simplest solution was to not permit the mixing of try/catch and yield. But surely this is limitating (Python allow yield to be used almost everywhere, the only exception being yield not allowed in try-finally). The guy then gave me an hint: obviously, once you are inside the block, you can jump whenever you like inside that very block. So, the execution can be resumed using a sort of multiple-dispatch. It was all very clear at once: you have to make a first jump inside the right try block, then at the correct location inside the block. In a function like this one
def f():
   try:
       yield 1
       try:
           yield 2
           1/0
           yield 3  # never get here
       except ZeroDivisionError:
           yield 4
           yield 5
           raise
       except:
           yield 6
       yield 7     # the "raise" above stops this
   except:
       yield 8
   yield 9
   try:
       x = 12
   finally:
       yield 10
   yield 11

The compiler will translate this code in something like:

 L_0001: ldfld int32 YieldTest.Program/d__7::<>1__state
 L_0006: stloc.1 
 L_0007: ldloc.1 
 L_0008: switch (L_0031, L_0023, L_0025, L_0027, L_002c)
 L_0021: br.s L_0033
 L_0023: br.s L_0059
 L_0025: br.s L_0085
 L_0027: br L_00b2
...
 L_0033: switch (L_0034, L_0037)
 L_0034: ldc.i4 2 
 L_0035: stfld int32 YieldTest.Program/d__7::<>2__current
 L_0036: br L_0105


The first switch (at position L_0008) selects the try block, the second one (position L_0033) selects the yield statement inside that block. I tried to translate this Python function in C#, to see how the IL code generated by the C# compiler looks like, and with my great surprise the following code

static IEnumerable<int> f()
{
try
{
yield return 1;
try
{
yield return 2;
int k = 1 / 0;
yield return 3; //never get here
}
catch (System.DivideByZeroException)
{
yield return 4;
yield return 5;
throw;
}
catch
{
yield return 6;
yield return 7; //the "raise" above stops this
}
}
catch
{
yield return 8;
yield return 9;
}
try
{
int x = 12;
}
finally
{
yield return 10;
yield return 11;
}
}

did not compiled at all! The compiler spits out the following errors:

error CS1626: Cannot yield a value in the body of a try block with a catch clause
error CS1631: Cannot yield a value in the body of a catch clause
error CS1625: Cannot yield in the body of a finally clause

Apparently, the C# team glissed over the problem by not allowing the same behaviour as in Python.


Copyright 2020 - Lorenzo Dematte