C# 2.0 中Iterators的改進(jìn)與實(shí)現(xiàn)原理淺析

2024-07-21 02:19:50

字體：大中小

供稿：網(wǎng)友

c#語言從vb中吸取了一個(gè)非常實(shí)用的foreach語句。對(duì)所有支持ienumerable接口的類的實(shí)例，foreach語句使用統(tǒng)一的接口遍歷其子項(xiàng)，使得以前冗長的for循環(huán)中繁瑣的薄記工作完全由編譯器自動(dòng)完成。支持ienumerable接口的類通常用一個(gè)內(nèi)嵌類實(shí)現(xiàn)ienumerator接口，并通過ienumerable.getenumerator函數(shù)，允許類的使用者如foreach語句完成遍歷工作。
這一特性使用起來非常方便，但需要付出一定的代價(jià)。juval lowy發(fā)表在msdn雜志2004年第5期上的create elegant code with anonymous methods, iterators, and partial classes一文中，較為詳細(xì)地介紹了c# 2.0中迭代支持和其他新特性。

首先，因?yàn)閕enumerator.current屬性是一個(gè)object類型的值，所以值類型(value type)集合在被foreach語句遍歷時(shí)，每個(gè)值都必須經(jīng)歷一次無用的box和unbox操作；就算是引用類型(reference type)集合，在被foreach語句使用時(shí)，也需要有一個(gè)冗余的castclass指令，保障枚舉出來的值進(jìn)行類型轉(zhuǎn)換的正確性。

以下為引用：

using system.collections;

public class tokens : ienumerable
{
...
tokens f = new tokens(...);

foreach (string item in f)
{
console.writeline(item);
}
...
}

上面的簡單代碼被自動(dòng)轉(zhuǎn)換為

以下為引用：

tokens f = new tokens(...);

ienumerator enum = f.getenumerator();
try
{
do {
string item = (string)enum.get_current(); // 冗余轉(zhuǎn)換

console.writeline(item);
} while(enum.movenext());
}
finally
{
if(enum is idisposable) // 需要驗(yàn)證實(shí)現(xiàn)ienumerator接口的類是否支持idisposable接口
{
((idisposable)enum).dispose();
}
}

好在c# 2.0中支持了泛型(generic)的概念，提供了強(qiáng)類型的泛型版本ienumerable定義，偽代碼如下：

以下為引用：

namespace system.collections.generic
{
public interface ienumerable<itemtype>
{
ienumerator<itemtype> getenumerator();
}
public interface ienumerator<itemtype> : idisposable
{
itemtype current{get;}
bool movenext();
}
}

這樣一來即保障了遍歷集合時(shí)的類型安全，又能夠?qū)系膶?shí)際類型直接進(jìn)行操作，避免冗余轉(zhuǎn)換，提高了效率。

以下為引用：

using system.collections.generic;

public class tokens : ienumerable<string>
{
... // 實(shí)現(xiàn) ienumerable<string> 接口

tokens f = new tokens(...);

foreach (string item in f)
{
console.writeline(item);
}
}

上面的代碼被自動(dòng)轉(zhuǎn)換為

以下為引用：

tokens f = new tokens(...);

ienumerator<string> enum = f.getenumerator();
try
{
do {
string item = enum.get_current(); // 無需轉(zhuǎn)換

console.writeline(item);
} while(enum.movenext());
}
finally
{
if(enum) // 無需驗(yàn)證實(shí)現(xiàn)ienumerator接口的類是否支持idisposable接口，
// 因?yàn)樗杏删幾g器自動(dòng)生成的ienumerator接口實(shí)現(xiàn)類都支持
{
((idisposable)enum).dispose();
}
}

除了遍歷時(shí)的冗余轉(zhuǎn)換降低性能外，c#現(xiàn)有版本另一個(gè)不爽之處在于實(shí)現(xiàn)ienumerator接口實(shí)在太麻煩了。通常都是由一個(gè)內(nèi)嵌類實(shí)現(xiàn)ienumerator接口，而此內(nèi)嵌類除了get_current()函數(shù)外，其他部分的功能基本上都是相同的，如

以下為引用：

public class tokens : ienumerable
{
public string[] elements;

tokens(string source, char[] delimiters)
{
// parse the string into tokens:
elements = source.split(delimiters);
}

public ienumerator getenumerator()
{
return new tokenenumerator(this);
}

// inner class implements ienumerator interface:
private class tokenenumerator : ienumerator
{
private int position = -1;
private tokens t;

public tokenenumerator(tokens t)
{
this.t = t;
}

// declare the movenext method required by ienumerator:
public bool movenext()
{
if (position < t.elements.length - 1)
{
position++;
return true;
}
else
{
return false;
}
}

// declare the reset method required by ienumerator:
public void reset()
{
position = -1;
}

// declare the current property required by ienumerator:
public object current
{
get // get_current函數(shù)
{
return t.elements[position];
}
}
}
...
}

內(nèi)嵌類tokenenumerator的position和tokens實(shí)際上是每個(gè)實(shí)現(xiàn)ienumerator接口的類共有的，只是current屬性的get函數(shù)有所區(qū)別而已。這方面c# 2.0做了很大的改進(jìn)，增加了yield關(guān)鍵字的支持，允許代碼邏輯上的重用。上面冗長的代碼在c# 2.0中只需要幾行，如

以下為引用：

using system.collections.generic;

public class tokens : ienumerable<string>
{
public ienumerator<string> getenumerator()
{
for(int i = 0; i<elements.length; i++)
yield elements[i];
}
...
}

getenumerator函數(shù)是一個(gè)c# 2.0支持的迭代塊(iterator block)，通過yield告訴編譯器在什么時(shí)候返回什么值，再由編譯器自動(dòng)完成實(shí)現(xiàn)ienumerator<string>接口的薄記工作。而yield break語句支持從迭代塊中直接結(jié)束，如

以下為引用：

public ienumerator<int> getenumerator()
{
for(int i = 1;i< 5;i++)
{
yield return i;
if(i > 2)
yield break; // i > 2 時(shí)結(jié)束遍歷
}
}

這樣一來，很容易就能實(shí)現(xiàn)ienumerator接口，并可以方便地支持在一個(gè)類中提供多種枚舉方式，如

以下為引用：

public class citycollection
{
string[] m_cities = {"new york","paris","london"};
public ienumerable<string> reverse
{
get
{
for(int i=m_cities.length-1; i>= 0; i--)
yield m_cities[i];
}
}
}

接下來我們看看如此方便的語言特性背后，編譯器為我們做了哪些工作。以上面那個(gè)支持ienumerable<string>接口的tokens類為例，getenumerator函數(shù)的代碼被編譯器用一個(gè)類包裝起來，偽代碼如下

以下為引用：

public class tokens : ienumerable<string>
{
private sealed class getenumerator$00000000__ienumeratorimpl
: ienumerator<string>, ienumerator, idisposable
{
private int $pc = 0;
private string $_current;
private tokens <this>;
public int i$00000001 = 0;

// 實(shí)現(xiàn) ienumerator<string> 接口
string ienumerator<string>.get_current()
{
return $_current;
}

bool ienumerator<string>.movenext()
{
switch($pc)
{
case 0:
{
$pc = -1;
i$00000001 = 0;
break;
}
case 1:
{
$pc = -1;
i$00000001++;
break;
}
default:
{
return false;
}
}

if(i$00000001 < <this>.elements.length)
{
$_current = <this>.elements[i$00000001];
$pc = 1;

return true;
}
else
{
return false;
}
}

// 實(shí)現(xiàn) ienumerator 接口
void ienumerator.reset()
{
throw new exception();
}

string ienumerator.get_current()
{
return $_current;
}

bool ienumerator.movenext()
{
return ienumerator<string>.movenext(); // 調(diào)用 ienumerator<string> 接口的實(shí)現(xiàn)
}

// 實(shí)現(xiàn) idisposable 接口
void dispose()
{
}
}

public ienumerator<string> getenumerator()
{
getenumerator$00000000__ienumeratorimpl impl = new getenumerator$00000000__ienumeratorimpl();

impl.<this> = this;

return impl;
}
}

從上面的偽代碼中我們可以看到，c# 2.0編譯器實(shí)際上維護(hù)了一個(gè)和我們前面實(shí)現(xiàn)ienumerator接口的tokenenumerator類非常類似的內(nèi)部類，用來封裝ienumerator<string>接口的實(shí)現(xiàn)。而這個(gè)內(nèi)嵌類的實(shí)現(xiàn)邏輯，則根據(jù)getenumerator定義的yield返回地點(diǎn)決定。
我們接下來看一個(gè)較為復(fù)雜的迭代塊的實(shí)現(xiàn)，支持遞歸迭代(recursive iterations)，代碼如下：

以下為引用：

using system;
using system.collections.generic;

class node<t>
{
public node<t> leftnode;
public node<t> rightnode;
public t item;
}

public class binarytree<t>
{
node<t> m_root;

public void add(params t[] items)
{
foreach(t item in items)
add(item);
}

public void add(t item)
{
// ...
}

public ienumerable<t> inorder
{
get
{
return scaninorder(m_root);
}
}

ienumerable<t> scaninorder(node<t> root)
{
if(root.leftnode != null)
{
foreach(t item in scaninorder(root.leftnode))
{
yield item;
}
}

yield root.item;

if(root.rightnode != null)
{
foreach(t item in scaninorder(root.rightnode))
{
yield item;
}
}
}
}

binarytree<t>提供了一個(gè)支持ienumerable<t>接口的inorder屬性，通過scaninorder函數(shù)遍歷整個(gè)二叉樹。因?yàn)閷?shí)現(xiàn)ienumerable<t>接口的不是類本身，而是一個(gè)屬性，所以編譯器首先要生成一個(gè)內(nèi)嵌類支持ienumerable<t>接口。偽代碼如下

以下為引用：

public class binarytree<t>
{
private sealed class scaninorder$00000000__ienumeratorimpl<t>
: ienumerator<t>, ienumerator, idisposable
{
binarytree<t> <this>;
node<t> root;

// ...
}

private sealed class scaninorder$00000000__ienumerableimpl<t>
: ienumerable<t>, ienumerable
{
binarytree<t> <this>;
node<t> root;

ienumerator<t> ienumerable<t>.getenumerator()
{
scaninorder$00000000__ienumeratorimpl<t> impl = new scaninorder$00000000__ienumeratorimpl<t>();

impl.<this> = this.<this>;
impl.root = this.root;

return impl;
}

ienumerator ienumerable.getenumerator()
{
scaninorder$00000000__ienumeratorimpl<t> impl = new scaninorder$00000000__ienumeratorimpl<t>();

impl.<this> = this.<this>;
impl.root = this.root;

return impl;
}
}

ienumerable<t> scaninorder(node<t> root)
{
scaninorder$00000000__ienumerableimpl<t> impl = new scaninorder$00000000__ienumerableimpl<t>();

impl.<this> = this;
impl.root = root;

return impl;
}
}

因?yàn)閟caninorder函數(shù)內(nèi)容需要用到root參數(shù)，故而ienumerable<t>和ienumerator<t>接口的包裝類都需要有一個(gè)root字段，保存?zhèn)魅雜caninorder函數(shù)的參數(shù)，并傳遞給最終的實(shí)現(xiàn)函數(shù)。
實(shí)現(xiàn)ienumerator<t>接口的內(nèi)嵌包裝類scaninorder$00000000__ienumeratorimpl<t>實(shí)現(xiàn)原理與前面例子里的大致相同，不同的是程序邏輯大大復(fù)雜化，并且需要用到idisposable接口完成資源的回收。

以下為引用：

public class binarytree<t>
{
private sealed class getenumerator$00000000__ienumeratorimpl
: ienumerator<t>, ienumerator, idisposable
{
private int $pc = 0;
private string $_current;
private tokens <this>;
public int i$00000001 = 0;

public ienumerator<t> __wrap$00000003;
public ienumerator<t> __wrap$00000004;
public t item$00000001;
public t item$00000002;
public node<t> root;

// 實(shí)現(xiàn) ienumerator<t> 接口
string ienumerator<t>.get_current()
{
return $_current;
}

bool ienumerator<t>.movenext()
{
switch($pc)
{
case 0:
{
$pc = -1;
if(root.leftnode != null)
{
__wrap$00000003 = <this>.scaninorder(root.leftnode).getenumerator();

goto scanleft;
}
else
{
goto getitem;
}
}
case 1:
{
return false;
}
case 2:
{
goto scanleft;
}
case 3:
{
$pc = -1;
if(root.rightnode != null)
{
__wrap$00000004 = <this>.scaninorder(root.rightnode).getenumerator();

goto scanright;
}
else
{
return false;
}
break;
}
case 4:
{
return false;
}
case 5:
{
goto scanright;
}
default:
{
return false;
}
scanleft:
$pc = 1;

if(__wrap$00000003.movenext())
{
$_current = item$00000001 = __wrap$00000003.get_current();
$pc = 2;
return true;
}

getitem:
$pc = -1;
if(__wrap$00000003 != null)
{
((idisposable)__wrap$00000003).dispose();
}
$_current = root.item;
$pc = 3;
return true;

scanright:
$pc = 4;

if(__wrap$00000004.movenext())
{
$_current = $item$00000002 = __wrap$00000004.get_current();
$pc = 5;
return true;
}
else
{
$pc = -1;
if(__wrap$00000004 != null)
{
((idisposable)__wrap$00000004).dispose();
}
return false;
}
}
// 實(shí)現(xiàn) idisposable 接口
void dispose()
{
switch($pc)
{
case 1:
case 2:
{
$pc = -1;
if(__wrap$00000003 != null)
{
((idisposable)__wrap$00000003).dispose();
}
break;
}
case 4:
case 5:
{
$pc = -1;
if(__wrap$00000004 != null)
{
((idisposable)__wrap$00000004).dispose();
}
break;
}
}
}
}
}

通過上面的偽代碼，我們可以看到，c# 2.0實(shí)際上是通過一個(gè)以$pc為自變量的有限狀態(tài)機(jī)完成的遞歸迭代塊，這可能是因?yàn)橛邢逘顟B(tài)機(jī)可以很方便地通過程序自動(dòng)生成吧。而dispose()函數(shù)則負(fù)責(zé)處理狀態(tài)機(jī)的中間變量。

有興趣進(jìn)一步了解迭代特性的朋友，可以到grant ri的blog上閱讀iterators相關(guān)文章。
在了解了iterators的實(shí)現(xiàn)原理后，再看那些討論就不會(huì)被其表象所迷惑了 :d

上一篇：C#下實(shí)現(xiàn)主從dro&#112;DownList互動(dòng)的方法

下一篇：CLR 中匿名函數(shù)的實(shí)現(xiàn)原理淺析