<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>I am LAZY bones ? &#187; 精华</title>
	<atom:link href="http://luy.li/category/best/feed/" rel="self" type="application/rss+xml" />
	<link>http://luy.li</link>
	<description>all linux</description>
	<lastBuildDate>Sun, 05 Sep 2010 14:50:25 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>一个支持上传的简单http server</title>
		<link>http://luy.li/2010/05/15/simplehttpserverwithupload/</link>
		<comments>http://luy.li/2010/05/15/simplehttpserverwithupload/#comments</comments>
		<pubDate>Sat, 15 May 2010 08:34:40 +0000</pubDate>
		<dc:creator>bones7456</dc:creator>
				<category><![CDATA[精华]]></category>

		<guid isPermaLink="false">http://li2z.cn/?p=1541</guid>
		<description><![CDATA[现在，很多人都知道，python里有个SimpleHTTPServer，可以拿来方便地共享文件。比如，你要发送某个文件给局域网里的同学，你只要cd到所在路径，然后执行这么一行： python -m SimpleHTTPServer 人家就可以访问 http://你的IP:8000 来访问你要共享的文件了。 像我早已把这个命令做了alias。 但是，某一天，你需要从同学哪里复制一个文件到本机，然后你就会跟你同学说，XX，共享下某目录。当你以为可以用http来访问他的8000端口的时候，他却告诉你，不好意思，我是windows啦~~ 当然你可以选择在他windows里装个python，也可以选择使用samba、ftp等其他方式，但是有没有和之前一样简单的方式呢~ 当然了，这时候，你就需要一个支持上传的简单http server，也就是我这个：SimpleHTTPServerWithUpload.py，哈哈。然后你开个服务，让人家上传即可。 其实这个就是修改自SimpleHTTPServer的，只不过我给它加上了最原始的上传功能，安全性方面没有验证过，不过理论上应该不会没人一直开着这个吧？另外，我对RFC1867的理解不一定透彻，所以，Use at your own risk! 截图如下： 代码在此，单文件、零配置，直接用python运行。]]></description>
			<content:encoded><![CDATA[<p>现在，很多人都知道，python里有个SimpleHTTPServer，可以拿来方便地共享文件。比如，你要发送某个文件给局域网里的同学，你只要cd到所在路径，然后执行这么一行：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;">python -m <span style="color: #dc143c;">SimpleHTTPServer</span></pre></div></div>

<p>人家就可以访问 http://你的IP:8000 来访问你要共享的文件了。<br />
像我早已把这个命令做了alias。<br />
但是，某一天，你需要从同学哪里复制一个文件到本机，然后你就会跟你同学说，XX，共享下某目录。当你以为可以用http来访问他的8000端口的时候，他却告诉你，不好意思，我是windows啦~~<br />
当然你可以选择在他windows里装个python，也可以选择使用samba、ftp等其他方式，但是有没有和之前一样简单的方式呢~<br />
当然了，这时候，你就需要一个支持上传的简单http server，也就是我这个：SimpleHTTPServerWithUpload.py，哈哈。然后你开个服务，让人家上传即可。<br />
其实这个就是修改自SimpleHTTPServer的，只不过我给它加上了最原始的上传功能，安全性方面没有验证过，不过理论上应该不会没人一直开着这个吧？另外，我对RFC1867的理解不一定透彻，所以，Use at your own risk!<br />
截图如下：<br />
<img src="http://luy.li/wp-content/uploads/2010/05/SimpleHTTPServerWithUpload.png" alt="" title="SimpleHTTPServerWithUpload" width="528" height="374" class="alignnone size-full wp-image-1542" /><br />
代码<a href="http://bones7456.googlecode.com/svn/trunk/SimpleHTTPServerWithUpload.py">在此</a>，单文件、零配置，直接用python运行。</p>
]]></content:encoded>
			<wfw:commentRss>http://luy.li/2010/05/15/simplehttpserverwithupload/feed/</wfw:commentRss>
		<slash:comments>28</slash:comments>
		</item>
		<item>
		<title>python的正则表达式 re</title>
		<link>http://luy.li/2010/05/12/python-re/</link>
		<comments>http://luy.li/2010/05/12/python-re/#comments</comments>
		<pubDate>Wed, 12 May 2010 06:54:59 +0000</pubDate>
		<dc:creator>bones7456</dc:creator>
				<category><![CDATA[精华]]></category>
		<category><![CDATA[编程相关]]></category>

		<guid isPermaLink="false">http://li2z.cn/?p=1491</guid>
		<description><![CDATA[延伸阅读：python的 内建函数 和 subprocess 。此文是本系列的第三篇文章了，和之前一样，内容出自官方文档，但是会有自己的理解，并非单纯的翻译。所以，如果我理解有误，欢迎指正，谢谢。 本模块提供了和Perl里的正则表达式类似的功能，不关是正则表达式本身还是被搜索的字符串，都可以是Unicode字符，这点不用担心，python会处理地和Ascii字符一样漂亮。 正则表达式使用反斜杆（\）来转义特殊字符，使其可以匹配字符本身，而不是指定其他特殊的含义。这可能会和python字面意义上的字符串转义相冲突，这也许有些令人费解。比如，要匹配一个反斜杆本身，你也许要用'\\\\'来做为正则表达式的字符串，因为正则表达式要是\\，而字符串里，每个反斜杆都要写成\\。 你也可以在字符串前加上 r 这个前缀来避免部分疑惑，因为 r 开头的python字符串是 raw 字符串，所以里面的所有字符都不会被转义，比如r'\n'这个字符串就是一个反斜杆加上一字母n，而'\n'我们知道这是个换行符。因此，上面的'\\\\'你也可以写成r'\\'，这样，应该就好理解很多了。可以看下面这段： &#62;&#62;&#62; import re &#62;&#62;&#62; s = '\x5c' #0x5c就是反斜杆 &#62;&#62;&#62; print s \ &#62;&#62;&#62; re.match&#40;'\\\\', s&#41; #这样可以匹配 &#60;_sre.SRE_Match object at 0xb6949e20&#62; &#62;&#62;&#62; re.match&#40;r'\\', s&#41; #这样也可以 &#60;_sre.SRE_Match object at 0x80ce2c0&#62; &#62;&#62;&#62; re.match&#40;'\\', s&#41; #但是这样不行 Traceback &#40;most recent call last&#41;: File &#34;&#60;stdin&#62;&#34;, line 1, [...]]]></description>
			<content:encoded><![CDATA[<p>延伸阅读：python的 <a href="http://luy.li/2009/11/29/python-built-in-functions/">内建函数</a> 和 <a href="http://luy.li/2010/04/14/python_subprocess/">subprocess</a> 。此文是本系列的第三篇文章了，和之前一样，内容出自<a href="http://docs.python.org/library/re.html">官方文档</a>，但是会有自己的理解，并非单纯的翻译。所以，如果我理解有误，欢迎指正，谢谢。</p>
<p>本模块提供了和Perl里的正则表达式类似的功能，不关是正则表达式本身还是被搜索的字符串，都可以是Unicode字符，这点不用担心，python会处理地和Ascii字符一样漂亮。<br />
正则表达式使用反斜杆（<code>\</code>）来转义特殊字符，使其可以匹配字符本身，而不是指定其他特殊的含义。这可能会和python字面意义上的字符串转义相冲突，这也许有些令人费解。比如，要匹配一个反斜杆本身，你也许要用<code>'\\\\'</code>来做为正则表达式的字符串，因为正则表达式要是<code>\\</code>，而字符串里，每个反斜杆都要写成<code>\\</code>。<br />
你也可以在字符串前加上 r 这个前缀来避免部分疑惑，因为 r 开头的python字符串是 raw 字符串，所以里面的所有字符都不会被转义，比如<code>r'\n'</code>这个字符串就是一个反斜杆加上一字母n，而<code>'\n'</code>我们知道这是个换行符。因此，上面的<code>'\\\\'</code>你也可以写成<code>r'\\'</code>，这样，应该就好理解很多了。可以看下面这段：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">re</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> s = <span style="color: #483d8b;">'<span style="color: #000099; font-weight: bold;">\x</span>5c'</span>  <span style="color: #808080; font-style: italic;">#0x5c就是反斜杆</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #ff7700;font-weight:bold;">print</span> s
\
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #dc143c;">re</span>.<span style="color: black;">match</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'<span style="color: #000099; font-weight: bold;">\\</span><span style="color: #000099; font-weight: bold;">\\</span>'</span>, s<span style="color: black;">&#41;</span>  <span style="color: #808080; font-style: italic;">#这样可以匹配</span>
<span style="color: #66cc66;">&lt;</span>_sre.<span style="color: black;">SRE_Match</span> <span style="color: #008000;">object</span> at 0xb6949e20<span style="color: #66cc66;">&gt;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #dc143c;">re</span>.<span style="color: black;">match</span><span style="color: black;">&#40;</span>r<span style="color: #483d8b;">'<span style="color: #000099; font-weight: bold;">\\</span>'</span>, s<span style="color: black;">&#41;</span>  <span style="color: #808080; font-style: italic;">#这样也可以</span>
<span style="color: #66cc66;">&lt;</span>_sre.<span style="color: black;">SRE_Match</span> <span style="color: #008000;">object</span> at 0x80ce2c0<span style="color: #66cc66;">&gt;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #dc143c;">re</span>.<span style="color: black;">match</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'<span style="color: #000099; font-weight: bold;">\\</span>'</span>, s<span style="color: black;">&#41;</span>  <span style="color: #808080; font-style: italic;">#但是这样不行</span>
Traceback <span style="color: black;">&#40;</span>most recent call last<span style="color: black;">&#41;</span>:
  File <span style="color: #483d8b;">&quot;&lt;stdin&gt;&quot;</span>, line <span style="color: #ff4500;">1</span>, <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #66cc66;">&lt;</span>module<span style="color: #66cc66;">&gt;</span>
  File <span style="color: #483d8b;">&quot;/usr/lib/python2.6/re.py&quot;</span>, line <span style="color: #ff4500;">137</span>, <span style="color: #ff7700;font-weight:bold;">in</span> match
    <span style="color: #ff7700;font-weight:bold;">return</span> _compile<span style="color: black;">&#40;</span>pattern, flags<span style="color: black;">&#41;</span>.<span style="color: black;">match</span><span style="color: black;">&#40;</span><span style="color: #dc143c;">string</span><span style="color: black;">&#41;</span>
  File <span style="color: #483d8b;">&quot;/usr/lib/python2.6/re.py&quot;</span>, line <span style="color: #ff4500;">245</span>, <span style="color: #ff7700;font-weight:bold;">in</span> _compile
    <span style="color: #ff7700;font-weight:bold;">raise</span> error, v <span style="color: #808080; font-style: italic;"># invalid expression</span>
sre_constants.<span style="color: black;">error</span>: bogus escape <span style="color: black;">&#40;</span>end of line<span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span></pre></div></div>

<p>另外值得一提的是，re模块的方法，大多也就是RegexObject对象的方法，两者的区别在于执行效率。这个在最后再展开吧。</p>
<p><big><strong>正则表达式语法</strong></big></p>
<p>正则表达式（RE）指定一个与之匹配的字符集合；本模块所提供的函数，将可以用来检查所给的字符串是否与指定的正则表达式匹配。<br />
正则表达式可以被连接，从而形成新的正则表达式；例如A和B都是正则表达式，那么AB也是正则表达式。一般地，如果字符串<em>p</em>与A匹配，<em>q</em>与B匹配的话，那么字符串<em>pq</em>也会与AB匹配，但A或者B里含有边界限定条件或者命名组操作的情况除外。也就是说，复杂的正则表达式可以用简单的连接而成。<br />
正则表达式可以包含特殊字符和普通字符，大部分字符比如<code>'A'</code>，<code>'a'</code>和<code>'0'</code>都是普通字符，如果做为正则表达式，它们将匹配它们本身。由于正则表达式可以连接，所以连接多个普通字符而成的正则表达式<code>last</code>也将匹配<code>'last'</code>。（后面将用不带引号的表示正则表达式，带引号的表示字符串）</p>
<p>下面就来介绍正则表达式的特殊字符：</p>
<p><code>'.'</code><br />
点号，在普通模式，它匹配除换行符外的任意一个字符；如果指定了 <strong>DOTALL</strong> 标记，匹配包括换行符以内的任意一个字符。</p>
<p><code>'^'</code><br />
尖尖号，匹配一个字符串的开始，在 <strong>MULTILINE</strong> 模式下，也将匹配任意一个新行的开始。</p>
<p><code>'$'</code><br />
美元符号，匹配一个字符串的结尾或者字符串最后面的换行符，在 <strong>MULTILINE</strong> 模式下，也匹配任意一行的行尾。也就是说，普通模式下，<code>foo.$</code>去搜索<code>'foo1\nfoo2\n'</code>只会找到&#8217;foo2&#8242;，但是在 <strong>MULTILINE</strong> 模式，还能找到 &#8216;foo1&#8242;，而且就用一个 <code>$</code> 去搜索<code>'foo\n'</code>的话，会找到两个空的匹配：一个是最后的换行符，一个是字符串的结尾，演示：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #dc143c;">re</span>.<span style="color: black;">findall</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'(foo.$)'</span>, <span style="color: #483d8b;">'foo1<span style="color: #000099; font-weight: bold;">\n</span>foo2<span style="color: #000099; font-weight: bold;">\n</span>'</span><span style="color: black;">&#41;</span>
<span style="color: black;">&#91;</span><span style="color: #483d8b;">'foo2'</span><span style="color: black;">&#93;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #dc143c;">re</span>.<span style="color: black;">findall</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'(foo.$)'</span>, <span style="color: #483d8b;">'foo1<span style="color: #000099; font-weight: bold;">\n</span>foo2<span style="color: #000099; font-weight: bold;">\n</span>'</span>, <span style="color: #dc143c;">re</span>.<span style="color: black;">MULTILINE</span><span style="color: black;">&#41;</span>
<span style="color: black;">&#91;</span><span style="color: #483d8b;">'foo1'</span>, <span style="color: #483d8b;">'foo2'</span><span style="color: black;">&#93;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #dc143c;">re</span>.<span style="color: black;">findall</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'($)'</span>, <span style="color: #483d8b;">'foo<span style="color: #000099; font-weight: bold;">\n</span>'</span><span style="color: black;">&#41;</span>
<span style="color: black;">&#91;</span><span style="color: #483d8b;">''</span>, <span style="color: #483d8b;">''</span><span style="color: black;">&#93;</span></pre></div></div>

<p><code>'*'</code><br />
星号，指定将前面的RE重复0次或者任意多次，而且总是试图尽量多次地匹配。</p>
<p><code>'+'</code><br />
加号，指定将前面的RE重复1次或者任意多次，而且总是试图尽量多次地匹配。</p>
<p><code>'?'</code><br />
问号，指定将前面的RE重复0次或者1次，如果有的话，也尽量匹配1次。</p>
<p><code>*?</code>， <code>+?</code>， <code>??</code><br />
从前面的描述可以看到<code>'*'</code>，<code>'+'</code>和<code>'?'</code>都是<em>贪婪的</em>，但这也许并不是我们说要的，所以，可以在后面加个问号，将策略改为<em>非贪婪</em>，只匹配尽量少的RE。示例，体会两者的区别：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #dc143c;">re</span>.<span style="color: black;">findall</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'&lt;(.*)&gt;'</span>, <span style="color: #483d8b;">'&lt;H1&gt;title&lt;/H1&gt;'</span><span style="color: black;">&#41;</span>
<span style="color: black;">&#91;</span><span style="color: #483d8b;">'H1&gt;title&lt;/H1'</span><span style="color: black;">&#93;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #dc143c;">re</span>.<span style="color: black;">findall</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'&lt;(.*?)&gt;'</span>, <span style="color: #483d8b;">'&lt;H1&gt;title&lt;/H1&gt;'</span><span style="color: black;">&#41;</span>
<span style="color: black;">&#91;</span><span style="color: #483d8b;">'H1'</span>, <span style="color: #483d8b;">'/H1'</span><span style="color: black;">&#93;</span></pre></div></div>

<p><code>{m}</code><br />
m是一个数字，指定将前面的RE重复m次。</p>
<p><code>{m,n}</code><br />
m和n都是数字，指定将前面的RE重复m到n次，例如<code>a{3,5}</code>匹配3到5个连续的a。注意，如果省略m，将匹配0到n个前面的RE；如果省略n，将匹配n到无穷多个前面的RE；当然中间的逗号是不能省略的，不然就变成前面那种形式了。</p>
<p><code>{m,n}?</code><br />
前面说的<code>{m,n}</code>，也是贪婪的，<code>a{3,5}</code>如果有5个以上连续a的话，会匹配5个，这个也可以通过加问号改变。<code>a{3,5}?</code>如果可能的话，将只匹配3个a。</p>
<p><code>'\'</code><br />
反斜杆，转义<code>'*'</code>，<code>'?'</code>等特殊字符，或者指定一个特殊序列（下面会详述）<br />
由于之前所述的原因，强烈建议用raw字符串来表述正则。</p>
<p><code>[]</code><br />
方括号，用于指定一个字符的集合。可以单独列出字符，也可以用<code>'-'</code>连接起止字符以表示一个范围。特殊字符在中括号里将失效，比如<code>[akm$]</code>就表示字符<code>'a'</code>，<code>'k'</code>，<code>'m'</code>，或<code>'$'</code>，在这里$也变身为普通字符了。<code>[a-z]</code>匹配任意一个小写字母，<code>[a-zA-Z0-9]</code>匹配任意一个字母或数字。如果你要匹配<code>']'</code>或<code>'-'</code>本身，你需要加反斜杆转义，或者是将其置于中括号的最前面，比如<code>[]]</code>可以匹配<code>']'</code><br />
你还可以对一个字符集合<em>取反</em>，以匹配任意不在这个字符集合里的字符，<code>取反</code>操作用一个<code>'^'</code>放在集合的最前面表示，放在其他地方的<code>'^'</code>将不会起特殊作用。例如<code>[^5]</code>将匹配任意不是<code>'5'</code>的字符；<code>[^^]</code>将匹配任意不是<code>'^'</code>的字符。<br />
注意：在中括号里，<code>+</code>、<code>*</code>、<code>(</code>、<code>)</code>这类字符将会失去特殊含义，仅作为普通字符。反向引用也不能在中括号内使用。</p>
<p><code>'|'</code><br />
管道符号，A和B是任意的RE，那么<code>A|B</code>就是匹配A或者B的一个新的RE。任意个数的RE都可以像这样用管道符号间隔连接起来。这种形式可以被用于<strong>组</strong>中（后面将详述）。对于目标字符串，被<code>'|'</code>分割的RE将自左至右一一被测试，一旦有一个测试成功，后面的将不再被测试，即使后面的RE可能可以匹配更长的串，换句话说，<code>'|'</code>操作符是非贪婪的。要匹配字面意义上的<code>'|'</code>，可以用反斜杆转义：<code>\|</code>，或是包含在反括号内：<code>[|]</code>。</p>
<p><code>(...)</code><br />
匹配圆括号里的RE匹配的内容，并指定<strong>组</strong>的开始和结束位置。组里面的内容可以被提取，也可以采用<code>\number</code>这样的特殊序列，被用于后续的匹配。要匹配字面意义上的<code>'('</code>和<code>')'</code>，可以用反斜杆转义：<code>\(</code>、<code>\)</code>，或是包含在反括号内：<code>[(]</code>、<code>[)]</code>。</p>
<p><code>(?...)</code><br />
这是一个表达式的扩展符号。<code>'?'</code>后的第一个字母决定了整个表达式的语法和含义，除了<code>(?P<name>...)</code>以外，表达式不会产生一个新的组。下面介绍几个目前已被支持的扩展：</p>
<p><code>(?iLmsux)</code><br />
<code>'i'</code>、<code>'L'</code>、<code>'m'</code>、<code>'s'</code>、<code>'u'</code>、<code>'x'</code>里的一个或多个字母。表达式不匹配任何字符，但是指定相应的标志：<strong>re.I</strong>(忽略大小写)、<strong>re.L</strong>(依赖locale)、<strong>re.M</strong>(多行模式)、<strong>re.S</strong>(.匹配所有字符)、<strong>re.U</strong>(依赖Unicode)、<strong>re.X</strong>(详细模式)。关于各个模式的区别，下面会有专门的一节来介绍的。使用这个语法可以代替在<code>re.compile()</code>的时候或者调用的时候指定<em>flag</em>参数。<br />
例如，上面举过的例子，可以改写成这样（和指定了<code>re.MULTILINE</code>是一样的效果）：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #dc143c;">re</span>.<span style="color: black;">findall</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'(?m)(foo.$)'</span>, <span style="color: #483d8b;">'foo1<span style="color: #000099; font-weight: bold;">\n</span>foo2<span style="color: #000099; font-weight: bold;">\n</span>'</span><span style="color: black;">&#41;</span>
<span style="color: black;">&#91;</span><span style="color: #483d8b;">'foo1'</span>, <span style="color: #483d8b;">'foo2'</span><span style="color: black;">&#93;</span></pre></div></div>

<p>另外，还要注意<code>(?x)</code>标志如果有的话，要放在最前面。</p>
<p><code>(?:...)</code><br />
匹配内部的RE所匹配的内容，但是不建立<strong>组</strong>。</p>
<p><code>(?P&lt;name&gt;...)</code><br />
和普通的圆括号类似，但是子串匹配到的内容将可以用命名的<em>name</em>参数来提取。组的<em>name</em>必须是有效的python标识符，而且在本表达式内不重名。命名了的组和普通组一样，也用数字来提取，也就是说名字只是个额外的属性。<br />
演示一下：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> m=<span style="color: #dc143c;">re</span>.<span style="color: black;">match</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'(?P&lt;var&gt;[a-zA-Z_]<span style="color: #000099; font-weight: bold;">\w</span>*)'</span>, <span style="color: #483d8b;">'abc=123'</span><span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> m.<span style="color: black;">group</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'var'</span><span style="color: black;">&#41;</span>
<span style="color: #483d8b;">'abc'</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> m.<span style="color: black;">group</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#41;</span>
<span style="color: #483d8b;">'abc'</span></pre></div></div>

<p><code>(?P=name)</code><br />
匹配之前以<em>name</em>命名的组里的内容。<br />
演示一下：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #dc143c;">re</span>.<span style="color: black;">match</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'&lt;(?P&lt;tagname&gt;<span style="color: #000099; font-weight: bold;">\w</span>*)&gt;.*&lt;/(?P=tagname)&gt;'</span>, <span style="color: #483d8b;">'&lt;h1&gt;xxx&lt;/h2&gt;'</span><span style="color: black;">&#41;</span>  <span style="color: #808080; font-style: italic;">#这个不匹配</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #dc143c;">re</span>.<span style="color: black;">match</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'&lt;(?P&lt;tagname&gt;<span style="color: #000099; font-weight: bold;">\w</span>*)&gt;.*&lt;/(?P=tagname)&gt;'</span>, <span style="color: #483d8b;">'&lt;h1&gt;xxx&lt;/h1&gt;'</span><span style="color: black;">&#41;</span>  <span style="color: #808080; font-style: italic;">#这个匹配</span>
<span style="color: #66cc66;">&lt;</span>_sre.<span style="color: black;">SRE_Match</span> <span style="color: #008000;">object</span> at 0xb69588e0<span style="color: #66cc66;">&gt;</span></pre></div></div>

<p><code>(?#...)</code><br />
注释，圆括号里的内容会被忽略。</p>
<p><code>(?=...)</code><br />
如果 <code>...</code> 匹配接下来的字符，才算匹配，但是并不会消耗任何被匹配的字符。例如 <code>Isaac (?=Asimov)</code> 只会匹配后面跟着 <code>'Asimov'</code> 的 <code>'Isaac '</code>，这个叫做“前瞻断言”。</p>
<p><code>(?!...)</code><br />
和上面的相反，只匹配接下来的字符串<strong>不</strong>匹配 <code>...</code> 的串，这叫做“反前瞻断言”。</p>
<p><code>(?<=...)</code><br />
只有当当前位置之前的字符串匹配 <code>...</code> ，整个匹配才有效，这叫“后顾断言”。<code>(?<=abc)def</code>会找到 <code>'abcdef'</code>，因为会后向查找3个字符，看是否为abc。所以内置的子RE，需要是固定长度的，比如可以是<code>abc</code>、<code>a|b</code>，但不能是<code>a*</code>、<code>a{3,4}</code>。注意这种RE永远不会匹配到字符串的开头。举个例子，找到连字符（<code>'-'</code>）后的单词：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> m = <span style="color: #dc143c;">re</span>.<span style="color: black;">search</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'(?&lt;=-)<span style="color: #000099; font-weight: bold;">\w</span>+'</span>, <span style="color: #483d8b;">'spam-egg'</span><span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> m.<span style="color: black;">group</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#41;</span>
<span style="color: #483d8b;">'egg'</span></pre></div></div>

<p><code>(?&lt;!...)</code><br />
同理，这个叫做“反后顾断言”，子RE需要固定长度的，含义是前面的字符串不匹配 <code>...</code> 整个才算匹配。</p>
<p><code>(?(id/name)yes-pattern|no-pattern)</code><br />
如有由<em>id</em>或者<em>name</em>指定的组存在的话，将会匹配<code>yes-pattern</code>，否则将会匹配<code>no-pattern</code>，通常情况下<code>no-pattern</code>也可以省略。例如：<code>(<)?(\w+@\w+(?:\.\w+)+)(?(1)>)</code>可以匹配 <code>'&lt;user@host.com&gt;'</code> 和 <code>'user@host.com'</code>，但是不会匹配 <code>'&lt;user@host.com'</code>。</p>
<p>下面列出以<code>'\'</code>开头的特殊序列。如果某个字符没有在下面列出，那么RE的结果会只匹配那个字母本身，比如，<code>\$</code>只匹配字面意义上的<code>'$'</code>。</p>
<p><code>\number</code><br />
匹配number所指的组相同的字符串。组的序号从1开始。例如：<code>(.+) \1</code>可以匹配<code>'the the'</code>和<code>'55 55'</code>，但不匹配<code>'the end'</code>。这种序列在一个正则表达式里最多可以有99个，如果<em>number</em>以0开头，或是有3位以上的数字，就会被当做八进制表示的字符了。同时，这个也不能用于方括号内。</p>
<p><code>\A</code><br />
只匹配字符串的开始。</p>
<p><code>\b</code><br />
匹配单词边界（包括开始和结束），这里的“单词”，是指连续的字母、数字和下划线组成的字符串。注意，<code>\b</code>的定义是<code>\w</code>和<code>\W</code>的交界，所以精确的定义有赖于<code>UNICODE</code>和<code>LOCALE</code>这两个标志位。</p>
<p><code>\B</code><br />
和<code>\b</code>相反，<code>\B</code>匹配非单词边界。也依赖于<code>UNICODE</code>和<code>LOCALE</code>这两个标志位。</p>
<p><code>\d</code><br />
未指定<code>UNICODE</code>标志时，匹配数字，等效于：<code>[0-9]</code>。指定了<code>UNICODE</code>标志时，还会匹配其他Unicode库里描述为字符串的符号。便于理解，举个例子（好不容易找的例子啊，呵呵）：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #808080; font-style: italic;">#\u2076\和u2084分别是上标的6和下标的4，属于unicode的DIGIT</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> unistr = u<span style="color: #483d8b;">'<span style="color: #000099; font-weight: bold;">\u</span>2076<span style="color: #000099; font-weight: bold;">\u</span>2084abc'</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #ff7700;font-weight:bold;">print</span> unistr
⁶₄abc
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #dc143c;">re</span>.<span style="color: black;">findall</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'<span style="color: #000099; font-weight: bold;">\d</span>+'</span>, unistr, <span style="color: #dc143c;">re</span>.<span style="color: black;">U</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span>
⁶₄</pre></div></div>

<p><code>\D</code><br />
和<code>\d</code>相反，不多说了。</p>
<p><code>\s</code><br />
当未指定<code>UNICODE</code>和<code>LOCALE</code>这两个标志位时，匹配任何空白字符，等效于<code>[ \t\n\r\f\v]</code>。如果指定了<code>LOCALE</code>，则还要加LOCALE相关的空白字符；如果指定了<code>UNICODE</code>，还要加上UNICODE空白字符，如较常见的空宽度连接空格（\uFEFF）、零宽度非连接空格(\u200B)等。</p>
<p><code>\S</code><br />
和<code>\s</code>相反，也不多说。</p>
<p><code>\w</code><br />
当未指定<code>UNICODE</code>和<code>LOCALE</code>这两个标志位时，等效于<code>[a-zA-Z0-9_]</code>。当指定了<code>LOCALE</code>时，为<code>[0-9_]</code>加上当前LOCAL指定的字母。当指定了<code>UNICODE</code>时，为<code>[0-9_]</code>加上UNICODE库里的所有字母。</p>
<p><code>\W</code><br />
和<code>\w</code>相反，不多说。</p>
<p><code>\Z</code><br />
只匹配字符串的结尾。</p>
<p><big><strong>匹配之于搜索</strong></big></p>
<p>python提供了两种基于正则表达式的操作：匹配（match）从字符串的开始检查字符串是否个正则匹配。而搜索（search）检查字符串任意位置是否有匹配的子串（perl默认就是如此）。<br />
注意，即使search的正则以<code>'^'</code>开头，match和search也还是有许多不同的。</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #dc143c;">re</span>.<span style="color: black;">match</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;c&quot;</span>, <span style="color: #483d8b;">&quot;abcdef&quot;</span><span style="color: black;">&#41;</span>  <span style="color: #808080; font-style: italic;"># 不匹配</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #dc143c;">re</span>.<span style="color: black;">search</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;c&quot;</span>, <span style="color: #483d8b;">&quot;abcdef&quot;</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># 匹配</span>
<span style="color: #66cc66;">&lt;</span>_sre.<span style="color: black;">SRE_Match</span> <span style="color: #008000;">object</span> at ...<span style="color: #66cc66;">&gt;</span></pre></div></div>

<p><big><strong>模块的属性和方法</strong></big></p>
<p><code>re</code>.<strong>compile</strong>(<em>pattern</em>[, <em>flags</em>])<br />
把一个正则表达式<em>pattern</em>编译成正则对象，以便可以用正则对象的<strong>match</strong>和<strong>search</strong>方法。<br />
得到的正则对象的行为（也就是模式）可以用<em>flags</em>来指定，值可以由几个下面的值OR得到。<br />
以下两段内容在语法上是等效的：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;">prog = <span style="color: #dc143c;">re</span>.<span style="color: #008000;">compile</span><span style="color: black;">&#40;</span>pattern<span style="color: black;">&#41;</span>
result = prog.<span style="color: black;">match</span><span style="color: black;">&#40;</span><span style="color: #dc143c;">string</span><span style="color: black;">&#41;</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;">result = <span style="color: #dc143c;">re</span>.<span style="color: black;">match</span><span style="color: black;">&#40;</span>pattern, <span style="color: #dc143c;">string</span><span style="color: black;">&#41;</span></pre></div></div>

<p>区别是，用了<strong>re.compile</strong>以后，正则对象会得到保留，这样在需要多次运用这个正则对象的时候，效率会有较大的提升。再用上面用过的例子来演示一下，用相同的正则匹配相同的字符串，执行100万次，就体现出compile的效率了（数据来自我那1.86G CPU的神舟本本）：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #dc143c;">timeit</span>.<span style="color: #dc143c;">timeit</span><span style="color: black;">&#40;</span>
...     <span style="color: black;">setup</span>=<span style="color: #483d8b;">''</span><span style="color: #483d8b;">'import re; reg = re.compile('</span><span style="color: #66cc66;">&lt;</span><span style="color: black;">&#40;</span><span style="color: #66cc66;">?</span>P<span style="color: #66cc66;">&lt;</span>tagname<span style="color: #66cc66;">&gt;</span>\w<span style="color: #66cc66;">*</span><span style="color: black;">&#41;</span><span style="color: #66cc66;">&gt;</span>.<span style="color: #66cc66;">*&lt;</span>/<span style="color: black;">&#40;</span><span style="color: #66cc66;">?</span>P=tagname<span style="color: black;">&#41;</span><span style="color: #66cc66;">&gt;</span><span style="color: #483d8b;">')'</span><span style="color: #483d8b;">''</span>,
...     <span style="color: black;">stmt</span>=<span style="color: #483d8b;">''</span><span style="color: #483d8b;">'reg.match('</span><span style="color: #66cc66;">&lt;</span>h1<span style="color: #66cc66;">&gt;</span>xxx<span style="color: #66cc66;">&lt;</span>/h1<span style="color: #66cc66;">&gt;</span><span style="color: #483d8b;">')'</span><span style="color: #483d8b;">''</span>,
...     <span style="color: black;">number</span>=<span style="color: #ff4500;">1000000</span><span style="color: black;">&#41;</span>
<span style="color: #ff4500;">1.2062149047851562</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #dc143c;">timeit</span>.<span style="color: #dc143c;">timeit</span><span style="color: black;">&#40;</span>
...     <span style="color: black;">setup</span>=<span style="color: #483d8b;">''</span><span style="color: #483d8b;">'import re'</span><span style="color: #483d8b;">''</span>,
...     <span style="color: black;">stmt</span>=<span style="color: #483d8b;">''</span><span style="color: #483d8b;">'re.match('</span><span style="color: #66cc66;">&lt;</span><span style="color: black;">&#40;</span><span style="color: #66cc66;">?</span>P<span style="color: #66cc66;">&lt;</span>tagname<span style="color: #66cc66;">&gt;</span>\w<span style="color: #66cc66;">*</span><span style="color: black;">&#41;</span><span style="color: #66cc66;">&gt;</span>.<span style="color: #66cc66;">*&lt;</span>/<span style="color: black;">&#40;</span><span style="color: #66cc66;">?</span>P=tagname<span style="color: black;">&#41;</span><span style="color: #66cc66;">&gt;</span><span style="color: #483d8b;">', '</span><span style="color: #66cc66;">&lt;</span>h1<span style="color: #66cc66;">&gt;</span>xxx<span style="color: #66cc66;">&lt;</span>/h1<span style="color: #66cc66;">&gt;</span><span style="color: #483d8b;">')'</span><span style="color: #483d8b;">''</span>,
...     <span style="color: black;">number</span>=<span style="color: #ff4500;">1000000</span><span style="color: black;">&#41;</span>
<span style="color: #ff4500;">4.4380838871002197</span></pre></div></div>

<p><code>re</code>.<strong>I</strong><br />
<code>re</code>.<strong>IGNORECASE</strong><br />
让正则表达式忽略大小写，这样一来，<code>[A-Z]</code>也可以匹配小写字母了。此特性和locale无关。</p>
<p><code>re</code>.<strong>L</strong><br />
<code>re</code>.<strong>LOCALE</strong><br />
让<code>\w</code>、<code>\W</code>、<code>\b</code>、<code>\B</code>、<code>\s</code>和<code>\S</code>依赖当前的locale。</p>
<p><code>re</code>.<strong>M</strong><br />
<code>re</code>.<strong>MULTILINE</strong><br />
影响<code>'^'</code>和<code>'$'</code>的行为，指定了以后，<code>'^'</code>会增加匹配每行的开始（也就是换行符后的位置）；<code>'$'</code>会增加匹配每行的结束（也就是换行符前的位置）。</p>
<p><code>re</code>.<strong>S</strong><br />
<code>re</code>.<strong>DOTALL</strong><br />
影响<code>'.'</code>的行为，平时<code>'.'</code>匹配除换行符以外的所有字符，指定了本标志以后，也可以匹配换行符。</p>
<p><code>re</code>.<strong>U</strong><br />
<code>re</code>.<strong>UNICODE</strong><br />
让<code>\w</code>、<code>\W</code>、<code>\b</code>、<code>\B</code>、<code>\d</code>、<code>\D</code>、<code>\s</code>和<code>\S</code>依赖Unicode库。</p>
<p><code>re</code>.<strong>X</strong><br />
<code>re</code>.<strong>VERBOSE</strong><br />
运用这个标志，你可以写出可读性更好的正则表达式：除了在方括号内的和被反斜杠转义的以外的所有空白字符，都将被忽略，而且每行中，一个正常的井号后的所有字符也被忽略，这样就可以方便地在正则表达式内部写注释了。也就是说，下面两个正则表达式是等效的：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;">a = <span style="color: #dc143c;">re</span>.<span style="color: #008000;">compile</span><span style="color: black;">&#40;</span>r<span style="color: #483d8b;">&quot;&quot;&quot;<span style="color: #000099; font-weight: bold;">\d</span> +  # the integral part
                   <span style="color: #000099; font-weight: bold;">\.</span>    # the decimal point
                   <span style="color: #000099; font-weight: bold;">\d</span> *  # some fractional digits&quot;&quot;&quot;</span>, <span style="color: #dc143c;">re</span>.<span style="color: black;">X</span><span style="color: black;">&#41;</span>
b = <span style="color: #dc143c;">re</span>.<span style="color: #008000;">compile</span><span style="color: black;">&#40;</span>r<span style="color: #483d8b;">&quot;<span style="color: #000099; font-weight: bold;">\d</span>+<span style="color: #000099; font-weight: bold;">\.</span><span style="color: #000099; font-weight: bold;">\d</span>*&quot;</span><span style="color: black;">&#41;</span></pre></div></div>

<p><code>re</code>.<strong>search</strong>(<em>pattern</em>, <em>string</em>[, <em>flags</em>])<br />
扫描<em>string</em>，看是否有个位置可以匹配正则表达式<em>pattern</em>。如果找到了，就返回一个<strong>MatchObject</strong>的实例，否则返回<strong>None</strong>，注意这和找到长度为0的子串含义是不一样的。搜索过程受<em>flags</em>的影响。</p>
<p><code>re</code>.<strong>match</strong>(<em>pattern</em>, <em>string</em>[, <em>flags</em>])<br />
如果字符串<em>string</em>的开头和正则表达式<em>pattern</em>匹配的话，返回一个相应的<strong>MatchObject</strong>的实例，否则返回<strong>None</strong></p>
<blockquote><p>注意：要在字符串的任意位置搜索的话，需要使用上面的<strong>search()</strong>。</p></blockquote>
<p><code>re</code>.<strong>split</strong>(<em>pattern</em>, <em>string</em>[, <em>maxsplit=0</em>])<br />
用匹配<em>pattern</em>的子串来分割<em>string</em>，如果<em>pattern</em>里使用了圆括号，那么被<em>pattern</em>匹配到的串也将作为返回值列表的一部分。如果<em>maxsplit</em>不为0，则最多被分割为<em>maxsplit</em>个子串，剩余部分将整个地被返回。</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #dc143c;">re</span>.<span style="color: black;">split</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'<span style="color: #000099; font-weight: bold;">\W</span>+'</span>, <span style="color: #483d8b;">'Words, words, words.'</span><span style="color: black;">&#41;</span>
<span style="color: black;">&#91;</span><span style="color: #483d8b;">'Words'</span>, <span style="color: #483d8b;">'words'</span>, <span style="color: #483d8b;">'words'</span>, <span style="color: #483d8b;">''</span><span style="color: black;">&#93;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #dc143c;">re</span>.<span style="color: black;">split</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'(<span style="color: #000099; font-weight: bold;">\W</span>+)'</span>, <span style="color: #483d8b;">'Words, words, words.'</span><span style="color: black;">&#41;</span>
<span style="color: black;">&#91;</span><span style="color: #483d8b;">'Words'</span>, <span style="color: #483d8b;">', '</span>, <span style="color: #483d8b;">'words'</span>, <span style="color: #483d8b;">', '</span>, <span style="color: #483d8b;">'words'</span>, <span style="color: #483d8b;">'.'</span>, <span style="color: #483d8b;">''</span><span style="color: black;">&#93;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #dc143c;">re</span>.<span style="color: black;">split</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'<span style="color: #000099; font-weight: bold;">\W</span>+'</span>, <span style="color: #483d8b;">'Words, words, words.'</span>, <span style="color: #ff4500;">1</span><span style="color: black;">&#41;</span>
<span style="color: black;">&#91;</span><span style="color: #483d8b;">'Words'</span>, <span style="color: #483d8b;">'words, words.'</span><span style="color: black;">&#93;</span></pre></div></div>

<p>如果正则有圆括号，并且可以匹配到字符串的开始位置的时候，返回值的第一项，会多出一个空字符串。匹配到字符结尾也是同样的道理：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #dc143c;">re</span>.<span style="color: black;">split</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'(<span style="color: #000099; font-weight: bold;">\W</span>+)'</span>, <span style="color: #483d8b;">'...words, words...'</span><span style="color: black;">&#41;</span>
<span style="color: black;">&#91;</span><span style="color: #483d8b;">''</span>, <span style="color: #483d8b;">'...'</span>, <span style="color: #483d8b;">'words'</span>, <span style="color: #483d8b;">', '</span>, <span style="color: #483d8b;">'words'</span>, <span style="color: #483d8b;">'...'</span>, <span style="color: #483d8b;">''</span><span style="color: black;">&#93;</span></pre></div></div>

<p>注意，<em>split</em>不会被零长度的正则所分割，例如：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #dc143c;">re</span>.<span style="color: black;">split</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'x*'</span>, <span style="color: #483d8b;">'foo'</span><span style="color: black;">&#41;</span>
<span style="color: black;">&#91;</span><span style="color: #483d8b;">'foo'</span><span style="color: black;">&#93;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #dc143c;">re</span>.<span style="color: black;">split</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;(?m)^$&quot;</span>, <span style="color: #483d8b;">&quot;foo<span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\n</span>bar<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: black;">&#41;</span>
<span style="color: black;">&#91;</span><span style="color: #483d8b;">'foo<span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\n</span>bar<span style="color: #000099; font-weight: bold;">\n</span>'</span><span style="color: black;">&#93;</span></pre></div></div>

<p><code>re</code>.<strong>findall</strong>(<em>pattern</em>, <em>string</em>[, <em>flags</em>])<br />
以列表的形式返回<em>string</em>里匹配<em>pattern</em>的不重叠的子串。<em>string</em>会被从左到右依次扫描，返回的列表也是从左到右一次匹配到的。如果<em>pattern</em>里含有<strong>组</strong>的话，那么会返回匹配到的组的列表；如果<em>pattern</em>里有多个组，那么各组会先组成一个元组，然后返回值将是一个元组的列表。<br />
由于这个函数不会涉及到<strong>MatchObject</strong>之类的概念，所以，对新手来说，应该是最好理解也最容易使用的一个函数了。下面就此来举几个简单的例子：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #808080; font-style: italic;">#简单的findall</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #dc143c;">re</span>.<span style="color: black;">findall</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'<span style="color: #000099; font-weight: bold;">\w</span>+'</span>, <span style="color: #483d8b;">'hello, world!'</span><span style="color: black;">&#41;</span>
<span style="color: black;">&#91;</span><span style="color: #483d8b;">'hello'</span>, <span style="color: #483d8b;">'world'</span><span style="color: black;">&#93;</span>
<span style="color: #808080; font-style: italic;">#这个返回的就是元组的列表</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #dc143c;">re</span>.<span style="color: black;">findall</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'(<span style="color: #000099; font-weight: bold;">\d</span>+)<span style="color: #000099; font-weight: bold;">\.</span>(<span style="color: #000099; font-weight: bold;">\d</span>+)<span style="color: #000099; font-weight: bold;">\.</span>(<span style="color: #000099; font-weight: bold;">\d</span>+)<span style="color: #000099; font-weight: bold;">\.</span>(<span style="color: #000099; font-weight: bold;">\d</span>+)'</span>, <span style="color: #483d8b;">'My IP is 192.168.0.2, and your is 192.168.0.3.'</span><span style="color: black;">&#41;</span>
<span style="color: black;">&#91;</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'192'</span>, <span style="color: #483d8b;">'168'</span>, <span style="color: #483d8b;">'0'</span>, <span style="color: #483d8b;">'2'</span><span style="color: black;">&#41;</span>, <span style="color: black;">&#40;</span><span style="color: #483d8b;">'192'</span>, <span style="color: #483d8b;">'168'</span>, <span style="color: #483d8b;">'0'</span>, <span style="color: #483d8b;">'3'</span><span style="color: black;">&#41;</span><span style="color: black;">&#93;</span></pre></div></div>

<p><code>re</code>.<strong> finditer</strong>(<em>pattern</em>, <em>string</em>[, <em>flags</em>])<br />
和上面的<strong>findall()</strong>类似，但返回的是<strong>MatchObject</strong>的实例的迭代器。<br />
还是例子说明问题：<br />
>>> for m in re.finditer('\w+', 'hello, world!'):<br />
...     print m.group()<br />
...<br />
hello<br />
world
</pre>
<p><code>re</code>.<strong>sub</strong>(<em>pattern</em>, <em>repl</em>, <em>string</em>[, <em>count</em>])<br />
替换，将<em>string</em>里，匹配<em>pattern</em>的部分，用<em>repl</em>替换掉，最多替换<em>count</em>次（剩余的匹配将不做处理），然后返回替换后的字符串。如果<em>string</em>里没有可以匹配<em>pattern</em>的串，将被原封不动地返回。<em>repl</em>可以是一个字符串，也可以是一个函数（也可以参考我以前的<a href="http://luy.li/2009/04/01/python_re/">例子</a>）。如果<em>repl</em>是个字符串，则其中的反斜杆会被处理过，比如 <code>\n</code> 会被转成换行符，反斜杆加数字会被替换成相应的组，比如 <code>\6</code> 表示<em>pattern</em>匹配到的第6个组的内容。<br />
例子：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #dc143c;">re</span>.<span style="color: black;">sub</span><span style="color: black;">&#40;</span>r<span style="color: #483d8b;">'def<span style="color: #000099; font-weight: bold;">\s</span>+([a-zA-Z_][a-zA-Z_0-9]*)<span style="color: #000099; font-weight: bold;">\s</span>*<span style="color: #000099; font-weight: bold;">\(</span><span style="color: #000099; font-weight: bold;">\s</span>*<span style="color: #000099; font-weight: bold;">\)</span>:'</span>,
...        <span style="color: black;">r</span><span style="color: #483d8b;">'static PyObject*<span style="color: #000099; font-weight: bold;">\n</span>py_<span style="color: #000099; font-weight: bold;">\1</span>(void)<span style="color: #000099; font-weight: bold;">\n</span>{'</span>,
...        <span style="color: #483d8b;">'def myfunc():'</span><span style="color: black;">&#41;</span>
<span style="color: #483d8b;">'static PyObject*<span style="color: #000099; font-weight: bold;">\n</span>py_myfunc(void)<span style="color: #000099; font-weight: bold;">\n</span>{'</span></pre></div></div>

<p>如果<em>repl</em>是个函数，每次<em>pattern</em>被匹配到的时候，都会被调用一次，传入一个匹配到的<strong>MatchObject</strong>对象，需要返回一个字符串，在匹配到的位置，就填入返回的字符串。<br />
例子：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #ff7700;font-weight:bold;">def</span> dashrepl<span style="color: black;">&#40;</span>matchobj<span style="color: black;">&#41;</span>:
...     <span style="color: #ff7700;font-weight:bold;">if</span> matchobj.<span style="color: black;">group</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#41;</span> == <span style="color: #483d8b;">'-'</span>: <span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #483d8b;">' '</span>
...     <span style="color: #ff7700;font-weight:bold;">else</span>: <span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #483d8b;">'-'</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #dc143c;">re</span>.<span style="color: black;">sub</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'-{1,2}'</span>, dashrepl, <span style="color: #483d8b;">'pro----gram-files'</span><span style="color: black;">&#41;</span>
<span style="color: #483d8b;">'pro--gram files'</span></pre></div></div>

<p>零长度的匹配也会被替换，比如：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #dc143c;">re</span>.<span style="color: black;">sub</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'x*'</span>, <span style="color: #483d8b;">'-'</span>, <span style="color: #483d8b;">'abcxxd'</span><span style="color: black;">&#41;</span>
<span style="color: #483d8b;">'-a-b-c-d-'</span></pre></div></div>

<p>特殊地，在替换字符串里，如果有<code>\g<name></code>这样的写法，将匹配正则的命名组（前面介绍过的，<code>(?P<name>...)</code>这样定义出来的东西）。<code>\g<number></code>这样的写法，也是数字的组，也就是说，<code>\g<2></code>一般和<code>\2</code>是等效的，但是万一你要在<code>\2</code>后面紧接着写上字面意义的0，你就不能写成<code>\20</code>了（因为这代表第20个组），这时候必须写成<code>\g<2>0</code>，另外，<code>\g<0></code>代表匹配到的整个子串。<br />
例子：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #dc143c;">re</span>.<span style="color: black;">sub</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'-(<span style="color: #000099; font-weight: bold;">\d</span>+)-'</span>, <span style="color: #483d8b;">'-<span style="color: #000099; font-weight: bold;">\g</span>&lt;1&gt;0<span style="color: #000099; font-weight: bold;">\g</span>&lt;0&gt;'</span>, <span style="color: #483d8b;">'a-11-b-22-c'</span><span style="color: black;">&#41;</span>
<span style="color: #483d8b;">'a-110-11-b-220-22-c'</span></pre></div></div>

<p><code>re</code>.<strong>subn</strong>(<em>pattern</em>, <em>repl</em>, <em>string</em>[, <em>count</em>])<br />
跟上面的<code>sub()</code>函数一样，只是它返回的是一个元组 <code>(新字符串, 匹配到的次数)</code><br />
，还是用例子说话：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #dc143c;">re</span>.<span style="color: black;">subn</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'-(<span style="color: #000099; font-weight: bold;">\d</span>+)-'</span>, <span style="color: #483d8b;">'-<span style="color: #000099; font-weight: bold;">\g</span>&lt;1&gt;0<span style="color: #000099; font-weight: bold;">\g</span>&lt;0&gt;'</span>, <span style="color: #483d8b;">'a-11-b-22-c'</span><span style="color: black;">&#41;</span>
<span style="color: black;">&#40;</span><span style="color: #483d8b;">'a-110-11-b-220-22-c'</span>, <span style="color: #ff4500;">2</span><span style="color: black;">&#41;</span></pre></div></div>

<p><code>re</code>.<strong>escape</strong>(<em>string</em>)<br />
把<em>string</em>中，除了字母和数字以外的字符，都加上反斜杆。</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #dc143c;">re</span>.<span style="color: black;">escape</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'abc123_@#$'</span><span style="color: black;">&#41;</span>
abc123\_\@\<span style="color: #808080; font-style: italic;">#\$</span></pre></div></div>

<p><em>exception</em> <code>re</code>.<strong>error</strong><br />
如果字符串不能被成功编译成正则表达式或者正则表达式在匹配过程中出错了，都会抛出此异常。但是如果正则表达式没有匹配到任何文本，是不会抛出这个异常的。</p>
<p><big><strong>正则对象</strong></big></p>
<p>正则对象由<code>re</code>.<strong>compile()</strong>返回。它有如下的属性和方法。</p>
<p><strong>match</strong>(<em>string</em>[, <em>pos</em>[, <em>endpos</em>]])<br />
作用和模块的<strong>match()</strong>函数类似，区别就是后面两个参数。<br />
<em>pos</em>是开始搜索的位置，默认为0。<em>endpos</em>是搜索的结束位置，如果<em>endpos</em>比<em>pos</em>还小的话，结果肯定是空的。也就是说只有<em>pos</em> 到 <em>endpos</em>-1 位置的字符串将会被搜索。<br />
例子：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> pattern = <span style="color: #dc143c;">re</span>.<span style="color: #008000;">compile</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;o&quot;</span><span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> pattern.<span style="color: black;">match</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;dog&quot;</span><span style="color: black;">&#41;</span>      <span style="color: #808080; font-style: italic;"># 开始位置不是o，所以不匹配</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> pattern.<span style="color: black;">match</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;dog&quot;</span>, <span style="color: #ff4500;">1</span><span style="color: black;">&#41;</span>   <span style="color: #808080; font-style: italic;"># 第二个字符是o，所以匹配</span>
<span style="color: #66cc66;">&lt;</span>_sre.<span style="color: black;">SRE_Match</span> <span style="color: #008000;">object</span> at ...<span style="color: #66cc66;">&gt;</span></pre></div></div>

<p><strong>search</strong>(<em>string</em>[, <em>pos</em>[, <em>endpos</em>]])<br />
作用和模块的<strong>search()</strong>函数类似，<em>pos</em>和<em>endpos</em>参数和上面的<strong>match()</strong>函数类似。</p>
<p><strong>split</strong>(<em>string</em>[, <em>maxsplit</em>=0])<br />
<strong>findall</strong>(<em>string</em>[, <em>pos</em>[, <em>endpos</em>]])<br />
<strong>finditer</strong>(<em>string</em>[, <em>pos</em>[, <em>endpos</em>]])<br />
<strong>sub</strong>(<em>repl</em>, <em>string</em>[, <em>count=0</em>])<br />
<strong>subn</strong>(<em>repl</em>, <em>string</em>[, <em>count=0</em>])<br />
这几个函数，都和模块的相应函数一致。</p>
<p><strong>flags</strong><br />
编译本RE时，指定的标志位，如果未指定任何标志位，则为0。</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> pattern = <span style="color: #dc143c;">re</span>.<span style="color: #008000;">compile</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;o&quot;</span>, <span style="color: #dc143c;">re</span>.<span style="color: black;">S</span>|re.<span style="color: black;">U</span><span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> pattern.<span style="color: black;">flags</span>
<span style="color: #ff4500;">48</span></pre></div></div>

<p><strong>groups</strong><br />
RE所含有的组的个数。</p>
<p><strong>groupindex</strong><br />
一个字典，定义了命名组的名字和序号之间的关系。<br />
例子：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;">这个正则有<span style="color: #ff4500;">3</span>个组，如果匹配到，第一个叫区号，最后一个叫分机号，中间的那个未命名
<span style="color: #66cc66;">&gt;&gt;&gt;</span> pattern = <span style="color: #dc143c;">re</span>.<span style="color: #008000;">compile</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;(?P&lt;quhao&gt;<span style="color: #000099; font-weight: bold;">\d</span>+)-(<span style="color: #000099; font-weight: bold;">\d</span>+)-(?P&lt;fenjihao&gt;<span style="color: #000099; font-weight: bold;">\d</span>+)&quot;</span><span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> pattern.<span style="color: black;">groups</span>
<span style="color: #ff4500;">3</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> pattern.<span style="color: black;">groupindex</span>
<span style="color: black;">&#123;</span><span style="color: #483d8b;">'fenjihao'</span>: <span style="color: #ff4500;">3</span>, <span style="color: #483d8b;">'quhao'</span>: <span style="color: #ff4500;">1</span><span style="color: black;">&#125;</span></pre></div></div>

<p><strong>pattern</strong><br />
建立本RE的原始字符串，相当于源代码了，呵呵。<br />
还是上面这个正则，可以看到，会原样返回：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #ff7700;font-weight:bold;">print</span> pattern.<span style="color: black;">pattern</span>
<span style="color: black;">&#40;</span><span style="color: #66cc66;">?</span>P<span style="color: #66cc66;">&lt;</span>quhao<span style="color: #66cc66;">&gt;</span>\d+<span style="color: black;">&#41;</span>-<span style="color: black;">&#40;</span>\d+<span style="color: black;">&#41;</span>-<span style="color: black;">&#40;</span><span style="color: #66cc66;">?</span>P<span style="color: #66cc66;">&lt;</span>fenjihao<span style="color: #66cc66;">&gt;</span>\d+<span style="color: black;">&#41;</span></pre></div></div>

<p><big><strong>Match对象</strong></big></p>
<p><code>re</code>.<strong>MatchObject</strong>被用于布尔判断的时候，始终返回True，所以你用 if 语句来判断某个 <strong>match()</strong> 是否成功是安全的。<br />
它有以下方法和属性：</p>
<p><strong>expand</strong>(<em>template</em>)<br />
用<em>template</em>做为模板，将<strong>MatchObject</strong>展开，就像<strong>sub()</strong>里的行为一样，看例子：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> m = <span style="color: #dc143c;">re</span>.<span style="color: black;">match</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'a=(<span style="color: #000099; font-weight: bold;">\d</span>+)'</span>, <span style="color: #483d8b;">'a=100'</span><span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> m.<span style="color: black;">expand</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'above a is <span style="color: #000099; font-weight: bold;">\g</span>&lt;1&gt;'</span><span style="color: black;">&#41;</span>
<span style="color: #483d8b;">'above a is 100'</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> m.<span style="color: black;">expand</span><span style="color: black;">&#40;</span>r<span style="color: #483d8b;">'above a is <span style="color: #000099; font-weight: bold;">\1</span>'</span><span style="color: black;">&#41;</span>
<span style="color: #483d8b;">'above a is 100'</span></pre></div></div>

<p><strong>group</strong>([<em>group1</em>, <em>...</em>])<br />
返回一个或多个子组。如果参数为一个，就返回一个子串；如果参数有多个，就返回多个子串注册的元组。如果不传任何参数，效果和传入一个0一样，将返回整个匹配。如果某个<em>groupN</em>未匹配到，相应位置会返回<strong>None</strong>。如果某个<em>groupN</em>是负数或者大于group的总数，则会抛出<strong>IndexError</strong>异常。</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> m = <span style="color: #dc143c;">re</span>.<span style="color: black;">match</span><span style="color: black;">&#40;</span>r<span style="color: #483d8b;">&quot;(<span style="color: #000099; font-weight: bold;">\w</span>+) (<span style="color: #000099; font-weight: bold;">\w</span>+)&quot;</span>, <span style="color: #483d8b;">&quot;Isaac Newton, physicist&quot;</span><span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> m.<span style="color: black;">group</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#41;</span>       <span style="color: #808080; font-style: italic;"># 整个匹配</span>
<span style="color: #483d8b;">'Isaac Newton'</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> m.<span style="color: black;">group</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#41;</span>       <span style="color: #808080; font-style: italic;"># 第一个子串</span>
<span style="color: #483d8b;">'Isaac'</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> m.<span style="color: black;">group</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">2</span><span style="color: black;">&#41;</span>       <span style="color: #808080; font-style: italic;"># 第二个子串</span>
<span style="color: #483d8b;">'Newton'</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> m.<span style="color: black;">group</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">1</span>, <span style="color: #ff4500;">2</span><span style="color: black;">&#41;</span>    <span style="color: #808080; font-style: italic;"># 多个子串组成的元组</span>
<span style="color: black;">&#40;</span><span style="color: #483d8b;">'Isaac'</span>, <span style="color: #483d8b;">'Newton'</span><span style="color: black;">&#41;</span></pre></div></div>

<p>如果有其中有用<code>(?P<name>...)</code>这种语法命名过的子串的话，相应的<em>groupN</em>也可以是名字字符串。例如：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> m = <span style="color: #dc143c;">re</span>.<span style="color: black;">match</span><span style="color: black;">&#40;</span>r<span style="color: #483d8b;">&quot;(?P&lt;first_name&gt;<span style="color: #000099; font-weight: bold;">\w</span>+) (?P&lt;last_name&gt;<span style="color: #000099; font-weight: bold;">\w</span>+)&quot;</span>, <span style="color: #483d8b;">&quot;Malcolm Reynolds&quot;</span><span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> m.<span style="color: black;">group</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'first_name'</span><span style="color: black;">&#41;</span>
<span style="color: #483d8b;">'Malcolm'</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> m.<span style="color: black;">group</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'last_name'</span><span style="color: black;">&#41;</span>
<span style="color: #483d8b;">'Reynolds'</span></pre></div></div>

<p>如果某个组被匹配到多次，那么只有最后一次的数据，可以被提取到：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> m = <span style="color: #dc143c;">re</span>.<span style="color: black;">match</span><span style="color: black;">&#40;</span>r<span style="color: #483d8b;">&quot;(..)+&quot;</span>, <span style="color: #483d8b;">&quot;a1b2c3&quot;</span><span style="color: black;">&#41;</span>  <span style="color: #808080; font-style: italic;"># 匹配到3次</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> m.<span style="color: black;">group</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#41;</span>                        <span style="color: #808080; font-style: italic;"># 返回的是最后一次</span>
<span style="color: #483d8b;">'c3'</span></pre></div></div>

<p><strong>groups</strong>([<em>default</em>])<br />
返回一个由所有匹配到的子串组成的元组。<em>default</em>参数，用于给那些没有匹配到的组做默认值，它的默认值是<strong>None</strong><br />
例如：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> m = <span style="color: #dc143c;">re</span>.<span style="color: black;">match</span><span style="color: black;">&#40;</span>r<span style="color: #483d8b;">&quot;(<span style="color: #000099; font-weight: bold;">\d</span>+)<span style="color: #000099; font-weight: bold;">\.</span>(<span style="color: #000099; font-weight: bold;">\d</span>+)&quot;</span>, <span style="color: #483d8b;">&quot;24.1632&quot;</span><span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> m.<span style="color: black;">groups</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
<span style="color: black;">&#40;</span><span style="color: #483d8b;">'24'</span>, <span style="color: #483d8b;">'1632'</span><span style="color: black;">&#41;</span></pre></div></div>

<p><em>default</em>的作用：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> m = <span style="color: #dc143c;">re</span>.<span style="color: black;">match</span><span style="color: black;">&#40;</span>r<span style="color: #483d8b;">&quot;(<span style="color: #000099; font-weight: bold;">\d</span>+)<span style="color: #000099; font-weight: bold;">\.</span>?(<span style="color: #000099; font-weight: bold;">\d</span>+)?&quot;</span>, <span style="color: #483d8b;">&quot;24&quot;</span><span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> m.<span style="color: black;">groups</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>      <span style="color: #808080; font-style: italic;"># 第二个默认是None</span>
<span style="color: black;">&#40;</span><span style="color: #483d8b;">'24'</span>, <span style="color: #008000;">None</span><span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> m.<span style="color: black;">groups</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'0'</span><span style="color: black;">&#41;</span>   <span style="color: #808080; font-style: italic;"># 现在默认是0了</span>
<span style="color: black;">&#40;</span><span style="color: #483d8b;">'24'</span>, <span style="color: #483d8b;">'0'</span><span style="color: black;">&#41;</span></pre></div></div>

<p><strong>groupdict</strong>([<em>default</em>])<br />
返回一个包含所有命名组的名字和子串的字典，<em>default</em>参数，用于给那些没有匹配到的组做默认值，它的默认值是<strong>None</strong>，例如：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> m = <span style="color: #dc143c;">re</span>.<span style="color: black;">match</span><span style="color: black;">&#40;</span>r<span style="color: #483d8b;">&quot;(?P&lt;first_name&gt;<span style="color: #000099; font-weight: bold;">\w</span>+) (?P&lt;last_name&gt;<span style="color: #000099; font-weight: bold;">\w</span>+)&quot;</span>, <span style="color: #483d8b;">&quot;Malcolm Reynolds&quot;</span><span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> m.<span style="color: black;">groupdict</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
<span style="color: black;">&#123;</span><span style="color: #483d8b;">'first_name'</span>: <span style="color: #483d8b;">'Malcolm'</span>, <span style="color: #483d8b;">'last_name'</span>: <span style="color: #483d8b;">'Reynolds'</span><span style="color: black;">&#125;</span></pre></div></div>

<p><strong>start</strong>([<strong>group</strong>])<br />
<strong>end</strong>([<em>group</em>])<br />
返回的是：被组<em>group</em>匹配到的子串在原字符串中的位置。如果不指定<em>group</em>或<em>group</em>指定为0，则代表整个匹配。如果<em>group</em>未匹配到，则返回 <code>-1</code>。<br />
对于指定的m和g，<code>m.group(g)</code>和<code>m.string[m.start(g):m.end(g)]</code>等效。<br />
注意：如果<em>group</em>匹配到空字符串，<em>m.start(group)</em>和<em>m.end(group)</em>将相等。<br />
例如：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> m = <span style="color: #dc143c;">re</span>.<span style="color: black;">search</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'b(c?)'</span>, <span style="color: #483d8b;">'cba'</span><span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> m.<span style="color: black;">start</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#41;</span>
<span style="color: #ff4500;">1</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> m.<span style="color: black;">end</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#41;</span>
<span style="color: #ff4500;">2</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> m.<span style="color: black;">start</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#41;</span>
<span style="color: #ff4500;">2</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> m.<span style="color: black;">end</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#41;</span>
<span style="color: #ff4500;">2</span></pre></div></div>

<p>下面是一个把email地址里的“remove_this”去掉的例子：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #dc143c;">email</span> = <span style="color: #483d8b;">&quot;tony@tiremove_thisger.net&quot;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> m = <span style="color: #dc143c;">re</span>.<span style="color: black;">search</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;remove_this&quot;</span>, <span style="color: #dc143c;">email</span><span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #dc143c;">email</span><span style="color: black;">&#91;</span>:m.<span style="color: black;">start</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><span style="color: black;">&#93;</span> + <span style="color: #dc143c;">email</span><span style="color: black;">&#91;</span>m.<span style="color: black;">end</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>:<span style="color: black;">&#93;</span>
<span style="color: #483d8b;">'tony@tiger.net'</span></pre></div></div>

<p><strong>span</strong>([<em>group</em>])<br />
返回一个元组： <code>(m.start(group), m.end(group))</code></p>
<p><strong>pos</strong><br />
就是传给RE对象的<strong>search()</strong>或<strong>match()</strong>方法的参数<em>pos</em>，代表RE开始搜索字符串的位置。</p>
<p><strong>endpos</strong><br />
就是传给RE对象的<strong>search()</strong>或<strong>match()</strong>方法的参数<em>endpos</em>，代表RE搜索字符串的结束位置。</p>
<p><strong>lastindex</strong><br />
最后一次匹配到的组的数字序号，如果没有匹配到，将得到<strong>None</strong>。<br />
例如：<code>(a)b</code>、<code>((a)(b))</code>和<code>((ab))</code>正则去匹配<code>'ab'</code>的话，得到的<strong>lastindex</strong>为1。而用<code>(a)(b)</code>去匹配<code>'ab'</code>的话，得到的<strong>lastindex</strong>为2。</p>
<p><strong>lastgroup</strong><br />
最后一次匹配到的组的名字，如果没有匹配到或者最后的组没有名字，将得到<strong>None</strong>。</p>
<p><strong>re</strong><br />
得到本Match对象的正则表达式对象，也就是执行<strong>search()</strong>或<strong>match()</strong>的对象。</p>
<p><strong>string</strong><br />
传给<strong>search()</strong>或<strong>match()</strong>的字符串。</p>
<p><br/></p>
<p>后面的例子就略了吧，文中已经加了很多我自己的例子了，需要更多例子的话，参照英文原文吧。<br />
最后，感谢我的老婆辛苦地帮我校对，哈哈。</p>
]]></content:encoded>
			<wfw:commentRss>http://luy.li/2010/05/12/python-re/feed/</wfw:commentRss>
		<slash:comments>16</slash:comments>
		</item>
		<item>
		<title>《SED单行脚本快速参考》的 awk 实现</title>
		<link>http://luy.li/2009/12/07/sed_awk/</link>
		<comments>http://luy.li/2009/12/07/sed_awk/#comments</comments>
		<pubDate>Mon, 07 Dec 2009 13:10:34 +0000</pubDate>
		<dc:creator>bones7456</dc:creator>
				<category><![CDATA[CLI软件]]></category>
		<category><![CDATA[精华]]></category>

		<guid isPermaLink="false">http://li2z.cn/?p=1216</guid>
		<description><![CDATA[sed和awk都是linux下常用的流编辑器，他们各有各的特色，本文并不是要做什么对比，而是权当好玩，把《SED单行脚本快速参考》这文章，用awk做了一遍~ 至于孰好孰坏，那真是很难评论了。一般来说，sed的命令会更短小一些，同时也更难读懂；而awk稍微长点，但是if、while这样的，逻辑性比较强，更加像“程序”。到底喜欢用哪个，就让各位看官自己决定吧！ PS： 貌似这个配色，单行的代码多了以后，拖动的时候会有点眼花的感觉，将就下吧，呵呵。 文本间隔： &#8212;&#8212;&#8211; # 在每一行后面增加一空行 sed G awk '{printf(&#34;%s\n\n&#34;,$0)}' # 将原来的所有空行删除并在每一行后面增加一空行。 # 这样在输出的文本中每一行后面将有且只有一空行。 sed '/^$/d;G' awk '!/^$/{printf(&#34;%s\n\n&#34;,$0)}' # 在每一行后面增加两行空行 sed 'G;G' awk '{printf(&#34;%s\n\n\n&#34;,$0)}' # 将第一个脚本所产生的所有空行删除（即删除所有偶数行） sed 'n;d' awk '{f=!f;if(f)print $0}' # 在匹配式样“regex”的行之前插入一空行 sed '/regex/{x;p;x;}' awk '{if(/regex/)printf(&#34;\n%s\n&#34;,$0);else print $0}' # 在匹配式样“regex”的行之后插入一空行 sed '/regex/G' awk '{if(/regex/)printf(&#34;%s\n\n&#34;,$0);else print $0}' # 在匹配式样“regex”的行之前和之后各插入一空行 sed '/regex/{x;p;x;G;}' awk '{if(/regex/)printf(&#34;\n%s\n\n&#34;,$0);else [...]]]></description>
			<content:encoded><![CDATA[<p>sed和awk都是linux下常用的流编辑器，他们各有各的特色，本文并不是要做什么对比，而是权当好玩，把<a href="http://sed.sourceforge.net/sed1line_zh-CN.html">《SED单行脚本快速参考》</a>这文章，用awk做了一遍~<br />
至于孰好孰坏，那真是很难评论了。一般来说，sed的命令会更短小一些，同时也更难读懂；而awk稍微长点，但是if、while这样的，逻辑性比较强，更加像“程序”。到底喜欢用哪个，就让各位看官自己决定吧！<br />
PS： 貌似这个配色，单行的代码多了以后，拖动的时候会有点眼花的感觉，将就下吧，呵呵。</p>
<p>文本间隔：<br />
&#8212;&#8212;&#8211;</p>
<p> # 在每一行后面增加一空行</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> G</pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{printf(&quot;%s\n\n&quot;,$0)}'</span></pre></div></div>

<p> # 将原来的所有空行删除并在每一行后面增加一空行。<br />
 # 这样在输出的文本中每一行后面将有且只有一空行。</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'/^$/d;G'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'!/^$/{printf(&quot;%s\n\n&quot;,$0)}'</span></pre></div></div>

<p> # 在每一行后面增加两行空行</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'G;G'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{printf(&quot;%s\n\n\n&quot;,$0)}'</span></pre></div></div>

<p> # 将第一个脚本所产生的所有空行删除（即删除所有偶数行）</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'n;d'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{f=!f;if(f)print $0}'</span></pre></div></div>

<p> # 在匹配式样“regex”的行之前插入一空行</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'/regex/{x;p;x;}'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{if(/regex/)printf(&quot;\n%s\n&quot;,$0);else print $0}'</span></pre></div></div>

<p> # 在匹配式样“regex”的行之后插入一空行</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'/regex/G'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{if(/regex/)printf(&quot;%s\n\n&quot;,$0);else print $0}'</span></pre></div></div>

<p> # 在匹配式样“regex”的行之前和之后各插入一空行</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'/regex/{x;p;x;G;}'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{if(/regex/)printf(&quot;\n%s\n\n&quot;,$0);else print $0}'</span></pre></div></div>

<p>编号：<br />
&#8212;&#8212;&#8211;</p>
<p> # 为文件中的每一行进行编号（简单的左对齐方式）。这里使用了“制表符”<br />
 # （tab，见本文末尾关于&#8217;\t&#8217;的用法的描述）而不是空格来对齐边缘。</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> = filename <span style="color: #000000; font-weight: bold;">|</span> <span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'N;s/\n/\t/'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{i++;printf(&quot;%d\t%s\n&quot;,i,$0)}'</span></pre></div></div>

<p> # 对文件中的所有行编号（行号在左，文字右端对齐）。</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> = filename <span style="color: #000000; font-weight: bold;">|</span> <span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'N; s/^/     /; s/ *\(.\{6,\}\)\n/\1  /'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{i++;printf(&quot;%6d  %s\n&quot;,i,$0)}'</span></pre></div></div>

<p> # 对文件中的所有行编号，但只显示非空白行的行号。</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'/./='</span> filename <span style="color: #000000; font-weight: bold;">|</span> <span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'/./N; s/\n/ /'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{i++;if(!/^$/)printf(&quot;%d %s\n&quot;,i,$0);else print}'</span></pre></div></div>

<p> # 计算行数 （模拟 &#8220;wc -l&#8221;）</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-n</span> <span style="color: #ff0000;">'$='</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{i++}END{print i}'</span></pre></div></div>

<p>文本转换和替代：<br />
&#8212;&#8212;&#8211;</p>
<p> # Unix环境：转换DOS的新行符（CR/LF）为Unix格式。</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'s/.$//'</span>                     <span style="color: #666666; font-style: italic;"># 假设所有行以CR/LF结束</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'s/^M$//'</span>                    <span style="color: #666666; font-style: italic;"># 在bash/tcsh中，将按Ctrl-M改为按Ctrl-V</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'s/\x0D$//'</span>                  <span style="color: #666666; font-style: italic;"># ssed、gsed 3.02.80，及更高版本</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{sub(/\x0D$/,&quot;&quot;);print $0}'</span></pre></div></div>

<p> # Unix环境：转换Unix的新行符（LF）为DOS格式。</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">&quot;s/$/<span style="color: #780078;">`echo -e \\\r`</span>/&quot;</span>        <span style="color: #666666; font-style: italic;"># 在ksh下所使用的命令</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'s/$'</span><span style="color: #ff0000;">&quot;/<span style="color: #780078;">`echo \\\r`</span>/&quot;</span>         <span style="color: #666666; font-style: italic;"># 在bash下所使用的命令</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">&quot;s/$/<span style="color: #780078;">`echo \\\r`</span>/&quot;</span>           <span style="color: #666666; font-style: italic;"># 在zsh下所使用的命令</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'s/$/\r/'</span>                    <span style="color: #666666; font-style: italic;"># gsed 3.02.80 及更高版本</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{printf(&quot;%s\r\n&quot;,$0)}'</span></pre></div></div>

<p> # DOS环境：转换Unix新行符（LF）为DOS格式。</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">&quot;s/$//&quot;</span>                      <span style="color: #666666; font-style: italic;"># 方法 1</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-n</span> p                         <span style="color: #666666; font-style: italic;"># 方法 2</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;">DOS环境的略过</pre></div></div>

<p> # DOS环境：转换DOS新行符（CR/LF）为Unix格式。<br />
 # 下面的脚本只对UnxUtils sed 4.0.7 及更高版本有效。要识别UnxUtils版本的<br />
 #  sed可以通过其特有的“&#8211;text”选项。你可以使用帮助选项（“&#8211;help”）看<br />
 # 其中有无一个“&#8211;text”项以此来判断所使用的是否是UnxUtils版本。其它DOS<br />
 # 版本的的sed则无法进行这一转换。但可以用“tr”来实现这一转换。</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">&quot;s/<span style="color: #000099; font-weight: bold;">\r</span>//&quot;</span> infile <span style="color: #000000; font-weight: bold;">&gt;</span>outfile     <span style="color: #666666; font-style: italic;"># UnxUtils sed v4.0.7 或更高版本</span>
<span style="color: #c20cb9; font-weight: bold;">tr</span> <span style="color: #660033;">-d</span> \r <span style="color: #000000; font-weight: bold;">&lt;</span>infile <span style="color: #000000; font-weight: bold;">&gt;</span>outfile        <span style="color: #666666; font-style: italic;"># GNU tr 1.22 或更高版本</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;">DOS环境的略过</pre></div></div>

<p> # 将每一行前导的“空白字符”（空格，制表符）删除<br />
 # 使之左对齐</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'s/^[ \t]*//'</span>                <span style="color: #666666; font-style: italic;"># 见本文末尾关于'\t'用法的描述</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{sub(/^[ \t]+/,&quot;&quot;);print $0}'</span></pre></div></div>

<p> # 将每一行拖尾的“空白字符”（空格，制表符）删除</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'s/[ \t]*$//'</span>                <span style="color: #666666; font-style: italic;"># 见本文末尾关于'\t'用法的描述</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{sub(/[ \t]+$/,&quot;&quot;);print $0}'</span></pre></div></div>

<p> # 将每一行中的前导和拖尾的空白字符删除</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'s/^[ \t]*//;s/[ \t]*$//'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{sub(/^[ \t]+/,&quot;&quot;);sub(/[ \t]+$/,&quot;&quot;);print $0}'</span></pre></div></div>

<p> # 在每一行开头处插入5个空格（使全文向右移动5个字符的位置）</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'s/^/     /'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{printf(&quot;     %s\n&quot;,$0)}'</span></pre></div></div>

<p> # 以79个字符为宽度，将所有文本右对齐<br />
 # 78个字符外加最后的一个空格</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-e</span> :a <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'s/^.\{1,78\}$/ &amp;/;ta'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{printf(&quot;%79s\n&quot;,$0)}'</span></pre></div></div>

<p> # 以79个字符为宽度，使所有文本居中。在方法1中，为了让文本居中每一行的前<br />
 # 头和后头都填充了空格。 在方法2中，在居中文本的过程中只在文本的前面填充<br />
 # 空格，并且最终这些空格将有一半会被删除。此外每一行的后头并未填充空格。</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span>  <span style="color: #660033;">-e</span> :a <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'s/^.\{1,77\}$/ &amp; /;ta'</span>                     <span style="color: #666666; font-style: italic;"># 方法1</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span>  <span style="color: #660033;">-e</span> :a <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'s/^.\{1,77\}$/ &amp;/;ta'</span> <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'s/\( *\)\1/\1/'</span>  <span style="color: #666666; font-style: italic;"># 方法2</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{for(i=0;i&lt;39-length($0)/2;i++)printf(&quot; &quot;);printf(&quot;%s\n&quot;,$0)}'</span>  <span style="color: #666666; font-style: italic;">#相当于上面的方法二</span></pre></div></div>

<p> # 在每一行中查找字串“foo”，并将找到的“foo”替换为“bar”</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'s/foo/bar/'</span>                 <span style="color: #666666; font-style: italic;"># 只替换每一行中的第一个“foo”字串</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'s/foo/bar/4'</span>                <span style="color: #666666; font-style: italic;"># 只替换每一行中的第四个“foo”字串</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'s/foo/bar/g'</span>                <span style="color: #666666; font-style: italic;"># 将每一行中的所有“foo”都换成“bar”</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'s/\(.*\)foo\(.*foo\)/\1bar\2/'</span> <span style="color: #666666; font-style: italic;"># 替换倒数第二个“foo”</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'s/\(.*\)foo/\1bar/'</span>            <span style="color: #666666; font-style: italic;"># 替换最后一个“foo”</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{gsub(/foo/,&quot;bar&quot;);print $0}'</span>   <span style="color: #666666; font-style: italic;"># 将每一行中的所有“foo”都换成“bar”</span></pre></div></div>

<p> # 只在行中出现字串“baz”的情况下将“foo”替换成“bar”</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'/baz/s/foo/bar/g'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{if(/baz/)gsub(/foo/,&quot;bar&quot;);print $0}'</span></pre></div></div>

<p> # 将“foo”替换成“bar”，并且只在行中未出现字串“baz”的情况下替换</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'/baz/!s/foo/bar/g'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{if(/baz$/)gsub(/foo/,&quot;bar&quot;);print $0}'</span></pre></div></div>

<p> # 不管是“scarlet”“ruby”还是“puce”，一律换成“red”</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'s/scarlet/red/g;s/ruby/red/g;s/puce/red/g'</span>  <span style="color: #666666; font-style: italic;">#对多数的sed都有效</span>
gsed <span style="color: #ff0000;">'s/scarlet\|ruby\|puce/red/g'</span>               <span style="color: #666666; font-style: italic;"># 只对GNU sed有效</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{gsub(/scarlet|ruby|puce/,&quot;red&quot;);print $0}'</span></pre></div></div>

<p> # 倒置所有行，第一行成为最后一行，依次类推（模拟“tac”）。<br />
 # 由于某些原因，使用下面命令时HHsed v1.5会将文件中的空行删除</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'1!G;h;$!d'</span>               <span style="color: #666666; font-style: italic;"># 方法1</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-n</span> <span style="color: #ff0000;">'1!G;h;$p'</span>             <span style="color: #666666; font-style: italic;"># 方法2</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{A[i++]=$0}END{for(j=i-1;j&gt;=0;j--)print A[j]}'</span></pre></div></div>

<p> # 将行中的字符逆序排列，第一个字成为最后一字，……（模拟“rev”）</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'/\n/!G;s/\(.\)\(.*\n\)/&amp;\2\1/;//D;s/.//'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{for(i=length($0);i&gt;0;i--)printf(&quot;%s&quot;,substr($0,i,1));printf(&quot;\n&quot;)}'</span></pre></div></div>

<p> # 将每两行连接成一行（类似“paste”）</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'$!N;s/\n/ /'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{f=!f;if(f)printf(&quot;%s&quot;,$0);else printf(&quot; %s\n&quot;,$0)}'</span></pre></div></div>

<p> # 如果当前行以反斜杠“\”结束，则将下一行并到当前行末尾<br />
 # 并去掉原来行尾的反斜杠</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-e</span> :a <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'/\\$/N; s/\\\n//; ta'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{if(/\\$/)printf(&quot;%s&quot;,substr($0,0,length($0)-1));else printf(&quot;%s\n&quot;,$0)}'</span></pre></div></div>

<p> # 如果当前行以等号开头，将当前行并到上一行末尾<br />
 # 并以单个空格代替原来行头的“=”</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-e</span> :a <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'$!N;s/\n=/ /;ta'</span> <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'P;D'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{if(/^=/)printf(&quot; %s&quot;,substr($0,2));else printf(&quot;%s%s&quot;,a,$0);a=&quot;\n&quot;}END{printf(&quot;\n&quot;)}'</span></pre></div></div>

<p> # 为数字字串增加逗号分隔符号，将“1234567”改为“1,234,567”</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;">gsed <span style="color: #ff0000;">':a;s/\B[0-9]\{3\}\&gt;/,&amp;/;ta'</span>                     <span style="color: #666666; font-style: italic;"># GNU sed</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-e</span> :a <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'s/\(.*[0-9]\)\([0-9]\{3\}\)/\1,\2/;ta'</span>  <span style="color: #666666; font-style: italic;"># 其他sed</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">#awk的正则没有后向匹配和引用，搞的比较狼狈，呵呵。</span>
<span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{while(match($0,/[0-9][0-9][0-9][0-9]+/)){$0=sprintf(&quot;%s,%s&quot;,substr($0,0,RSTART+RLENGTH-4),substr($0,RSTART+RLENGTH-3))}print $0}'</span></pre></div></div>

<p> # 为带有小数点和负号的数值增加逗号分隔符（GNU sed）</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;">gsed <span style="color: #660033;">-r</span> <span style="color: #ff0000;">':a;s/(^|[^0-9.])([0-9]+)([0-9]{3})/\1\2,\3/g;ta'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">#和上例差不多</span>
<span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{while(match($0,/[^\.0-9][0-9][0-9][0-9][0-9]+/)){$0=sprintf(&quot;%s,%s&quot;,substr($0,0,RSTART+RLENGTH-4),substr($0,RSTART+RLENGTH-3))}print $0}'</span></pre></div></div>

<p> # 在每5行后增加一空白行 （在第5，10，15，20，等行后增加一空白行）</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;">gsed <span style="color: #ff0000;">'0~5G'</span>                      <span style="color: #666666; font-style: italic;"># 只对GNU sed有效</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'n;n;n;n;G;'</span>                 <span style="color: #666666; font-style: italic;"># 其他sed</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{print $0;i++;if(i==5){printf(&quot;\n&quot;);i=0}}'</span></pre></div></div>

<p>选择性地显示特定行：<br />
&#8212;&#8212;&#8211;</p>
<p> # 显示文件中的前10行 （模拟“head”的行为）</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> 10q</pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{print;if(NR==10)exit}'</span></pre></div></div>

<p> # 显示文件中的第一行 （模拟“head -1”命令）</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> q</pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{print;exit}'</span></pre></div></div>

<p> # 显示文件中的最后10行 （模拟“tail”）</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-e</span> :a <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'$q;N;11,$D;ba'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">#用awk干这个有点亏，得全文缓存，对于大文件肯定很慢</span>
<span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{A[NR]=$0}END{for(i=NR-9;i&lt;=NR;i++)print A[i]}'</span></pre></div></div>

<p> # 显示文件中的最后2行（模拟“tail -2”命令）</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'$!N;$!D'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{A[NR]=$0}END{for(i=NR-1;i&lt;=NR;i++)print A[i]}'</span></pre></div></div>

<p> # 显示文件中的最后一行（模拟“tail -1”）</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'$!d'</span>                        <span style="color: #666666; font-style: italic;"># 方法1</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-n</span> <span style="color: #ff0000;">'$p'</span>                      <span style="color: #666666; font-style: italic;"># 方法2</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">#这个比较好办，只存最后一行了。</span>
<span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{A=$0}END{print A}'</span></pre></div></div>

<p> # 显示文件中的倒数第二行</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'$!{h;d;}'</span> <span style="color: #660033;">-e</span> x              <span style="color: #666666; font-style: italic;"># 当文件中只有一行时，输出空行</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'1{$q;}'</span> <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'$!{h;d;}'</span> <span style="color: #660033;">-e</span> x  <span style="color: #666666; font-style: italic;"># 当文件中只有一行时，显示该行</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'1{$d;}'</span> <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'$!{h;d;}'</span> <span style="color: #660033;">-e</span> x  <span style="color: #666666; font-style: italic;"># 当文件中只有一行时，不输出</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">#存两行呗（当文件中只有一行时，输出空行）</span>
<span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{B=A;A=$0}END{print B}'</span></pre></div></div>

<p> # 只显示匹配正则表达式的行（模拟“grep”）</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-n</span> <span style="color: #ff0000;">'/regexp/p'</span>               <span style="color: #666666; font-style: italic;"># 方法1</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'/regexp/!d'</span>                 <span style="color: #666666; font-style: italic;"># 方法2</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'/regexp/{print}'</span></pre></div></div>

<p> # 只显示“不”匹配正则表达式的行（模拟“grep -v”）</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-n</span> <span style="color: #ff0000;">'/regexp/!p'</span>              <span style="color: #666666; font-style: italic;"># 方法1，与前面的命令相对应</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'/regexp/d'</span>                  <span style="color: #666666; font-style: italic;"># 方法2，类似的语法</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'!/regexp/{print}'</span></pre></div></div>

<p> # 查找“regexp”并将匹配行的上一行显示出来，但并不显示匹配行</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-n</span> <span style="color: #ff0000;">'/regexp/{g;1!p;};h'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'/regexp/{print A}{A=$0}'</span></pre></div></div>

<p> # 查找“regexp”并将匹配行的下一行显示出来，但并不显示匹配行</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-n</span> <span style="color: #ff0000;">'/regexp/{n;p;}'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{if(A)print;A=0}/regexp/{A=1}'</span></pre></div></div>

<p> # 显示包含“regexp”的行及其前后行，并在第一行之前加上“regexp”所在行的行号 （类似“grep -A1 -B1”）</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-n</span> <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'/regexp/{=;x;1!p;g;$!N;p;D;}'</span> <span style="color: #660033;">-e</span> h</pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{if(F)print;F=0}/regexp/{print NR;print b;print;F=1}{b=$0}'</span></pre></div></div>

<p> # 显示包含“AAA”、“BBB”和“CCC”的行（任意次序）</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'/AAA/!d; /BBB/!d; /CCC/!d'</span>   <span style="color: #666666; font-style: italic;"># 字串的次序不影响结果</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{if(match($0,/AAA/) &amp;&amp; match($0,/BBB/) &amp;&amp; match($0,/CCC/))print}'</span></pre></div></div>

<p> # 显示包含“AAA”、“BBB”和“CCC”的行（固定次序）</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'/AAA.*BBB.*CCC/!d'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{if(match($0,/AAA.*BBB.*CCC/))print}'</span></pre></div></div>

<p> # 显示包含“AAA”“BBB”或“CCC”的行 （模拟“egrep”）</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'/AAA/b'</span> <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'/BBB/b'</span> <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'/CCC/b'</span> <span style="color: #660033;">-e</span> d    <span style="color: #666666; font-style: italic;"># 多数sed</span>
gsed <span style="color: #ff0000;">'/AAA\|BBB\|CCC/!d'</span>                        <span style="color: #666666; font-style: italic;"># 对GNU sed有效</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'/AAA/{print;next}/BBB/{print;next}/CCC/{print}'</span>
<span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'/AAA|BBB|CCC/{print}'</span></pre></div></div>

<p> # 显示包含“AAA”的段落 （段落间以空行分隔）<br />
 # HHsed v1.5 必须在“x;”后加入“G;”，接下来的3个脚本都是这样</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'/./{H;$!d;}'</span> <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'x;/AAA/!d;'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'BEGIN{RS=&quot;&quot;}/AAA/{print}'</span>
<span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #660033;">-vRS</span>= <span style="color: #ff0000;">'/AAA/{print}'</span></pre></div></div>

<p> # 显示包含“AAA”“BBB”和“CCC”三个字串的段落 （任意次序）</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'/./{H;$!d;}'</span> <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'x;/AAA/!d;/BBB/!d;/CCC/!d'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #660033;">-vRS</span>= <span style="color: #ff0000;">'{if(match($0,/AAA/) &amp;&amp; match($0,/BBB/) &amp;&amp; match($0,/CCC/))print}'</span></pre></div></div>

<p> # 显示包含“AAA”、“BBB”、“CCC”三者中任一字串的段落 （任意次序）</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'/./{H;$!d;}'</span> <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'x;/AAA/b'</span> <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'/BBB/b'</span> <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'/CCC/b'</span> <span style="color: #660033;">-e</span> d
gsed <span style="color: #ff0000;">'/./{H;$!d;};x;/AAA\|BBB\|CCC/b;d'</span>         <span style="color: #666666; font-style: italic;"># 只对GNU sed有效</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #660033;">-vRS</span>= <span style="color: #ff0000;">'/AAA|BBB|CCC/{print &quot;&quot;;print}'</span></pre></div></div>

<p> # 显示包含65个或以上字符的行</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-n</span> <span style="color: #ff0000;">'/^.\{65\}/p'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">cat</span> ll.txt <span style="color: #000000; font-weight: bold;">|</span> <span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{if(length($0)&gt;=65)print}'</span></pre></div></div>

<p> # 显示包含65个以下字符的行</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-n</span> <span style="color: #ff0000;">'/^.\{65\}/!p'</span>            <span style="color: #666666; font-style: italic;"># 方法1，与上面的脚本相对应</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'/^.\{65\}/d'</span>                <span style="color: #666666; font-style: italic;"># 方法2，更简便一点的方法</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{if(length($0)&lt;=65)print}'</span></pre></div></div>

<p> # 显示部分文本——从包含正则表达式的行开始到最后一行结束</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-n</span> <span style="color: #ff0000;">'/regexp/,$p'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'/regexp/{F=1}{if(F)print}'</span></pre></div></div>

<p> # 显示部分文本——指定行号范围（从第8至第12行，含8和12行）</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-n</span> <span style="color: #ff0000;">'8,12p'</span>                   <span style="color: #666666; font-style: italic;"># 方法1</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'8,12!d'</span>                     <span style="color: #666666; font-style: italic;"># 方法2</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{if(NR&gt;=8 &amp;&amp; NR&lt;12)print}'</span></pre></div></div>

<p> # 显示第52行</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-n</span> <span style="color: #ff0000;">'52p'</span>                     <span style="color: #666666; font-style: italic;"># 方法1</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'52!d'</span>                       <span style="color: #666666; font-style: italic;"># 方法2</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'52q;d'</span>                      <span style="color: #666666; font-style: italic;"># 方法3, 处理大文件时更有效率</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{if(NR==52){print;exit}}'</span></pre></div></div>

<p> # 从第3行开始，每7行显示一次</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;">gsed <span style="color: #660033;">-n</span> <span style="color: #ff0000;">'3~7p'</span>                   <span style="color: #666666; font-style: italic;"># 只对GNU sed有效</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-n</span> <span style="color: #ff0000;">'3,${p;n;n;n;n;n;n;}'</span>     <span style="color: #666666; font-style: italic;"># 其他sed</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{if(NR==3)F=1}{if(F){i++;if(i%7==1)print}}'</span></pre></div></div>

<p> # 显示两个正则表达式之间的文本（包含）</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-n</span> <span style="color: #ff0000;">'/Iowa/,/Montana/p'</span>       <span style="color: #666666; font-style: italic;"># 区分大小写方式</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'/Iowa/{F=1}{if(F)print}/Montana/{F=0}'</span></pre></div></div>

<p>选择性地删除特定行：<br />
&#8212;&#8212;&#8211;</p>
<p> # 显示通篇文档，除了两个正则表达式之间的内容</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'/Iowa/,/Montana/d'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'/Iowa/{F=1}{if(!F)print}/Montana/{F=0}'</span></pre></div></div>

<p> # 删除文件中相邻的重复行（模拟“uniq”）<br />
 # 只保留重复行中的第一行，其他行删除</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'$!N; /^\(.*\)\n\1$/!P; D'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{if($0!=B)print;B=$0}'</span></pre></div></div>

<p> # 删除文件中的重复行，不管有无相邻。注意hold space所能支持的缓存大小，或者使用GNU sed。</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-n</span> <span style="color: #ff0000;">'G; s/\n/&amp;&amp;/; /^\([ -~]*\n\).*\n\1/d; s/\n//; h; P'</span>  <span style="color: #666666; font-style: italic;">#bones7456注：我这里此命令并不能正常工作</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{if(!($0 in B))print;B[$0]=1}'</span></pre></div></div>

<p> # 删除除重复行外的所有行（模拟“uniq -d”）</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'$!N; s/^\(.*\)\n\1$/\1/; t; D'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{if($0==B &amp;&amp; $0!=l){print;l=$0}B=$0}'</span></pre></div></div>

<p> # 删除文件中开头的10行</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'1,10d'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{if(NR&gt;10)print}'</span></pre></div></div>

<p> # 删除文件中的最后一行</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'$d'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">#awk在过程中并不知道文件一共有几行，所以只能通篇缓存，大文件可能不适合，下面两个也一样</span>
<span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{B[NR]=$0}END{for(i=0;i&lt;=NR-1;i++)print B[i]}'</span></pre></div></div>

<p> # 删除文件中的最后两行</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'N;$!P;$!D;$d'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{B[NR]=$0}END{for(i=0;i&lt;=NR-2;i++)print B[i]}'</span></pre></div></div>

<p> # 删除文件中的最后10行</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-e</span> :a <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'$d;N;2,10ba'</span> <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'P;D'</span>   <span style="color: #666666; font-style: italic;"># 方法1</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-n</span> <span style="color: #660033;">-e</span> :a <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'1,10!{P;N;D;};N;ba'</span>  <span style="color: #666666; font-style: italic;"># 方法2</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{B[NR]=$0}END{for(i=0;i&lt;=NR-10;i++)print B[i]}'</span></pre></div></div>

<p> # 删除8的倍数行</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;">gsed <span style="color: #ff0000;">'0~8d'</span>                           <span style="color: #666666; font-style: italic;"># 只对GNU sed有效</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'n;n;n;n;n;n;n;d;'</span>                <span style="color: #666666; font-style: italic;"># 其他sed</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{if(NR%8!=0)print}'</span> <span style="color: #000000; font-weight: bold;">|</span><span style="color: #c20cb9; font-weight: bold;">head</span></pre></div></div>

<p> # 删除匹配式样的行</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'/pattern/d'</span>                      <span style="color: #666666; font-style: italic;"># 删除含pattern的行。当然pattern可以换成任何有效的正则表达式</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{if(!match($0,/pattern/))print}'</span></pre></div></div>

<p> # 删除文件中的所有空行（与“grep &#8216;.&#8217; ”效果相同）</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'/^$/d'</span>                           <span style="color: #666666; font-style: italic;"># 方法1</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'/./!d'</span>                           <span style="color: #666666; font-style: italic;"># 方法2</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{if(!match($0,/^$/))print}'</span></pre></div></div>

<p> # 只保留多个相邻空行的第一行。并且删除文件顶部和尾部的空行。<br />
 # （模拟“cat -s”）</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'/./,/^$/!d'</span>        <span style="color: #666666; font-style: italic;">#方法1，删除文件顶部的空行，允许尾部保留一空行</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'/^$/N;/\n$/D'</span>      <span style="color: #666666; font-style: italic;">#方法2，允许顶部保留一空行，尾部不留空行</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{if(!match($0,/^$/)){print;F=1}else{if(F)print;F=0}}'</span>  <span style="color: #666666; font-style: italic;">#同上面的方法2</span></pre></div></div>

<p> # 只保留多个相邻空行的前两行。</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'/^$/N;/\n$/N;//D'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{if(!match($0,/^$/)){print;F=0}else{if(F&lt;2)print;F++}}'</span></pre></div></div>

<p> # 删除文件顶部的所有空行</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'/./,$!d'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{if(F || !match($0,/^$/)){print;F=1}}'</span></pre></div></div>

<p> # 删除文件尾部的所有空行</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-e</span> :a <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'/^\n*$/{$d;N;ba'</span> <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'}'</span>  <span style="color: #666666; font-style: italic;"># 对所有sed有效</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-e</span> :a <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'/^\n*$/N;/\n$/ba'</span>        <span style="color: #666666; font-style: italic;"># 同上，但只对 gsed 3.02.*有效</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'/^.+$/{for(i=l;i&lt;NR-1;i++)print &quot;&quot;;print;l=NR}'</span></pre></div></div>

<p> # 删除每个段落的最后一行</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-n</span> <span style="color: #ff0000;">'/^$/{p;h;};/./{x;/./p;}'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">#很长，很ugly，应该有更好的办法</span>
<span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #660033;">-vRS</span>= <span style="color: #ff0000;">'{B=$0;l=0;f=1;while(match(B,/\n/)&gt;0){print substr(B,l,RSTART-l-f);l=RSTART;sub(/\n/,&quot;&quot;,B);f=0};print &quot;&quot;}'</span></pre></div></div>

<p>特殊应用：<br />
&#8212;&#8212;&#8211;</p>
<p> # 移除手册页（man page）中的nroff标记。在Unix System V或bash shell下使<br />
 # 用&#8217;echo&#8217;命令时可能需要加上 -e 选项。</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">&quot;s/.<span style="color: #780078;">`echo \\\b`</span>//g&quot;</span>    <span style="color: #666666; font-style: italic;"># 外层的双括号是必须的（Unix环境）</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'s/.^H//g'</span>             <span style="color: #666666; font-style: italic;"># 在bash或tcsh中, 按 Ctrl-V 再按 Ctrl-H</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'s/.\x08//g'</span>           <span style="color: #666666; font-style: italic;"># sed 1.5，GNU sed，ssed所使用的十六进制的表示方法</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{gsub(/.\x08/,&quot;&quot;,$0);print}'</span></pre></div></div>

<p> # 提取新闻组或 e-mail 的邮件头</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'/^$/q'</span>                <span style="color: #666666; font-style: italic;"># 删除第一行空行后的所有内容</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{print}/^$/{exit}'</span></pre></div></div>

<p> # 提取新闻组或 e-mail 的正文部分</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'1,/^$/d'</span>              <span style="color: #666666; font-style: italic;"># 删除第一行空行之前的所有内容</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{if(F)print}/^$/{F=1}'</span></pre></div></div>

<p> # 从邮件头提取“Subject”（标题栏字段），并移除开头的“Subject:”字样</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'/^Subject: */!d; s///;q'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'/^Subject:.*/{print substr($0,10)}/^$/{exit}'</span></pre></div></div>

<p> # 从邮件头获得回复地址</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'/^Reply-To:/q; /^From:/h; /./d;g;q'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">#好像是输出第一个Reply-To:开头的行？From是干啥用的？不清楚规则。。</span>
<span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'/^Reply-To:.*/{print;exit}/^$/{exit}'</span></pre></div></div>

<p> # 获取邮件地址。在上一个脚本所产生的那一行邮件头的基础上进一步的将非电邮地址的部分剃除。（见上一脚本）</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'s/ *(.*)//; s/&gt;.*//; s/.*[:&lt;] *//'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">#取尖括号里的东西吧？</span>
<span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #660033;">-F</span><span style="color: #ff0000;">'[&lt;&gt;]+'</span> <span style="color: #ff0000;">'{print $2}'</span></pre></div></div>

<p> # 在每一行开头加上一个尖括号和空格（引用信息）</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'s/^/&gt; /'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{print &quot;&gt; &quot; $0}'</span></pre></div></div>

<p> # 将每一行开头处的尖括号和空格删除（解除引用）</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'s/^&gt; //'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'/^&gt; /{print substr($0,3)}'</span></pre></div></div>

<p> # 移除大部分的HTML标签（包括跨行标签）</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-e</span> :a <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'s/&lt;[^&gt;]*&gt;//g;/&lt;/N;//ba'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{gsub(/&lt;[^&gt;]*&gt;/,&quot;&quot;,$0);print}'</span></pre></div></div>

<p> # 将分成多卷的uuencode文件解码。移除文件头信息，只保留uuencode编码部分。<br />
 # 文件必须以特定顺序传给sed。下面第一种版本的脚本可以直接在命令行下输入；<br />
 # 第二种版本则可以放入一个带执行权限的shell脚本中。（由Rahul Dhesi的一<br />
 # 个脚本修改而来。）</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'/^end/,/^begin/d'</span> file1 file2 ... fileX <span style="color: #000000; font-weight: bold;">|</span> uudecode   <span style="color: #666666; font-style: italic;"># vers. 1</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'/^end/,/^begin/d'</span> <span style="color: #ff0000;">&quot;$@&quot;</span> <span style="color: #000000; font-weight: bold;">|</span> uudecode                    <span style="color: #666666; font-style: italic;"># vers. 2</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #666666; font-style: italic;">#我不想装个uudecode验证，大致写个吧</span>
<span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'/^end/{F=0}{if(F)print}/^begin/{F=1}'</span> file1 file2 ... fileX</pre></div></div>

<p> # 将文件中的段落以字母顺序排序。段落间以（一行或多行）空行分隔。GNU sed使用<br />
 # 字元“\v”来表示垂直制表符，这里用它来作为换行符的占位符——当然你也可以<br />
 # 用其他未在文件中使用的字符来代替它。</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'/./{H;d;};x;s/\n/={NL}=/g'</span> <span style="color: #c20cb9; font-weight: bold;">file</span> <span style="color: #000000; font-weight: bold;">|</span> <span style="color: #c20cb9; font-weight: bold;">sort</span> <span style="color: #000000; font-weight: bold;">|</span> <span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'1s/={NL}=//;s/={NL}=/\n/g'</span>
gsed <span style="color: #ff0000;">'/./{H;d};x;y/\n/\v/'</span> <span style="color: #c20cb9; font-weight: bold;">file</span> <span style="color: #000000; font-weight: bold;">|</span> <span style="color: #c20cb9; font-weight: bold;">sort</span> <span style="color: #000000; font-weight: bold;">|</span> <span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'1s/\v//;y/\v/\n/'</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #660033;">-vRS</span>= <span style="color: #ff0000;">'{gsub(/\n/,&quot;\v&quot;,$0);print}'</span> ll.txt <span style="color: #000000; font-weight: bold;">|</span> <span style="color: #c20cb9; font-weight: bold;">sort</span> <span style="color: #000000; font-weight: bold;">|</span> <span style="color: #c20cb9; font-weight: bold;">awk</span> <span style="color: #ff0000;">'{gsub(/\v/,&quot;\n&quot;,$0);print;print &quot;&quot;}'</span></pre></div></div>

<p> # 分别压缩每个.TXT文件，压缩后删除原来的文件并将压缩后的.ZIP文件<br />
 # 命名为与原来相同的名字（只是扩展名不同）。（DOS环境：“dir /b”<br />
 # 显示不带路径的文件名）。</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #7a0874; font-weight: bold;">echo</span> <span style="color: #000000; font-weight: bold;">@</span><span style="color: #7a0874; font-weight: bold;">echo</span> off <span style="color: #000000; font-weight: bold;">&gt;</span>zipup.bat
<span style="color: #c20cb9; font-weight: bold;">dir</span> <span style="color: #000000; font-weight: bold;">/</span>b <span style="color: #000000; font-weight: bold;">*</span>.txt <span style="color: #000000; font-weight: bold;">|</span> <span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">&quot;s/^\(.*\)\.TXT/pkzip -mo \1 \1.TXT/&quot;</span> <span style="color: #000000; font-weight: bold;">&gt;&gt;</span>zipup.bat</pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;">DOS环境再次略过，而且我觉得这里用<span style="color: #c20cb9; font-weight: bold;">bash</span>的参数 <span style="color: #800000;">${i%.TXT}</span>.zip 替换更帅。</pre></div></div>

<p>下面的一些SED说明略过，需要的朋友自行查看原文。</p>
]]></content:encoded>
			<wfw:commentRss>http://luy.li/2009/12/07/sed_awk/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>python 内建函数</title>
		<link>http://luy.li/2009/11/29/python-built-in-functions/</link>
		<comments>http://luy.li/2009/11/29/python-built-in-functions/#comments</comments>
		<pubDate>Sun, 29 Nov 2009 07:08:45 +0000</pubDate>
		<dc:creator>bones7456</dc:creator>
				<category><![CDATA[精华]]></category>
		<category><![CDATA[编程相关]]></category>

		<guid isPermaLink="false">http://li2z.cn/?p=1107</guid>
		<description><![CDATA[说明：本文内容全部出自python官方文档，但是会有自己的理解，并非单纯的翻译。文章较长，如有错误之处，还请大家指正。 abs(x) 返回x的绝对值；当x是复数时，返回x的模。没错，python内建支持复数，见下面的complex()函数。 all(iterable) 当iterable里的每项都为真时，才返回真，等效于： def all&#40;iterable&#41;: for element in iterable: if not element: return False return True any(iterable) 只要iterable里有一项为真，就返回真，等效于： def any&#40;iterable&#41;: for element in iterable: if element: return True return False basestring() 这是 str 和 unicode 的抽象类，它不能被调用也不能被实例化，但是可以用在 isinstance 函数里进行判断，isinstance(obj, basestring) 等效于 isinstance(obj, (str, unicode)). &#62;&#62;&#62; isinstance&#40;123, basestring&#41; False &#62;&#62;&#62; isinstance&#40;&#34;123&#34;, basestring&#41; True &#62;&#62;&#62; isinstance&#40;u&#34;一二三&#34;, [...]]]></description>
			<content:encoded><![CDATA[<p>说明：本文内容全部出自<a href="http://docs.python.org/library/functions.html">python官方文档</a>，但是会有自己的理解，并非单纯的翻译。文章较长，如有错误之处，还请大家指正。</p>
<p><strong>abs</strong>(<em>x</em>)<br />
返回<em>x</em>的绝对值；当<em>x</em>是复数时，返回<em>x</em>的模。没错，python内建支持复数，见下面的<strong>complex</strong>()函数。</p>
<p><strong>all</strong>(<em>iterable</em>)<br />
当<em>iterable</em>里的每项都为真时，才返回真，等效于：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #ff7700;font-weight:bold;">def</span> <span style="color: #008000;">all</span><span style="color: black;">&#40;</span>iterable<span style="color: black;">&#41;</span>:
    <span style="color: #ff7700;font-weight:bold;">for</span> element <span style="color: #ff7700;font-weight:bold;">in</span> iterable:
        <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #ff7700;font-weight:bold;">not</span> element:
            <span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #008000;">False</span>
    <span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #008000;">True</span></pre></div></div>

<p><strong>any</strong>(<em>iterable</em>)<br />
只要<em>iterable</em>里有一项为真，就返回真，等效于：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #ff7700;font-weight:bold;">def</span> <span style="color: #008000;">any</span><span style="color: black;">&#40;</span>iterable<span style="color: black;">&#41;</span>:
    <span style="color: #ff7700;font-weight:bold;">for</span> element <span style="color: #ff7700;font-weight:bold;">in</span> iterable:
        <span style="color: #ff7700;font-weight:bold;">if</span> element:
            <span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #008000;">True</span>
    <span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #008000;">False</span></pre></div></div>

<p><strong>basestring</strong>()<br />
这是 str 和 unicode 的抽象类，它不能被调用也不能被实例化，但是可以用在 <strong>isinstance</strong> 函数里进行判断，isinstance(obj, basestring) 等效于 isinstance(obj, (str, unicode)).</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">isinstance</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">123</span>, <span style="color: #008000;">basestring</span><span style="color: black;">&#41;</span>
<span style="color: #008000;">False</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">isinstance</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;123&quot;</span>, <span style="color: #008000;">basestring</span><span style="color: black;">&#41;</span>
<span style="color: #008000;">True</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">isinstance</span><span style="color: black;">&#40;</span>u<span style="color: #483d8b;">&quot;一二三&quot;</span>, <span style="color: #008000;">basestring</span><span style="color: black;">&#41;</span>
<span style="color: #008000;">True</span></pre></div></div>

<p><strong>bin</strong>(<em>x</em>)<br />
如果<em>x</em>是一个整数，则返回一个与<em>x</em>等值的二进制python表达式；如果<em>x</em>不是一个整数类型，则<em>x</em>的类需要有一个可以返回一个整数的<strong>__index__</strong>()函数。</p>
<p><strong>bool</strong>([<em>x</em>])<br />
返回一个布尔型的值，如果<em>x</em>为False或者没传<em>x</em>参数的时候返回False，否则返回True。</p>
<p><strong>callable</strong>(<em>object</em>)<br />
判断<em>object</em>是否可调用，如果<em>object</em>是 函数、类、或者含有<strong>__call__</strong>()的类对象的话，将返回True。</p>
<p><strong>chr</strong>(<em>i</em>)<br />
返回一个单个字符的字符串，此字符的ascii码值为<em>i</em>(0&lt;=<em>i</em>&lt;=255)，此函数是<strong>ord</strong>函数的反函数。如果参数大于255而想得到一个unicode字符的话，需要使用<strong>unichr</strong>()</p>
<p><strong>classmethod</strong>(<em>function</em>)<br />
返回一个类的方法（类的方法有别于实例的方法，是不需要实例化也可以通过类名访问的方法），定义一个类的方法需要用这样的形式：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #ff7700;font-weight:bold;">class</span> C:
    @<span style="color: #008000;">classmethod</span>
    <span style="color: #ff7700;font-weight:bold;">def</span> f<span style="color: black;">&#40;</span>cls, arg1, arg2, ...<span style="color: black;">&#41;</span>: ...</pre></div></div>

<p><strong>cmp</strong>(<em>x</em>, <em>y</em>)<br />
比较两个对象<em>x</em>和<em>y</em>。如果<em>x</em>小于<em>y</em>，返回负数；大于返回正数；等于返回0。</p>
<p><strong>compile</strong>(<em>source</em>, <em>filename</em>, <em>mode</em>[, <em>flags</em>[, <em>dont_inherit</em>]])<br />
把<em>source</em>字符串编译成一个<a href="http://docs.python.org/library/ast.html#module-ast">AST对象</a>，暂时用不到，先略过。</p>
<p><strong>complex</strong>([<em>real</em>[, <em>imag</em>]])<br />
用传入的实部和虚部创建一个复数对象。</p>
<p><strong>delattr</strong>(<em>object</em>, <em>name</em>)<br />
删除对象的属性，相当于 del object.name ，可以和<strong>setattr</strong>配合使用。</p>
<p><strong>dict</strong>([<em>arg</em>])<br />
建立一个新的字典型数据，可以从参数里获取数据。</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">dict</span><span style="color: black;">&#40;</span><span style="color: black;">&#123;</span><span style="color: #483d8b;">&quot;a&quot;</span>:<span style="color: #483d8b;">&quot;b&quot;</span>,<span style="color: #483d8b;">&quot;c&quot;</span>:<span style="color: #483d8b;">&quot;d&quot;</span><span style="color: black;">&#125;</span><span style="color: black;">&#41;</span>
<span style="color: black;">&#123;</span><span style="color: #483d8b;">'a'</span>: <span style="color: #483d8b;">'b'</span>, <span style="color: #483d8b;">'c'</span>: <span style="color: #483d8b;">'d'</span><span style="color: black;">&#125;</span></pre></div></div>

<p><strong>dir</strong>([<em>object</em>])<br />
如果不加参数，返回当前执行环境下的变量名的列表。<br />
如果加了<em>object</em>参数，则会根据复杂的规则得到<em>object</em>的属性名列表，需要注意的是，当<em>object</em>定义了<strong>__dir__</strong>()或者 <strong>__getattr__</strong>()方法时，返回的结果并不一定正确。<br />
示例：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">dir</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
<span style="color: black;">&#91;</span><span style="color: #483d8b;">'__builtins__'</span>, <span style="color: #483d8b;">'__doc__'</span>, <span style="color: #483d8b;">'__name__'</span>, <span style="color: #483d8b;">'__package__'</span><span style="color: black;">&#93;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> t=<span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span>,<span style="color: #ff4500;">2</span><span style="color: black;">&#93;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">dir</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
<span style="color: black;">&#91;</span><span style="color: #483d8b;">'__builtins__'</span>, <span style="color: #483d8b;">'__doc__'</span>, <span style="color: #483d8b;">'__name__'</span>, <span style="color: #483d8b;">'__package__'</span>, <span style="color: #483d8b;">'t'</span><span style="color: black;">&#93;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">dir</span><span style="color: black;">&#40;</span>t<span style="color: black;">&#41;</span>
<span style="color: black;">&#91;</span><span style="color: #483d8b;">'__add__'</span>, <span style="color: #483d8b;">'__class__'</span>, <span style="color: #483d8b;">'__contains__'</span>, <span style="color: #483d8b;">'__delattr__'</span>, <span style="color: #483d8b;">'__delitem__'</span>, <span style="color: #483d8b;">'__delslice__'</span>, <span style="color: #483d8b;">'__doc__'</span>, <span style="color: #483d8b;">'__eq__'</span>, <span style="color: #483d8b;">'__format__'</span>, <span style="color: #483d8b;">'__ge__'</span>, <span style="color: #483d8b;">'__getattribute__'</span>, <span style="color: #483d8b;">'__getitem__'</span>, <span style="color: #483d8b;">'__getslice__'</span>, <span style="color: #483d8b;">'__gt__'</span>, <span style="color: #483d8b;">'__hash__'</span>, <span style="color: #483d8b;">'__iadd__'</span>, <span style="color: #483d8b;">'__imul__'</span>, <span style="color: #483d8b;">'__init__'</span>, <span style="color: #483d8b;">'__iter__'</span>, <span style="color: #483d8b;">'__le__'</span>, <span style="color: #483d8b;">'__len__'</span>, <span style="color: #483d8b;">'__lt__'</span>, <span style="color: #483d8b;">'__mul__'</span>, <span style="color: #483d8b;">'__ne__'</span>, <span style="color: #483d8b;">'__new__'</span>, <span style="color: #483d8b;">'__reduce__'</span>, <span style="color: #483d8b;">'__reduce_ex__'</span>, <span style="color: #483d8b;">'__repr__'</span>, <span style="color: #483d8b;">'__reversed__'</span>, <span style="color: #483d8b;">'__rmul__'</span>, <span style="color: #483d8b;">'__setattr__'</span>, <span style="color: #483d8b;">'__setitem__'</span>, <span style="color: #483d8b;">'__setslice__'</span>, <span style="color: #483d8b;">'__sizeof__'</span>, <span style="color: #483d8b;">'__str__'</span>, <span style="color: #483d8b;">'__subclasshook__'</span>, <span style="color: #483d8b;">'append'</span>, <span style="color: #483d8b;">'count'</span>, <span style="color: #483d8b;">'extend'</span>, <span style="color: #483d8b;">'index'</span>, <span style="color: #483d8b;">'insert'</span>, <span style="color: #483d8b;">'pop'</span>, <span style="color: #483d8b;">'remove'</span>, <span style="color: #483d8b;">'reverse'</span>, <span style="color: #483d8b;">'sort'</span><span style="color: black;">&#93;</span></pre></div></div>

<p><strong>divmod</strong>(<em>a</em>, <em>b</em>)<br />
通常返回<em>a</em>和<em>b</em>的商和余数组成的元组：  <em>(a // b, a % b)</em>。参数不能是复数。</p>
<p><strong>enumerate</strong>(<em>sequence</em>[, <em>start</em>=0])<br />
返回一个列举后的对象，<em>sequence</em>要支持迭代。返回的对象支持next()方法，此方法依次返回一个从<em>start</em>开始增长的序数和<em>sequence</em>里的元素组成的元组。看以下的例子：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> enu=<span style="color: #008000;">enumerate</span><span style="color: black;">&#40;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'Spring'</span>, <span style="color: #483d8b;">'Summer'</span>, <span style="color: #483d8b;">'Fall'</span>, <span style="color: #483d8b;">'Winter'</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> enu.<span style="color: black;">next</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
<span style="color: black;">&#40;</span><span style="color: #ff4500;">0</span>, <span style="color: #483d8b;">'Spring'</span><span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> enu.<span style="color: black;">next</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
<span style="color: black;">&#40;</span><span style="color: #ff4500;">1</span>, <span style="color: #483d8b;">'Summer'</span><span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #ff7700;font-weight:bold;">for</span> i, season <span style="color: #ff7700;font-weight:bold;">in</span> enu:
...     <span style="color: #ff7700;font-weight:bold;">print</span> i, season
... 
<span style="color: #ff4500;">2</span> Fall
<span style="color: #ff4500;">3</span> Winter</pre></div></div>

<p><strong>eval</strong>(<em>expression</em>[, <em>globals</em>[, <em>locals</em>]])<br />
执行<em>expression</em>表达式，可以用<em>globals</em>和<em>locals</em>来限制<em>expression</em>能访问的变量。<br />
值得注意的是，<em>expression</em>不仅可以是明文的字符串，还可以是compile()函数返回的代码对象。</p>
<p><strong>execfile</strong>(<em>filename</em>[, <em>globals</em>[, <em>locals</em>]])<br />
此函数类似<a href="http://docs.python.org/reference/simple_stmts.html#exec">exec</a>表达式。只是从文件里读取表达式。它和import的区别在于，execfile会无条件地读取文件，而且不会生成新的模块。<br />
<em>globals</em>和<em>locals</em>的用法和上面的eval同理。</p>
<p><strong>file</strong>(<em>filename</em>[, <em>mode</em>[, <em>bufsize</em>]])<br />
<a href="http://docs.python.org/library/stdtypes.html#bltin-file-objects">File类型</a>的构造函数，参数的作用和下面提到的open()函数是一样的。<br />
值得注意的是，open()函数更适合于打开一个文件，而file函数更适用于类型测试，例如： isinstance(f, file)</p>
<p><strong>filter</strong>(<em>function</em>, <em>iterable</em>)<br />
构造一个<em>function</em>(<em>iterable</em>)为true的list。当然<em>iterable</em>为字符串或者tuple的时候，返回的类型也是字符串或者tuple，否则返回list。</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">filter</span><span style="color: black;">&#40;</span><span style="color: #ff7700;font-weight:bold;">lambda</span> c: c <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #483d8b;">'abc'</span>, <span style="color: #483d8b;">'abcdcba'</span><span style="color: black;">&#41;</span>
<span style="color: #483d8b;">'abccba'</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">filter</span><span style="color: black;">&#40;</span><span style="color: #ff7700;font-weight:bold;">lambda</span> i: i <span style="color: #66cc66;">&lt;</span> <span style="color: #ff4500;">3</span>, <span style="color: black;">&#40;</span><span style="color: #ff4500;">1</span>, <span style="color: #ff4500;">2</span>, <span style="color: #ff4500;">3</span>, <span style="color: #ff4500;">2</span>, <span style="color: #ff4500;">1</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>
<span style="color: black;">&#40;</span><span style="color: #ff4500;">1</span>, <span style="color: #ff4500;">2</span>, <span style="color: #ff4500;">2</span>, <span style="color: #ff4500;">1</span><span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">filter</span><span style="color: black;">&#40;</span><span style="color: #ff7700;font-weight:bold;">lambda</span> i: i <span style="color: #66cc66;">&lt;</span> <span style="color: #ff4500;">3</span>, <span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span>, <span style="color: #ff4500;">2</span>, <span style="color: #ff4500;">4</span>, <span style="color: #ff4500;">2</span>, <span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span>
<span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span>, <span style="color: #ff4500;">2</span>, <span style="color: #ff4500;">2</span>, <span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span></pre></div></div>

<p>如果<em>function</em>为None，则<em>iterable</em>为false的元素将被剔除。也就是说，<em>function</em>不为None的时候，<code>filter(function, iterable)</code>等效于<code>[item for item in iterable if function(item)]</code>，否则等效于<code>[item for item in iterable if item]</code></p>
<p><strong>float</strong>([<em>x</em>])<br />
传入一个字符串或者整数或者float，返回一个float数据。</p>
<p><strong>format</strong>(<em>value</em>[, <em>format_spec</em>])<br />
根据<em>format_spec</em>格式化输出<em>value</em>的值，实际上只是调用了<code>value.__format__(format_spec)</code>，很多内建类型都有标准的<a href="http://docs.python.org/library/string.html#formatspec">输出函数</a>。</p>
<p><strong>frozenset</strong>([<em>iterable</em>])<br />
由<em>iterable</em>创建一个<a href="http://docs.python.org/library/stdtypes.html#types-set">frozenset对象</a>，frozenset是set的一个子类，它和set的区别在于它不支持某些可以修改set的操作，例如：add、remove、pop、clear等。可以理解为一个set的常量。</p>
<p><strong>getattr</strong>(<em>object</em>, <em>name</em>[, <em>default</em>])<br />
获得对象的属性值，<em>name</em>必须是字符串，如果<em>name</em>是<em>object</em>的属性，则<code>getattr(x, 'foobar')</code>相当于<code>x.foobar</code>，如果<em>name</em>不是<em>object</em>的属性，则返回<em>default</em>，如果没有<em>default</em>就会抛出AttributeError意外。</p>
<p><strong>globals</strong>()<br />
返回一个包含当前“全局符号表”的dict。</p>
<p><strong>hasattr</strong>(<em>object</em>, <em>name</em>)<br />
参数是一个对象和一个字符串，如果<em>object</em>对象有名为<em>name</em>的属性，则返回True，否则返回False。在执行<code>getattr(object, name)</code>之前，可以以此来检测属性的存在性。</p>
<p><strong>hash</strong>(<em>object</em>)<br />
如果可能的话，返回<em>object</em>的hash值，hash值是一个整型的数字，用于快速比较两个对象。两个相等的数字型对象将有相同的hash值，比如：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">hash</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#41;</span> == <span style="color: #008000;">hash</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">1.0</span><span style="color: black;">&#41;</span>
<span style="color: #008000;">True</span></pre></div></div>

<p><strong>help</strong>([<em>object</em>])<br />
调用内建的帮助系统（交互式）。<br />
如果省略参数，则会进入帮助控制台，出现<code>help></code>的提示符，输入相应内容就可以查看相应的帮助。<br />
如果参数是字符串，则在模块名、函数名、类名、方法名、关键字及文档和帮助主题里搜索此字符串，并显示。<br />
如果参数是其他类型的对象，则显示此对象的帮助信息。</p>
<p><strong>hex</strong>(<em>x</em>)<br />
将任何长度的整型数字转化为16进制的字符串。<br />
如果转换浮点数为16进制，则须使用<code>float.hex()</code>方法。</p>
<p><strong>id</strong>(<em>object</em>)<br />
返回一个整型（或者长整型）的<em>object</em>的唯一标识符。注意：两个生命周期没有交叉的对象，也许会返回同一个标识符。（在CPython里，其实就是返回<em>object</em>的地址）</p>
<p><strong>input</strong>([<em>prompt</em>])<br />
等效于 <code>eval(raw_input(prompt))</code><br />
返回用户输入的python表达式的值，一句话：注意安全。</p>
<p><strong>int</strong>([<em>x</em>[, <em>base</em>]])<br />
根据<em>x</em>的值返回一个整数，<em>x</em>可以是一个含有数字信息的字符串或者数字类型（整型/浮点型/长整型/复数）。可选的<em>base</em>参数，代表进制，可以是2~36之间的数字或者0。如果<em>base</em>的值为0，将会根据<em>x</em>的值选取<a href="http://docs.python.org/reference/lexical_analysis.html#numbers">适当</a>的基数。如果不提供任何参数，将返回0。</p>
<p><strong>isinstance</strong>(<em>object</em>, <em>classinfo</em>)<br />
如果<em>object</em>是<em>classinfo</em>或者<em>classinfo</em>的子类的实例，或者是和<em>classinfo</em>同类的对象，则返回True。<em>classinfo</em>也可以是类或者对象组成的tuple，这时候，<em>object</em>只要是<em>classinfo</em>里的一者就返回True：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">isinstance</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">1</span>, <span style="color: black;">&#40;</span><span style="color: #008000;">int</span>,<span style="color: #008000;">float</span><span style="color: black;">&#41;</span> <span style="color: black;">&#41;</span>
<span style="color: #008000;">True</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">isinstance</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">1.0</span>, <span style="color: black;">&#40;</span><span style="color: #008000;">int</span>,<span style="color: #008000;">float</span><span style="color: black;">&#41;</span> <span style="color: black;">&#41;</span>
<span style="color: #008000;">True</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">isinstance</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;1.0&quot;</span>, <span style="color: black;">&#40;</span><span style="color: #008000;">int</span>,<span style="color: #008000;">float</span><span style="color: black;">&#41;</span> <span style="color: black;">&#41;</span>
<span style="color: #008000;">False</span></pre></div></div>

<p><strong>issubclass</strong>(<em>class</em>, <em>classinfo</em>)<br />
如果<em>class</em>是<em>classinfo</em>的直接或者间接之类的话，就返回True。一个类也被视为自己的之类。同上例，<em>classinfo</em>也可以是tuple。</p>
<p><strong>iter</strong>(<em>o</em>[, <em>sentinel</em>])<br />
返回一个“迭代器”对象，根据<em>sentinel</em>的设置不停地对第一个参数进线取值。当忽略第二个参数时，<em>o</em>必须是一个支持<a href="http://docs.python.org/reference/datamodel.html#object.__iter__">__iter__()</a>或者<a href="http://docs.python.org/reference/datamodel.html#object.__getitem__">__getitem__()</a>方法的对象，否则将会抛出TypeError例外。如果提供了<em>sentinel</em>参数，<em>o</em>必须是一个可调用的对象，这时将不停地调用此方法，并返回迭代器的项，知道返回的值等于<em>sentinel</em>为止，这时将抛出StopIteration。<br />
第二种形式特别适用于打开一个文件，一行行处理文本，知道遇到特定的行：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #ff7700;font-weight:bold;">with</span> <span style="color: #008000;">open</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;mydata.txt&quot;</span><span style="color: black;">&#41;</span> <span style="color: #ff7700;font-weight:bold;">as</span> fp:
    <span style="color: #ff7700;font-weight:bold;">for</span> line <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #008000;">iter</span><span style="color: black;">&#40;</span>fp.<span style="color: #dc143c;">readline</span>, <span style="color: #483d8b;">&quot;STOP&quot;</span><span style="color: black;">&#41;</span>:
        process_line<span style="color: black;">&#40;</span>line<span style="color: black;">&#41;</span></pre></div></div>

<p><strong>len</strong>(<em>s</em>)<br />
返回<em>s</em>的长度，也就是项数。自建会调用__len__函数取值。</p>
<p><strong>list</strong>([<em>iterable</em>])<br />
返回一个含有所有<em>iterable</em>中的元素的list对象。如果参数为空，则返回空的list。</p>
<p><strong>locals</strong>()<br />
和上面的<code>globals()</code>对应，返回一个包含当前“局部符号表”的dict。在函数里调用的时候，将排除在类中声明的变量。</p>
<p><strong>long</strong>([<em>x</em>[, <em>base</em>]])<br />
根据字符串或者数字类型的参数，返回一个长整型的数字。参数的含义和上面的<code>int</code>类似。</p>
<p><strong>map</strong>(<em>function</em>, <em>iterable</em>, <em>&#8230;</em>)<br />
对<em>iterable</em>里的每项执行<em>function</em>函数，并把结果以一个list的形式返回。如果有3个以上的参数，则后面的参数也需要是可迭代的，map会把额外的参数传给<em>function</em>，例如，这样可以把两个tuple一一相加得到一个list：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">map</span><span style="color: black;">&#40;</span><span style="color: #ff7700;font-weight:bold;">lambda</span> x, add: x + add, <span style="color: black;">&#40;</span><span style="color: #ff4500;">2</span>, <span style="color: #ff4500;">4</span>, <span style="color: #ff4500;">5</span><span style="color: black;">&#41;</span>, <span style="color: black;">&#40;</span><span style="color: #ff4500;">1</span>, <span style="color: #ff4500;">3</span>, <span style="color: #ff4500;">6</span><span style="color: black;">&#41;</span> <span style="color: black;">&#41;</span>
<span style="color: black;">&#91;</span><span style="color: #ff4500;">3</span>, <span style="color: #ff4500;">7</span>, <span style="color: #ff4500;">11</span><span style="color: black;">&#93;</span></pre></div></div>

<p>如迭代器的长度不一致，缺失的项将用None代替：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">map</span><span style="color: black;">&#40;</span><span style="color: #ff7700;font-weight:bold;">lambda</span> x, add: x + add, <span style="color: black;">&#40;</span><span style="color: #ff4500;">2</span>, <span style="color: #ff4500;">4</span>, <span style="color: #ff4500;">5</span><span style="color: black;">&#41;</span>, <span style="color: black;">&#40;</span><span style="color: #ff4500;">1</span>, <span style="color: #ff4500;">3</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>
Traceback <span style="color: black;">&#40;</span>most recent call last<span style="color: black;">&#41;</span>:
  File <span style="color: #483d8b;">&quot;&lt;stdin&gt;&quot;</span>, line <span style="color: #ff4500;">1</span>, <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #66cc66;">&lt;</span>module<span style="color: #66cc66;">&gt;</span>
  File <span style="color: #483d8b;">&quot;&lt;stdin&gt;&quot;</span>, line <span style="color: #ff4500;">1</span>, <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #66cc66;">&lt;</span>lambda<span style="color: #66cc66;">&gt;</span>
<span style="color: #008000;">TypeError</span>: unsupported operand <span style="color: #008000;">type</span><span style="color: black;">&#40;</span>s<span style="color: black;">&#41;</span> <span style="color: #ff7700;font-weight:bold;">for</span> +: <span style="color: #483d8b;">'int'</span> <span style="color: #ff7700;font-weight:bold;">and</span> <span style="color: #483d8b;">'NoneType'</span></pre></div></div>

<p>如果<em>function</em>为None，将用 identity function 代替（好像就是直入直出）。</p>
<p><strong>max</strong>(<em>iterable</em>[, <em>args</em>...][, <em>key</em>])<br />
如果只给一个参数，就返回<em>iterable</em>里最大的项；如果是多个参数的话，则返回参数里最大的项。</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">max</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;abcd&quot;</span><span style="color: black;">&#41;</span>
<span style="color: #483d8b;">'d'</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">max</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">1</span>, <span style="color: #ff4500;">2</span>, <span style="color: #ff4500;">3</span><span style="color: black;">&#41;</span>
<span style="color: #ff4500;">3</span></pre></div></div>

<p>额外的key参数，是用于比较的函数，比如，下面这个可以得到各项除3的余数最大的一个：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">max</span><span style="color: black;">&#40;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span>, <span style="color: #ff4500;">2</span>, <span style="color: #ff4500;">3</span>, <span style="color: #ff4500;">4</span><span style="color: black;">&#93;</span>, key=<span style="color: #ff7700;font-weight:bold;">lambda</span> x: x <span style="color: #66cc66;">%</span> <span style="color: #ff4500;">3</span><span style="color: black;">&#41;</span>
<span style="color: #ff4500;">2</span></pre></div></div>

<p><strong>min</strong>(<em>iterable</em>[, <em>args</em>...][, <em>key</em>])<br />
同上，求最小值。</p>
<p><strong>next</strong>(<em>iterator</em>[, <em>default</em>])<br />
依次返回迭代器<em>iterator</em>的项。当<em>iterator</em>没有更多的项时，如果有<em>default</em>参数，则返回<em>default</em>，否则抛出StopIteration例外。</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> a = <span style="color: #008000;">iter</span><span style="color: black;">&#40;</span><span style="color: #008000;">range</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">3</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> next<span style="color: black;">&#40;</span>a<span style="color: black;">&#41;</span>
<span style="color: #ff4500;">0</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> next<span style="color: black;">&#40;</span>a<span style="color: black;">&#41;</span>
<span style="color: #ff4500;">1</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> next<span style="color: black;">&#40;</span>a<span style="color: black;">&#41;</span>
<span style="color: #ff4500;">2</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> next<span style="color: black;">&#40;</span>a<span style="color: black;">&#41;</span>
Traceback <span style="color: black;">&#40;</span>most recent call last<span style="color: black;">&#41;</span>:
  File <span style="color: #483d8b;">&quot;&lt;stdin&gt;&quot;</span>, line <span style="color: #ff4500;">1</span>, <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #66cc66;">&lt;</span>module<span style="color: #66cc66;">&gt;</span>
<span style="color: #008000;">StopIteration</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> next<span style="color: black;">&#40;</span>a, <span style="color: #483d8b;">&quot;No More Item...&quot;</span><span style="color: black;">&#41;</span>
<span style="color: #483d8b;">'No More Item...'</span></pre></div></div>

<p><strong>object</strong>()<br />
返回一个空的对象，但是此对象会有一些公有的属性：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> o = <span style="color: #008000;">object</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">dir</span><span style="color: black;">&#40;</span>o<span style="color: black;">&#41;</span>
<span style="color: black;">&#91;</span><span style="color: #483d8b;">'__class__'</span>, <span style="color: #483d8b;">'__delattr__'</span>, <span style="color: #483d8b;">'__doc__'</span>, <span style="color: #483d8b;">'__format__'</span>, <span style="color: #483d8b;">'__getattribute__'</span>, <span style="color: #483d8b;">'__hash__'</span>, <span style="color: #483d8b;">'__init__'</span>, <span style="color: #483d8b;">'__new__'</span>, <span style="color: #483d8b;">'__reduce__'</span>, <span style="color: #483d8b;">'__reduce_ex__'</span>, <span style="color: #483d8b;">'__repr__'</span>, <span style="color: #483d8b;">'__setattr__'</span>, <span style="color: #483d8b;">'__sizeof__'</span>, <span style="color: #483d8b;">'__str__'</span>, <span style="color: #483d8b;">'__subclasshook__'</span><span style="color: black;">&#93;</span></pre></div></div>

<p><strong>oct</strong>(<em>x</em>)<br />
将任意精度的十进制整数<em>x</em>转换成八进制。</p>
<p><strong>open</strong>(<em>filename</em>[, <em>mode</em>[, <em>bufsize</em>]])<br />
打开文件，返回一个<a href="http://docs.python.org/library/stdtypes.html#bltin-file-objects">文件对象</a>，如果文件打不开，将抛出<code>IOError</code>错误。<br />
<em>filename</em>参数，是要打开的文件名。<br />
<em>mode</em>参数是打开方式，通常是<code>'r'</code>表示读，<code>'w'</code>表示写（如果已存在则会覆盖），<code>'a'</code>表示追加。缺省为<code>'r'</code>。另外，缺省使用的是文本模式，会把<code>'\n'</code>转成系统相关的换行符，如果要避免这个引起的问题，需要在各个模式后面加一个<code>'b'</code>表示使用二进制模式。另外还有些&#8217;+uU&#8217;之类的模式，不常用，也就不介绍了吧。<br />
可选的<em>bufsize</em>参数表示缓冲区的大小。0表示不缓冲，1表示行缓冲，其他正数表示近视的缓冲区字节数，负数表示使用系统默认值。默认是0。</p>
<p><strong>ord</strong>(<em>c</em>)<br />
给定一个长度为1的字符串或者unicode字符，返回该字符的ascii码或者unicode码，前一种情况是<code>chr()</code>的反函数，后一种情况是<code>unichr()</code>的反函数。</p>
<p><strong>pow</strong>(<em>x</em>, <em>y</em>[, <em>z</em>])<br />
返回<em>x</em>的<em>y</em>次方，也就是<code>x**y</code>。如果有<em>z</em>的话，返回<em>x</em>的<em>y</em>次方除<em>z</em>得到的余数（这个比<code>pow(x, y) % z</code>更高效，这点可以看我写的<a href="http://luy.li/2009/10/26/projecteuler/">欧拉工程</a>第<a href="http://code.google.com/p/nrciz/source/browse/trunk/projecteuler/048/048-bones7456.py?spec=svn195&#038;r=195">48题的代码</a>，之前很慢，现在很快）。<br />
如果第二个参数是负数的话，将返回浮点型的数据，而且这个时候不能有<em>z</em>。</p>
<p><strong>print</strong>([<em>object</em>, <em>...</em>][, <em>sep</em>=' '][, <em>end</em>='\n'][, <em>file</em>=<em>sys.stdout</em>])<br />
输出一个或多个<em>object</em>到<em>file</em>，中间用<em>sep</em>间隔，并在结尾加上<em>end</em>。<br />
后3个参数如果给出的话，必须用keyword arguments的形式，也就是必须指定参数名，否则将一概被视为<em>object</em>的一部分而被输出。<br />
需要注意的是和python 2.6前的print关键字的区别。</p>
<p><strong>property</strong>([<em>fget</em>[, <em>fset</em>[, <em>fdel</em>[, <em>doc</em>]]]])<br />
返回一个属性，参数分别是获取、设置和删除的函数外加doc string，看例子吧：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #ff7700;font-weight:bold;">class</span> C<span style="color: black;">&#40;</span><span style="color: #008000;">object</span><span style="color: black;">&#41;</span>:
...     <span style="color: #ff7700;font-weight:bold;">def</span> <span style="color: #0000cd;">__init__</span><span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>:
...         <span style="color: #008000;">self</span>._x = <span style="color: #008000;">None</span>
...     <span style="color: #ff7700;font-weight:bold;">def</span> getx<span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>:
...         <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">&quot;OK. give you:&quot;</span>, <span style="color: #008000;">self</span>._x
...         <span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #008000;">self</span>._x
...     <span style="color: #ff7700;font-weight:bold;">def</span> setx<span style="color: black;">&#40;</span><span style="color: #008000;">self</span>, value<span style="color: black;">&#41;</span>:
...         <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">&quot;Now x is:&quot;</span>, value
...         <span style="color: #008000;">self</span>._x = value
...     <span style="color: #ff7700;font-weight:bold;">def</span> delx<span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>:
...         <span style="color: #ff7700;font-weight:bold;">del</span> <span style="color: #008000;">self</span>._x
...     <span style="color: black;">x</span> = <span style="color: #008000;">property</span><span style="color: black;">&#40;</span>getx, setx, delx, <span style="color: #483d8b;">&quot;I'm the 'x' property.&quot;</span><span style="color: black;">&#41;</span>
... 
<span style="color: #66cc66;">&gt;&gt;&gt;</span> a = C<span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> a.<span style="color: black;">x</span> = <span style="color: #ff4500;">123</span>
Now x <span style="color: #ff7700;font-weight:bold;">is</span>: <span style="color: #ff4500;">123</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #ff7700;font-weight:bold;">print</span> a.<span style="color: black;">x</span>
OK. <span style="color: black;">give</span> you: <span style="color: #ff4500;">123</span>
<span style="color: #ff4500;">123</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">help</span><span style="color: black;">&#40;</span>a.<span style="color: black;">x</span><span style="color: black;">&#41;</span>
OK. <span style="color: black;">give</span> you: <span style="color: #ff4500;">123</span>
&nbsp;
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">help</span><span style="color: black;">&#40;</span>C.<span style="color: black;">x</span><span style="color: black;">&#41;</span>
<span style="color: #808080; font-style: italic;">#这里可以看到I'm the 'x' property.</span></pre></div></div>

<p><strong>range</strong>([<em>start</em>], <em>stop</em>[, <em>step</em>])<br />
方便地产生一个包含等差数列的list，如果忽略<em>start</em>，则默认为0；如果忽略<em>step</em>，则默认为1。经常被用于for循环里。注意返回的结果并不包含<em>stop</em>。</p>
<p><strong>raw_input</strong>([<em>prompt</em>])<br />
从输入读入一行字符串，结尾的回车将被去掉。如果提供了<em>prompt</em>参数，将做为输入的提示符。</p>
<p><strong>reduce</strong>(<em>function</em>, <em>iterable</em>[, <em>initializer</em>])<br />
将两个参数的<em>function</em>函数循环应用到迭代器的各项，例如<code>reduce(lambda x, y: x+y, [1, 2, 3, 4, 5])</code>相当于<code>((((1+2)+3)+4)+5)</code>。如果提供了可选的<em>initializer</em>参数，则会将它放在迭代器的前面进行运算。</p>
<p><strong>reload</strong>(<em>module</em>)<br />
重新加载之前已经导入的模块。当你在设计一个模块，并用外部编辑器更新了它的代码时，可以用reload重新导入此模块，来验证模块的正确性。<br />
reload执行时候的具体细节这里就不描述了。</p>
<p><strong>repr</strong>(<em>object</em>)<br />
返回一个尽量包含<em>object</em>的信息的字符串，其实交互式python解释器，在输入一个对象回车的时候，就是返回对象的repr值。<br />
对于很多常见的对象，返回的值都尽可能地使得能够被<code>eval</code>解释并返回对象本身；另外的就尽量包含所在的域信息和类型或者地址等。<br />
一个类可以通过<a href="http://docs.python.org/reference/datamodel.html#object.__repr__">__repr__()</a>方法自定义repr的返回值。</p>
<p><strong>reversed</strong>(<strong>seq</strong>)<br />
返回一个倒序的迭代器。<strong>seq</strong>要么支持 __reversed__() 方法，要么支持取项的操作（也就是支持__len__()方法和从0开始的整数值的__getitem__()方法）。<br />
例子：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">reversed</span><span style="color: black;">&#40;</span><span style="color: #008000;">range</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">5</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&lt;</span>listreverseiterator <span style="color: #008000;">object</span> at 0x80a658c<span style="color: #66cc66;">&gt;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: black;">&#91;</span>i <span style="color: #ff7700;font-weight:bold;">for</span> i <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #008000;">reversed</span><span style="color: black;">&#40;</span><span style="color: #008000;">range</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">5</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span><span style="color: black;">&#93;</span>
<span style="color: black;">&#91;</span><span style="color: #ff4500;">4</span>, <span style="color: #ff4500;">3</span>, <span style="color: #ff4500;">2</span>, <span style="color: #ff4500;">1</span>, <span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span></pre></div></div>

<p><strong>round</strong>(<em>x</em>[, <em>n</em>])<br />
将浮点数<em>x</em>四舍五入取整到小数点后<em>n</em>位小数。<em>n</em>的默认值是0，也就是取整。</p>
<p><strong>set</strong>([<em>iterable</em>])<br />
由迭代器<em>iterable</em>返回一个集合对象，集合中的元素是随机顺序，但是不重复的。此函数在去掉列表的重复项的时候，特别有用：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> l = <span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span>, <span style="color: #ff4500;">2</span>, <span style="color: #ff4500;">3</span>, <span style="color: #ff4500;">2</span>, <span style="color: #ff4500;">4</span>, <span style="color: #ff4500;">3</span><span style="color: black;">&#93;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">set</span><span style="color: black;">&#40;</span>l<span style="color: black;">&#41;</span>
<span style="color: #008000;">set</span><span style="color: black;">&#40;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span>, <span style="color: #ff4500;">2</span>, <span style="color: #ff4500;">3</span>, <span style="color: #ff4500;">4</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">list</span><span style="color: black;">&#40;</span><span style="color: #008000;">set</span><span style="color: black;">&#40;</span>l<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>
<span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span>, <span style="color: #ff4500;">2</span>, <span style="color: #ff4500;">3</span>, <span style="color: #ff4500;">4</span><span style="color: black;">&#93;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #483d8b;">''</span>.<span style="color: black;">join</span><span style="color: black;">&#40;</span><span style="color: #008000;">set</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;hello&quot;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>
<span style="color: #483d8b;">'helo'</span></pre></div></div>

<p><strong>setattr</strong>(<em>object</em>, <em>name</em>, <em>value</em>)<br />
此函数和<code>getattr()</code>配合使用，<code>setattr(x, 'foobar', 123)</code>相当于<code>x.foobar = 123</code>。</p>
<p><strong>slice</strong>([<em>start</em>], <em>stop</em>[, <em>step</em>])<br />
返回一个分片对象，分片对象就只包含了<em>start</em>, <em>stop</em>, <em>step</em>这3个信息，它在python内部和一些第三方库中广泛被使用，其实类似a[1:3]这样的操作也会生成分片对象。如果省略<em>start</em>和<em>step</em>，将默认为None。<br />
可以看到下面两者其实是等效的：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">range</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">5</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #008000;">slice</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">1</span>, <span style="color: #ff4500;">4</span>, <span style="color: #ff4500;">2</span><span style="color: black;">&#41;</span><span style="color: black;">&#93;</span>
<span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span>, <span style="color: #ff4500;">3</span><span style="color: black;">&#93;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">range</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">5</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span>:<span style="color: #ff4500;">4</span>:<span style="color: #ff4500;">2</span><span style="color: black;">&#93;</span>
<span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span>, <span style="color: #ff4500;">3</span><span style="color: black;">&#93;</span></pre></div></div>

<p><strong>sorted</strong>(<em>iterable</em>[, <em>cmp</em>[, <em>key</em>[, <em>reverse</em>]]])<br />
返回一个排序后的列表，用于排序的元素来自<em>iterable</em>，后面的参数控制排序的过程。<br />
<em>cmp</em>是自定义的比较函数，接受两个参数，返回负数表示第一个参数较小，返回0表示两者一样大，返回正数表示第一个参数较大。<br />
<em>key</em>可以理解为每个参数的求值函数。如果提供了<em>key</em>，则在比较前，先对每个先用key进线求职，对结果再进行排序，但是返回的排序后的结果还是之前的值。<br />
<em>reverse</em>如果是True，则按降序排列，默认是从小到大的升序。<br />
看例子：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #808080; font-style: italic;">#正常的排序</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">sorted</span><span style="color: black;">&#40;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span>, <span style="color: #ff4500;">5</span>, <span style="color: #ff4500;">3</span>, <span style="color: #ff4500;">4</span>, <span style="color: #ff4500;">6</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span>
<span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span>, <span style="color: #ff4500;">3</span>, <span style="color: #ff4500;">4</span>, <span style="color: #ff4500;">5</span>, <span style="color: #ff4500;">6</span><span style="color: black;">&#93;</span>
<span style="color: #808080; font-style: italic;">#倒序</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">sorted</span><span style="color: black;">&#40;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span>, <span style="color: #ff4500;">5</span>, <span style="color: #ff4500;">3</span>, <span style="color: #ff4500;">4</span>, <span style="color: #ff4500;">6</span><span style="color: black;">&#93;</span>, reverse=<span style="color: #008000;">True</span><span style="color: black;">&#41;</span>
<span style="color: black;">&#91;</span><span style="color: #ff4500;">6</span>, <span style="color: #ff4500;">5</span>, <span style="color: #ff4500;">4</span>, <span style="color: #ff4500;">3</span>, <span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span>
<span style="color: #808080; font-style: italic;">#提供了key，结果是除3的余数谁最小，谁就排前</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">sorted</span><span style="color: black;">&#40;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span>, <span style="color: #ff4500;">5</span>, <span style="color: #ff4500;">3</span>, <span style="color: #ff4500;">4</span>, <span style="color: #ff4500;">6</span><span style="color: black;">&#93;</span>, key=<span style="color: #ff7700;font-weight:bold;">lambda</span> x: x<span style="color: #66cc66;">%</span>3<span style="color: black;">&#41;</span>
<span style="color: black;">&#91;</span><span style="color: #ff4500;">3</span>, <span style="color: #ff4500;">6</span>, <span style="color: #ff4500;">1</span>, <span style="color: #ff4500;">4</span>, <span style="color: #ff4500;">5</span><span style="color: black;">&#93;</span>
<span style="color: #808080; font-style: italic;">#用cmp实现的版本</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">sorted</span><span style="color: black;">&#40;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span>, <span style="color: #ff4500;">5</span>, <span style="color: #ff4500;">3</span>, <span style="color: #ff4500;">4</span>, <span style="color: #ff4500;">6</span><span style="color: black;">&#93;</span>, <span style="color: #008000;">cmp</span>=<span style="color: #ff7700;font-weight:bold;">lambda</span> x,y: x<span style="color: #66cc66;">%</span>3 - y<span style="color: #66cc66;">%</span>3<span style="color: black;">&#41;</span>
<span style="color: black;">&#91;</span><span style="color: #ff4500;">3</span>, <span style="color: #ff4500;">6</span>, <span style="color: #ff4500;">1</span>, <span style="color: #ff4500;">4</span>, <span style="color: #ff4500;">5</span><span style="color: black;">&#93;</span></pre></div></div>

<p>值得注意的是，虽然<em>cmp</em>和<em>key</em>都可以实现上面的除3余数排列，但是因为<em>cmp</em>要对每次比较的两个元素都调用一次函数，所以，效率不如<em>key</em>来得高。</p>
<p><strong>staticmethod</strong>(<em>function</em>)<br />
返回一个静态方法<em>function</em><br />
要声明一个静态方法，需要使用如下的语法：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #ff7700;font-weight:bold;">class</span> C:
    @<span style="color: #008000;">staticmethod</span>
    <span style="color: #ff7700;font-weight:bold;">def</span> f<span style="color: black;">&#40;</span>arg1, arg2, ...<span style="color: black;">&#41;</span>: ...</pre></div></div>

<p>静态方法可以被类本身调用（例如：<code>C.f()</code>）也可以被类的对象调用（例如：<code>C().f()</code>）。</p>
<p><strong>str</strong>([<em>object</em>])<br />
返回一个精确可打印的字符串，来说明<em>object</em>。和<code>repr(object)</code>不同，<code>str(object)</code>返回的字符串不一定能被eval()执行来得到对象本身，<code>str(object)</code>的目标只是可打印和可读。</p>
<p><strong>sum</strong>(<em>iterable</em>[, <em>start</em>])<br />
对<em>iterable</em>在<em>start</em>做为初值的基础上进行累加。<em>start</em>的默认值为0。<br />
注意此方法不能对字符串进行相加（连接）操作，连接字符串还是用<code>''.join(sequence)</code>好了。另外，<code>sum(range(n), m)</code>等价于<code>reduce(operator.add, range(n), m)</code>，要更精确地对浮点数进行累加，请使用<a href="http://docs.python.org/library/math.html#math.fsum">math.fsum()</a>。</p>
<p><strong>super</strong>(<em>type</em>[, <em>object-or-type</em>])<br />
返回一个指代<em>type</em>的父类或者兄弟类的对象，可以用这个对象间接地调用父类或者兄弟类的方法。在有复杂的类继承关系结构的时候，会很有用。用到的时候可以自行研究下<a href="http://blog.csdn.net/johnsonguo/archive/2006/01/20/585193.aspx">这文章</a>。</p>
<p><strong>tuple</strong>([<em>iterable</em>])<br />
返回一个tuple对象（元组），元素来自<em>iterable</em>。如果省略参数，将返回空的元组。</p>
<p><strong>type</strong>(<em>object</em>)<br />
返回<em>object</em>的类型，返回值本身是个“类型对象”。注意，进行类型判断建议使用<code>isinstance()</code>函数。</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">type</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&lt;</span>type <span style="color: #483d8b;">'int'</span><span style="color: #66cc66;">&gt;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">type</span><span style="color: black;">&#40;</span><span style="color: #008000;">type</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&lt;</span>type <span style="color: #483d8b;">'type'</span><span style="color: #66cc66;">&gt;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">type</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#41;</span> == <span style="color: #008000;">int</span>  <span style="color: #808080; font-style: italic;">#非常不建议这样的使用方法。</span>
<span style="color: #008000;">True</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">isinstance</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">1</span>,<span style="color: #008000;">int</span><span style="color: black;">&#41;</span>   <span style="color: #808080; font-style: italic;">#建议这样使用。</span>
<span style="color: #008000;">True</span></pre></div></div>

<p><strong>type</strong>(<em>name</em>, <em>bases</em>, <em>dict</em>)<br />
不同于上面那个一个参数的type，这个方法用于快速构造一个类，传入的3个参数将分别转化为所得到的类的__name__，__bases__和__dict__。<br />
例如，下面这两个X是等价的：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #ff7700;font-weight:bold;">class</span> X<span style="color: black;">&#40;</span><span style="color: #008000;">object</span><span style="color: black;">&#41;</span>:
...     <span style="color: black;">a</span> = <span style="color: #ff4500;">1</span>
...
<span style="color: #66cc66;">&gt;&gt;&gt;</span> X = <span style="color: #008000;">type</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'X'</span>, <span style="color: black;">&#40;</span><span style="color: #008000;">object</span>,<span style="color: black;">&#41;</span>, <span style="color: #008000;">dict</span><span style="color: black;">&#40;</span>a=<span style="color: #ff4500;">1</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span></pre></div></div>

<p><strong>unichr</strong>(<em>i</em>)<br />
返回一个单个字符的unicode串，此字符的unicode码值为<em>i</em>。对于Unicode，此函数也是ord()的反函数。<em>i</em>的范围由python解释器的编译环境决定。</p>
<p><strong>unicode</strong>([<em>object</em>[, <em>encoding</em>[, <em>errors</em>]]])<br />
返回一个代表<em>object</em>的unicode字符串。<br />
如果给定了<em>encoding</em>和/或<em>errors</em>，将用ascii或者<em>encoding</em>指定的编码对<em>object</em>进行解码，在遇到解码错误的时候，<em>errors</em>的值将影响函数的下一步动作：如果<em>errors</em>的值是<code>'strict'</code>（默认值），将会抛出ValueError错误；如果<em>errors</em>的值是<code>'ignore'</code>将会忽略错误，继续解码；如果<em>errors</em>是<code>'replace'</code>，将使用U+FFFD来替换当前字符。<br />
看个例子，我的utf8环境下：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">unicode</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'我是bones7456'</span>, encoding=<span style="color: #483d8b;">'utf8'</span><span style="color: black;">&#41;</span>
u<span style="color: #483d8b;">'<span style="color: #000099; font-weight: bold;">\u</span>6211<span style="color: #000099; font-weight: bold;">\u</span>662fbones7456'</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #008000;">unicode</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'我是bones7456'</span>, encoding=<span style="color: #483d8b;">'utf8'</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;">#可见解码成功</span>
我是bones7456
<span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #008000;">unicode</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'我是bones7456'</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;">#不指定编码方式，将默认使用ascii解码，失败了。</span>
Traceback <span style="color: black;">&#40;</span>most recent call last<span style="color: black;">&#41;</span>:
  File <span style="color: #483d8b;">&quot;&lt;stdin&gt;&quot;</span>, line <span style="color: #ff4500;">1</span>, <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #66cc66;">&lt;</span>module<span style="color: #66cc66;">&gt;</span>
<span style="color: #008000;">UnicodeDecodeError</span>: <span style="color: #483d8b;">'ascii'</span> codec can<span style="color: #483d8b;">'t decode byte 0xe6 in position 0: ordinal not in range(128)
&gt;&gt;&gt; unicode('</span>我是bones7456<span style="color: #483d8b;">', errors='</span>ignore<span style="color: #483d8b;">') #忽略失败，可以得到英文数字部分
u'</span>bones7456<span style="color: #483d8b;">'
&gt;&gt;&gt; unicode('</span>我是bones7456<span style="color: #483d8b;">', errors='</span>replace<span style="color: #483d8b;">') #替换的话，会加上一堆？？？哈哈。
u'</span>\ufffd\ufffd\ufffd\ufffd\ufffd\ufffdbones7456<span style="color: #483d8b;">'
&gt;&gt;&gt; print unicode('</span>我是bones7456<span style="color: #483d8b;">', errors='</span>replace<span style="color: #483d8b;">')
������bones7456</span></pre></div></div>

<p>如果没有后面的俩参数，<code>unicode()</code>的行为类似于<code>str()</code>，只不过返回的unicode字符串而已。<br />
如果，<em>object</em>对象提供了__unicode__()方法，将调用此方法来返回一个可被用户自定义的unicode串。</p>
<p><strong>vars</strong>([<em>object</em>])<br />
如果省略<em>object</em>，<code>vars()</code>和locals()类似，如果<em>object</em>是模块、类、类的对象或者其他还有__dict__属性的对象的话，就返回它的__dict__。</p>
<p><strong>xrange</strong>([<strong>start</strong>], <strong>stop</strong>[, <strong>step</strong>])<br />
此函数和range()非常类似，但是返回的不是一个列表，而是一个xrange对象。xrange对象在被引用时，也能生成列表的各项，但是这些项不是同时存在于内存里的。xrange和range比的优势是更小巧，更快。</p>
<p><strong>zip</strong>([<em>iterable</em>, <em>...</em>])<br />
哈，说到这个函数，我还给python官方文档提过<a href="http://bugs.python.org/issue6084">一个bug</a>，因为之前版本的文档的示例代码有点小问题，前因后果可以看<a href="http://luy.li/2009/05/22/python_doc_bug/">这里</a>。<br />
zip函数返回一个元组的列表，第<em>i</em>个元组，就包含了每个<em>iterable</em>的第<em>i</em>项。如果参数的各<em>iterable</em>不一样长，会别截取到最短的值，这个值也就是结果列表的长度。<br />
然后，zip内如果有个 <code>*</code> 开头，将会执行逆运算（unzip），示例：</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> x = <span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span>, <span style="color: #ff4500;">2</span>, <span style="color: #ff4500;">3</span><span style="color: black;">&#93;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> y = <span style="color: black;">&#91;</span><span style="color: #ff4500;">4</span>, <span style="color: #ff4500;">5</span>, <span style="color: #ff4500;">6</span><span style="color: black;">&#93;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> zipped = <span style="color: #008000;">zip</span><span style="color: black;">&#40;</span>x, y<span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> zipped
<span style="color: black;">&#91;</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">1</span>, <span style="color: #ff4500;">4</span><span style="color: black;">&#41;</span>, <span style="color: black;">&#40;</span><span style="color: #ff4500;">2</span>, <span style="color: #ff4500;">5</span><span style="color: black;">&#41;</span>, <span style="color: black;">&#40;</span><span style="color: #ff4500;">3</span>, <span style="color: #ff4500;">6</span><span style="color: black;">&#41;</span><span style="color: black;">&#93;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> x2, y2 = <span style="color: #008000;">zip</span><span style="color: black;">&#40;</span><span style="color: #66cc66;">*</span>zipped<span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> x == <span style="color: #008000;">list</span><span style="color: black;">&#40;</span>x2<span style="color: black;">&#41;</span> <span style="color: #ff7700;font-weight:bold;">and</span> y == <span style="color: #008000;">list</span><span style="color: black;">&#40;</span>y2<span style="color: black;">&#41;</span>
<span style="color: #008000;">True</span></pre></div></div>

<p><strong>__import__</strong>(<em>name</em>[, <em>globals</em>[, <em>locals</em>[, <em>fromlist</em>[, <em>level</em>]]]])<br />
此函数被import语句调用。代码中很少会用到这个函数，除非你要import的模块名是运行时才可知的。就不详述了。</p>
]]></content:encoded>
			<wfw:commentRss>http://luy.li/2009/11/29/python-built-in-functions/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>我架设的ubuntu源</title>
		<link>http://luy.li/2009/09/18/ubuntu_mirror/</link>
		<comments>http://luy.li/2009/09/18/ubuntu_mirror/#comments</comments>
		<pubDate>Fri, 18 Sep 2009 15:28:43 +0000</pubDate>
		<dc:creator>bones7456</dc:creator>
				<category><![CDATA[精华]]></category>

		<guid isPermaLink="false">http://li2z.cn/?p=1061</guid>
		<description><![CDATA[9月19日，是2009年的软件自由日，在这个比较特殊的日子，我要送给广大ubuntu爱好者一份礼物──一个新的ubuntu源。 正如我刚才的那篇文章所说，这个源的特点就是能保证全和新。至于速度还要看大家的测试结果（应该不会太差）。 这个源是在杭州电信的，百兆共享带宽，不知道网通用户的速度如何。 希望在9.10发布的时候，就能够加到ubuntu官方源列表里。这样用起来就更方便了。 其他就先不说了，大家可以通过这几个域名访问： http://ubuntu.srt.cn/ http://ubuntu.hzlug.org/ http://u.srt.cn/]]></description>
			<content:encoded><![CDATA[<p>9月19日，是2009年的<a href="http://zh.wikipedia.org/zh-cn/%E8%BD%AF%E4%BB%B6%E8%87%AA%E7%94%B1%E6%97%A5">软件自由日</a>，在这个比较特殊的日子，我要送给广大ubuntu爱好者一份礼物──一个新的ubuntu源。<br />
正如我刚才的<a href="http://luy.li/2009/09/18/script_of_ubuntu_mirror/">那篇文章</a>所说，这个源的特点就是能保证全和新。至于速度还要看大家的测试结果（应该不会太差）。<br />
这个源是在杭州电信的，百兆共享带宽，不知道网通用户的速度如何。<br />
希望在9.10发布的时候，就能够加到<a href="https://launchpad.net/ubuntu/+archivemirrors">ubuntu官方源列表</a>里。这样用起来就更方便了。<br />
其他就先不说了，大家可以通过这几个域名访问：<br />
<a href="http://ubuntu.srt.cn/">http://ubuntu.srt.cn/</a><br />
<a href="http://ubuntu.hzlug.org/">http://ubuntu.hzlug.org/</a><br />
<a href="http://u.srt.cn/">http://u.srt.cn/</a></p>
]]></content:encoded>
			<wfw:commentRss>http://luy.li/2009/09/18/ubuntu_mirror/feed/</wfw:commentRss>
		<slash:comments>51</slash:comments>
		</item>
	</channel>
</rss>
