<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Becomin&#039; Charles &#187; lucene</title>
	<atom:link href="http://sexywp.com/tags/lucene/feed" rel="self" type="application/rss+xml" />
	<link>http://sexywp.com</link>
	<description>Building another myself~~</description>
	<lastBuildDate>Fri, 27 Jan 2012 16:00:02 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Lucene笔记07&#8212;&#8212;中文分词</title>
		<link>http://sexywp.com/lucene-note-07.htm</link>
		<comments>http://sexywp.com/lucene-note-07.htm#comments</comments>
		<pubDate>Tue, 28 Apr 2009 08:55:10 +0000</pubDate>
		<dc:creator>Charles</dc:creator>
				<category><![CDATA[工作相关]]></category>
		<category><![CDATA[lucene]]></category>
		<category><![CDATA[note]]></category>

		<guid isPermaLink="false">http://sexywp.com/lucene-note-07.htm</guid>
		<description><![CDATA[什么是分词？什么是中文分词？

分词，就是将一段文字，按照语义上的最小单位切割开来。对于中文来说，虽然，很多汉字本身就具有相对独立的意思，但是更多情况下，单个的汉字是与其他一个或多个汉字组合在一起形成一个含义的。举个例子，“我是一个学生”，分词的结果是：“我/是/一个/学生”，再比如，“我/打算/去/做/分词/的/研究”。中文分词，就是将中文段落划分成词。

分词是理解语义的前提。人类依据自身的知识，在看到文字的时候，就自动完成了分词的过程。然而，计算机不具备人类的知识，更加不具备人类的智能，让机器实现自动切分文本，就成为了一个重要的研究课题，隶属于自然语言处理技术领域。在各种语言的分词中，最为困难的，可能就是中文分词了，因为中文语法复杂，规则少，特例多，歧义性强。中文领域文本处理技术，大大落后于西文，分词就是制约因素之一。

<span class="readmore"><a href="http://sexywp.com/lucene-note-07.htm" title="Lucene笔记07&#8212;&#8212;中文分词">Keep Reading --- 2169 words totally</a></span><table class="wumii-related-items" cellspacing="0" cellpadding="2" border="0" width="100%" style="clear: both;">
    
    <tr>
        <td ><b><font size="-1"  style="display: block !important; padding: 20px 0 5px !important;">您可能也喜欢：</font></b></td>
    </tr>
    
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-02.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-07.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记02</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-06.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-07.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记06</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-03.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-07.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记03</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-05.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-07.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记05</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-04.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-07.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记04</font>
                    </a>
                </td>
            </tr>
    
    <tr>
        <td  align="right">
            <a style="text-decoration: none !important;" href="http://www.wumii.com/widget/relatedItems.htm" target="_blank" title="无觅相关文章插件">
                <font size="-1" color="#bbbbbb" style="display: block !important; font-family: arial !important; padding: 5px 0 !important; font-size: 12px !important; color: #bbb !important;">无觅</font>
            </a>
        </td>
    </tr>
</table>]]></description>
			<content:encoded><![CDATA[<p><span id="more-345"></span><br />
<h3>什么是分词？什么是中文分词？</h3>
<p>分词，就是将一段文字，按照语义上的最小单位切割开来。对于中文来说，虽然，很多汉字本身就具有相对独立的意思，但是更多情况下，单个的汉字是与其他一个或多个汉字组合在一起形成一个含义的。举个例子，“我是一个学生”，分词的结果是：“我/是/一个/学生”，再比如，“我/打算/去/做/分词/的/研究”。中文分词，就是将中文段落划分成词。</p>
<p>分词是理解语义的前提。人类依据自身的知识，在看到文字的时候，就自动完成了分词的过程。然而，计算机不具备人类的知识，更加不具备人类的智能，让机器实现自动切分文本，就成为了一个重要的研究课题，隶属于自然语言处理技术领域。在各种语言的分词中，最为困难的，可能就是中文分词了，因为中文语法复杂，规则少，特例多，歧义性强。中文领域文本处理技术，大大落后于西文，分词就是制约因素之一。</p>
<p>分词是搜索引擎建立索引的重要环节。我们固然可以对单个汉字建立索引，但是那样建立的索引，体积庞大，效率低下，检索缓慢，精确率低。分词完毕后，就能大大减少索引的体积，提高检索的效率。对于检索领域来说，分词可能仍非必要环节，但是对于自然语言处理领域来说（典型的如机器翻译），分词就必不可少。</p>
<h3>&#160;</h3>
<h3>分词的基本方法</h3>
<p>分词一般有三种方法，基于字符串匹配的分词方法，将一段文本，与一个充分大辞典，逐条进行匹配，实现分词。按照长度优先级的不同，可以分为最大匹配、最小匹配；按照匹配方向的不同，可以分为正向匹配、逆向匹配。这些方法可以互相组合。</p>
<p>基于理解的分词方法，这种方法让计算机模拟人对句子的理解，对句子进行语法分析，语义分析，最后实现分词。该方法实现难度大，需要大量的语言知识和信息，目前还没有可生产的系统。</p>
<p>基于统计的分词方法，是通过对大量语料统计两个字相邻出现的频率，识别出词的方法。该方法虽然不需要辞典，而且能够实现对新词的识别。</p>
<p>三大类分词方法，各有利弊，参见文献【2】，现实中的分词系统，往往是综合系统，也即两种以上方法的综合，以此实现最优的分词效果。</p>
<p>&#160;</p>
<h3>在Lucene中分词</h3>
<p>在Lucene中执行分词任务的是Analyzer对象，该对象中最关键的方法，是tokenStream方法，该方法可以返回一个包含着token的集合，也即TokenStream对象。TokenStream本身，是一个有着类似迭代器接口的抽象类，其具体类有两种，一种是以Reader对象作为输入的Tokenizer对象，另一种是以另一个TokenStream对象作为输入的TokenFilter。</p>
<p>到此，我们已经不难看出，实际执行切割任务的是Tokenizer，而TokenFilter则正如其名，对切割的结果进行过滤。要得到一个分词完毕的结果集合，必须要各类Tokenizer和TokenFilter的合作才可以完成，而Analyzer在这里，就扮演着一个组装器的角色。</p>
<p>从StandardAnalyzer中，我们不难发现Lucene的思路。首先创建一个StandardTokenizer实现第一次切割，然后是StandardTokenFilter，实现对token的归一化，如将复数词变成单数词，接着是一个ToLowerCaseFilter，将所有的token转换成小写字母，最后是StopFilter，将所有的stop words（无意义虚词）去掉。这样就完成了对一个英文文字段落的分词。</p>
<p>这样的设计，将复杂的分词实现全部对用户透明，用户具体使用的时候，就非常容易，只要创建一个Analyzer对象，然后传递给IndexWriter或者QueryParser（分解用户的查询，第一个步骤也是分词）即可。</p>
<p>&#160;</p>
<h3>在Lucene中实现中文分词</h3>
<p>有了上面的理解，我们就知道，要让Lucene能够实现中文分词，我们必须创建自己的Analyzer，以及与其相关的Tokenizer和TokenFilter，有了这几个类的配合，就可以实现中文分词。</p>
<p>我在网上调研了数个中文分词系统，但是，里面实现了Lucene接口的并不多，好在我只是做一般性科研用途，对性能，效率不需考量，所以，我就直接选择了按照Lucene的接口设计的Paoding Analysis包。如果，我们期望能够得到更高的精确率和分词效率，我们还需要选用更加优秀的分词组件才行，那时候，就必须要自己手动来进行一次封装，以实现Lucene的接口，不过，这并不是一个复杂的事情。很多分词组件本身已经非常全面，例如ICTCLAS，已经实现了定制化分词，其给出的结果，就直接是最终结果了，所以，如果要包装ICTCLAS，我们要做的事情就很简单了，将ICTCLAS返回的结果，用TokenStream包装即可，可能单实现一个Analyzer就足够了。即便考虑得全面一些，事情也不会太复杂。</p>
<p>&#160;</p>
<h3>总结</h3>
<p>市面上的中文分词组件非常多，一般应用，我们完全可以采用现成的系统。没有必要重复发明轮子。但是这些系统大多数难以适用于要求更高的商业系统，那时候就不得不选购一些相关的解决方案，或者在某开源系统的基础上进行再开发。</p>
<p>我个人在实际项目中，选用了Paoding Analysis，主要考虑到该实现完全使用Java，具有优秀的跨平台特性，而且用起来也最为省心，分词效果也还不错。虽然效率可能存在一定的问题，但是由于系统本身内容较少，也就不是矛盾的主要方面了。日后可以考虑更换更为高效和更为准确的系统。当然，前提是你在构建系统的时候，不要将你使用的Analyzer硬编码进系统，而是使用配置文件等方式来接入。</p>
<p>&#160;</p>
<p>参考文献：</p>
<p>【1】百度百科：中文分词 <a title="http://baike.baidu.com/view/19109.html" href="http://baike.baidu.com/view/19109.html">http://baike.baidu.com/view/19109.html</a> （中文分词的基本方法，技术难点，基本应用）</p>
<p>【2】三种中文分词算法优劣比较 <a title="http://www.blogjava.net/jiangyz/articles/238120.html" href="http://www.blogjava.net/jiangyz/articles/238120.html">http://www.blogjava.net/jiangyz/articles/238120.html</a></p>
<p>【3】<a href="http://www.coreseek.com/forum/index.php?action=vthread&amp;forum=1&amp;topic=2">开源中文分词软件</a></p>
<p>【4】<a href="http://approximation.javaeye.com/blog/345885">Lucene中文分析器的中文分词准确性和性能比较</a></p>
<p>【5】<a href="http://searcher.org.cn/search/20070903/98.html">几款免费中文分词模块介绍</a> （原文已经丢失） </p>
<p>【6】<a href="http://www.webryan.cn/2009/04/something-about-chinese-seg/">关于中文分词的一些琐碎资料 </a></p>
<p>&#160;</p>
<p>常见分词系统：</p>
<p>【1】中科院计算所ICTCLAS <a title="http://ictclas.org/index.html" href="http://ictclas.org/index.html">http://ictclas.org/index.html</a> （很多企业和研究机构的分词系统，带有词性标注，便于搞科研）</p>
<p>【2】海量信息技术有限公司的中文智能分词 <a title="http://www.hylanda.com" href="http://www.hylanda.com">http://www.hylanda.com</a> （据说是中搜使用的分词系统，传说业界公认最好的分词）</p>
<p>【3】猎兔 中文分词 <a title="http://www.lietu.com/demo/index.jsp" href="http://www.lietu.com/demo/index.jsp">http://www.lietu.com/demo/index.jsp</a></p>
<p>【4】极易中文分词组件 <a title="http://www.jesoft.cn/" href="http://www.jesoft.cn/">http://www.jesoft.cn/</a> （基于正向最大匹配算法）</p>
<p>【5】庖丁解牛中文分词 <a title="http://code.google.com/p/paoding/" href="http://code.google.com/p/paoding/">http://code.google.com/p/paoding/</a> （开源项目）</p>
<p>【6】简易中文分词系统 <a title="http://www.hightman.cn/index.php?scws" href="http://www.hightman.cn/index.php?scws">http://www.hightman.cn/index.php?scws</a> （提供php扩展）</p>
<p>【7】还有许多许多</p>
<table class="wumii-related-items" cellspacing="0" cellpadding="2" border="0" width="100%" style="clear: both;">
    
    <tr>
        <td ><b><font size="-1"  style="display: block !important; padding: 20px 0 5px !important;">您可能也喜欢：</font></b></td>
    </tr>
    
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-02.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-07.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记02</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-06.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-07.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记06</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-03.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-07.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记03</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-05.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-07.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记05</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-04.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-07.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记04</font>
                    </a>
                </td>
            </tr>
    
    <tr>
        <td  align="right">
            <a style="text-decoration: none !important;" href="http://www.wumii.com/widget/relatedItems.htm" target="_blank" title="无觅相关文章插件">
                <font size="-1" color="#bbbbbb" style="display: block !important; font-family: arial !important; padding: 5px 0 !important; font-size: 12px !important; color: #bbb !important;">无觅</font>
            </a>
        </td>
    </tr>
</table>
	标签：<a href="http://sexywp.com/tags/lucene" title="lucene" rel="tag">lucene</a>, <a href="http://sexywp.com/tags/note" title="note" rel="tag">note</a><br />
]]></content:encoded>
			<wfw:commentRss>http://sexywp.com/lucene-note-07.htm/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Lucene笔记06</title>
		<link>http://sexywp.com/lucene-note-06.htm</link>
		<comments>http://sexywp.com/lucene-note-06.htm#comments</comments>
		<pubDate>Tue, 14 Apr 2009 11:36:41 +0000</pubDate>
		<dc:creator>Charles</dc:creator>
				<category><![CDATA[工作相关]]></category>
		<category><![CDATA[lucene]]></category>
		<category><![CDATA[note]]></category>

		<guid isPermaLink="false">http://sexywp.com/lucene-note-06.htm</guid>
		<description><![CDATA[在笔记03中，已经提到了使用Lucene进行搜索的几个必要组件：



IndexSearcher——该对象内包含了很多search方法的重载，搜素一个索引，主要就是使用该对象的实例。 

<span class="readmore"><a href="http://sexywp.com/lucene-note-06.htm" title="Lucene笔记06">Keep Reading --- 987 words totally</a></span><table class="wumii-related-items" cellspacing="0" cellpadding="2" border="0" width="100%" style="clear: both;">
    
    <tr>
        <td ><b><font size="-1"  style="display: block !important; padding: 20px 0 5px !important;">您可能也喜欢：</font></b></td>
    </tr>
    
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-07.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-06.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记07——中文分词</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-04.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-06.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记04</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-03.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-06.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记03</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-02.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-06.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记02</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-05.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-06.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记05</font>
                    </a>
                </td>
            </tr>
    
    <tr>
        <td  align="right">
            <a style="text-decoration: none !important;" href="http://www.wumii.com/widget/relatedItems.htm" target="_blank" title="无觅相关文章插件">
                <font size="-1" color="#bbbbbb" style="display: block !important; font-family: arial !important; padding: 5px 0 !important; font-size: 12px !important; color: #bbb !important;">无觅</font>
            </a>
        </td>
    </tr>
</table>]]></description>
			<content:encoded><![CDATA[<p><span id="more-340"></span>
<p>在<a title="Lucene Note 03" href="http://sexywp.com/lucene-note-03.htm">笔记03</a>中，已经提到了使用Lucene进行搜索的几个必要组件：</p>
<ul>
<li><strong>IndexSearcher</strong>——该对象内包含了很多search方法的重载，搜素一个索引，主要就是使用该对象的实例。 </li>
<li><strong>Query</strong>——该类是一个抽象类，其派生类产生的对象，是对各种形式搜索的封装。
<ul>
<li>TermQuery——匹配那些包含单个查询词语（term）的文档。可以使用BooleanQuery进行组合。 </li>
<li>BooleanQuery——匹配由其他查询（TermQuery或PhraseQuery或者BooleanQuery）布尔组合后形成的查询的文档。 </li>
<li>FuzzyQuery——模糊查询。 </li>
<li>RangeQuery——范围查询。 </li>
<li>还有很多…… </li>
</ul>
</li>
<li><strong>QueryParser</strong>——将人类语言翻译成上述某种Query对象。 </li>
<li><strong>TopDocs</strong>——搜索结果的容器。TopFieldDocs是其派生类，也是存放搜索结果的容器。 </li>
</ul>
<p>&#160;</p>
<p>上面已经提到了，IndexSearcher中有很多重载的search方法，不过我仔细看了一下，建议使用的并不多。</p>
<pre><span style="color: #0000ff">public</span> TopFieldDocs search(Query query, Filter filter, <span style="color: #0000ff">int</span> n, Sort sort)
<span style="color: #0000ff">public</span> TopDocs search(Query query, Filter filter, <span style="color: #0000ff">int</span> n)
<span style="color: #0000ff">public</span> TopDocs search(Query query, <span style="color: #0000ff">int</span> n)

<span style="color: #008000">//following low-level</span>
<span style="color: #0000ff">public</span> <span style="color: #0000ff">void</span> search(Query query, HitCollector results)
<span style="color: #0000ff">public</span> <span style="color: #0000ff">void</span> search(Query query, Filter filter, HitCollector results)</pre>
<p>Filter是一个抽象类，就像其名字标识的一样，该对象将会过滤搜索结果，挡住一些结果，放行另一些。如果我们在建立索引的时候，对日期按照YYYYMMDD的格式建立了索引，那么我们在搜索的时候，可以使用PrefixFilter来查询某一年YYYY的文档。另外，我们也可以使用RangeFilter来搜索某个特定字段存在于某个范围内的文档。Sort则是一个可以改变排序结果的对象，可以告诉Lucene按照某个或者某几个特定的字段排序搜索结果，还可以让Lucene将搜索结果逆序。</p>
<p>HitCollector是一个抽象类，其子类的对象，都要提供一个collect方法，IndexSearcher会将每个文档的id和其原始得分传递给该方法。在这里，就可以自定义对最后结果的排序算法了。这里，可以看一下默认的TopDocCollector的collect方法实现：</p>
<pre>  <span style="color: #0000ff">public</span> <span style="color: #0000ff">void</span> collect(<span style="color: #0000ff">int</span> doc, <span style="color: #0000ff">float</span> score) {
    <span style="color: #0000ff">if</span> (score &gt; 0.0f) {
      totalHits++;
      <span style="color: #0000ff">if</span> (reusableSD == <span style="color: #0000ff">null</span>) {
        reusableSD = <span style="color: #0000ff">new</span> ScoreDoc(doc, score);
      } <span style="color: #0000ff">else</span> <span style="color: #0000ff">if</span> (score &gt;= reusableSD.score) {
        <span style="color: #008000">// reusableSD holds the last &quot;rejected&quot; entry, so, if</span>
        <span style="color: #008000">// this new score is not better than that, there's no</span>
        <span style="color: #008000">// need to try inserting it</span>
        reusableSD.doc = doc;
        reusableSD.score = score;
      } <span style="color: #0000ff">else</span> {
        <span style="color: #0000ff">return</span>;
      }
      reusableSD = (ScoreDoc) hq.insertWithOverflow(reusableSD);
    }
  }</pre>
<p>默认的算法，就是根据原始分数的高低，来对搜索命中结果进行排序。实质上就是一个堆排序，这里用到的hq对象，是一个优先级队列。</p>
<ol>
<li>每传递进来一个文档id，将该文档的分数与上一次操作结束后，得分最低的文档比较（优先级队列长度是有限的，填满后，抛弃得分最低的文档） </li>
<li>如果，得分较高，插入队列，并记录下这次操作被挤出队列的文档，否则，抛弃之。 </li>
</ol>
<p>如果我们自己来设定排序方法的话，我们可能先要调用IndexReader对象，取得我们计算排序所必须的字段，然后，在这个函数内实现我们的计算公式。这里，我就在想，是否可以更早一步地介入到Lucene的搜索过程中，比如这里，score已经计算好了，是否有机会介入到计算scroe之前的地方，这样，我们就可以用自己的公式来计算scroe。这只是一个想法，还没有进一步验证和研究。</p>
<p>&#160;</p>
<p>今天看过的一些文章的摘抄：</p>
<ol>
<li><a title="不选择使用Lucene的6大原因" href="http://blog.csdn.net/accesine960/archive/2008/03/22/2207462.aspx" target="_blank">不选择使用Lucene的6大原因</a> </li>
<li><a title="Why Lucene isn&#39;t that good?" href="http://www.jroller.com/melix/entry/why_lucene_isn_t_that" target="_blank">Moving Lucene a step forward</a> ——上面那篇中文的文章中提到的英文原文。 </li>
<li><a title="The introduction of TFIDF" href="http://www.googlechinablog.com/2006/06/blog-post_27.html" target="_blank">数学之美 系列九 -- 如何确定网页和查询的相关性</a> —— TF·IDF统计法的一个基础介绍。 </li>
</ol>
<table class="wumii-related-items" cellspacing="0" cellpadding="2" border="0" width="100%" style="clear: both;">
    
    <tr>
        <td ><b><font size="-1"  style="display: block !important; padding: 20px 0 5px !important;">您可能也喜欢：</font></b></td>
    </tr>
    
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-07.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-06.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记07——中文分词</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-04.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-06.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记04</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-03.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-06.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记03</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-02.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-06.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记02</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-05.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-06.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记05</font>
                    </a>
                </td>
            </tr>
    
    <tr>
        <td  align="right">
            <a style="text-decoration: none !important;" href="http://www.wumii.com/widget/relatedItems.htm" target="_blank" title="无觅相关文章插件">
                <font size="-1" color="#bbbbbb" style="display: block !important; font-family: arial !important; padding: 5px 0 !important; font-size: 12px !important; color: #bbb !important;">无觅</font>
            </a>
        </td>
    </tr>
</table>
	标签：<a href="http://sexywp.com/tags/lucene" title="lucene" rel="tag">lucene</a>, <a href="http://sexywp.com/tags/note" title="note" rel="tag">note</a><br />
]]></content:encoded>
			<wfw:commentRss>http://sexywp.com/lucene-note-06.htm/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Lucene笔记05</title>
		<link>http://sexywp.com/lucene-note-05.htm</link>
		<comments>http://sexywp.com/lucene-note-05.htm#comments</comments>
		<pubDate>Tue, 14 Apr 2009 04:39:14 +0000</pubDate>
		<dc:creator>Charles</dc:creator>
				<category><![CDATA[工作相关]]></category>
		<category><![CDATA[lucene]]></category>
		<category><![CDATA[note]]></category>

		<guid isPermaLink="false">http://sexywp.com/lucene-note-05.htm</guid>
		<description><![CDATA[Lucene是允许对索引的并发操作的，具体操作时，要遵循三条简单而严格的规则：



任意数量只读操作可以并行。 

<span class="readmore"><a href="http://sexywp.com/lucene-note-05.htm" title="Lucene笔记05">Keep Reading --- 366 words totally</a></span><table class="wumii-related-items" cellspacing="0" cellpadding="2" border="0" width="100%" style="clear: both;">
    
    <tr>
        <td ><b><font size="-1"  style="display: block !important; padding: 20px 0 5px !important;">您可能也喜欢：</font></b></td>
    </tr>
    
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-04.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-05.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记04</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-03.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-05.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记03</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-07.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-05.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记07——中文分词</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-02.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-05.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记02</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-06.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-05.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记06</font>
                    </a>
                </td>
            </tr>
    
    <tr>
        <td  align="right">
            <a style="text-decoration: none !important;" href="http://www.wumii.com/widget/relatedItems.htm" target="_blank" title="无觅相关文章插件">
                <font size="-1" color="#bbbbbb" style="display: block !important; font-family: arial !important; padding: 5px 0 !important; font-size: 12px !important; color: #bbb !important;">无觅</font>
            </a>
        </td>
    </tr>
</table>]]></description>
			<content:encoded><![CDATA[<p><span id="more-339"></span>
<p>Lucene是允许对索引的并发操作的，具体操作时，要遵循三条简单而严格的规则：</p>
<ul>
<li>任意数量只读操作可以并行。 </li>
<li>对于一个处于写状态的索引来说，也允许任意只读操作并行。 </li>
<li>索引的写操作不可以并行，只能有一个实例线程修改索引。 </li>
</ul>
<p>Lucene的并发规则非常简单，而且，这样的规则基本符合我们的直觉思维，因而非常容易记忆。事实上，Lucene并不强制遵守这些规则，但是违背规则，将带来不可预测的风险，例如索引损坏。</p>
<p>实际操作中，一个好的做法是对于执行写操作的对象，使其单一实例化，也就是使用Singleton设计模式。Lucene的索引操作对象，都被设计为线程安全的形式，多个线程可以直接调用，而不需要额外的同步操作。这一点相当体贴。</p>
<p>&#160;</p>
<p>实际上，可能没有人会故意去不遵守Lucene的并发规则，造成这样的状况，往往是意外，所以，Lucene提供了一套锁，来保护索引。</p>
<p>Lucene的锁以文件的形式保存在磁盘上，一共有两种锁，一种是write.lock，另一种是commit.lock。</p>
<table class="wumii-related-items" cellspacing="0" cellpadding="2" border="0" width="100%" style="clear: both;">
    
    <tr>
        <td ><b><font size="-1"  style="display: block !important; padding: 20px 0 5px !important;">您可能也喜欢：</font></b></td>
    </tr>
    
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-04.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-05.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记04</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-03.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-05.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记03</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-07.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-05.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记07——中文分词</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-02.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-05.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记02</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-06.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-05.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记06</font>
                    </a>
                </td>
            </tr>
    
    <tr>
        <td  align="right">
            <a style="text-decoration: none !important;" href="http://www.wumii.com/widget/relatedItems.htm" target="_blank" title="无觅相关文章插件">
                <font size="-1" color="#bbbbbb" style="display: block !important; font-family: arial !important; padding: 5px 0 !important; font-size: 12px !important; color: #bbb !important;">无觅</font>
            </a>
        </td>
    </tr>
</table>
	标签：<a href="http://sexywp.com/tags/lucene" title="lucene" rel="tag">lucene</a>, <a href="http://sexywp.com/tags/note" title="note" rel="tag">note</a><br />
]]></content:encoded>
			<wfw:commentRss>http://sexywp.com/lucene-note-05.htm/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Lucene笔记04</title>
		<link>http://sexywp.com/lucene-note-04.htm</link>
		<comments>http://sexywp.com/lucene-note-04.htm#comments</comments>
		<pubDate>Mon, 13 Apr 2009 16:34:17 +0000</pubDate>
		<dc:creator>Charles</dc:creator>
				<category><![CDATA[工作相关]]></category>
		<category><![CDATA[lucene]]></category>
		<category><![CDATA[note]]></category>

		<guid isPermaLink="false">http://sexywp.com/lucene-note-04.htm</guid>
		<description><![CDATA[本文详细介绍了使用Lucene建立索引的原理，基础知识，详细配置参数，和基础操作实践。<table class="wumii-related-items" cellspacing="0" cellpadding="2" border="0" width="100%" style="clear: both;">
    
    <tr>
        <td ><b><font size="-1"  style="display: block !important; padding: 20px 0 5px !important;">您可能也喜欢：</font></b></td>
    </tr>
    
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-07.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-04.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记07——中文分词</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-06.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-04.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记06</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-02.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-04.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记02</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-03.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-04.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记03</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-05.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-04.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记05</font>
                    </a>
                </td>
            </tr>
    
    <tr>
        <td  align="right">
            <a style="text-decoration: none !important;" href="http://www.wumii.com/widget/relatedItems.htm" target="_blank" title="无觅相关文章插件">
                <font size="-1" color="#bbbbbb" style="display: block !important; font-family: arial !important; padding: 5px 0 !important; font-size: 12px !important; color: #bbb !important;">无觅</font>
            </a>
        </td>
    </tr>
</table>]]></description>
			<content:encoded><![CDATA[</p>
<p> <span id="more-338"></span>
<p>使用Lucene建立索引，有三个主要步骤。</p>
<p><strong>提取文本。</strong>Lucene只能对纯文本建立索引，所以，任何需要建立索引的资料，都要进行过滤处理，从中提取到纯文本。比如对于Word和PDF，我们都要使用相关API将其中的纯文本提取出来，而对于XML和HTML，则意味着要过滤掉所有的tag。</p>
<p><strong>文本分析。</strong>要建立索引，首先要将文本分解成一个个片段，一般就是单词，当然也可能是词组，句子等。分割好的东西，可能还要进行归一化处理，以确保最大程度上的检索能力，比如，全部变成小写字母，以后搜索的时候，就能忽略大小写。这个过程对于字母文字，有个步骤，就是回归原型，像英文、德文、法文这些我稍微有点了解的语言里，一般都有“数”，“格”，“态”的变化，而同一个词的变化形式，应该被视为是一个词，而不是不同的词。对于汉语这样的没有变形的语言，这方面就非常方便了，但是汉语却有着另一个不方便的地方，就是汉语的最小单位不是字，而是词。也即汉语需要进行分词处理。英文单词使用空格分隔，分词要简单得多得多。除却这些步骤，还有一个共同的步骤就是删除stop words，简单说就是无意义词，一般来说就是数词，量词，助词，介词，代词等等虚词。</p>
<p><strong>将索引写入磁盘。</strong>Lucene将分析好的文本使用一种叫做倒排索引的数据结构写入到磁盘中。倒排索引（inverted index）的建立，完全是为了搜索的方便。如果说，“正排索引”可以回答你一个问题，“这个文档中，包含了哪些关键词？”，那么“倒排索引”回答了你一个相反的问题，就是“哪些文档，包含了关键词X？”。倒排索引是当今所有主流搜索引擎的核心结构，而这些搜索引擎之所以不同，是因为在建立倒排索引时所附加的独特的参数，比如著名的Google PageRank。这些参数决定了最终搜索结果的排序。</p>
<p>&#160;</p>
<p>在<a title="Lucene笔记02" href="http://sexywp.com/lucene-note-02.htm">笔记02</a>中，写了一些建立索引的基本元件，但是没有详细研究。这里从逻辑上来看一下Lucene的索引。Lucene的索引从逻辑上看是一个整体，内部的结构却很复杂，可以很方便的合并，删除，而且并发访问的能力很强。索引的内部，最小的单位是字段（Field），然后一些字段的集合是文档（Document）。这个文档和我们对其建立索引的那个“文档”是不同的，这是Lucene索引内部组织的一种形式，按照我的理解，就是一组相关的字段，规整在一起，就是一个Lucene Document。举个例子，一个Lucene Document包含标题，作者，日期，关键字，内容五个字段，具体字段的数量，应该是在建立索引的时候决定的，可以根据需要添加足够多的字段。比如还可以添加网址，分类信息等等。事实上，在建立索引完毕后，我们从索引中搜索出来的最终结果，是Lucene Document，而不是原本我们对其建立索引的文件。这些Document通过其内部字段信息，和真正的文件建立关联，比如我们在Google中，搜到的结果，都带有一个超链接指向真正存在于互联网某个角落的网页。</p>
<p>字段是一种数据结构，包含一个名称（name），和对应的值（value），可以看作是一种键值对。字段的值可以是普通文本，比如一个网页的全部正文部分，在构造一个Field的时候，这样的值由java.lang.String或者java.io.Reader封装，这样的字段在建立索引时，会经历上文中第二个步骤——文本分析；字段值也可以是原子化的关键字，这样的值，在建立索引的时候，不会被分析，这样的值往往是用来标识一个文件的日期，url等信息。字段是否需要存储到索引中，是可选的，存储的字段在搜索时也会包含于搜索结果中。</p>
<p>根据一个字段是否需要分析，是否需要索引，是否需要存储，字段应该一共分成八种。但是有些种类是没有意义的，比如需要分析，但是不需要索引的字段，实际上是不存在的，所以，实际常用的字段只有四种类型。</p>
<p><strong>Keyword</strong>——这类字段不会被分析，但是会被索引并且会被保存。这类字段的完整性需要收到保护，比如网址，电话号码，日期，人名等等。</p>
<p><strong>UnIndexed</strong>——这类字段不会被分析，也不会被索引，但是会被保存。这种字段一般用来存储你需要显示在搜索结果中的内容，但是你往往不需要直接搜索它们的值。我个人对这个种类是存在疑问的，我基本上想不出来有哪些信息满足这种特征的。搜索引擎发展到今天，我们可能在Google搜索框中敲入任何东西，曾经看到一篇文章，里面有句话，“……你可能无法想象，有多么高的比例的人，在使用Google的时候，在搜索框中直接打入一个完整的网址……”，确实是这样，像我爸这类用户，把Google设成首页，完全无视浏览器的地址栏，跑题了。如果说非有这么一种字段，我想，可能是标识了这个搜索结果的属性的信息。比如，你搜索“Charles”，返回结果里有一些信息，说“性别：男”，“物种：人类”，“居住地：地球”，这种信息应该满足条件了，但是不是现实生活中的例子，呵呵，搜索引擎要真那么恐怖，某些人就惨了。</p>
<p><strong>UnStored</strong>——这类字段与Unindex正好完全相反。会被分析，并且索引，但是不会被保存。比如，一本书的全部内容。一个网页的全部正文。</p>
<p><strong>Text</strong>——这类字段会被分析，索引，如果是纯文本，也会被存储。符合这个种类的例子，我能想到的例子，比如一篇论文的摘要。这样的文本段落不但有高密度的关键字，而且几乎可以完整的概述一个文件的内容。</p>
<p>&#160;</p>
<p>上述的四个种类，在早期的Lucene版本中，本身就有对应的类，构造字段的时候直接使用上面的名字构造相应的对象即可。但是在最新版本的Lucene中，使用了一种更为统一的形式，也即只有Field一个类，然后使用一些参数来描述这个字段的属性，通过参数组合，可以组合出各种类别，甚至那四种不存在的类别理论上也是可以组合出来。</p>
<p>现在的Field构造函数原型是如下样子的：</p>
<pre><span style="color: #0000ff">public</span> Field(String name, String value, Store store, Index index) </pre>
<p>我想，有了上面的分析，这个构造函数应该是相当容易理解了。一开始就是因为理解得不透彻，导致我自己犯了很低级的错误。然后，根据新的构造函数的原型，来总结一下上面的内容。</p>
<table cellspacing="0" cellpadding="2" width="600" border="1">
<tbody>
<tr>
<td valign="top" width="150">col:Store, row:Index</td>
<td valign="top" width="150">YES</td>
<td valign="top" width="150">NO</td>
<td valign="top" width="150">COMPRESS</td>
</tr>
<tr>
<td valign="top" width="150">NO</td>
<td valign="top" width="150">存储，但是不建立索引，当然也就不分析。这样的字段无法搜索，但是会出现在搜索结果中。</td>
<td valign="top" width="150">无意义。引发Illegal Argument Exception。</td>
<td valign="top" width="150">基本等同于YES，但是外加了压缩。对于较长的文本和二进制的字段，应该选用这个参数。计算量更大。</td>
</tr>
<tr>
<td valign="top" width="150">ANALYZED</td>
<td valign="top" width="150">分析，索引，存储。</td>
<td valign="top" width="150">分析，索引，不存储。</td>
<td valign="top" width="150">分析，索引，压缩存储。</td>
</tr>
<tr>
<td valign="top" width="150">NOT_ANALYZED</td>
<td valign="top" width="150">不分析，但是索引，存储。</td>
<td valign="top" width="150">不分析，直接索引，但是不存储。</td>
<td valign="top" width="150">不分析，但是索引，压缩存储。</td>
</tr>
<tr>
<td valign="top" width="150">NOT_ANALYZED_NO_NORMS</td>
<td valign="top" width="150">（高级）</td>
<td valign="top" width="150">（高级）</td>
<td valign="top" width="150">（高级）</td>
</tr>
<tr>
<td valign="top" width="150">ANALYZED_NO_NORMS</td>
<td valign="top" width="150">（高级）</td>
<td valign="top" width="150">（高级）</td>
<td valign="top" width="150">（高级）</td>
</tr>
</tbody>
</table>
<p>&#160;</p>
<p>呵呵，事实上，Lucene内部的字段种类，一共有十四种之多，像上表描述的一样。从这张表中，能发现，一个实际的，有生产能力的系统，远比理论模型要复杂，细致，考虑的细节更多。</p>
<p>&#160;</p>
<p>实际上，构造的Field的时候，还有一个可选参数。就是TermVector。所谓的TermVector，字面理解就是关键字向量。通俗点说，就是实际上该字段中，被索引的词语的一个列表，还有跟每个词关联的信息，比如位置（Position，第几个词）或者偏移量（Offset，第几个字节，辅助高亮显示）。这也就暗示着一件事情，如果一个字段没有被索引，那么也不能指定其带有TermVector参数。TermVector有五种取值：</p>
<p>NO——默认值。不保留信息。</p>
<p>YES——保留词及该词出现的次数。</p>
<p>WITH_POSITIONS</p>
<p>WITH_OFFSETS</p>
<p>WITH_POSITIONS_OFFSETS</p>
<p>可以为每个Field指定TermVector选项，但是最后每个Document只维护一个Term Vector。</p>
<table class="wumii-related-items" cellspacing="0" cellpadding="2" border="0" width="100%" style="clear: both;">
    
    <tr>
        <td ><b><font size="-1"  style="display: block !important; padding: 20px 0 5px !important;">您可能也喜欢：</font></b></td>
    </tr>
    
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-07.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-04.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记07——中文分词</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-06.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-04.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记06</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-02.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-04.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记02</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-03.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-04.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记03</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-05.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-04.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记05</font>
                    </a>
                </td>
            </tr>
    
    <tr>
        <td  align="right">
            <a style="text-decoration: none !important;" href="http://www.wumii.com/widget/relatedItems.htm" target="_blank" title="无觅相关文章插件">
                <font size="-1" color="#bbbbbb" style="display: block !important; font-family: arial !important; padding: 5px 0 !important; font-size: 12px !important; color: #bbb !important;">无觅</font>
            </a>
        </td>
    </tr>
</table>
	标签：<a href="http://sexywp.com/tags/lucene" title="lucene" rel="tag">lucene</a>, <a href="http://sexywp.com/tags/note" title="note" rel="tag">note</a><br />
]]></content:encoded>
			<wfw:commentRss>http://sexywp.com/lucene-note-04.htm/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Lucene笔记03</title>
		<link>http://sexywp.com/lucene-note-03.htm</link>
		<comments>http://sexywp.com/lucene-note-03.htm#comments</comments>
		<pubDate>Thu, 18 Dec 2008 01:40:03 +0000</pubDate>
		<dc:creator>Charles</dc:creator>
				<category><![CDATA[工作相关]]></category>
		<category><![CDATA[lucene]]></category>
		<category><![CDATA[note]]></category>

		<guid isPermaLink="false">http://www.charlestang.cn/?p=280</guid>
		<description><![CDATA[要完成最基本的搜索过程，Lucene需要以下几个对象的合作：



IndexSearcher——这个对象主要用来检索IndexWriter生成的索引文件，所以IndexSearcher构造的时候，使用一个包含了索引所在目录的Directory对象来构造。IndexSearcher提供的是一种对索引文件的只读访问，里面提供了多种搜索方法。在我第一次的笔记里代码中用到的search方法，接受一个Query对象和一个HitCollector对象，返回值为空。搜索结果被填充到HitCollector中。

<span class="readmore"><a href="http://sexywp.com/lucene-note-03.htm" title="Lucene笔记03">Keep Reading --- 618 words totally</a></span><table class="wumii-related-items" cellspacing="0" cellpadding="2" border="0" width="100%" style="clear: both;">
    
    <tr>
        <td ><b><font size="-1"  style="display: block !important; padding: 20px 0 5px !important;">您可能也喜欢：</font></b></td>
    </tr>
    
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-05.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-03.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记05</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-04.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-03.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记04</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-07.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-03.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记07——中文分词</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-02.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-03.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记02</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-06.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-03.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记06</font>
                    </a>
                </td>
            </tr>
    
    <tr>
        <td  align="right">
            <a style="text-decoration: none !important;" href="http://www.wumii.com/widget/relatedItems.htm" target="_blank" title="无觅相关文章插件">
                <font size="-1" color="#bbbbbb" style="display: block !important; font-family: arial !important; padding: 5px 0 !important; font-size: 12px !important; color: #bbb !important;">无觅</font>
            </a>
        </td>
    </tr>
</table>]]></description>
			<content:encoded><![CDATA[<p><span id="more-280"></span>要完成最基本的搜索过程，Lucene需要以下几个对象的合作：</p>
<ul>
<li>IndexSearcher——这个对象主要用来检索IndexWriter生成的索引文件，所以IndexSearcher构造的时候，使用一个包含了索引所在目录的Directory对象来构造。IndexSearcher提供的是一种对索引文件的只读访问，里面提供了多种搜索方法。在我第一次的笔记里代码中用到的search方法，接受一个Query对象和一个HitCollector对象，返回值为空。搜索结果被填充到HitCollector中。</li>
<li>Term——该对象是一个和Field相似的对象，包含一个名字和值对。但是目前，在代码里还没有遇到过这个对象，虽然书里提到在建立索引和搜索的过程中都会用到这个东西，但是实际上，我并没有看到。</li>
<li>Query——Query类是一个抽象类，在Lucene的内部有许多的实现，虽然说，书中也提到了最基本的Query是TermQuery，但是看了看内部的代码，在笔记1中提到的代码内部，实际上用到是BooleanQuery，而不是TermQuery。</li>
<li>TermQuery——最基本的Query，上面也提到了，用来匹配文档中包含的特定的域的特定的值，暂时也没有碰到过。</li>
<li>Hits——这个对象本来应该是一个简单的容器，用来包含搜索得到的排序结果的，但是实际上，在笔记1中的代码里，已经看不到这个东西了，Lucene已经不推荐使用这个东西，现在用到的东西是HitCollector似乎是一个更高级的容器了，在代码中我们看到，我们从这个对象中去除了一个Document的数组，包含的元素正是搜索结果。</li>
<li>QueryParser——这个对象在书中没有提到，实际上，我觉得必须要有的，本质上就是把一个字符串转换成一个Query对象，实际上，这个东西应该是设计得非常的复杂的，因为搜索引擎一般都提供了很丰富的搜索语法，Lucene也是一样的。构造QueryParser的时候，还可以指定专门的Analyzer。</li>
</ul>
<table class="wumii-related-items" cellspacing="0" cellpadding="2" border="0" width="100%" style="clear: both;">
    
    <tr>
        <td ><b><font size="-1"  style="display: block !important; padding: 20px 0 5px !important;">您可能也喜欢：</font></b></td>
    </tr>
    
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-05.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-03.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记05</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-04.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-03.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记04</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-07.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-03.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记07——中文分词</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-02.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-03.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记02</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-06.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-03.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记06</font>
                    </a>
                </td>
            </tr>
    
    <tr>
        <td  align="right">
            <a style="text-decoration: none !important;" href="http://www.wumii.com/widget/relatedItems.htm" target="_blank" title="无觅相关文章插件">
                <font size="-1" color="#bbbbbb" style="display: block !important; font-family: arial !important; padding: 5px 0 !important; font-size: 12px !important; color: #bbb !important;">无觅</font>
            </a>
        </td>
    </tr>
</table>
	标签：<a href="http://sexywp.com/tags/lucene" title="lucene" rel="tag">lucene</a>, <a href="http://sexywp.com/tags/note" title="note" rel="tag">note</a><br />
]]></content:encoded>
			<wfw:commentRss>http://sexywp.com/lucene-note-03.htm/feed</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Lucene笔记02</title>
		<link>http://sexywp.com/lucene-note-02.htm</link>
		<comments>http://sexywp.com/lucene-note-02.htm#comments</comments>
		<pubDate>Wed, 17 Dec 2008 06:15:29 +0000</pubDate>
		<dc:creator>Charles</dc:creator>
				<category><![CDATA[工作相关]]></category>
		<category><![CDATA[lucene]]></category>
		<category><![CDATA[note]]></category>

		<guid isPermaLink="false">http://www.charlestang.cn/?p=278</guid>
		<description><![CDATA[要完成最基本的建立索引的过程，Lucene需要以下几个对象的合作：





<span class="readmore"><a href="http://sexywp.com/lucene-note-02.htm" title="Lucene笔记02">Keep Reading --- 458 words totally</a></span><table class="wumii-related-items" cellspacing="0" cellpadding="2" border="0" width="100%" style="clear: both;">
    
    <tr>
        <td ><b><font size="-1"  style="display: block !important; padding: 20px 0 5px !important;">您可能也喜欢：</font></b></td>
    </tr>
    
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-03.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-02.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记03</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-06.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-02.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记06</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-04.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-02.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记04</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-07.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-02.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记07——中文分词</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-05.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-02.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记05</font>
                    </a>
                </td>
            </tr>
    
    <tr>
        <td  align="right">
            <a style="text-decoration: none !important;" href="http://www.wumii.com/widget/relatedItems.htm" target="_blank" title="无觅相关文章插件">
                <font size="-1" color="#bbbbbb" style="display: block !important; font-family: arial !important; padding: 5px 0 !important; font-size: 12px !important; color: #bbb !important;">无觅</font>
            </a>
        </td>
    </tr>
</table>]]></description>
			<content:encoded><![CDATA[</p>
<p> <span id="more-278"></span>要完成最基本的建立索引的过程，Lucene需要以下几个对象的合作：
</p>
<ul>
<li>IndexWriter——Lucene内部用来创建索引的最重要的组件。可以创建新索引，或者从文档增量地创建索引。 </li>
<li>Directory——Directory是一个抽象类，用于表达索引存放的目录，在lucene内部提供了两个实现，一个是FSDirectory，一个是RAMDirectory，顾名思义了。Directory可能在内部提供了锁的机制，使得建立索引和搜索可以同时进行。 </li>
<li>Analyzer——又是一个抽象类，是IndexWriter的构成组件之一，主要用来分析文本，包括分词，去除stop words等等功能。在构建一个项目的时候，选取或者创建正确的分析器是至关重要的。 </li>
<li>Document——是Lucene处理的对象，一个Document是一组Field的集合 </li>
<li>Field——Lucene建立的索引中，每个Document都包含一个或者多个命名的域，被包装在Field类中，Field有多种的类型，Keyword，UnIndexed，Unstored，Text </li>
</ul>
<p>按照书中的说法，在进行一个最简单的建立索引的过程时候，必须要用到这几个类，但是上一次的笔记中，我也帖了我敲的代码，貌似则个Directory是没有直接在Indexer的代码中提到的，不过，我进IndexWriter的构造函数看了一下，其实是用到的，如果我们在构造一个IndexWriter的时候，没有传递一个Directory给它，而是只传了一个路径，那么会默认使用FSDirectory对像的，这是一种使用了简单锁机制的Directory对象。</p>
<table class="wumii-related-items" cellspacing="0" cellpadding="2" border="0" width="100%" style="clear: both;">
    
    <tr>
        <td ><b><font size="-1"  style="display: block !important; padding: 20px 0 5px !important;">您可能也喜欢：</font></b></td>
    </tr>
    
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-03.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-02.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记03</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-06.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-02.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记06</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-04.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-02.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记04</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-07.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-02.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记07——中文分词</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-05.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-02.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记05</font>
                    </a>
                </td>
            </tr>
    
    <tr>
        <td  align="right">
            <a style="text-decoration: none !important;" href="http://www.wumii.com/widget/relatedItems.htm" target="_blank" title="无觅相关文章插件">
                <font size="-1" color="#bbbbbb" style="display: block !important; font-family: arial !important; padding: 5px 0 !important; font-size: 12px !important; color: #bbb !important;">无觅</font>
            </a>
        </td>
    </tr>
</table>
	标签：<a href="http://sexywp.com/tags/lucene" title="lucene" rel="tag">lucene</a>, <a href="http://sexywp.com/tags/note" title="note" rel="tag">note</a><br />
]]></content:encoded>
			<wfw:commentRss>http://sexywp.com/lucene-note-02.htm/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Lucene笔记01</title>
		<link>http://sexywp.com/lucene-note-01.htm</link>
		<comments>http://sexywp.com/lucene-note-01.htm#comments</comments>
		<pubDate>Wed, 10 Dec 2008 08:24:30 +0000</pubDate>
		<dc:creator>Charles</dc:creator>
				<category><![CDATA[日　　记]]></category>
		<category><![CDATA[code examples]]></category>
		<category><![CDATA[lucene]]></category>
		<category><![CDATA[note]]></category>

		<guid isPermaLink="false">http://www.charlestang.cn/?p=272</guid>
		<description><![CDATA[由于项目需要，开始学习Lucene，现在手头在看的就一本书《Lucene in Action》，别的材料手头也没有，不过，有一点非常遗憾，就是这本书已经非常旧了，所以，决定一边看，一边验证，主要是参看一下源代码，也是没有办法的事情，就在博客上做点小笔记好了。



From the very beginning

<span class="readmore"><a href="http://sexywp.com/lucene-note-01.htm" title="Lucene笔记01">Keep Reading --- 455 words totally</a></span><table class="wumii-related-items" cellspacing="0" cellpadding="2" border="0" width="100%" style="clear: both;">
    
    <tr>
        <td ><b><font size="-1"  style="display: block !important; padding: 20px 0 5px !important;">您可能也喜欢：</font></b></td>
    </tr>
    
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-04.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-01.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记04</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-02.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-01.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记02</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-05.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-01.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记05</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-07.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-01.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记07——中文分词</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-03.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-01.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记03</font>
                    </a>
                </td>
            </tr>
    
    <tr>
        <td  align="right">
            <a style="text-decoration: none !important;" href="http://www.wumii.com/widget/relatedItems.htm" target="_blank" title="无觅相关文章插件">
                <font size="-1" color="#bbbbbb" style="display: block !important; font-family: arial !important; padding: 5px 0 !important; font-size: 12px !important; color: #bbb !important;">无觅</font>
            </a>
        </td>
    </tr>
</table>]]></description>
			<content:encoded><![CDATA[<p>由于项目需要，开始学习Lucene，现在手头在看的就一本书《<a href="http://sexywp.com/tags/lucene" class="st_tag internal_tag" rel="tag" title="标签 lucene 下的日志">Lucene</a> in Action》，别的材料手头也没有，不过，有一点非常遗憾，就是这本书已经非常旧了，所以，决定一边看，一边验证，主要是参看一下源代码，也是没有办法的事情，就在博客上做点小笔记好了。<br />
<span id="more-272"></span></p>
<h3>From the very beginning</h3>
<p>第一章，就有两个小例子，用来展示一下Lucene的功能，可惜，那两个例子是基于Lucene 1.x版本的，现在Lucene的版本已经发展到了2.4.0，并且在向3.0迈进了，很多东西的结构都变了。</p>
<p>两个代码例子，我照着敲了一遍，发现根本就跑不了，根据我的理解，我又改了点东西，记录在下面：</p>
<p>Indexer的代码范例：</p>
<div class="hl-surround"><div class="hl-main"><ol class="hl-main ln-show" title="Double click to hide line number." ondblclick = "linenumber(this)"><li>&nbsp;<span style="color: Green;">package</span><span style="color: Gray;"> </span><span style="color: Blue;">MainTest</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;</span><span style="color: Green;">import</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">java</span><span style="color: Gray;">.</span><span style="color: Blue;">io</span><span style="color: Gray;">.</span><span style="color: Blue;">File</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;</span><span style="color: Green;">import</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">java</span><span style="color: Gray;">.</span><span style="color: Blue;">io</span><span style="color: Gray;">.</span><span style="color: Blue;">FileReader</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;</span><span style="color: Green;">import</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">java</span><span style="color: Gray;">.</span><span style="color: Blue;">io</span><span style="color: Gray;">.</span><span style="color: Blue;">IOException</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;</span><span style="color: Green;">import</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">java</span><span style="color: Gray;">.</span><span style="color: Blue;">util</span><span style="color: Gray;">.</span><span style="color: Blue;">Date</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;</span><span style="color: Green;">import</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">org</span><span style="color: Gray;">.</span><span style="color: Blue;">apache</span><span style="color: Gray;">.</span><span style="color: Blue;">lucene</span><span style="color: Gray;">.</span><span style="color: Blue;">analysis</span><span style="color: Gray;">.</span><span style="color: Blue;">standard</span><span style="color: Gray;">.</span><span style="color: Blue;">StandardAnalyzer</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;</span><span style="color: Green;">import</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">org</span><span style="color: Gray;">.</span><span style="color: Blue;">apache</span><span style="color: Gray;">.</span><span style="color: Blue;">lucene</span><span style="color: Gray;">.</span><span style="color: Blue;">document</span><span style="color: Gray;">.</span><span style="color: Blue;">Document</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;</span><span style="color: Green;">import</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">org</span><span style="color: Gray;">.</span><span style="color: Blue;">apache</span><span style="color: Gray;">.</span><span style="color: Blue;">lucene</span><span style="color: Gray;">.</span><span style="color: Blue;">document</span><span style="color: Gray;">.</span><span style="color: Blue;">Field</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;</span><span style="color: Green;">import</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">org</span><span style="color: Gray;">.</span><span style="color: Blue;">apache</span><span style="color: Gray;">.</span><span style="color: Blue;">lucene</span><span style="color: Gray;">.</span><span style="color: Blue;">index</span><span style="color: Gray;">.</span><span style="color: Blue;">IndexWriter</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;</span><span style="color: Green;">import</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">org</span><span style="color: Gray;">.</span><span style="color: Blue;">apache</span><span style="color: Gray;">.</span><span style="color: Blue;">lucene</span><span style="color: Gray;">.</span><span style="color: Blue;">index</span><span style="color: Gray;">.</span><span style="color: Blue;">IndexWriter</span><span style="color: Gray;">.</span><span style="color: Blue;">MaxFieldLength</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;</span><span style="color: Green;">public</span><span style="color: Gray;">&nbsp;</span><span style="color: Green;">class</span><span style="color: Gray;"> </span><span style="color: Blue;">Indexer</span><span style="color: Gray;"> </span><span style="color: Olive;">{</span></li>
<li><span style="color: Gray;">&nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; </span><span style="color: #ffa500;">/*</span><span style="color: #ffa500;">*</span></li>
<li><span style="color: #ffa500;">&nbsp;&nbsp; &nbsp; &nbsp;*</span><span style="color: Blue;"> @param </span><span style="color: #ffa500;">args</span></li>
<li><span style="color: #ffa500;">&nbsp;&nbsp; &nbsp; &nbsp;*</span><span style="color: Blue;"> @throws </span><span style="color: #ffa500;">Exception </span></li>
<li><span style="color: #ffa500;">&nbsp;&nbsp; &nbsp; &nbsp;</span><span style="color: #ffa500;">*/</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; </span><span style="color: Green;">public</span><span style="color: Gray;">&nbsp;</span><span >static</span><span style="color: Gray;"> </span><span >void</span><span style="color: Gray;"> </span><span style="color: Blue;">main</span><span style="color: Olive;">(</span><span style="color: Blue;">String</span><span style="color: Olive;">[</span><span style="color: Olive;">]</span><span style="color: Gray;"> </span><span style="color: Blue;">args</span><span style="color: Olive;">)</span><span style="color: Gray;"> </span><span style="color: Green;">throws</span><span style="color: Gray;"> </span><span style="color: Blue;">Exception</span><span style="color: Gray;"> </span><span style="color: Olive;">{</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Green;">if</span><span style="color: Gray;">&nbsp;</span><span style="color: Olive;">(</span><span style="color: Blue;">args</span><span style="color: Gray;">.</span><span style="color: Blue;">length</span><span style="color: Gray;"> != </span><span style="color: Maroon;">2</span><span style="color: Olive;">)</span><span style="color: Olive;">{</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Green;">throw</span><span style="color: Gray;">&nbsp;</span><span style="color: Green;">new</span><span style="color: Gray;"> </span><span style="color: Blue;">Exception</span><span style="color: Olive;">(</span><span style="color: #8b0000;">&quot;</span><span style="color: Red;">Usage: java </span><span style="color: #8b0000;">&quot;</span><span style="color: Gray;"> + </span><span style="color: Blue;">Indexer</span><span style="color: Gray;">.</span><span style="color: Green;">class</span><span style="color: Gray;">.</span><span style="color: Blue;">getName</span><span style="color: Olive;">(</span><span style="color: Olive;">)</span><span style="color: Gray;">+ </span><span style="color: #8b0000;">&quot;</span><span style="color: Red;"> &lt;index dir&gt; &lt;data dir&gt;</span><span style="color: #8b0000;">&quot;</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Olive;">}</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Blue;">File</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">indexDir</span><span style="color: Gray;"> = </span><span style="color: Green;">new</span><span style="color: Gray;"> </span><span style="color: Blue;">File</span><span style="color: Olive;">(</span><span style="color: Blue;">args</span><span style="color: Olive;">[</span><span style="color: Maroon;">0</span><span style="color: Olive;">]</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Blue;">File</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">dataDir</span><span style="color: Gray;"> = </span><span style="color: Green;">new</span><span style="color: Gray;"> </span><span style="color: Blue;">File</span><span style="color: Olive;">(</span><span style="color: Blue;">args</span><span style="color: Olive;">[</span><span style="color: Maroon;">1</span><span style="color: Olive;">]</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span >long</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">start</span><span style="color: Gray;"> = </span><span style="color: Green;">new</span><span style="color: Gray;"> </span><span style="color: Blue;">Date</span><span style="color: Olive;">(</span><span style="color: Olive;">)</span><span style="color: Gray;">.</span><span style="color: Blue;">getTime</span><span style="color: Olive;">(</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span >int</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">numIndexed</span><span style="color: Gray;"> = </span><span style="color: Blue;">index</span><span style="color: Olive;">(</span><span style="color: Blue;">indexDir</span><span style="color: Gray;">, </span><span style="color: Blue;">dataDir</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span >long</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">end</span><span style="color: Gray;"> = </span><span style="color: Green;">new</span><span style="color: Gray;"> </span><span style="color: Blue;">Date</span><span style="color: Olive;">(</span><span style="color: Olive;">)</span><span style="color: Gray;">.</span><span style="color: Blue;">getTime</span><span style="color: Olive;">(</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Blue;">System</span><span style="color: Gray;">.</span><span style="color: Blue;">out</span><span style="color: Gray;">.</span><span style="color: Blue;">println</span><span style="color: Olive;">(</span><span style="color: #8b0000;">&quot;</span><span style="color: Red;">Indexig </span><span style="color: #8b0000;">&quot;</span><span style="color: Gray;">+</span><span style="color: Blue;">numIndexed</span><span style="color: Gray;"> + </span><span style="color: #8b0000;">&quot;</span><span style="color: Red;"> files took </span><span style="color: #8b0000;">&quot;</span><span style="color: Gray;"> + </span><span style="color: Olive;">(</span><span style="color: Blue;">end</span><span style="color: Gray;"> - </span><span style="color: Blue;">start</span><span style="color: Olive;">)</span><span style="color: Gray;"> + </span><span style="color: #8b0000;">&quot;</span><span style="color: Red;"> milliseconds</span><span style="color: #8b0000;">&quot;</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; </span><span style="color: Olive;">}</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; </span><span style="color: Green;">public</span><span style="color: Gray;">&nbsp;</span><span >static</span><span style="color: Gray;"> </span><span >int</span><span style="color: Gray;"> </span><span style="color: Blue;">index</span><span style="color: Olive;">(</span><span style="color: Blue;">File</span><span style="color: Gray;"> </span><span style="color: Blue;">indexDir</span><span style="color: Gray;">, </span><span style="color: Blue;">File</span><span style="color: Gray;"> </span><span style="color: Blue;">dataDir</span><span style="color: Olive;">)</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Green;">throws</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">IOException</span><span style="color: Olive;">{</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Green;">if</span><span style="color: Gray;">&nbsp;</span><span style="color: Olive;">(</span><span style="color: Gray;">!</span><span style="color: Blue;">dataDir</span><span style="color: Gray;">.</span><span style="color: Blue;">exists</span><span style="color: Olive;">(</span><span style="color: Olive;">)</span><span style="color: Gray;"> || !</span><span style="color: Blue;">dataDir</span><span style="color: Gray;">.</span><span style="color: Blue;">isDirectory</span><span style="color: Olive;">(</span><span style="color: Olive;">)</span><span style="color: Olive;">)</span><span style="color: Olive;">{</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Green;">throw</span><span style="color: Gray;">&nbsp;</span><span style="color: Green;">new</span><span style="color: Gray;"> </span><span style="color: Blue;">IOException</span><span style="color: Olive;">(</span><span style="color: Blue;">dataDir</span><span style="color: Gray;"> + </span><span style="color: #8b0000;">&quot;</span><span style="color: Red;"> does not exist or is not a directory</span><span style="color: #8b0000;">&quot;</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Olive;">}</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Blue;">IndexWriter</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">writer</span><span style="color: Gray;"> = </span><span style="color: Green;">new</span><span style="color: Gray;"> </span><span style="color: Blue;">IndexWriter</span><span style="color: Olive;">(</span><span style="color: Blue;">indexDir</span><span style="color: Gray;">, </span><span style="color: Green;">new</span><span style="color: Gray;"> </span><span style="color: Blue;">StandardAnalyzer</span><span style="color: Olive;">(</span><span style="color: Olive;">)</span><span style="color: Gray;">, </span><span style="color: Green;">true</span><span style="color: Gray;">, </span><span style="color: Blue;">MaxFieldLength</span><span style="color: Gray;">.</span><span style="color: Blue;">LIMITED</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Blue;">indexDirectory</span><span style="color: Olive;">(</span><span style="color: Blue;">writer</span><span style="color: Gray;">, </span><span style="color: Blue;">dataDir</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span >int</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">numIndexed</span><span style="color: Gray;"> = </span><span style="color: Blue;">writer</span><span style="color: Gray;">.</span><span style="color: Blue;">maxDoc</span><span style="color: Olive;">(</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Blue;">writer</span><span style="color: Gray;">.</span><span style="color: Blue;">optimize</span><span style="color: Olive;">(</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Blue;">writer</span><span style="color: Gray;">.</span><span style="color: Blue;">close</span><span style="color: Olive;">(</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Green;">return</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">numIndexed</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; </span><span style="color: Olive;">}</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; </span><span style="color: Green;">private</span><span style="color: Gray;">&nbsp;</span><span >static</span><span style="color: Gray;"> </span><span >void</span><span style="color: Gray;"> </span><span style="color: Blue;">indexDirectory</span><span style="color: Olive;">(</span><span style="color: Blue;">IndexWriter</span><span style="color: Gray;"> </span><span style="color: Blue;">writer</span><span style="color: Gray;">, </span><span style="color: Blue;">File</span><span style="color: Gray;"> </span><span style="color: Blue;">dir</span><span style="color: Olive;">)</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Green;">throws</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">IOException</span><span style="color: Olive;">{</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Blue;">File</span><span style="color: Olive;">[</span><span style="color: Olive;">]</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">files</span><span style="color: Gray;"> = </span><span style="color: Blue;">dir</span><span style="color: Gray;">.</span><span style="color: Blue;">listFiles</span><span style="color: Olive;">(</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Green;">for</span><span style="color: Olive;">(</span><span >int</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">i</span><span style="color: Gray;"> = </span><span style="color: Maroon;">0</span><span style="color: Gray;">; </span><span style="color: Blue;">i</span><span style="color: Gray;">&lt; </span><span style="color: Blue;">files</span><span style="color: Gray;">.</span><span style="color: Blue;">length</span><span style="color: Gray;">; </span><span style="color: Blue;">i</span><span style="color: Gray;">++</span><span style="color: Olive;">)</span><span style="color: Olive;">{</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Blue;">File</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">f</span><span style="color: Gray;"> = </span><span style="color: Blue;">files</span><span style="color: Olive;">[</span><span style="color: Blue;">i</span><span style="color: Olive;">]</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Green;">if</span><span style="color: Gray;">&nbsp;</span><span style="color: Olive;">(</span><span style="color: Blue;">f</span><span style="color: Gray;">.</span><span style="color: Blue;">isDirectory</span><span style="color: Olive;">(</span><span style="color: Olive;">)</span><span style="color: Olive;">)</span><span style="color: Olive;">{</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Blue;">indexDirectory</span><span style="color: Olive;">(</span><span style="color: Blue;">writer</span><span style="color: Gray;">,</span><span style="color: Blue;">f</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Olive;">}</span><span style="color: Green;">else</span><span style="color: Gray;">&nbsp;</span><span style="color: Green;">if</span><span style="color: Gray;"> </span><span style="color: Olive;">(</span><span style="color: Blue;">f</span><span style="color: Gray;">.</span><span style="color: Blue;">getName</span><span style="color: Olive;">(</span><span style="color: Olive;">)</span><span style="color: Gray;">.</span><span style="color: Blue;">endsWith</span><span style="color: Olive;">(</span><span style="color: #8b0000;">&quot;</span><span style="color: Red;">.txt</span><span style="color: #8b0000;">&quot;</span><span style="color: Olive;">)</span><span style="color: Olive;">)</span><span style="color: Olive;">{</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Blue;">indexFile</span><span style="color: Olive;">(</span><span style="color: Blue;">writer</span><span style="color: Gray;">,</span><span style="color: Blue;">f</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Olive;">}</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Olive;">}</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; </span><span style="color: Olive;">}</span></li>
<li><span style="color: Gray;">&nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; </span><span style="color: Green;">private</span><span style="color: Gray;">&nbsp;</span><span >static</span><span style="color: Gray;"> </span><span >void</span><span style="color: Gray;"> </span><span style="color: Blue;">indexFile</span><span style="color: Olive;">(</span><span style="color: Blue;">IndexWriter</span><span style="color: Gray;"> </span><span style="color: Blue;">writer</span><span style="color: Gray;">, </span><span style="color: Blue;">File</span><span style="color: Gray;"> </span><span style="color: Blue;">f</span><span style="color: Olive;">)</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Green;">throws</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">IOException</span><span style="color: Olive;">{</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Green;">if</span><span style="color: Olive;">(</span><span style="color: Blue;">f</span><span style="color: Gray;">.</span><span style="color: Blue;">isHidden</span><span style="color: Olive;">(</span><span style="color: Olive;">)</span><span style="color: Gray;"> || !</span><span style="color: Blue;">f</span><span style="color: Gray;">.</span><span style="color: Blue;">exists</span><span style="color: Olive;">(</span><span style="color: Olive;">)</span><span style="color: Gray;"> || !</span><span style="color: Blue;">f</span><span style="color: Gray;">.</span><span style="color: Blue;">canRead</span><span style="color: Olive;">(</span><span style="color: Olive;">)</span><span style="color: Olive;">)</span><span style="color: Olive;">{</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Green;">return</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Olive;">}</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Blue;">System</span><span style="color: Gray;">.</span><span style="color: Blue;">out</span><span style="color: Gray;">.</span><span style="color: Blue;">println</span><span style="color: Olive;">(</span><span style="color: #8b0000;">&quot;</span><span style="color: Red;">Indexing </span><span style="color: #8b0000;">&quot;</span><span style="color: Gray;"> + </span><span style="color: Blue;">f</span><span style="color: Gray;">.</span><span style="color: Blue;">getCanonicalPath</span><span style="color: Olive;">(</span><span style="color: Olive;">)</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Blue;">Document</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">doc</span><span style="color: Gray;"> = </span><span style="color: Green;">new</span><span style="color: Gray;"> </span><span style="color: Blue;">Document</span><span style="color: Olive;">(</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: #ffa500;">//</span><span style="color: #ffa500;">doc.add(Field.Text(&quot;contents&quot;, new FileReader(f)));&nbsp; &nbsp; &nbsp; &nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: #ffa500;">//</span><span style="color: #ffa500;">doc.add(Field.Keyword(&quot;filename&quot;, f.getCanonicalPath()));</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Blue;">Field</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">contents</span><span style="color: Gray;"> = </span><span style="color: Green;">new</span><span style="color: Gray;"> </span><span style="color: Blue;">Field</span><span style="color: Olive;">(</span><span style="color: #8b0000;">&quot;</span><span style="color: Red;">contents</span><span style="color: #8b0000;">&quot;</span><span style="color: Gray;">,</span><span style="color: Green;">new</span><span style="color: Gray;"> </span><span style="color: Blue;">FileReader</span><span style="color: Olive;">(</span><span style="color: Blue;">f</span><span style="color: Olive;">)</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Blue;">Field</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">filename</span><span style="color: Gray;"> = </span><span style="color: Green;">new</span><span style="color: Gray;"> </span><span style="color: Blue;">Field</span><span style="color: Olive;">(</span><span style="color: #8b0000;">&quot;</span><span style="color: Red;">filename</span><span style="color: #8b0000;">&quot;</span><span style="color: Gray;">, </span><span style="color: Blue;">f</span><span style="color: Gray;">.</span><span style="color: Blue;">getCanonicalPath</span><span style="color: Olive;">(</span><span style="color: Olive;">)</span><span style="color: Gray;">, </span><span style="color: Blue;">Field</span><span style="color: Gray;">.</span><span style="color: Blue;">Store</span><span style="color: Gray;">.</span><span style="color: Blue;">YES</span><span style="color: Gray;">, </span><span style="color: Blue;">Field</span><span style="color: Gray;">.</span><span style="color: Blue;">Index</span><span style="color: Gray;">.</span><span style="color: Blue;">NO</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Blue;">doc</span><span style="color: Gray;">.</span><span style="color: Blue;">add</span><span style="color: Olive;">(</span><span style="color: Blue;">contents</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Blue;">doc</span><span style="color: Gray;">.</span><span style="color: Blue;">add</span><span style="color: Olive;">(</span><span style="color: Blue;">filename</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Blue;">writer</span><span style="color: Gray;">.</span><span style="color: Blue;">addDocument</span><span style="color: Olive;">(</span><span style="color: Blue;">doc</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; </span><span style="color: Olive;">}</span></li>
<li><span style="color: Gray;">&nbsp;</span><span style="color: Olive;">}</span></li></ol></div></div>
<p>Searcher的代码范例，这部分好像变化了很多东西啊：</p>
<div class="hl-surround"><div class="hl-main"><ol class="hl-main ln-show" title="Double click to hide line number." ondblclick = "linenumber(this)"><li>&nbsp;<span style="color: Green;">package</span><span style="color: Gray;"> </span><span style="color: Blue;">MainTest</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;</span><span style="color: Green;">import</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">java</span><span style="color: Gray;">.</span><span style="color: Blue;">io</span><span style="color: Gray;">.</span><span style="color: Blue;">File</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;</span><span style="color: Green;">import</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">java</span><span style="color: Gray;">.</span><span style="color: Blue;">util</span><span style="color: Gray;">.</span><span style="color: Blue;">Date</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;</span><span style="color: Green;">import</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">org</span><span style="color: Gray;">.</span><span style="color: Blue;">apache</span><span style="color: Gray;">.</span><span style="color: Blue;">lucene</span><span style="color: Gray;">.</span><span style="color: Blue;">analysis</span><span style="color: Gray;">.</span><span style="color: Blue;">standard</span><span style="color: Gray;">.</span><span style="color: Blue;">StandardAnalyzer</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;</span><span style="color: Green;">import</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">org</span><span style="color: Gray;">.</span><span style="color: Blue;">apache</span><span style="color: Gray;">.</span><span style="color: Blue;">lucene</span><span style="color: Gray;">.</span><span style="color: Blue;">document</span><span style="color: Gray;">.</span><span style="color: Blue;">Document</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;</span><span style="color: Green;">import</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">org</span><span style="color: Gray;">.</span><span style="color: Blue;">apache</span><span style="color: Gray;">.</span><span style="color: Blue;">lucene</span><span style="color: Gray;">.</span><span style="color: Blue;">queryParser</span><span style="color: Gray;">.</span><span style="color: Blue;">QueryParser</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;</span><span style="color: Green;">import</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">org</span><span style="color: Gray;">.</span><span style="color: Blue;">apache</span><span style="color: Gray;">.</span><span style="color: Blue;">lucene</span><span style="color: Gray;">.</span><span style="color: Blue;">search</span><span style="color: Gray;">.</span><span style="color: Blue;">IndexSearcher</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;</span><span style="color: Green;">import</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">org</span><span style="color: Gray;">.</span><span style="color: Blue;">apache</span><span style="color: Gray;">.</span><span style="color: Blue;">lucene</span><span style="color: Gray;">.</span><span style="color: Blue;">search</span><span style="color: Gray;">.</span><span style="color: Blue;">Query</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;</span><span style="color: Green;">import</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">org</span><span style="color: Gray;">.</span><span style="color: Blue;">apache</span><span style="color: Gray;">.</span><span style="color: Blue;">lucene</span><span style="color: Gray;">.</span><span style="color: Blue;">search</span><span style="color: Gray;">.</span><span style="color: Blue;">ScoreDoc</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;</span><span style="color: Green;">import</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">org</span><span style="color: Gray;">.</span><span style="color: Blue;">apache</span><span style="color: Gray;">.</span><span style="color: Blue;">lucene</span><span style="color: Gray;">.</span><span style="color: Blue;">search</span><span style="color: Gray;">.</span><span style="color: Blue;">TopDocCollector</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;</span><span style="color: Green;">import</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">org</span><span style="color: Gray;">.</span><span style="color: Blue;">apache</span><span style="color: Gray;">.</span><span style="color: Blue;">lucene</span><span style="color: Gray;">.</span><span style="color: Blue;">store</span><span style="color: Gray;">.</span><span style="color: Blue;">Directory</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;</span><span style="color: Green;">import</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">org</span><span style="color: Gray;">.</span><span style="color: Blue;">apache</span><span style="color: Gray;">.</span><span style="color: Blue;">lucene</span><span style="color: Gray;">.</span><span style="color: Blue;">store</span><span style="color: Gray;">.</span><span style="color: Blue;">FSDirectory</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;</span><span style="color: Green;">public</span><span style="color: Gray;">&nbsp;</span><span style="color: Green;">class</span><span style="color: Gray;"> </span><span style="color: Blue;">Searcher</span><span style="color: Gray;"> </span><span style="color: Olive;">{</span></li>
<li><span style="color: Gray;">&nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; </span><span style="color: #ffa500;">/*</span><span style="color: #ffa500;">*</span></li>
<li><span style="color: #ffa500;">&nbsp;&nbsp; &nbsp; &nbsp;*</span><span style="color: Blue;"> @param </span><span style="color: #ffa500;">args</span></li>
<li><span style="color: #ffa500;">&nbsp;&nbsp; &nbsp; &nbsp;*</span><span style="color: Blue;"> @throws </span><span style="color: #ffa500;">Exception </span></li>
<li><span style="color: #ffa500;">&nbsp;&nbsp; &nbsp; &nbsp;</span><span style="color: #ffa500;">*/</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; </span><span style="color: Green;">public</span><span style="color: Gray;">&nbsp;</span><span >static</span><span style="color: Gray;"> </span><span >void</span><span style="color: Gray;"> </span><span style="color: Blue;">main</span><span style="color: Olive;">(</span><span style="color: Blue;">String</span><span style="color: Olive;">[</span><span style="color: Olive;">]</span><span style="color: Gray;"> </span><span style="color: Blue;">args</span><span style="color: Olive;">)</span><span style="color: Gray;"> </span><span style="color: Green;">throws</span><span style="color: Gray;"> </span><span style="color: Blue;">Exception</span><span style="color: Gray;"> </span><span style="color: Olive;">{</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Green;">if</span><span style="color: Gray;">&nbsp;</span><span style="color: Olive;">(</span><span style="color: Blue;">args</span><span style="color: Gray;">.</span><span style="color: Blue;">length</span><span style="color: Gray;"> != </span><span style="color: Maroon;">2</span><span style="color: Olive;">)</span><span style="color: Olive;">{</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Green;">throw</span><span style="color: Gray;">&nbsp;</span><span style="color: Green;">new</span><span style="color: Gray;"> </span><span style="color: Blue;">Exception</span><span style="color: Olive;">(</span><span style="color: #8b0000;">&quot;</span><span style="color: Red;">Usage: java </span><span style="color: #8b0000;">&quot;</span><span style="color: Gray;">+ </span><span style="color: Blue;">Searcher</span><span style="color: Gray;">.</span><span style="color: Green;">class</span><span style="color: Gray;">.</span><span style="color: Blue;">getName</span><span style="color: Olive;">(</span><span style="color: Olive;">)</span><span style="color: Gray;"> + </span><span style="color: #8b0000;">&quot;</span><span style="color: Red;"> &lt;index dir&gt; &lt;query&gt;</span><span style="color: #8b0000;">&quot;</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Olive;">}</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Blue;">File</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">indexDir</span><span style="color: Gray;"> = </span><span style="color: Green;">new</span><span style="color: Gray;"> </span><span style="color: Blue;">File</span><span style="color: Olive;">(</span><span style="color: Blue;">args</span><span style="color: Olive;">[</span><span style="color: Maroon;">0</span><span style="color: Olive;">]</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Blue;">String</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">q</span><span style="color: Gray;"> = </span><span style="color: Blue;">args</span><span style="color: Olive;">[</span><span style="color: Maroon;">1</span><span style="color: Olive;">]</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Green;">if</span><span style="color: Gray;">&nbsp;</span><span style="color: Olive;">(</span><span style="color: Gray;">!</span><span style="color: Blue;">indexDir</span><span style="color: Gray;">.</span><span style="color: Blue;">exists</span><span style="color: Olive;">(</span><span style="color: Olive;">)</span><span style="color: Gray;"> || !</span><span style="color: Blue;">indexDir</span><span style="color: Gray;">.</span><span style="color: Blue;">isDirectory</span><span style="color: Olive;">(</span><span style="color: Olive;">)</span><span style="color: Olive;">)</span><span style="color: Olive;">{</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Green;">throw</span><span style="color: Gray;">&nbsp;</span><span style="color: Green;">new</span><span style="color: Gray;"> </span><span style="color: Blue;">Exception</span><span style="color: Olive;">(</span><span style="color: Blue;">indexDir</span><span style="color: Gray;"> + </span><span style="color: #8b0000;">&quot;</span><span style="color: Red;"> does not exists or is not a directory.</span><span style="color: #8b0000;">&quot;</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Olive;">}</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Blue;">search</span><span style="color: Olive;">(</span><span style="color: Blue;">indexDir</span><span style="color: Gray;">, </span><span style="color: Blue;">q</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; </span><span style="color: Olive;">}</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; </span><span style="color: Green;">public</span><span style="color: Gray;">&nbsp;</span><span >static</span><span style="color: Gray;"> </span><span >void</span><span style="color: Gray;"> </span><span style="color: Blue;">search</span><span style="color: Olive;">(</span><span style="color: Blue;">File</span><span style="color: Gray;"> </span><span style="color: Blue;">indexDir</span><span style="color: Gray;">, </span><span style="color: Blue;">String</span><span style="color: Gray;"> </span><span style="color: Blue;">q</span><span style="color: Olive;">)</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Green;">throws</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">Exception</span><span style="color: Olive;">{</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Blue;">Directory</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">fsDir</span><span style="color: Gray;"> = </span><span style="color: Blue;">FSDirectory</span><span style="color: Gray;">.</span><span style="color: Blue;">getDirectory</span><span style="color: Olive;">(</span><span style="color: Blue;">indexDir</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Blue;">IndexSearcher</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">is</span><span style="color: Gray;"> = </span><span style="color: Green;">new</span><span style="color: Gray;"> </span><span style="color: Blue;">IndexSearcher</span><span style="color: Olive;">(</span><span style="color: Blue;">fsDir</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Blue;">QueryParser</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">qp</span><span style="color: Gray;"> = </span><span style="color: Green;">new</span><span style="color: Gray;"> </span><span style="color: Blue;">QueryParser</span><span style="color: Olive;">(</span><span style="color: #8b0000;">&quot;</span><span style="color: Red;">contents</span><span style="color: #8b0000;">&quot;</span><span style="color: Gray;">, </span><span style="color: Green;">new</span><span style="color: Gray;"> </span><span style="color: Blue;">StandardAnalyzer</span><span style="color: Olive;">(</span><span style="color: Olive;">)</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Blue;">Query</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">query</span><span style="color: Gray;"> = </span><span style="color: Blue;">qp</span><span style="color: Gray;">.</span><span style="color: Blue;">parse</span><span style="color: Olive;">(</span><span style="color: Blue;">q</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span >int</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">hitsPerPage</span><span style="color: Gray;"> = </span><span style="color: Maroon;">10</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Blue;">TopDocCollector</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">collector</span><span style="color: Gray;"> = </span><span style="color: Green;">new</span><span style="color: Gray;"> </span><span style="color: Blue;">TopDocCollector</span><span style="color: Olive;">(</span><span style="color: Blue;">hitsPerPage</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span >long</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">start</span><span style="color: Gray;"> = </span><span style="color: Green;">new</span><span style="color: Gray;"> </span><span style="color: Blue;">Date</span><span style="color: Olive;">(</span><span style="color: Olive;">)</span><span style="color: Gray;">.</span><span style="color: Blue;">getTime</span><span style="color: Olive;">(</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Blue;">is</span><span style="color: Gray;">.</span><span style="color: Blue;">search</span><span style="color: Olive;">(</span><span style="color: Blue;">query</span><span style="color: Gray;">, </span><span style="color: Blue;">collector</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Blue;">ScoreDoc</span><span style="color: Olive;">[</span><span style="color: Olive;">]</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">hits</span><span style="color: Gray;"> = </span><span style="color: Blue;">collector</span><span style="color: Gray;">.</span><span style="color: Blue;">topDocs</span><span style="color: Olive;">(</span><span style="color: Olive;">)</span><span style="color: Gray;">.</span><span style="color: Blue;">scoreDocs</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span >long</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">end</span><span style="color: Gray;"> = </span><span style="color: Green;">new</span><span style="color: Gray;"> </span><span style="color: Blue;">Date</span><span style="color: Olive;">(</span><span style="color: Olive;">)</span><span style="color: Gray;">.</span><span style="color: Blue;">getTime</span><span style="color: Olive;">(</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Blue;">System</span><span style="color: Gray;">.</span><span style="color: Blue;">err</span><span style="color: Gray;">.</span><span style="color: Blue;">println</span><span style="color: Olive;">(</span><span style="color: #8b0000;">&quot;</span><span style="color: Red;">Found </span><span style="color: #8b0000;">&quot;</span><span style="color: Gray;"> + </span><span style="color: Blue;">hits</span><span style="color: Gray;">.</span><span style="color: Blue;">length</span><span style="color: Gray;"> + </span><span style="color: #8b0000;">&quot;</span><span style="color: Red;"> document(s) (in </span><span style="color: #8b0000;">&quot;</span><span style="color: Gray;"> + </span><span style="color: Olive;">(</span><span style="color: Blue;">end</span><span style="color: Gray;"> - </span><span style="color: Blue;">start</span><span style="color: Olive;">)</span><span style="color: Gray;"> + </span><span style="color: #8b0000;">&quot;</span><span style="color: Red;"> milliseconds) that matched query '</span><span style="color: #8b0000;">&quot;</span><span style="color: Gray;"> + </span><span style="color: Blue;">q</span><span style="color: Gray;"> + </span><span style="color: #8b0000;">&quot;</span><span style="color: Red;">':</span><span style="color: #8b0000;">&quot;</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Green;">for</span><span style="color: Gray;">&nbsp;</span><span style="color: Olive;">(</span><span >int</span><span style="color: Gray;"> </span><span style="color: Blue;">i</span><span style="color: Gray;"> = </span><span style="color: Maroon;">0</span><span style="color: Gray;">; </span><span style="color: Blue;">i</span><span style="color: Gray;"> &lt; </span><span style="color: Blue;">hits</span><span style="color: Gray;">.</span><span style="color: Blue;">length</span><span style="color: Gray;">; </span><span style="color: Blue;">i</span><span style="color: Gray;">++</span><span style="color: Olive;">)</span><span style="color: Olive;">{</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </span><span >int</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">docId</span><span style="color: Gray;"> = </span><span style="color: Blue;">hits</span><span style="color: Olive;">[</span><span style="color: Blue;">i</span><span style="color: Olive;">]</span><span style="color: Gray;">.</span><span style="color: Blue;">doc</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Blue;">Document</span><span style="color: Gray;">&nbsp;</span><span style="color: Blue;">doc</span><span style="color: Gray;"> = </span><span style="color: Blue;">is</span><span style="color: Gray;">.</span><span style="color: Blue;">doc</span><span style="color: Olive;">(</span><span style="color: Blue;">docId</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Blue;">System</span><span style="color: Gray;">.</span><span style="color: Blue;">out</span><span style="color: Gray;">.</span><span style="color: Blue;">println</span><span style="color: Olive;">(</span><span style="color: Blue;">doc</span><span style="color: Gray;">.</span><span style="color: Blue;">get</span><span style="color: Olive;">(</span><span style="color: #8b0000;">&quot;</span><span style="color: Red;">filename</span><span style="color: #8b0000;">&quot;</span><span style="color: Olive;">)</span><span style="color: Olive;">)</span><span style="color: Gray;">;</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; </span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; </span><span style="color: Olive;">}</span></li>
<li><span style="color: Gray;">&nbsp;&nbsp; &nbsp; </span><span style="color: Olive;">}</span></li>
<li><span style="color: Gray;">&nbsp;</span><span style="color: Olive;">}</span></li></ol></div></div>
<p>以上代码都在jdk 1.6.x + lucene 2.4.0上测试通过的。</p>
<table class="wumii-related-items" cellspacing="0" cellpadding="2" border="0" width="100%" style="clear: both;">
    
    <tr>
        <td ><b><font size="-1"  style="display: block !important; padding: 20px 0 5px !important;">您可能也喜欢：</font></b></td>
    </tr>
    
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-04.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-01.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记04</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-02.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-01.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记02</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-05.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-01.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记05</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-07.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-01.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记07——中文分词</font>
                    </a>
                </td>
            </tr>
            <tr>
                <td style="margin: 0 !important; padding: 0 !important; line-height: 20px !important;">
                    <img border="0" src="http://static.wumii.com/images/widget/widget_solidPoint.gif">
                    <a target="_blank" style="text-decoration: none !important;" href="http://app.wumii.com/ext/redirect.htm?url=http%3A%2F%2Fsexywp.com%2Flucene-note-03.htm&from=http%3A%2F%2Fsexywp.com%2Flucene-note-01.htm">
                        <font size="-1" color="#333333" style="line-height: 1.65em; font-size: 12px !important;">Lucene笔记03</font>
                    </a>
                </td>
            </tr>
    
    <tr>
        <td  align="right">
            <a style="text-decoration: none !important;" href="http://www.wumii.com/widget/relatedItems.htm" target="_blank" title="无觅相关文章插件">
                <font size="-1" color="#bbbbbb" style="display: block !important; font-family: arial !important; padding: 5px 0 !important; font-size: 12px !important; color: #bbb !important;">无觅</font>
            </a>
        </td>
    </tr>
</table>
	标签：<a href="http://sexywp.com/tags/code-examples" title="code examples" rel="tag">code examples</a>, <a href="http://sexywp.com/tags/lucene" title="lucene" rel="tag">lucene</a>, <a href="http://sexywp.com/tags/note" title="note" rel="tag">note</a><br />
]]></content:encoded>
			<wfw:commentRss>http://sexywp.com/lucene-note-01.htm/feed</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
	</channel>
</rss>

