org.aminds.lucene.analysis
クラス EnglishSubReader
java.lang.Object
java.io.Reader
org.apache.lucene.analysis.CharStream
org.apache.lucene.analysis.CharFilter
org.aminds.lucene.analysis.SubReader
org.aminds.lucene.analysis.CodePointBasedSubReader
org.aminds.lucene.analysis.EnglishSubReader
- すべての実装されたインタフェース:
- Closeable, Readable, ReusableCharFilter
public class EnglishSubReader
- extends CodePointBasedSubReader
SubReader that ignores whitespaces between hyphen and alphabet character. This behavior realizes
fine tokenization of multiline/multipage English text with hyphenation.
Current implementation finds '-' + pattern, replacing it with
'-' . Hyphen character is not removed. To remove a hyphen appropriately, we need
a dictionary.
- 作成者:
- Masashi Nakanishi
|
メソッドの概要 |
protected boolean |
accept(int codePoint)
|
static boolean |
isPrintable(int codePoint)
|
| クラス java.lang.Object から継承されたメソッド |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
EnglishSubReader
public EnglishSubReader()
accept
protected boolean accept(int codePoint)
- 定義:
- クラス
CodePointBasedSubReader 内の accept
isPrintable
public static boolean isPrintable(int codePoint)
Copyright (c) 2008-2011 Masashi Nakanishi.