Strict Standards: Declaration of action_plugin_blog::register() should be compatible with DokuWiki_Action_Plugin::register($controller) in /var/www/vhosts/w629.ws.domainking.cloud/enjoy-lei.com/dokuwiki/lib/plugins/blog/action.php on line 13

Strict Standards: Declaration of action_plugin_indexmenu::register() should be compatible with DokuWiki_Action_Plugin::register($controller) in /var/www/vhosts/w629.ws.domainking.cloud/enjoy-lei.com/dokuwiki/lib/plugins/indexmenu/action.php on line 13

Strict Standards: Declaration of action_plugin_importoldchangelog::register() should be compatible with DokuWiki_Action_Plugin::register($controller) in /var/www/vhosts/w629.ws.domainking.cloud/enjoy-lei.com/dokuwiki/lib/plugins/importoldchangelog/action.php on line 8

Strict Standards: Declaration of action_plugin_importoldindex::register() should be compatible with DokuWiki_Action_Plugin::register($controller) in /var/www/vhosts/w629.ws.domainking.cloud/enjoy-lei.com/dokuwiki/lib/plugins/importoldindex/action.php on line 8

Strict Standards: Declaration of action_plugin_include::register() should be compatible with DokuWiki_Action_Plugin::register($controller) in /var/www/vhosts/w629.ws.domainking.cloud/enjoy-lei.com/dokuwiki/lib/plugins/include/action.php on line 19

Deprecated: Assigning the return value of new by reference is deprecated in /var/www/vhosts/w629.ws.domainking.cloud/enjoy-lei.com/dokuwiki/inc/parserutils.php on line 208

Deprecated: Assigning the return value of new by reference is deprecated in /var/www/vhosts/w629.ws.domainking.cloud/enjoy-lei.com/dokuwiki/inc/parserutils.php on line 211

Deprecated: Assigning the return value of new by reference is deprecated in /var/www/vhosts/w629.ws.domainking.cloud/enjoy-lei.com/dokuwiki/inc/parserutils.php on line 421

Deprecated: Assigning the return value of new by reference is deprecated in /var/www/vhosts/w629.ws.domainking.cloud/enjoy-lei.com/dokuwiki/inc/parserutils.php on line 594

Strict Standards: Declaration of cache_instructions::retrieveCache() should be compatible with cache::retrieveCache($clean = true) in /var/www/vhosts/w629.ws.domainking.cloud/enjoy-lei.com/dokuwiki/inc/cache.php on line 291

Deprecated: Function split() is deprecated in /var/www/vhosts/w629.ws.domainking.cloud/enjoy-lei.com/dokuwiki/inc/auth.php on line 154

Strict Standards: Only variables should be passed by reference in /var/www/vhosts/w629.ws.domainking.cloud/enjoy-lei.com/dokuwiki/doku.php on line 73
mecabで日本語全文検索 [開発チーム]

Deprecated: Function split() is deprecated in /var/www/vhosts/w629.ws.domainking.cloud/enjoy-lei.com/dokuwiki/inc/parser/lexer.php on line 517

Deprecated: Function split() is deprecated in /var/www/vhosts/w629.ws.domainking.cloud/enjoy-lei.com/dokuwiki/inc/parser/lexer.php on line 517

Deprecated: Function split() is deprecated in /var/www/vhosts/w629.ws.domainking.cloud/enjoy-lei.com/dokuwiki/inc/parser/lexer.php on line 517

Deprecated: Function split() is deprecated in /var/www/vhosts/w629.ws.domainking.cloud/enjoy-lei.com/dokuwiki/inc/parser/lexer.php on line 517

Deprecated: Function split() is deprecated in /var/www/vhosts/w629.ws.domainking.cloud/enjoy-lei.com/dokuwiki/inc/parser/lexer.php on line 517

Deprecated: Function split() is deprecated in /var/www/vhosts/w629.ws.domainking.cloud/enjoy-lei.com/dokuwiki/inc/parser/lexer.php on line 517

▼用意するもの

▼mecabのインストール方法

mecab本体

「–prefix=」は展開先

# cd mecab-0.97
# ./configure --enable-utf8-only --enable-mutex --prefix=/usr
# make
# make check
# sudo make install

mecabの辞書

# cd mecab-ipadic-2.7.0-20060707
# ./configure --with-charset=utf8 --prefix=/usr
# make
# sudo make install

入力テスト

上記2点のインストールが終了したら、mecabコマンドが出現

# mecab

コマンドを入力すると、次の行にシフトされるので、そこでコピペした文章を張り付けてEnterキー

# mecab(↓へカーソルが移動したら入力!!!!!)
クマをネコ並みに強くしたかった

すると、結果が表示されれば成功

# mecab
クマをネコ並みに強くしたかった
クマ    名詞,一般,*,*,*,*,クマ,クマ,クマ
を      助詞,格助詞,一般,*,*,*,を,ヲ,ヲ
ネコ    名詞,一般,*,*,*,*,ネコ,ネコ,ネコ
並み    名詞,接尾,一般,*,*,*,並み,ナミ,ナミ
に      助詞,格助詞,一般,*,*,*,に,ニ,ニ
強く    形容詞,自立,*,*,形容詞・アウオ段,連用テ接続,強い,ツヨク,ツヨク
し      動詞,自立,*,*,サ変・スル,連用形,する,シ,シ
たかっ  助動詞,*,*,*,特殊・タイ,連用タ接続,たい,タカッ,タカッ
た      助動詞,*,*,*,特殊・タ,基本形,た,タ,タ
EOS
猫がネコのぬいぐるみが欲しいと言うので、わざわざ買いに都内へ出た
猫      名詞,一般,*,*,*,*,猫,ネコ,ネコ
が      助詞,格助詞,一般,*,*,*,が,ガ,ガ
ネコ    名詞,一般,*,*,*,*,ネコ,ネコ,ネコ
の      助詞,連体化,*,*,*,*,の,ノ,ノ
ぬいぐるみ      名詞,一般,*,*,*,*,ぬいぐるみ,ヌイグルミ,ヌイグルミ
が      助詞,格助詞,一般,*,*,*,が,ガ,ガ
欲しい  形容詞,自立,*,*,形容詞・イ段,基本形,欲しい,ホシイ,ホシイ
と      助詞,格助詞,引用,*,*,*,と,ト,ト
言う    動詞,自立,*,*,五段・ワ行促音便,基本形,言う,イウ,イウ
ので    助詞,接続助詞,*,*,*,*,ので,ノデ,ノデ
、      記号,読点,*,*,*,*,、,、,、
わざわざ        副詞,助詞類接続,*,*,*,*,わざわざ,ワザワザ,ワザワザ
買い    動詞,自立,*,*,五段・ワ行促音便,連用形,買う,カイ,カイ
に      助詞,格助詞,一般,*,*,*,に,ニ,ニ
都内    名詞,一般,*,*,*,*,都内,トナイ,トナイ
へ      助詞,格助詞,一般,*,*,*,へ,ヘ,エ
出      動詞,自立,*,*,一段,連用形,出る,デ,デ
た      助動詞,*,*,*,特殊・タ,基本形,た,タ,タ
EOS

apt-get install mecab mecab-ipadic-utf8

apt-get -t lenny-backports install postgresql-common

apt-get install postgresql-8.3

▼postgresのインストール

▼postgresインストール

# adduser postgres
# mkdir /usr/local/pgsql-8.3.7
# ./configure --prefix=/usr/local/pgsql-8.3.7 --with-readline

◇./configureでエラー

# apt-get install libreadline5 libreadline5-dev
# make
# make install

◇postgresユーザーのホームディレクトリの「.bashrc」に追記

==========================\
PATH="$PATH":/usr/local/pgsql/bin

export POSTGRES_HOME=/usr/local/pgsql
export PGLIB=$POSTGRES_HOME/lib
export PGDATA=$POSTGRES_HOME/data
export MANPATH="$MANPATH":$POSTGRES_HOME/man
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH":"$PGLIB"
==========================

◇追記後、ファイルを読み込み設定の反映

# source .bashrc

◇エンコードの変更 initdb –encoding=UTF8 –no-locale

◇起動 postgres -D /usr/local/pgsql/data

◇DBに接続(予め、「zero」DBを作成済) /usr/local/pgsql/bin/psql -d zero </code>

▼textsearch-jaのインストール

http://lets.postgresql.jp/documents/technical/text-processing/3#exactmatch
分かち検索の参考サイト!/usr/local/pgsql-8.3.7/contrib内にソースをダウンロード

[root]#cd /usr/local/pgsql-8.3.7/contrib
[root]#tar zxfv textsearch_ja-X.X.X.tar.gz 
[root]#cd textsearch_ja 
[root]#make 
[root]#make install 

textsearch-jaモジュールをSQLから呼び出すために下記を実行

[root]#su postgres 
[postgres]#psql -f textsearch_ja.sql YourFreeWordSearchDB(全文検索をしたいデータベース名) 

動作の確認

[postgres]#psql -d [DB名] 
postgres=#SELECT to_tsvector('japanese', '今日は昨日に比べて良い天気です。'); [enterキー] 
※「japanese」の部分は抜けるなら抜いてもOK

参考

★テーブル作成


CREATE TABLE documents (
  id serial PRIMARY KEY,
  body text
);

★SQLコマンド

▽繰り返しデータ入稿
INSERT INTO kuma(body) select body||id from kuma;
INSERT INTO kuma(body) VALUES ('くま');
▽インデックス作成
CREATE INDEX body ON kuma USING gin(to_tsvector('japanese', body));
▽検索速度
EXPLAIN SELECT * FROM kuma WHERE to_tsvector('japanese', body)@@ to_tsquery('japanese', 'パンダ');
EXPLAIN SELECT * FROM kuma WHERE body LIKE '%パンダ%';

★コマンド

 /usr/local/pgsql/bin/psql -d zero -c 'DROP TABLE IF EXISTS kuma'
 /usr/local/pgsql/bin/psql -d zero -c 'INSERT INTO kuma(body) VALUES (\'くま\');'
mecabで日本語全文検索.txt · 最終更新: 2010/07/27 15:47 (外部編集)
www.chimeric.de Creative Commons License Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0

Strict Standards: Only variables should be passed by reference in /var/www/vhosts/w629.ws.domainking.cloud/enjoy-lei.com/dokuwiki/doku.php on line 81