view · edit · sidebar · attach · print · history

20130314-repair-fulltext-index-dictonary-of-postgresql

<< | Index | >>


Summary

  • Repair Dictionary of PostgreSQL on production server
    • oddb.org needs default_german and default_french dictionaries of ODBA.
    • ODBA needs old-version tesarch2 structure (8.0 with patch or 8.1 -8.2). but Production server has new version tsearch2 of postgresql 8.3.

Index


Repair postgerSQL Dictionary for fulltext index search

Try old Dictiorary

tried sql for 8.2 on 8.3.
got many errors.

ERROR:  language "plpgsql" already exists
ERROR:  could not find function "gtsvector_in" in file "/usr/lib64/postgresql-9.1/lib64/tsearch2.so"
ERROR:  function public.gtsvector_in(cstring) does not exist
ERROR:  could not find function "gtsvector_out" in file "/usr/lib64/postgresql-9.1/lib64/tsearch2.so"
ERROR:  function public.gtsvector_out(gtsvector) does not exist
ERROR:  function gtsvector_in(cstring) does not exist
...

But some old version tables are created.

oddb.org.ruby193=# \d pg_ts_cfg;
  Table "public.pg_ts_cfg"
  Column  | Type | Modifiers 
----------+------+-----------
 ts_name  | text | not null
 prs_name | text | not null
 locale   | text | 
Indexes:
    "pg_ts_cfg_pkey" PRIMARY KEY, btree (ts_name)

oddb.org.ruby193=# \d pg_ts_cfgmap;
  Table "public.pg_ts_cfgmap"
  Column   |  Type  | Modifiers 
-----------+--------+-----------
 ts_name   | text   | not null
 tok_alias | text   | not null
 dict_name | text[] | 
Indexes:
    "pg_ts_cfgmap_pkey" PRIMARY KEY, btree (ts_name, tok_alias)

But there are many conflicts in tsearch2 function and tables.
This way dose not work.

Links


Create 8.3 version Dictionary (ispell and configuration)

Update ODBA to be suitable for tsearch2 > 8.3.

oddb.org.ruby193=# CREATE TEXT SEARCH CONFIGURATION public.default_german ( COPY = pg_catalog.german );
CREATE TEXT SEARCH CONFIGURATION
oddb.org.ruby193=# select * from pg_ts_config;
    cfgname     | cfgnamespace | cfgowner | cfgparser 
----------------+--------------+----------+-----------
 simple         |           11 |       10 |      3722

 ...

 default_german |         2200 |       10 |      3722  #=> default_#{language} comes from ODBA
(17 rows)

Dictionary Files von oddb.org are

  • ext/fulltext/data/dicts/german/fulltext.dict
  • ext/fulltext/data/dicts/german/fulltext.aff (old format and old suffix)
  • ext/fulltext/data/dicts/german/fulltext.stop

PostgreSQL 8.3 dose not work with PATH. copied these files into

$ ls -la /usr/share/postgresql/tsearch_data
-rw-r--r-- 1 yasuhiro yasuhiro   27227 Mar 14 14:42 fulltext.affix
-rw-r--r-- 1 yasuhiro yasuhiro 4471026 Mar 14 14:42 fulltext.dict
-rw-r--r-- 1 yasuhiro yasuhiro     706 Mar 14 14:42 fulltext.stop

But aff file got error.

oddb.org.ruby193=# CREATE TEXT SEARCH DICTIONARY german_ispell (
  TEMPLATE = ispell,
  DictFile  = fulltext,
  AffFile   = fulltext,
  StopWords = fulltext
);
ERROR:  wrong affix file format for flag
CONTEXT:  line 25 of configuration file "/usr/share/postgresql-9.1/tsearch_data/fulltext.affix": "PFX G N 1"

These german ispell (affix) files also don't work by Encoding problem.

Finaly commented out this from fulltext.aff of oddb.org like above german tar files.

#PFX G N 1
#   .    > GE
oddb.org.ruby193=# CREATE TEXT SEARCH DICTIONARY german_ispell (
  TEMPLATE = ispell,
  DictFile = fulltext,
  AffFile  = fulltext,
  StopWords = fulltext
);
CREATE TEXT SEARCH DICTIONARY

oddb.org.ruby193=# select * from pg_ts_dict;
    dictname     | dictnamespace | dictowner | dicttemplate |                           dictinitoption                            
-----------------+---------------+-----------+--------------+---------------------------------------------------------------------
  ...
  german_stem     |            11 |        10 |        11378 | language = 'german', stopwords = 'german' 
  ... 
  german_ispell   |          2200 |        10 |         3733 | dictfile = 'fulltext', afffile = 'fulltext', stopwords = 'fulltext'

SQLs (german, french):

# Drop
DROP TEXT SEARCH CONFIGURATION default_german;
DROP TEXT SEARCH DICTIONARY german_ispell;

# Creation
CREATE TEXT SEARCH CONFIGURATION public.default_german ( COPY = pg_catalog.german );

CREATE TEXT SEARCH DICTIONARY german_ispell (
  TEMPLATE = ispell,
  DictFile = fulltext,
  AffFile  = fulltext,
  StopWords = fulltext
);

ALTER TEXT SEARCH CONFIGURATION default_german
ALTER MAPPING FOR
  word, hword, hword_part, hword_numpart,
  numword, numhword
WITH german_ispell, german_stem;

ALTER TEXT SEARCH CONFIGURATION default_german
ALTER MAPPING FOR
 asciiword, asciihword, hword_asciipart,
 email, protocol, url, host, url_path, file,
 sfloat, float, int, uint, version, tag, entity, blank
WITH simple;

This Created Dictionary works fine.

oddb.org.ruby193=# select * from ts_debug('default_german', 'En Güte');
   alias   |    description    | token |        dictionaries         | dictionary  | lexemes 
-----------+-------------------+-------+-----------------------------+-------------+---------
 asciiword | Word, all ASCII   | En    | {simple}                    | simple      | {en}
 blank     | Space symbols     |       | {simple}                    | simple      | {" "}
 word      | Word, all letters | Güte  | {german_ispell,german_stem} | german_stem | {gut}
(3 rows)

Then update ODBA to be suitable for this verion's tsearch2.

Links

view · edit · sidebar · attach · print · history
Page last modified on December 12, 2017, at 10:00 AM