view · edit · sidebar · attach · print · history

Index>

20171212-postgres10-dictionaries

Summary

Commits

Index

Update to postgres 10

Running the watir tests shows 19 remaining errors

rspec ./spec/address_correction_spec.rb[1:1] # ch.oddb.org should be possible to correct an address for a company
rspec ./spec/address_correction_spec.rb[1:2] # ch.oddb.org should be possible to correct an address for a doctor
rspec ./spec/address_correction_spec.rb[1:3] # ch.oddb.org should be possible to correct an address for a hospital
rspec ./spec/address_correction_spec.rb[1:4] # ch.oddb.org should be possible to correct an address for a pharmacy
rspec ./spec/doctors_spec.rb:64 # ch.oddb.org check doctors
rspec ./spec/evidentia_spec.rb:213 # ch.oddb.org should work all the time for Cellcept when evidenita LNF is not enabled
rspec ./spec/home_2_list_spec.rb:82 # ch.oddb.org admin in home_companies we should see all companies when logged in as admin
rspec ./spec/home_2_list_spec.rb:87 # ch.oddb.org admin in home_companies we should have the link active_companies if logged in as admin
rspec ./spec/paypal_spec.rb:113 # ch.oddb.org should show the poweruser dialog with the top left logo
rspec ./spec/searchbar_spec.rb[1:1] # ch.oddb.org should be possible to find 1,25-Dihydroxycholecalciferol when searching via 125Dihydroxycholecalciferol 1000 in analysen
rspec ./spec/searchbar_spec.rb[1:2] # ch.oddb.org should be possible to find 1,25-Dihydroxycholecalciferol when searching via 125Dihydroxycholecalciferol in analysen
rspec ./spec/searchbar_spec.rb[1:3] # ch.oddb.org should be possible to find CA 125 when searching via 125 in analysen
rspec ./spec/searchbar_spec.rb:101 # ch.oddb.org should be possible to find Dr. Peter Schönbucher via doctors 
rspec ./spec/searchbar_spec.rb:141 # ch.oddb.org should be possible to find the Kantonsspital Glarus via Spital
rspec ./spec/searchbar_spec.rb:153 # ch.oddb.org should be possible to find Krücke via MiGeL
rspec ./spec/searchbar_spec.rb:325 # ch.oddb.org should set best_result when searching Rivoleve via search_type
rspec ./spec/smoketest_spec.rb:223 # ch.oddb.org desitin should have a link to the migel
rspec ./spec/smoketest_spec.rb[1:2:31] # ch.oddb.org desitin should have a working instant search for Inderal and going back
rspec ./spec/welcome_spec.rb[1:5] # ch.oddb.org should be possible to see the sponsored log-in in zone migel

Reasons are:

  • Spec was not adapted to removal of "Zurück zu"
  • Dictionaries do not work (oddb.ch and migel)
  • Address changes/Status page do not work

Trying to generate dictionary on oddb.org via bin/admin shows

ch.oddb> generate_dictionaries
-> ERROR:  could not open dictionary file "/usr/local/pgsql-10.1/share/tsearch_data/french_fulltext.dict": Datei oder Verzeichnis nicht gefunden

This must be fixed.

In the oddbd log we find


failsafe rescued DBI::ProgrammingError < StandardError
ERROR:  could not open dictionary file "/usr/local/pgsql-10.1/share/tsearch_data/french_fulltext.dict": Datei oder Verzeichnis nicht gefunden
/var/www/oddb.org/src/util/oddbapp.rb:1688:in `instance_eval'
/var/www/oddb.org/vendor/ruby/2.4.0/gems/ydbd-pg-0.5.3/lib/dbd/pg/statement.rb:37:in `execute'
/var/www/oddb.org/vendor/ruby/2.4.0/gems/ydbi-0.5.3/lib/dbi/base_classes/database.rb:96:in `execute'
/var/www/oddb.org/vendor/ruby/2.4.0/gems/ydbi-0.5.3/lib/dbi/base_classes/database.rb:114:in `do'
/var/www/oddb.org/vendor/ruby/2.4.0/gems/ydbi-0.5.3/lib/dbi/handles/database.rb:106:in `do'
/var/www/oddb.org/vendor/ruby/2.4.0/gems/odba-1.1.2/lib/odba/connection_pool.rb:39:in `block in method_missing'
/var/www/oddb.org/vendor/ruby/2.4.0/gems/odba-1.1.2/lib/odba/connection_pool.rb:29:in `next_connection'
/var/www/oddb.org/vendor/ruby/2.4.0/gems/odba-1.1.2/lib/odba/connection_pool.rb:38:in `method_missing'
/var/www/oddb.org/vendor/ruby/2.4.0/gems/odba-1.1.2/lib/odba/storage.rb:315:in `generate_dictionary'
/var/www/oddb.org/src/util/oddbapp.rb:1475:in `generate_dictionary'
/var/www/oddb.org/src/util/oddbapp.rb:1482:in `generate_french_dictionary'
/var/www/oddb.org/src/util/oddbapp.rb:1478:in `generate_dictionaries'
/var/www/oddb.org/src/util/oddbapp.rb:1568:in `block in method_missing'
/var/www/oddb.org/src/util/oddbapp.rb:1567:in `synchronize'
/var/www/oddb.org/src/util/oddbapp.rb:1567:in `method_missing'
(eval):1:in `block (2 levels) in _admin'
/var/www/oddb.org/src/util/oddbapp.rb:1688:in `instance_eval'
/var/www/oddb.org/src/util/oddbapp.rb:1688:in `block (2 levels) in _admin'
/var/www/oddb.org/src/util/failsafe.rb:10:in `failsafe'
/var/www/oddb.org/src/util/oddbapp.rb:1687:in `block in _admin'

Generating text search indices must be handled differently under Postgres 10, see Creating indices and Controlling Text Search

Copy old

sudo cp -pv  /usr/share/postgresql/tsearch_data/*fulltext* /usr/local/pgsql-10.1/share/tsearch_data/
‘/usr/share/postgresql/tsearch_data/french_fulltext.affix’ -> ‘/usr/local/pgsql-10.1/share/tsearch_data/french_fulltext.affix’
‘/usr/share/postgresql/tsearch_data/french_fulltext.dict’ -> ‘/usr/local/pgsql-10.1/share/tsearch_data/french_fulltext.dict’
‘/usr/share/postgresql/tsearch_data/french_fulltext.stop’ -> ‘/usr/local/pgsql-10.1/share/tsearch_data/french_fulltext.stop’
‘/usr/share/postgresql/tsearch_data/fulltext.affix’ -> ‘/usr/local/pgsql-10.1/share/tsearch_data/fulltext.affix’
‘/usr/share/postgresql/tsearch_data/fulltext.de.affix’ -> ‘/usr/local/pgsql-10.1/share/tsearch_data/fulltext.de.affix’
‘/usr/share/postgresql/tsearch_data/fulltext.de.dict’ -> ‘/usr/local/pgsql-10.1/share/tsearch_data/fulltext.de.dict’
‘/usr/share/postgresql/tsearch_data/fulltext.de.stop’ -> ‘/usr/local/pgsql-10.1/share/tsearch_data/fulltext.de.stop’
‘/usr/share/postgresql/tsearch_data/fulltext.dict’ -> ‘/usr/local/pgsql-10.1/share/tsearch_data/fulltext.dict’
‘/usr/share/postgresql/tsearch_data/fulltext.fr.affix’ -> ‘/usr/local/pgsql-10.1/share/tsearch_data/fulltext.fr.affix’
‘/usr/share/postgresql/tsearch_data/fulltext.fr.stop’ -> ‘/usr/local/pgsql-10.1/share/tsearch_data/fulltext.fr.stop’
‘/usr/share/postgresql/tsearch_data/fulltext.stop’ -> ‘/usr/local/pgsql-10.1/share/tsearch_data/fulltext.stop’
‘/usr/share/postgresql/tsearch_data/german_fulltext.affix’ -> ‘/usr/local/pgsql-10.1/share/tsearch_data/german_fulltext.affix’
‘/usr/share/postgresql/tsearch_data/german_fulltext.dict’ -> ‘/usr/local/pgsql-10.1/share/tsearch_data/german_fulltext.dict’
‘/usr/share/postgresql/tsearch_data/german_fulltext.stop’ -> ‘/usr/local/pgsql-10.1/share/tsearch_data/german_fulltext.stop

After that bin/admin was able to run generate_dictionaries.

But searching for a Migle article works (e.g. we get Binden), but the migel service log shows

Returning 2017-07-01
2017-12-12 10:47:27.838 CET [27894] ERROR:  invalid input syntax for type oid: "simple"
2017-12-12 10:47:27.838 CET [27894] STATEMENT:                                  SELECT target_id, 
                                                max(ts_rank(search_term, to_tsquery($1, $2))) AS relevance
                                        FROM doctor_index 
                                        WHERE search_term @@ to_tsquery($3, $4) 
                                        GROUP BY target_id
                                        ORDER BY relevance DESC

Pushed commits

Calling jobs/rebuild_indices doctor_index also fails with ERROR: invalid input syntax for type oid: "simple". Pry shows

value = <<-SQL, origin_id, dict, search_term, target_id
INSERT INTO #{index_name} (origin_id, search_term, target_id)
VALUES (?, to_tsvector(?, ?), ?)
SQL
=> ["INSERT INTO doctor_index (origin_id, search_term, target_id)\n" + "VALUES (?, to_tsvector(?, ?), ?)\n",
 425438,
 "simple",
 "Hammel Theres Elisabeth thammelhinch Allgemeine Innere Medizin 1985 Schweiz Kardiologie 1985 Schweiz 7601000216885 Limmattalstr 197 8049 Zuerich Zuerich Zuerich 8049",
 425438]
[4] pry(#<ODBA::Storage>)> puts value.join(' ')
INSERT INTO doctor_index (origin_id, search_term, target_id)
VALUES (?, to_tsvector(?, ?), ?)
 425438 simple Hammel Theres Elisabeth thammelhinch Allgemeine Innere Medizin 1985 Schweiz Kardiologie 1985 Schweiz 7601000216885 Limmattalstr 197 8049 Zuerich Zuerich Zuerich 8049 425438
=> nil

The PSL looks like this

oddb.org.ruby193=# \dS doctor_index
               Table "public.doctor_index"
   Column    |   Type   | Collation | Nullable | Default 
-------------+----------+-----------+----------+---------
 origin_id   | integer  |           |          | 
 search_term | tsvector |           |          | 
 target_id   | integer  |           |          | 
Indexes:
    "origin_id_doctor_index" btree (origin_id)
    "search_term_doctor_index" gist (search_term)
    "target_id_doctor_index" btree (target_id)

After changing the dictionaray to default_german I get the error ERROR: invalid input syntax for type oid: "default_german"

Trying after removing the dictionary line for doctor_index.

Copied some more files

sudo cp -pv /usr/share/postgresql/tsearch_data/ispell_sample.* /usr/local/pgsql-10.1/share/tsearch_data/
‘/usr/share/postgresql/tsearch_data/ispell_sample.affix’ -> ‘/usr/local/pgsql-10.1/share/tsearch_data/ispell_sample.affix’
‘/usr/share/postgresql/tsearch_data/ispell_sample.dict’ -> ‘/usr/local/pgsql-10.1/share/tsearch_data/ispell_sample.dict’

Trying hospital_index (it needs much less time than the doctor_index) I see

dropping: hospital_index
creating: hospital_index
filling: hospital_index
@hospitals.values
source.size: 1112
ERROR:  invalid input syntax for type oid: "simple"

From: /home/niklaus/git/odba/lib/odba/storage.rb @ line 613 ODBA::Storage#update_fulltext_index:

    594:     def update_fulltext_index(index_name, origin_id, search_term, target_id, dict)
    595:       search_term = search_term.gsub(/\s+/, ' ').strip
    596:       if(target_id)
    597:                          value = <<-SQL, origin_id, dict, search_term, target_id
    598: INSERT INTO #{index_name} (origin_id, search_term, target_id)
    599: VALUES (?, to_tsvector(?, ?), ?)
    600: SQL
    601:         self.dbi.do <<-SQL, origin_id, dict, search_term, target_id
    602: INSERT INTO #{index_name} (origin_id, search_term, target_id)
    603: VALUES (?, to_tsvector(?, ?), ?)
    604:         SQL
    605:       else
    606:         self.dbi.do <<-SQL, dict, search_term, origin_id
    607: UPDATE #{index_name} SET search_term=to_tsvector(?, ?)
    608: WHERE origin_id=?
    609:         SQL
    610:       end
    611:                          rescue => error
    612:         require 'pry'; binding.pry
 => 613:       end

[1] pry(#<ODBA::Storage>)> puts value
INSERT INTO hospital_index (origin_id, search_term, target_id)
VALUES (?, to_tsvector(?, ?), ?)
689846
simple
Hopital Cantonal HUG Service de 7601001049048 pharmacologie et toxicologie cliniques Rue Micheli Rue MicheliduCrest 24 Rue Micheli du Crest 24 1211 Geneve 14 GE 022 3829932 022 3829940 Geneve 14 1211
689846
=> nil

Tried various commands via psql as seen on https://www.postgresql.org/docs/10/static/textsearch-dictionaries.html

oddb.org.ruby193=# select version();
                                         version                                         
-----------------------------------------------------------------------------------------
 PostgreSQL 10.1 on x86_64-pc-linux-gnu, compiled by gcc (Funtoo 5.3.0-r1) 5.3.0, 64-bit
(1 row)

oddb.org.ruby193=# select * FROM ts_debug('english', 'Paris');
   alias   |   description   | token |  dictionaries  |  dictionary  | lexemes 
-----------+-----------------+-------+----------------+--------------+---------
 asciiword | Word, all ASCII | Paris | {english_stem} | english_stem | {pari}
(1 row)

oddb.org.ruby193=# SELECT to_tsvector('english','in the list of stop words');
ERROR:  invalid input syntax for type oid: "english"
LINE 1: SELECT to_tsvector('english','in the list of stop words');
                           ^
oddb.org.ruby193=# SELECT to_tsvector('simple','in the list of stop words');
ERROR:  invalid input syntax for type oid: "simple"
LINE 1: SELECT to_tsvector('simple','in the list of stop words');
                           ^
oddb.org.ruby193=# SELECT to_tsvector('default_german','in the list of stop words');
ERROR:  invalid input syntax for type oid: "default_german"
LINE 1: SELECT to_tsvector('default_german','in the list of stop wor...
                           ^

I as able to suppress the annoying "NOTICE" about already generated tables/indices by adding client_min_messages = warning to /opt/data/postgres-10.1/data/postgresql.conf.

Tried to find any problems with pry in dbd/pg/statement.rb in the method execute. But there the oid is not calculated and the bind_vars look like follow:

var is 689846
var is german
var is Hopital Cantonal HUG Service de 7601001049048 pharmacologie et toxicologie cliniques Rue Micheli Rue MicheliduCrest 24 Rue Micheli du Crest 24 1211 Geneve 14 GE 022 3829932 022 3829940 Geneve 14 1211
var is 689846

Changing the methods to fill the indexes by never setting a 'simple', 'default_german', etc as first parameter to to_tsvector. Running update_all_indexs overnight.

view · edit · sidebar · attach · print · history
Page last modified on December 12, 2017, at 05:02 PM