Using PostgreSQL hstore for i18n
Asked Answered
B

2

8

I'm just modeling a PostgreSQL data base for a new project that needs localization-support. I wonder if hstore might be a good way to localize certain data-fields.

For example: let's take a simplified table for blog posts with the fields post_title and post_content. With the key-value type hstore one could add several translations to those fields, identified with a language-code as key:

id | post_title (hstore)   | post_content (hstore)
---|-----------------------|------------------------------
 1 | "en" => "Hello World",| "en" => "Content in english",
   | "de" => "Hallo Welt"  | "de" => "Inhalt auf deutsch"
---|-----------------------|------------------------------
 2 | ...                   | ...

Is anyone experienced in that approach? It seems to be very flexible and easy to use, but maybe I'm missing some significant drawbacks here?

Breeding answered 3/11, 2015 at 18:20 Comment(3)
Interesting idea I think querying on this would be a pain in the dick but I'm not sure how it's any better with other i18n models. Also not sure how the query planner would optimize for things like joins across hstore values.Graceless
Is it possible to do full text search with this approach?Trommel
@OlegYablokov, you can use hstore values in WHERE clauses and you can select specific fields (e.g. for building document indices), so I guess it should be feasible. I recently read an interesting article on full-text search in Postgres – not related to hstore, but its ideas can used with hstore as well.Breeding
S
3

I am using this approach in a project as well and I believe it was a good decision, because I can easily add a new language to my project without the need of writing code or updating the database schema

Also I can easily add fallback to queries like this:

select coalesce(nullif(post_title -> :lang, ''), post_title -> 'en') from posts

Plus there are plenty powerful sql functions at your disposal.

Steverson answered 18/5, 2018 at 4:26 Comment(1)
I'm also using this approach and it works really great from almost every point of view. The only complication I found is when I need the column to be unique only based on the language. I.e. the entry ('en' => 'foo', 'es' => 'foo') is possible, but another entry with ('en' => 'foo', 'es' => 'bar') should not be possible, as 'en' => 'foo' already exists. Any suggestions of an easy way to create a unique constraint that works this way? I'm currently using a UNIQUE INDEX but it's not a very maintainable solution as I need to create an index for each laguageCanuck
P
0

Yes, this is a very good approach that can be made very efficient for search using indexes. Some time ago I did some research on how to do an easily searchable multilingual data model for a products database. Ended up with this type of schema:

schema

The downside is that you have to create an index for each supported language

CREATE EXTENSION IF NOT EXISTS pg_trgm;


CREATE INDEX product_details_name_idx ON Product
USING GIN (lower(details_name_value) gin_trgm_ops);


CREATE INDEX product_translated_name_idx ON Product
USING GIN (lower(details_name_translationbylanguage -> 'pl') gin_trgm_ops);

But in result you get a very efficient execution plan:

explain select
    product0_.id as id1_0_,
    product0_.details_code as details_2_0_,
    product0_.details_description_translationByLanguage as details_3_0_,
    product0_.details_description_value as details_4_0_,
    product0_.details_name_translationByLanguage as details_5_0_,
    product0_.details_name_value as details_6_0_,
    product0_.details_price as details_7_0_
from Product product0_
where lower(product0_.details_name_value) like ('%'||lower('aaa')||'%')
or lower(product0_.details_name_translationByLanguage -> 'pl') like ('%'||lower('aaa')||'%')

  Bitmap Heap Scan on product product0_  (cost=175.84..5669.84 rows=15680 width=173)
  Recheck Cond: ((lower((details_name_value)::text) ~~ '%aaa%'::text) OR (lower((details_name_translationbylanguage -> 'pl'::text)) ~~ '%aaa%'::text))
  ->  BitmapOr  (cost=175.84..175.84 rows=16000 width=0)
        ->  Bitmap Index Scan on product_details_name_idx  (cost=0.00..84.00 rows=8000 width=0)
              Index Cond: (lower((details_name_value)::text) ~~ '%aaa%'::text)
        ->  Bitmap Index Scan on product_translated_name_idx  (cost=0.00..84.00 rows=8000 width=0)
              Index Cond: (lower((details_name_translationbylanguage -> 'pl'::text)) ~~ '%aaa%'::text)

It even was very easy to integrate with an ORM in Javas JPA/Hibernate.

Pau answered 31/7, 2023 at 15:34 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.