PostgreSQL(Full Text Search) vs ElasticSearch
Asked Answered
B

4

85

Hi I am doing some research before I implement search feature into my service. I'm currently using PostgreSQL as my main storage. I could definitely use PostgreSQL's built-in Full-Text-Search but the problem is that I have data scattered around several tables.

My service is an e-commerce website. So if a customer searches "good apple laptop", I need to join Brand table, post table and review table(1 post is a combination of several reviews + short summary) to fully search all posts. If I were to use elasticsearch, I could insert complete posts by preprocessing.

From my research, some people said PostgreSQL's FTS and elasticsearch have similar performance and some people said elasticsearch is faster. Which would be better solution for my case?

Thanks in advance

Boldface answered 12/11, 2019 at 5:2 Comment(3)
How do you know the searching keyword is related to some tables you have stored in your database?Erotomania
I don't.. So I was thinking of joining all possible columns in different tables and turning them into ts_vector. Are there any better solutions?Boldface
Hmm, this will involve in semantic recognition problem and it's another story...Erotomania
E
-14

Short Answer: Elasticsearch is better

Explanation: PostgreSQL and Elasticsearch are 2 different types of databases. Elasticsearch is powerful for document searching, and PostgreSQL is a traditional RDBMS. No matter how well PostgreSQL does on its full-text searches, Elasticsearch is designed to search in enormous texts and documents(or records). And the more size you want to search in, the more Elasticsearch is better than PostgreSQL in performance. Additionally, you could also get many benefits and great performance if you pre-process the posts into several fields and indexes well before storing into Elasticsearch.

If you surely need the full-text feature, you may consider MSSQL, which may do better than PostgreSQL.

Reply on Comments: It should be commonsense for the properties comparison on those different types of DBs. Since OP didn't provide what amount and size of data are stored. If this is small size data-in-search, Maybe choose Postgres or ES, both are OK. However, if transactions and data repository become larger in future, ES will provide benefits.

You could check this site to know the current ranking of each type DB, and choose the best one for your requirements, architecture and future data growth of your applications.

Erotomania answered 12/11, 2019 at 5:22 Comment(7)
Agreed on the rethoric but if you have some proof or other sources, it will be more reliable.Paginate
Your answer is only based on your opinions, you haven't wrote any example, benchmark or link to prove your point and I can't see other yours answers on the subject that can prove you know about these software. I see you are a new contributor so I would suggest you for next time to not write absolute sentence and to report your experiences, real data or links to prove your thesis.Pointblank
@conifers good the update and clarification on your answer but the link you added don't prove your point. I had been interested if you would have added an URL with a comparison or a benchmark.Pointblank
ranking by popularity does not mean Elasticsearch outperforms PostgreSQL when it comes to full text search. "Better" and "It's should be the common sense" mean we expect to see some benchmark or test that compares those two technologies in your answer which there is not.Fistulous
"Number of mentioning" and "Public interest" are the worst possible criteria for software selection, save maybe site design.Porshaport
This article rocky.dev/full-text-search provides some quantitative comparisons that support your conclusions. Elastic Search is 5x faster than Postgres for its synthetic data set. This matches my experience.Territorial
@PhilipConstantinou the article you cited has been updated with a GIN index and "queries down to 6-10ms when measured in psql, practically the same as Elasticsearch" ...Pointblank
P
105

If PostgreSQL is already in your stack the best option for you is using the PostgreSQL full-text search.

Why full-text search (FTS) in PostgreSQL ?

Because otherwise you have to feed database content to external search engines.

External search engines (e.g. elasticsearch) are fast BUT:

  • They can't index all documents - could be totally virtual
  • They don't have access to attributes - no complex queries
  • They have to be maintained — headache for DBA
  • Sometimes they need to be certified
  • They don't provide instant search (need time to download new data and reindex)
  • They don't provide consistency — search results can be already deleted from database

If you want to read more about FTS in PostgreSQL there's a great presentation by Oleg Bartunov (I extracted the list above from here): "Do you need a Full-Text Search in PostgreSQL ?"

This as a short example how you can create a "Document" (read the text search documentation) from more than one table in SQL:

SELECT to_tsvector(posts.summary || ' ' || brands.name) 
FROM posts
INNER JOIN brands ON (brand_id = brands.id);

If you are using Django for your e-commerce website you can also read this article I wrote on "Full-Text Search in Django with PostgreSQL"

Pointblank answered 14/11, 2019 at 13:34 Comment(10)
Something about elasticsearch's statement is wrong... They can't index all documents: Surely you can! If you have already identified and transform it into your configuration while indexing, just like in PostgreSQL you need to define the DDL first. They don't have access to attributes: Yes it might true due to due to PostgreSQL is general use databases, need to support CRUD well. They have to be maintained: Does PostgreSQL need not to be maintained?... The routine backup, performance tuning is still required no matter what type DB.Erotomania
They don't provide instant search: Well, ES is just strong on instant search...plz try Kibana firstly. They don't provide consistency: This might be the only true statement due to any RDBMS is required on ACID properties.Erotomania
The complete sentence is They don't provide instant search (need time to download new data and reindex) : it means if your user on the e-commerce website (as in the question) buy the last Item1 available, this information is instantly stored on PostgreSQL, and if you use the full-text search of PostgreSQL other users will not find Item1 in the search section. Otherwise if you use Elasitcsearch, you need time to send this new information to Elasticsearch and to reindex before other users will stop seeing Item1 in the search result. Maybe they try to buy it but it's no longer available. :-(Pointblank
About all other points in the list there's only one thing I want to write: In original question @j-s-c wrote that they already have PostgreSQL in their stack so the data are already stored there, they already have access to all attribute to execute full-text search with relational query. BUT if you use Elasticsearch you have to add time to send a small part of data (not all attributes) from PG to ES, time to reindex data in ES. At the end using ES you will have another service to manage, more memory occupied, more storage space to store redundant data and delay in your whole process.Pointblank
A lot of the argument here lacks substance. Arguably the reasons to use Postgres here over ES would be considerations, but not a strong argument for choosing one over the other.Terrill
docs.djangoproject.com/en/3.2/ref/contrib/postgres/search nice example of using postgres with djangoMessenger
Another con is that you would be inducing coupling between PG and ES. A headache for Devs and Dev Architects. Not one can vary without surely not impacting the other.Possession
Sorry, but all of this reads incredibly biased.Bema
This is a reasonable answer for most developers but it doesn't seem to address the question of what types of queries the various solutions offer. For instance, doesn't elasticsearch offer more search options?Philbert
As some other people alluded, the answer is not wrong in that it is easier to use what you already have if you can. But I am not sure one can make an apple to apples comparison here. Elastic Search is a search powerhouse with a ton of features. For example look at all the ways to fine tune relevance results elastic.co/guide/en/app-search/current/…Kamakura
O
11

I've found research for 2021 with some benchmarks

Postgresql vs ElasticSearch performance graph

and useful Conclusion
With each new version of PostgreSQL, the search response time is improving, and it is proceeding toward an apple to apple comparison when compared with ElasticSearch. So, if the project is not going to have millions of records or large-scale data, Postgresql Full-Text Search would be the best option to opt for.

Ovalle answered 27/12, 2022 at 8:44 Comment(4)
While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes. - From ReviewAtkinson
hey! i've posted link to benchmark graph. IT IS the essential part of the research, there is NO LINK to research in my answer, only graph with conclusion. What's wrong?Ovalle
Maybe also include a link to the quoted source itself, too, so that people can read more than what you quoted, and so the authors can get some credit? That source appears to be fueled.com/the-cache/posts/backend/… , based on a Google Images search.Shelf
@Shelf thanks! i m've missed itOvalle
C
1

For a full text search implementation on a Postgres stack, I would recommend PGroonga. PGroonga is a PostgreSQL extension which offers: fast, easy to implement (use standard SQL queries with some pgroonga specific operators), full text search support for all languages, including character based languages, like Japanese and Chinese. And all this is out of the box, no extra configurations required.

PGroonga Documentation

PGroonga Github

Cornie answered 3/5 at 2:14 Comment(0)
E
-14

Short Answer: Elasticsearch is better

Explanation: PostgreSQL and Elasticsearch are 2 different types of databases. Elasticsearch is powerful for document searching, and PostgreSQL is a traditional RDBMS. No matter how well PostgreSQL does on its full-text searches, Elasticsearch is designed to search in enormous texts and documents(or records). And the more size you want to search in, the more Elasticsearch is better than PostgreSQL in performance. Additionally, you could also get many benefits and great performance if you pre-process the posts into several fields and indexes well before storing into Elasticsearch.

If you surely need the full-text feature, you may consider MSSQL, which may do better than PostgreSQL.

Reply on Comments: It should be commonsense for the properties comparison on those different types of DBs. Since OP didn't provide what amount and size of data are stored. If this is small size data-in-search, Maybe choose Postgres or ES, both are OK. However, if transactions and data repository become larger in future, ES will provide benefits.

You could check this site to know the current ranking of each type DB, and choose the best one for your requirements, architecture and future data growth of your applications.

Erotomania answered 12/11, 2019 at 5:22 Comment(7)
Agreed on the rethoric but if you have some proof or other sources, it will be more reliable.Paginate
Your answer is only based on your opinions, you haven't wrote any example, benchmark or link to prove your point and I can't see other yours answers on the subject that can prove you know about these software. I see you are a new contributor so I would suggest you for next time to not write absolute sentence and to report your experiences, real data or links to prove your thesis.Pointblank
@conifers good the update and clarification on your answer but the link you added don't prove your point. I had been interested if you would have added an URL with a comparison or a benchmark.Pointblank
ranking by popularity does not mean Elasticsearch outperforms PostgreSQL when it comes to full text search. "Better" and "It's should be the common sense" mean we expect to see some benchmark or test that compares those two technologies in your answer which there is not.Fistulous
"Number of mentioning" and "Public interest" are the worst possible criteria for software selection, save maybe site design.Porshaport
This article rocky.dev/full-text-search provides some quantitative comparisons that support your conclusions. Elastic Search is 5x faster than Postgres for its synthetic data set. This matches my experience.Territorial
@PhilipConstantinou the article you cited has been updated with a GIN index and "queries down to 6-10ms when measured in psql, practically the same as Elasticsearch" ...Pointblank

© 2022 - 2024 — McMap. All rights reserved.