I have a mysql database with couple tables, I wanna migrate the mysql data to ElasticSearch. It's easy to migrate the whole database to ES via a batch job. But how should I update ES from mysql realtime. i.e if there was a update operation in mysql then I should do the same operation in ES. I researched mysql binLog which can reflect any changes from mysql. But I have to parse binLog to ES syntax, I think it's really painful. Thanks! (the same case with Solr)
There is an existing project which takes your binlog, transforms it and ships it to Elasticsearch, You can check it out at: https://github.com/siddontang/go-mysql-elasticsearch
Another one would be this one: https://github.com/noplay/python-mysql-replication.
Note, however, that whichever you pick, it's a good practice to pre-create your index and mappings before indexing your binlog. That gives you more control over your data.
UPDATE:
Here is another interesting blog article on the subject: How to keep Elasticsearch synchronized with a relational database using Logstash
version
of document and the mark the previous version for deletion. This is the standard way how ES works. –
Ardellardella The best open source solution would be this. You can run this as a command line and give the incremental logic too in the command.
GO through this session to get a complete idea.
I guess best option is to simply use Kafka connect plugin called debezium, and use the Mysql Connector for source, and Elastic Search sink connector
© 2022 - 2024 — McMap. All rights reserved.