using BufferedWriter in flask whooshalchemy
Asked Answered
B

1

7

Hi I am running a flask app with a postgreSQL database. I get LockErrors when using multiple workers. I learned that this is because the whoosh search locks the database

https://mcmap.net/q/1627848/-postgres-lockerror-how-to-investigate

As explained in this link I have to use BufferedWriter... I google around, but I really can't figure out how to implement it? Here is my database setup in terms of whoosh

import sys
if sys.version_info >= (3, 0):
    enable_search = False
else:
    enable_search = True
    import flask.ext.whooshalchemy as whooshalchemy

class User(db.Model):
    __searchable__ = ['username','email','position','institute','id'] # these fields will be indexed by whoosh

    id = db.Column(db.Integer, primary_key=True)
    username = db.Column(db.String(100), index=True)
    ...

    def __repr__(self):
        return '<User %r>' % (self.username)

if enable_search:
    whooshalchemy.whoosh_index(app, User)

help is much appreciated thanks carl

EDIT: If there is no capability for parallel access in flask-whosshsqlalchemy are there any alternatives you could suggest?

Biceps answered 22/4, 2016 at 19:47 Comment(0)
M
2

As you can read here:

http://whoosh.readthedocs.io/en/latest/threads.html

Only one writer can hold lock. Buffered writer, keeps your data for sometime, but... at some point your objects are stored, and that mean - lock.

According to that document async writer is something that you are looking for, but... That would try to store your data, if fails - it will create additional thread, and retry. Let's suppose you are throwing 1000 new items. Potentially you will end up with something like 1000 threads. It can be better to treat each insert as a task, and send it to separate thread. If there are many processes, you can stack that tasks. For instance - insert 10, and wait. If that 10 are inserted as a batch, in short time? Will work - for some time...

Edit

Sample with async reader - to make buffered - simply rename import, and usage.

import os, os.path
from whoosh import index
from whoosh.fields import SchemaClass, TEXT, KEYWORD, ID

if not os.path.exists("data"):
    os.mkdir("data")

# http://whoosh.readthedocs.io/en/latest/schema.html
class MySchema(SchemaClass):
    path = ID(stored=True)
    title = TEXT(stored=True)
    icon = TEXT
    content = TEXT(stored=True)
    tags = KEYWORD

# http://whoosh.readthedocs.io/en/latest/indexing.html
ix = index.create_in("data", MySchema, indexname="myindex")

writer = ix.writer()
writer.add_document(title=u"My document", content=u"This is my document!",
                    path=u"/a", tags=u"first short", icon=u"/icons/star.png")
writer.add_document(title=u"Second try", content=u"This is the second example.",
                    path=u"/b", tags=u"second short", icon=u"/icons/sheep.png")
writer.add_document(title=u"Third time's the charm", content=u"Examples are many.",
                    path=u"/c", tags=u"short", icon=u"/icons/book.png")
writer.commit()

# needed to release lock
ix.close()

#http://whoosh.readthedocs.io/en/latest/api/writing.html#whoosh.writing.AsyncWriter
from whoosh.writing import AsyncWriter

ix = index.open_dir("data", indexname="myindex")

writer = AsyncWriter(ix)
writer.add_document(title=u"My document no 4", content=u"This is my document!",
                    path=u"/a", tags=u"four short", icon=u"/icons/star.png")
writer.add_document(title=u"5th try", content=u"This is the second example.",
                    path=u"/b", tags=u"5 short", icon=u"/icons/sheep.png")
writer.add_document(title=u"Number six is coming", content=u"Examples are many.",
                    path=u"/c", tags=u"short", icon=u"/icons/book.png")
writer.commit()
Mulberry answered 7/5, 2016 at 22:12 Comment(3)
Hi Michal thanks for your reply. In my case I do not have big load on the database, I just get an lock error every now and than and would like to avoid it... the standard solutions like BufferedWriter would be completely fine... I just can't find an example for how to use it with flask-whoshalchemyBiceps
does that mean flask-whooshalchemy does not provide this functionality?Biceps
Don't know... I had an idea to create something like elastic search is for Lucene, maybe with compatible interface :)Radiotelephony

© 2022 - 2024 — McMap. All rights reserved.