Availability tracking with Algolia
Asked Answered
E

1

13

I am working on an Airbnb-like website and I am in the process of rewriting our in-house, SQL-based search system with Algolia. It's been a really pleasant journey so far, as I have managed to remove a lot of legacy code and outsource it, with awesome results. However, there is one critical piece of our search system which I am not sure can be implemented with Algolia.

Internally, we store the availability/unavailability (and price) of each date for each asset as a single row in the database. This means our availabilities table looks like this:

asset_id | date       | status      | price_cents
-------- | ---------- | ----------- | -----------
1        | 2017-02-09 | available   | 15000
1        | 2017-02-10 | available   | 15000
1        | 2017-02-11 | unavailable | NULL
1        | 2017-02-12 | available   | 20000

When a user searches for available properties, they enter a date range and, optionally, a price range.

What we're doing now is simply querying the availabilities table and making sure that all dates in the date range are available for that asset (i.e. the count of available dates is equal to the number of days in the range). If the user enters a price range, we also make sure that the average price for those dates is within the requested range. The SQL query is fairly complex, but this is what it does at the end of the day.

I have been trying to replicate this with Algolia, but couldn't find any documentation about a similar feature. In fact, I am facing two separate issues right now:

  • I have no way to ensure all dates in the provided date range are available, because Algolia has little to no knowledge about associations, and
  • I have no way to calculate (and query) the average price for the provided date range, because it depends on user input (i.e. the date range).

Is there a way to achieve this with Algolia? If not, is it feasible to use SQL or another tool in combination with Algolia to achieve the desired result? Of course, I could do all of this with Elasticsearch, but Algolia is so fast and easy that I'd hate to step away from it because of these issues.

Excrescence answered 9/2, 2017 at 13:57 Comment(1)
Have you found a solution to this? I'm very curious to know how it's done. I've randomly been checking out a housing rental website and I thought of this and now I really need to scratch this itch!Export
B
2

This use-case is definitely complex, and Algolia needs precomputed data in order to work.


Edit 2020 (better solution)

In each item, you could simply store the list of days where the location is available, e.g.

{
  name: "2 bedroom appartment",
  location: "Paris",
  availabilities: ['2020-04-27', '2020-04-28', '2020-04-30']
  price_cents: 30000
}

You could then, at search time, generate the list of all the availabilities you require your items to have, e.g. (available from April 28th to April 30th):

index.search('', {
  filters: '' +
    'availabilities:2020-04-28 AND availabilities:2020-04-29 AND availabilities:2020-04-30 AND ' +
    'price_cents >= ' + lowPriceRange + ' AND price_cents <= ' + highPriceRange 
}) 

In this example, the record wouldn't match as it lacks 2020-04-29.


Another solution, which works more generically, but requires way more records:

I'm assuming there is a cap of the amount of days in advance you can book, I'll assume here it's 90 days.
You could generate every date range possible inside those 90 days.
This would mean generating 90 + 89 + ... = 90 * 91 / 2 = 4095 date ranges.
Then for each of those ranges, and each of the flats you're offering on your service, you could generate an object like this:

{
  name: "2 bedroom appartment",
  location: "Paris",
  availability_range: "2017-02-09 -> 2017-02-10",
  availability_start_timestamp: 10001000,
  availability_end_timestamp: 10002000,
  price_cents: 30000
}

With those objects, then searching for an date range would be as easy as:

index.search('', {
  filters: '' +
    'availability_range:"' + startDate + ' -> ' + endDate + '" AND ' +
    'price_cents >= ' + lowPriceRange + ' AND price_cents <= ' + highPriceRange 
}) 

You would only be indexing available time ranges, so this should greatly reduce the amount of objects, but it would still be probably huge.

Finally, the timestamps in the object would be here to know which ones to delete when a booking is made. The call would be something like:

index.deleteByQuery('', {
  filters: 'availability_start_timestamp < ' + booking_end_timestamp + ' AND availability_end_timestamp > ' + booking_start_timestamp
})
Boride answered 9/2, 2017 at 16:20 Comment(6)
Unfortunately we store availabilities for the entire year. Is there any way I can use SQL to filter before or after Algolia?Excrescence
Then you would be looking at 66795 date ranges per flat, which would probably be way too expensive, indeed.Boride
Is the list of results always expected to be low (<1000)? For instance, are you also restricting by location? If you are, you could maybe use Algolia to get a list of objectIDs matching the text + location query and send this list of IDs to MySQL to compute the available ones out of this list.Boride
You would lose Algolia's speed but would still benefit from its textual relevance.Boride
For the time being, we won't have many results, but going forward we might. How well would such a solution scale? What techniques could I adopt to make it more scalable?Excrescence
Well, as soon as your Algolia results list goes over 1000 items, you'll probably get stuck. You can increase this limit by an order of magnitude, but this won't be infinitely scalable. But I would guess that by then, you would have resources who would be dedicated on working on that. :)Boride

© 2022 - 2024 — McMap. All rights reserved.