Is it worth trying MonetDB?
Asked Answered
C

2

8

Has anyone had any experience with MonetDB? Currently, I have a MySQL database that is growing too large, and queries are getting too slow. According to column-oriented paradigm, insertions will be slower (which I don't mind at all), but data retrieval becomes very fast. Do I stand a chance of getting more data retrieval performance just by switching to MonetDB? Is it MonetDB mature enough?

Carbylamine answered 15/9, 2011 at 17:48 Comment(2)
Any benchmark comparing MonetDB against Hyperdex, Aerospike, DynamoDB, Voldermort, VoltDB or ExtremeDB?Dawndawna
Just wonder if you tried MonetDB out? If the performance is good for you?Sanferd
H
18

You have a chance of improving the performance of your application. The gain is, however, largely dependent on your workload, the size of your database and your hardware. MonetDB is developed/tuned under two main assumptions:

  1. Your workload is analytical, i.e., you have lots of (grouped) aggregations and the like.
  2. Even more important: your hot dataset (the data that you actually work with) fits into the main memory of your system. MonetDB does not have it's own Buffer Manager but relies on the OS to handle disk I/O. Since the OS (especially windows but Linux too) is sometimes very dumb about disk swapping that may become a problem (especially for joins that run out of memory).

As for the maturity, there are probably more opinions on that than people inhabiting this planet. Personally, I find it mature enough but I am a member of the development team and, thus, biased. But MonetDB is a research project so if you have an interesting application we'd love to hear about it and see if we can help.

Hoodwink answered 16/9, 2011 at 10:34 Comment(5)
Some further description: let's say my table has these fields (name, birth_date, social_security_id, drivers_licence_id, annual_income), I want to be able to do this: select * from persons where name > "M" and birth_date between DATE1 and DATE2 and annual_income between 10 and 100; AND i want to be able to SORT BY any of these fields. All this ranges are killing performance if the table grows really big. I have the feeling MonetDB can't help much in this case but if there's small chance, I'll give it a try.Carbylamine
Well, I'd say that depends on the size of your intermediate results (i.e., the number of tuples that qualify for each condition). If their IDs (internal 64-bit integers) fit into main memory you should be fine. If not, it might still perform decently if you omit the order by. One thing to note about MonetDB is that all operations are implemented very efficiently but all intermediate results are materialized in main memory (or potentially disk) which may kill performance if you don't have enough RAM. I'd say you might give MonetDB a try.Hoodwink
"Fit in RAM" compressed or uncompressed? I mean, shall I have RAM enought to fit all the content of the "dbfarm" folder ? (talking about one database with one big table). ThanksCherlycherlyn
No. You don't need keep you whole dbfarm folder in memory. Only the data that is hot should fit in-memory (mind that this includes intermediate results). Given that MonetDB is a "column-store", having enough memory to keep the hot columns in memory is enough. You may even get by with less if one parts of your column is accessed (the OS determines "hotness" of data on a memory page granularity - usually 4K).Hoodwink
What DB is able to work with larger database not fitting in memory (10TB) . I would like to be able to perform operations such as sorting or converting tables from wide format to long. A DB working natively on Windows would be ideal.Dawndawna
H
4

The answer of course depends on your payload but my experience so far would seem to indicate that about everything is faster in MonetDB than I've seen in MySQL. The exception would be joins, which not only seem slow, but seem completely inept at pipelining so you end up needing gobs of memory to process large ones. That said my experience with joins in MySQL hasn't exactly been stellar either, so I'm guessing your expectations may be low. If you really want good join performance, I'd probably recommend SQL Server or the like; for those other queries you mention in the follow up comments, MonetDB should be awesome.

For instance, given a table with about 2 million rows in it, I was able to range on one column (wherin there were about 800K rows in the range) and order by another column and the limited result was processed and returned in 25ms. Performance of those type of queries does seem to degrade with scale, but that should give you a taste for what you might expect at that scale.

I should caution that the optimistic concurrency model might throw off those that have only been exposed to pessimistic concurrency (most people). I'd research it before wondering why some of your commits fail under concurrent load.

Hypopituitarism answered 2/6, 2012 at 6:22 Comment(1)
I would say most people are familiar with the OCC model since most ORMs do it. Pessimistic concurrency and MVCC on the other hand most people are not familiar with it (MySQL didn't original support it and most non-enterprise web apps are devoid of transactions and some ORMS do not even support row/table locking).Elvieelvin

© 2022 - 2024 — McMap. All rights reserved.