Creating an in-memory table in PostgreSQL?
Asked Answered
B

2

17

My understanding of an in-memory table is a table that will be created in memory and would resort to disk as little as possible, if at all. I am assuming that I have enough RAM to fit the table there, or at least most of it. I do not want to use an explicit function to load tables (like pg_prewarm) in memory, I just want the table to be there by default as soon as I issue a CREATE TABLE or CREATE TABLE AS select statement, unless memory is full or unless I indicate otherwise. I do not particularly care about logging to disk.

7 years ago, a similar question was asked here PostgreSQL equivalent of MySQL memory tables?. It has received 2 answers and one of them was a bit late (4 years later).

One answer says to create a RAM disk and to add a tablespace for it. Or to use an UNLOGGED table. Or to wait for global temporary tables. However, I do not have special hardware, I only have regular RAM - so I am not sure how to go about that. I can use UNLOGGED feature, but as I understand, there is still quite a bit of disk interaction involved (this is what I am trying to reduce) and I am not sure if tables will be loaded in memory by default. Furthermore, I do not see how global temporary spaces are related. My understanding of them is that they are just tables in spaces that can be shared.

Another answer recommends an in-memory column store engine. And to then use a function to load everything in memory. The issue I have with this approach is that the engine being referred to looks old and unmaintained and I cannot find any other. Also, I was hoping I wouldn't have to explicitly resort to using a 'load into memory' function, but instead that everything will happen by default.

I was just wondering how to get in-memory tables now in Postgres 12, 7 years later.

Brashear answered 8/3, 2020 at 2:22 Comment(0)
V
13

Postgres does not have in-memory tables, and I do not have any information about any serious work on this topic now. If you need this capability then you can use one of the special in-memory databases like REDIS, MEMCACHED or MonetDB. There are FDW drivers for these databases. So you can create in-memory tables in a specialized database and you can work with these tables from Postgres via foreign tables.

MySQL in-memory tables were necessary when there was only the MyISAM engine, because this engine had very primitive capabilities with regard to IO and MySQL did not have its own buffers. Now MySQL has the InnoDB engine (with modern form of joins like other databases) and a lot of the arguments for using MySQL in-memory tables are obsolete. In comparison to the old MySQL Postgres has its own buffers and does not bypass file system caches, so all of the RAM is available for your data and you have to do nothing. Ten years ago we had to use MySQL in-memory engine to have good enough performance. But after migrating to Postgres we have had better performance without in-memory tables.

If you have a lot of memory then Postgres can use it by default - via file system cache.

Valles answered 8/3, 2020 at 4:42 Comment(8)
Well, I am in a situation where I must use Postgres and I am not particularly interested in MySQL. How would I use it by default on Posgres "via file system cache"?Brashear
@Brashear - do nothing, postgres does it by everytimeValles
I am interested in creating table using CTAS syntax. Currently in PostgreSQL, this invokes disk IO, that is what I am trying to minimize because I have a lot of available memory. I am seeing your suggestion to use those FDWs from PostgreSQL, but my understanding is that they do not support CTAS? Is this correct?Brashear
@Brashear - if there is lot of write operations, then Postgres has to write to disc. You can reduce writing with unlogged tables or temporary tables, but you cannot to eliminate writing. But unlogged or temp tables are not guarded by transaction log, so the number of write operations is significantly reduced. For FDW you can use COPY (SELECT) TO fdwtab or INSERT INTO fdwtab SELECT ...Valles
Thanks. I was wondering, since you are more familiar with the internals, would you know if the "intermediate" relations of the query (at the query optimizer code) invoke as much disk access? I am assuming these internal structures stay in memory for as long as possible and avoid/delay writing to disk (as I want), maybe sacrificing some guarantees. I wouldn't mind exploring the code further if it is the case and I can access these structures.Brashear
There are not any intermediate relations - Postgres has special structure - tuplestore. When this structure is lower, then work_mem, then data are buffered in memory. When data are higher, then are stored in temp files.Valles
Is there straightforward functionality to perform SQL over tuplestores?Brashear
You can use SELECT * FROM tab ORDER BY column. Sort is based on usage of tuplestoreValles
H
2

I'm not fully sure, if the following fits to the requirements of the questioner, but it would have helped me reading this Q&A here.

PostgreSQL supports temporary tables, as listed e.g. here in the Create Table Documentation.

According to this answer here and the documentation about temp_buffers, it should be mostly in memory and dropped after usage (depending on the ON COMMIT option).

Horsemint answered 23/8, 2023 at 19:23 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.