Possible to perform cross-database queries with PostgreSQL?
Asked Answered
O

10

217

I'm going to guess that the answer is "no" based on the below error message (and this Google result), but is there anyway to perform a cross-database query using PostgreSQL?

databaseA=# select * from databaseB.public.someTableName;
ERROR:  cross-database references are not implemented:
 "databaseB.public.someTableName"

I'm working with some data that is partitioned across two databases although data is really shared between the two (userid columns in one database come from the users table in the other database). I have no idea why these are two separate databases instead of schema, but c'est la vie...

Olympian answered 5/9, 2008 at 17:9 Comment(0)
Z
165

Note: As the original asker implied, if you are setting up two databases on the same machine you probably want to make two schemas instead - in that case you don't need anything special to query across them.

postgres_fdw

Use postgres_fdw (foreign data wrapper) to connect to tables in any Postgres database - local or remote.

Note that there are foreign data wrappers for other popular data sources. At this time, only postgres_fdw and file_fdw are part of the official Postgres distribution.

For Postgres versions before 9.3

Versions this old are no longer supported, but if you need to do this in a pre-2013 Postgres installation, there is a function called dblink.

I've never used it, but it is maintained and distributed with the rest of PostgreSQL. If you're using the version of PostgreSQL that came with your Linux distro, you might need to install a package called postgresql-contrib.

Zechariah answered 5/9, 2008 at 18:10 Comment(4)
Need to install postgresql-contrib before dblink? Or postgresql-contrib includes dblink? And then the OP's query will work, or do you have to query it differently?Guildhall
From what I can read, dblink doesn't handle the case where you want a query that spans two databases.Rejuvenate
But about performance and simplicity when fdw table is not remote, as in the question, all cross-databases are local and in the same cluster? All documentation is about "remote database" (!) and say nothing about "local database"... There are some simplification and/or some performance gain when all databases are local?Torrens
If you using FDW in the same pg server... The FDW performance is ugly! I was using it, and seeming very slow... So I compare the same query by EXPLAIN ANALYZE, and it shows that FDW-query is ~450 times slow! Yes, use all in the same database splitting things by SQL-schema.Torrens
R
68

dblink() -- executes a query in a remote database

dblink executes a query (usually a SELECT, but it can be any SQL statement that returns rows) in a remote database.

When two text arguments are given, the first one is first looked up as a persistent connection's name; if found, the command is executed on that connection. If not found, the first argument is treated as a connection info string as for dblink_connect, and the indicated connection is made just for the duration of this command.

one of the good example:

SELECT * 
FROM   table1 tb1 
LEFT   JOIN (
   SELECT *
   FROM   dblink('dbname=db2','SELECT id, code FROM table2')
   AS     tb2(id int, code text);
) AS tb2 ON tb2.column = tb1.column;

Note: I am giving this information for future reference. Reference

Redfield answered 5/9, 2008 at 17:9 Comment(3)
Don't forget to create extension CREATE EXTENSION IF NOT EXISTS dblink;Zoospore
Note: We can pass dbname, port, host, user, password attributes in first argument of dblink() function. Example : dblink('dbname=testrds port=5432 host=testrds.rds.amazonaws.com user=testuser password=passwd', 'SELECT id, code FROM table2') AS tb2(id int, code text);Hsiuhsu
From the perspective of designing microservices I believe this is a no go since the databases should be able to migrate independently between two different microservices. Using this solution would create tight coupling between databases belonging to different microservices.Stop
T
25

I have run into this before an came to the same conclusion about cross database queries as you. What I ended up doing was using schemas to divide the table space that way I could keep the tables grouped but still query them all.

Tala answered 5/9, 2008 at 17:17 Comment(2)
If you're coming from a MySQL environment, what MySQL calls databases are really schemas (CREATE SCHEMA == CREATE DATABASE in MySQL), so if you porting something from MySQL using multiple databases, use schemasLinetta
Yep, Postgres 'Databases' are essentially full tenants. They exist in their own universe. (And I've seen horrifying security holes caused by MySQL treating schemas as databases. On one hosting site, our security guy managed to get full access to about 400 other databases SOMEHOW. That got patched next version, but it was an absolute mess of a problem)Xylotomy
K
11

Just to add a bit more information.

There is no way to query a database other than the current one. Because PostgreSQL loads database-specific system catalogs, it is uncertain how a cross-database query should even behave.

contrib/dblink allows cross-database queries using function calls. Of course, a client can also make simultaneous connections to different databases and merge the results on the client side.

PostgreSQL FAQ

Kemberlykemble answered 6/5, 2010 at 21:10 Comment(2)
This additonal information may be misleading and may discourage users to use the above solution.Campbellbannerman
Not only is it not misleading, its the correct answer. The other solutions are innapropriate to many use-cases.Xylotomy
S
6

Yes, you can by using DBlink (postgresql only) and DBI-Link (allows foreign cross database queriers) and TDS_LInk which allows queries to be run against MS SQL server.

I have used DB-Link and TDS-link before with great success.

Starlet answered 22/9, 2008 at 5:47 Comment(0)
T
4

If performance is important and most queries are read-only, I would suggest to replicate data over to another database. While this seems like unneeded duplication of data, it might help if indexes are required.

This can be done with simple on insert triggers which in turn call dblink to update another copy. There are also full-blown replication options (like Slony) but that's off-topic.

Tonnage answered 12/9, 2008 at 15:56 Comment(0)
T
4

I have checked and tried to create a foreign key relationships between 2 tables in 2 different databases using both dblink and postgres_fdw but with no result.

Having read the other peoples feedback on this, for example here and here and in some other sources it looks like there is no way to do that currently:

The dblink and postgres_fdw indeed enable one to connect to and query tables in other databases, which is not possible with the standard Postgres, but they do not allow to establish foreign key relationships between tables in different databases.

Turpeth answered 12/7, 2019 at 13:32 Comment(0)
I
2

In case someone needs a more involved example on how to do cross-database queries, here's an example that cleans up the databasechangeloglock table on every database that has it:

CREATE EXTENSION IF NOT EXISTS dblink;

DO 
$$
DECLARE database_name TEXT;
DECLARE conn_template TEXT;
DECLARE conn_string TEXT;
DECLARE table_exists Boolean;
BEGIN
    conn_template = 'user=myuser password=mypass dbname=';

    FOR database_name IN
        SELECT datname FROM pg_database
        WHERE datistemplate = false
    LOOP
        conn_string = conn_template || database_name;

        table_exists = (select table_exists_ from dblink(conn_string, '(select Count(*) > 0 from information_schema.tables where table_name = ''databasechangeloglock'')') as (table_exists_ Boolean));
        IF table_exists THEN
            perform dblink_exec(conn_string, 'delete from databasechangeloglock');
        END IF;     
    END LOOP;

END
$$
Ionium answered 12/4, 2019 at 13:58 Comment(0)
S
2

see https://www.cybertec-postgresql.com/en/joining-data-from-multiple-postgres-databases/ [published 2017]

These days you also have the option to use https://prestodb.io/

You can run SQL on that PrestoDB node and it will distribute the SQL query as required. It can connect to the same node twice for different databases, or it might be connecting to different nodes on different hosts.

It does not support:

DELETE
ALTER TABLE
CREATE TABLE (CREATE TABLE AS is supported)
GRANT
REVOKE
SHOW GRANTS
SHOW ROLES
SHOW ROLE GRANTS

So you should only use it for SELECT and JOIN needs. Connect directly to each database for the above needs. (It looks like you can also INSERT or UPDATE which is nice)

Client applications connect to PrestoDB primarily using JDBC, but other types of connection are possible including a Tableu compatible web API

This is an open source tool governed by the Linux Foundation and Presto Foundation.

The founding members of the Presto Foundation are: Facebook, Uber, Twitter, and Alibaba.

The current members are: Facebook, Uber, Twitter, Alibaba, Alluxio, Ahana, Upsolver, and Intel.

Sackman answered 6/2, 2021 at 15:55 Comment(1)
Should I use Presto or Trino?Lofton
K
0

In Sybase/MSSQLServer, the logical organization is db.user.table

In Oracle, the logical organization is schema.table

I understand that in PostgreSQL is db.schema.table, but db cannot be shared

The secret is to use only 2 level, db.<defaultuser.>table in Sybase or schema.table in Oracle.

Then in PostgreSQL, if you need to share tables between levels, use only one db (do not create multiples databases, use only postgres), and use multiples schemas like Oracle: postgres.schemaX.table

Kurt answered 20/5, 2023 at 20:22 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.