How to persist data in a dockerized postgres database using volumes
Asked Answered
J

6

405

My docker compose file has three containers, web, nginx, and postgres. Postgres looks like this:

postgres:
  container_name: postgres
  restart: always
  image: postgres:latest
  volumes:
    - ./database:/var/lib/postgresql
  ports:
    - 5432:5432

My goal is to mount a volume which corresponds to a local folder called ./database inside the postgres container as /var/lib/postgres. When I start these containers and insert data into postgres, I verify that /var/lib/postgres/data/base/ is full of the data I'm adding (in the postgres container), but in my local system, ./database only gets a data folder in it, i.e. ./database/data is created, but it's empty. Why?

Notes:

UPDATE 1

Per Nick's suggestion, I did a docker inspect and found:

    "Mounts": [
        {
            "Source": "/Users/alex/Documents/MyApp/database",
            "Destination": "/var/lib/postgresql",
            "Mode": "rw",
            "RW": true,
            "Propagation": "rprivate"
        },
        {
            "Name": "e5bf22471215db058127109053e72e0a423d97b05a2afb4824b411322efd2c35",
            "Source": "/var/lib/docker/volumes/e5bf22471215db058127109053e72e0a423d97b05a2afb4824b411322efd2c35/_data",
            "Destination": "/var/lib/postgresql/data",
            "Driver": "local",
            "Mode": "",
            "RW": true,
            "Propagation": ""
        }
    ],

Which makes it seem like the data is being stolen by another volume I didn't code myself. Not sure why that is. Is the postgres image creating that volume for me? If so, is there some way to use that volume instead of the volume I'm mounting when I restart? Otherwise, is there a good way of disabling that other volume and using my own, ./database?

Jackass answered 13/1, 2017 at 15:2 Comment(6)
do you already run the initdb command line to initialize your database cluster?Ariose
Are you sure your data subdirectory is really empty? It might have special access permissions.Patchouli
Thanks for getting back to me so fast! I'm using a flask app, so I from app import db and db.create_all() from a docker run after starting the containers. I don't initdb directly from the command line.Jackass
@YaroslavStavnichiy I don't know how else to check that than sudo su - and look in ./database/data. There's nothing in there as far as I can tell.Jackass
Someone might find this useful: sample compose file persisting postgres, elastic search and media data, https://mcmap.net/q/87623/-django-app-with-docker-compose-keep-the-data-in-media-volumeOffprint
Seems like if you specify a path as the first, this creates a bind_mount instead of a volume?Vonvona
J
520

Strangely enough, the solution ended up being to change

volumes:
  - ./postgres-data:/var/lib/postgresql

to

volumes:
  - ./postgres-data:/var/lib/postgresql/data
Jackass answered 14/1, 2017 at 14:10 Comment(11)
Just a quick "why" for this answer (which works). Per the postgres folks, the default data directory is /var/lib/postgresql/data - you can read the PGDATA variable notes here: store.docker.com/images/…Mannerless
And add the local directory to your .dockerignore file, especially if you'll ever trun this into a production image. See codefresh.io/blog/not-ignore-dockerignore for a discussion.Langrage
this does still not work for me (mac os x high sierra)Thanet
@OlliD-Metz I had to do a docker rm my_postgres_container_1 before it worked (also High Sierra).Skyway
@Skyway not very helpful tip :)Boudicca
If you have run the docker-compose during test without specifying the volume info, I think that you will have automatically created volumes which persist. In my case, I had to manually delete them before the container would use the included volume info. My solution turned out to be: docker volume rm <long volume string>Eslinger
Make sure to check the logs when starting the image back up again. In my case I got an error because the local directory I was trying to mount to wasn't empty.Sochor
Also I had an issue where it wasn't realizing the directory was empty if I deleted the data directory and started it up again. Adding a trailing slash to the local folder fixed that (e.g. - ./postgres-data/ instead of - ./postgres-data)Sochor
Can someone explain why it works? The first (highly upvoted) comment just states the obvious.Auditor
upvoted what is the difference between using a named volume for postgres vs a directory?Inessential
Be careful not to repeat the same silly mistake that I made: The path as defined in PGDATA is /var/lib/postgresql/data and not /var/lib/postgres/data. If you spot the difference, you are all good to go. It works for me.Leavitt
M
206

You can create a common volume for all Postgres data

docker volume create pgdata

or you can set it to the compose file

version: "3"
services:
  db:
    image: postgres
    environment:
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=postgress
      - POSTGRES_DB=postgres
    ports:
      - "5433:5432"
    volumes:
      - pgdata:/var/lib/postgresql/data
    networks:
      - suruse
volumes:
  pgdata:

It will create volume name pgdata and mount this volume to container's path.

You can inspect this volume

docker volume inspect pgdata
// output will be
[
    {
        "Driver": "local",
        "Labels": {},
        "Mountpoint": "/var/lib/docker/volumes/pgdata/_data",
        "Name": "pgdata",
        "Options": {},
        "Scope": "local"
    }
]

Maurilia answered 10/8, 2017 at 6:51 Comment(6)
Commenting a bit late, but won't this clear data if I do a docker-compose down -v. And what is the solution to that?Segregationist
@Sid, yes, it will! Just be careful with this option.Thanet
so with docker-compose [down] the volume is no longer persisted? Does a full cleanup even of the volume?Kartis
@Segregationist Commenting even later, but you can use docker-compose down --rmi all without the -v option and it'll clear out "everything" except the volumes, i.e. containers, networks, images, etc. I do that when deploying while allowing data persistence.Unlatch
What is the suruse for?Auberta
@IanChuTe It's a network name, you can give any name best suited to your project.Maurilia
C
18

I would avoid using a relative path. Remember that docker is a daemon/client relationship.

When you are executing the compose, it's essentially just breaking down into various docker client commands, which are then passed to the daemon. That ./database is then relative to the daemon, not the client.

Now, the docker dev team has some back and forth on this issue, but the bottom line is it can have some unexpected results.

In short, don't use a relative path, use an absolute path.

Critchfield answered 13/1, 2017 at 18:43 Comment(3)
Thanks for this answer! Sadly, I don't think it worked. I changed the line to an absolute path, and after inserting the data, the database/data folder is still empty =(Jackass
K. Next up is to run docker inspect on the container and make sure that the container is aware of the volume (just in case compose is confused or something). (note: docker inspect can have sensitive data, so don't paste it here without munging ;-) After that, it's a matter of checking permissions (although that would usually show an error)Critchfield
Aha! @Nick Burke I think you've found something. I've updated the question.Jackass
B
9

The postgres image's Dockerfile contains this line:

VOLUME /var/lib/postgresql/data

When a container starts up based on this image, if nothing else is mounted there, Docker will create an anonymous volume and automatically mount it. You can see this volume using commands like docker volume ls, and that's also the second mount in the docker inspect output you quote.

The main consequence of this is that your external volume mount must be on /var/lib/postgresql/data and not a parent directory. (Also see @AlexLenail's answer.)

If you mount a host directory on /var/lib/postgresql instead, then:

  1. Docker internally sorts the mounts, so parent directories get mounted first.
  2. The host directory is mounted on /var/lib/postgresql.
  3. Nothing is mounted on /var/lib/postgresql/data, so Docker creates the anonymous volume.
  4. The mount point does not exist (it is in the empty bind-mounted host directory) so Docker creates it, also creating the ./database/data directory on the host.
  5. The anonymous volume is mounted on that directory, but only in the container filesystem.

The result of this is what you see, the database apparently operates correctly but its data is not necessarily persisted and you get only the empty data directory on the host.

Bonanza answered 10/5, 2023 at 11:0 Comment(0)
K
3

I think you just need to create your volume outside docker first with a docker create -v /location --name and then reuse it.

And by the time I used to use docker a lot, it wasn't possible to use a static docker volume with dockerfile definition so my suggestion is to try the command line (eventually with a script ) .

Karlsbad answered 13/1, 2017 at 23:40 Comment(0)
C
0

I had a similar issue where postgres would create an anonymous volume in addition to the specified one.

  postgres:
    container_name: postgres
    image: postgres:15
    environment:
      PGUSER: postgres
      POSTGRES_PASSWORD: postgres
      PGDATA: /data/postgres
    volumes:
      - postgres_data:/data/postgres

Turns out that specifying PGDATA environment variable causes that. The container would properly use the postgres_data volume, but would in addition create an empty anonymous volume.

You can find out the used volumes by using docker inspect postgres, here is the Mounts section of the output:

    "Mounts": [
        {
            "Type": "volume",
            "Name": "modular-service-layer_postgres_data",
            "Source": "/var/lib/docker/volumes/modular-service-layer_postgres_data/_data",
            "Destination": "/data/postgres",
            "Driver": "local",
            "Mode": "z",
            "RW": true,
            "Propagation": ""
        },
        {
            "Type": "volume",
            "Name": "ceef11ea50a07400a798fbda75db4896f35a4a33d41c70cc68f976c187736dbd",
            "Source": "/var/lib/docker/volumes/ceef11ea50a07400a798fbda75db4896f35a4a33d41c70cc68f976c187736dbd/_data",
            "Destination": "/var/lib/postgresql/data",
            "Driver": "local",
            "Mode": "",
            "RW": true,
            "Propagation": ""
        }
    ],

I fixed that by using the proper mount point directly without setting PGDATA:

  postgres:
    container_name: postgres
    image: postgres:15
    environment:
      PGUSER: postgres
      POSTGRES_PASSWORD: postgres
    volumes:
      - postgres_data:/var/lib/postgresql/data
Crazyweed answered 7/3 at 11:56 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.