Dockerfile COPY vs "docker run ... cp"
Asked Answered
P

1

7

We have a build pipeline that first does a docker build using a dockerfile. That dockerfile has a number of COPY commands. We also have a later step that does a docker run, with 'cp' command, as follows:

docker run --volume /hostDirA:/containerDirB --workdir /folderB dockerRepo:somebuildnum cp -R /hostDirC/. /containerDirB

First, before the main point, it is my understanding that the cp command is copying from one folder to another, both folders on the container. Is that a correct understanding?

Second, why would a cp be done in this way in the docker run when COPY is already being done in the docker build via the dockerfile? Are there valid reasons why we wouldn't move this cp to be inside the dockerfile?

Palpate answered 28/1, 2021 at 14:14 Comment(6)
How do HostDirA and HostDirC relate to each other? If one is under the other, you need to show that. And are there any other VOLUME specifications in your Dockerfile?Fantoccini
COPY is appropriate when you're creating a new layer. cp is appropriate when you're copying content out of the container onto a volume. To be clear, docker run cp does not create a new layer. Thus, it is not a substitute for a COPY directive.Fantoccini
If containerDirB has a symlink in its path that should make it land on a volume, that too would make this make more sense.Fantoccini
That said, "why is this pre-existing code written the way it is?" questions often require us to be mind readers, and this is one of those times. Sometimes there are extra circumstances (such as the above-referenced symlink possibility). Sometimes the person writing it just made a mistake. Figuring out which requires investigating context -- what code relies on that cp having occurred, how and when is it invoked, does it actually work as intended? -- which is not something suited to our format.Fantoccini
Can you clarify the directory names a little bit? If the cp source folder is something from the image (not a host directory), and its destination folder is the second --volume directory, then this seems like it's trying to copy data out of an image build on to the host.Rotherham
@CharlesDuffy I'm inquiring at my company as to why this was done in such a way. My hope was that this was a clear pattern or anti-pattern. Lots of meetings today, so it will be a bit before I can address your questions. Thanks!Palpate
M
2

Without knowing what the files are for, we can only take wild guesses.

Volumes are used for data persistence, whereas COPY is used for data that is needed in running the process, but may be destroyed.

One possible valid scenario for why data is copied into the volume instead of using the COPY command is that persistent data needs to be initialized and they don't want that initialization data to add bloat to the container image. Like I said, it's just a wild guess.

Another guess is that the Dockerfile is shared between developers and the initialization data between groups may vary, so one set of developers might copy different data into the volume.

Whatever the data is, if you shut down and remove the container, data created via COPY just vanishes with the container, but data moved into the volume via cp on the host stays in the directory that was mounted. That data may have changed while the container was running from what was originally placed in it, but it doesn't reset when you remove the container and spawn a new container from the image.

You should ask the developer what all the files are for, and whether the files need to persist or whether they can just be "ephemeral". This will probably answer your questions as to why they are copied the way they are.

Meissen answered 21/10, 2021 at 6:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.