Is there an elegant way to check file integrity with md5 in ansible using md5 files fetched from server?
Asked Answered
I

6

14

I have several files on a server that I need to download from an ansible playbook, but because the connection has good chances of interruption I would like to check their integrity after download.

I'm considering two approaches:

  1. Store the md5 of those files in ansible as vars
  2. Store the md5 of those files on the server as files with the extension .md5. Such a pair would look like: file.extension and file.extension.md5.

The first approach introduces overhead in maintaining the md5s in ansible. So everytime someone adds a new file, he needs to make sure he adds the md5 in the right place.

But as an advantage, there is a solution for this, using the built in check from get_url action in conjunction with checksum=md5. E.g.:

action: get_url: url=http://example.com/path/file.conf dest=/etc/foo.conf checksum=md5:66dffb5228a211e61d6d7ef4a86f5758

The second approach is more elegant and the narrows the responsibility. When someone adds a new file on the server, he will make sure to add the .md5 as well and won't even need to use the ansible playbooks.

Is there a way to use the checksum approach to match the md5 from a file?

Irs answered 6/4, 2016 at 10:52 Comment(0)
S
24

If you wish to go with your method of storing the checksum in files on the server, you can definitely use the get_url checksum arg to validate it.

Download the .md5 file and read it into a var:

- set_fact:
    md5_value: "{{ lookup('file', '/etc/myfile.md5') }}"

And then when you download the file, pass the contents of md5_value to get_url:

- get_url:
    url: http://example.com
    dest: /my/dest/file
    checksum: "md5:{{ md5_value }}"
    force: true

Note that it is vital to specify a path to a file in dest; if you set this to a directory (and have a filename in url), the behavior changes significantly.

Note also that you probably need the force: true. This will cause a new file to download every time you run it. The checksum is only triggered when files are downloaded. If the file already exists on your host it won't bother to validate the sum of the existing file, which might not be desirable.

To avoid the download every time you could stat to see if the file already exists, see what its sum is, and set the force param conditionally.

- stat:
    path: /my/dest/file
  register: existing_file

- set_fact:
    force_new_download: "{{ existing_file.stat.md5 != md5_value }}"
  when: existing_file.stat.exists

- get_url:
    url: http://example.com
    dest: /my/dest/file
    checksum: "md5:{{ md5_value }}"
    force:  "{{ force_new_download | default ('false') }}"

Also, if you are pulling the sums/artifacts from some sort of web server you can actually get the value of the sum right from the url without having to actually download the file to the host. Here is an example using a Nexus server that would host the artifacts and their sums:

- set_fact:
    md5_value: "{{ item }}"
  with_url: http://my_nexus_server.com:8081/nexus/service/local/artifact/maven/content?g=log4j&a=log4j&v=1.2.9&r=central&e=jar.md5

This could be used in place of using get_url to download the md5 file and then using lookup to read from it.

Squalid answered 23/5, 2016 at 19:8 Comment(4)
this cannot be right. The file lookup plugin only works on localhost. It CANNOT lookup remote files. the get_url store files at remote dest. Thus the whole concept is actually wrong.Encrata
At least add delegate_to: 127.0.0.1 to your get_url taskEncrata
when using stat, also set checksum_algorithm, added in 2.0 of ansible.builtinScientism
In the latest docs for get_url it's written that If the checksum does not equal destination_checksum, the destination file is deleted. So it's safe to use it without force: true, assuming nobody else writes the file.Carolecarolee
T
3

With the stat module:

- stat:
    path: "path/to/your/file"
  register: your_file_info

- debug:
    var: your_file_info.stat.md5
Tocci answered 9/5, 2017 at 15:18 Comment(1)
The stat.md5 attribute does no longer exists in the 'stat' output. Now it only contains the stat.checksum attribute which is the sha1sum of the file.Buffalo
S
2

The elegant solution will be using the below 3 modules provided by ansible itself

  1. http://docs.ansible.com/ansible/stat_module.html

    use the stat module to extract the md5 value and register it in a variable

  2. http://docs.ansible.com/ansible/copy_module.html

    while using the copy module to copy the file from the server, register the return value of md5 in another variable

  3. http://docs.ansible.com/ansible/playbooks_conditionals.html

    use this conditional module to compare the above 2 variables and print the results whether the file is copied properly or not

Soft answered 15/4, 2016 at 11:11 Comment(0)
I
1

Another solution is to use url lookup (tested on ansible-2.3.1.0):

- name: Download
  get_url:
    url: "http://localhost/file"
    dest: "/tmp/file"
    checksum: "md5:{{ lookup('url', 'http://localhost/file.md5') }}"
Inductive answered 28/7, 2017 at 12:44 Comment(0)
F
0

Wrote an ansible module with the help of https://pypi.org/project/checksumdir

The module can be found here

Example:

- get_checksum: 
    path: path/to/directory
    checksum_type: sha1/md5/sha256/sha512
  register: checksum
Forewing answered 8/5, 2019 at 22:30 Comment(0)
S
0

I looked for a way to check md5 all over, and none of the answers provided a really elegant one. Finally I came up with this one:

- name: download the country data
  get_url:
    url: "{{ pbf_url }}"
    dest: "{{ data_dir }}"
    tmp_dest: "{{ tmp_dir }}"
    checksum: "md5:{{ lookup('url', pbf_url + '.md5', split_lines=False).split()[0] }}"

The url lookup returns a list of lines, so the split_lines flag disables splitting. Then, the md5 file (the result of md5sum command) has a file name inside, like this:

8596eb4d50e63bff41c7121c8964e44a  estonia-latest.osm.pbf

To get only the first part, I do .split()[0].

Stridulous answered 10/1 at 8:55 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.