How to de-reference a list of external links using pytables?
Asked Answered
E

2

0

I have created external links leading from one hdf5 file to another using pytables. My question is how to de-reference it in a loop?

for example:

Let's assume file_name = "collection.h5", where external links are stored

I created external links under the root node and when i traverse the nodes under the root, i get the following output :

/link1 (ExternalLink) -> /files/data1.h5:/weights/Image
/link2 (ExternalLink) -> /files/data2.h5:/weights/Image

and so on,

I know that for de-referencing a link, it can be done like this, using natural naming in the below manner:

f = open_file('collection.h5',mode='r')
plink1 = f.root.link1()
plink2 = f.root.link2()

but I want to do this in a for-loop, any help regarding this?

Eatton answered 28/3, 2019 at 6:25 Comment(0)
W
0

This is a more complete (robust and complicated) answer to handle the general condition when you have an ExternalLink at any group level. It is similar to above, but uses walk_nodes() because it has 3 groups at the root level, and includes a test for ExternalLink types (see isinstance()). Also, it shows how to use the _v_children attribute to get a dictionary of nodes. (I couldn't get list_nodes() to work with an ExternalLink.)

import tables as tb
import glob

h5f = tb.open_file('collection.h5',mode='w')
link_cnt = 0
pre_list = ['SO_53', 'SO_54', 'SO_55']
for h5f_pre in pre_list :
    h5f_pre_grp = h5f.create_group('/', h5f_pre)
    for h5name in glob.glob('./'+h5f_pre+'*.h5'):
        link_cnt += 1
        h5f.create_external_link(h5f_pre_grp, 'link_'+'%02d'%(link_cnt), h5name+':/')
h5f.close()

h5f = tb.open_file('collection.h5',mode='r')
for link_node in h5f.walk_nodes('/') : 
    if isinstance(link_node, tb.link.ExternalLink) :
        print('\nFor Node %s:' % (link_node._v_pathname) )
        print("``%s`` is an external link to: ``%s``" % (link_node, link_node.target))
        plink = link_node(mode='r') # this returns a file object for the linked file
        linked_nodes = plink._v_children
        print (linked_nodes)

h5f.close()
Wiedmann answered 28/3, 2019 at 19:3 Comment(0)
W
0

You can use iter_nodes() or walk_nodes(); walk_nodes is recursive, iter_nodes is not. An example of iter_nodes() is explained in my answer to this SO topic: cannot-retrieve-datasets-in-pytables-using-natural-naming I discovered you can't use get_node() to reference an ExternalLink. You need to reference differently.

Here's a simple example that creates collection.h5 from a list of HDF5 files in my local folder, then uses iter_nodes() in a for loop. Note that this is a very basic example. It does not check the Node's object type (Group or Leaf or ExternalLink). It assumes each Node at the root level is an ExternalLink, and creates a file object from the node. There are additional PyTables methods and attributes to check for these situations. See detailed answer below for a more robust (complicated) method.

import tables as tb
import glob

h5f = tb.open_file('collection.h5',mode='w')
link_cnt = 0 
for h5name in glob.glob('./SO*.h5'):
    link_cnt += 1
    h5f.create_external_link('/', 'link'+str(link_cnt), h5name+':/')
h5f.close()

h5f = tb.open_file('collection.h5',mode='r')
for link_node in h5f.iter_nodes('/') : 
    print("``%s`` is an external link to: ``%s``" % (link_node, link_node.target))
    plink = link_node(mode='r') # returns a FILE object

h5f.close()
Wiedmann answered 28/3, 2019 at 14:4 Comment(0)
W
0

This is a more complete (robust and complicated) answer to handle the general condition when you have an ExternalLink at any group level. It is similar to above, but uses walk_nodes() because it has 3 groups at the root level, and includes a test for ExternalLink types (see isinstance()). Also, it shows how to use the _v_children attribute to get a dictionary of nodes. (I couldn't get list_nodes() to work with an ExternalLink.)

import tables as tb
import glob

h5f = tb.open_file('collection.h5',mode='w')
link_cnt = 0
pre_list = ['SO_53', 'SO_54', 'SO_55']
for h5f_pre in pre_list :
    h5f_pre_grp = h5f.create_group('/', h5f_pre)
    for h5name in glob.glob('./'+h5f_pre+'*.h5'):
        link_cnt += 1
        h5f.create_external_link(h5f_pre_grp, 'link_'+'%02d'%(link_cnt), h5name+':/')
h5f.close()

h5f = tb.open_file('collection.h5',mode='r')
for link_node in h5f.walk_nodes('/') : 
    if isinstance(link_node, tb.link.ExternalLink) :
        print('\nFor Node %s:' % (link_node._v_pathname) )
        print("``%s`` is an external link to: ``%s``" % (link_node, link_node.target))
        plink = link_node(mode='r') # this returns a file object for the linked file
        linked_nodes = plink._v_children
        print (linked_nodes)

h5f.close()
Wiedmann answered 28/3, 2019 at 19:3 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.