How to store a hash in extended file attributes on OS X with Java?
Asked Answered
O

3

8

Preface I am working on a platform in-depended media database written in java where the media files are identified by a file hash. The user shall be able to move the files around, so I do NOT want to rely on any file path. Once imported, I store the path and the hash in my database. I developed a fast file-hash-id algorithm based on a tradeoff between accuracy and performance, but fast is not always fast enough. :)

In order to update and import mediafiles, I need to (re)create the file hashes of all files in my library. My idea is now to calculate the hash just once and store it in the files metadata (extended attributes) to boost performance on filesystems which support extended file attributes. (NTFS, HFS+, ext3...) I already implemented it, and you can find the current source here: archimedesJ.io.metadata

Attempts At a first glance, Java 1.7 offers with the UserDefinedFileAttributeView a nice way to handle metadata. For most platforms this works. Sadly, UserDefinedFileAttributeView does not work on HFS+. Albeit, I do not understand why especially the HFS+ filesystem is not supported - it is one of the leading formats for metadata? (see related Question - which does not provide any solution)

How to store extended file attributes on OS X with Java? In oder to come by this java limitation, I decided to use the xattr commandline tool present on OSX and use it with Javas Process handling to read the output from it. My implementation works, but it is very slow. (Recalculation of the file hash is faster, how ironic! I am testing on a Mac BookPro Retina, with an SSD.)

It turned out, that the xattr tool works quite slow. (Writing is damn slow, but more importantly also reading an attribute is slow) To prove that it is not a Java issue but the tool itself, I have created a simple bash script to use the xattr tool on several files which have my custom attribute:

FILES=/Users/IsNull/Pictures/
for f in $FILES
do
  xattr -p vidada.hash $f
done

If I run it, the lines appear "fast" after each other, but I would expect to show me the output immediately within milliseconds. A little delay is clearly visible and thus I guess the tool is not that fast. Using this in java gives me an additional overhead of creating a process, parsing the output which makes it even a bit slower.

Is there a better way to access the extended attributes on HFS+ with Java? What is a fast way to work with the extended attributes on OS X with Java?

Outrelief answered 2/5, 2013 at 13:52 Comment(0)
O
4

I have created a JNI wrapper for accessing the extended attributes now directly over the C-API. It is a open source Java Maven project and avaiable on GitHub/xattrj

For reference, I post the interesting source pieces here. For the latest sources, please refer to the above project page.

Xattrj.java

public class Xattrj {

    /**
     * Write the extended attribute to the given file
     * @param file
     * @param attrKey
     * @param attrValue
     */
    public void writeAttribute(File file, String attrKey, String attrValue){
        writeAttribute(file.getAbsolutePath(), attrKey, attrValue);
    }

    /**
     * Read the extended attribute from the given file
     * @param file
     * @param attrKey
     * @return
     */
    public String readAttribute(File file, String attrKey){
        return readAttribute(file.getAbsolutePath(), attrKey);
    }

    /**
     * Write the extended attribute to the given file
     * @param file
     * @param attrKey
     * @param attrValue
     */
    private native void writeAttribute(String file, String attrKey, String attrValue);

    /**
     * Read the extended attribute from the given file
     * @param file
     * @param attrKey
     * @return
     */
    private native String readAttribute(String file, String attrKey);


    static {
        try {
            System.out.println("loading xattrj...");
            LibraryLoader.loadLibrary("xattrj");
            System.out.println("loaded!");
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

org_securityvision_xattrj_Xattrj.cpp

#include <jni.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "org_securityvision_xattrj_Xattrj.h"
#include <sys/xattr.h>


/**
 * writeAttribute
 * writes the extended attribute
 *
 */
JNIEXPORT void JNICALL Java_org_securityvision_xattrj_Xattrj_writeAttribute
    (JNIEnv *env, jobject jobj, jstring jfilePath, jstring jattrName, jstring jattrValue){

    const char *filePath= env->GetStringUTFChars(jfilePath, 0);
    const char *attrName= env->GetStringUTFChars(jattrName, 0);
    const char *attrValue=env->GetStringUTFChars(jattrValue,0);

    int res = setxattr(filePath,
                attrName,
                (void *)attrValue,
                strlen(attrValue), 0,  0); //XATTR_NOFOLLOW != 0
    if(res){
      // an error occurred, see errno
        printf("native:writeAttribute: error on write...");
        perror("");
    }
}


/**
 * readAttribute
 * Reads the extended attribute as string
 *
 * If the attribute does not exist (or any other error occurs)
 * a null string is returned.
 *
 *
 */
JNIEXPORT jstring JNICALL Java_org_securityvision_xattrj_Xattrj_readAttribute
    (JNIEnv *env, jobject jobj, jstring jfilePath, jstring jattrName){

    jstring jvalue = NULL;

    const char *filePath= env->GetStringUTFChars(jfilePath, 0);
    const char *attrName= env->GetStringUTFChars(jattrName, 0);

    // get size of needed buffer
    int bufferLength = getxattr(filePath, attrName, NULL, 0, 0, 0);

    if(bufferLength > 0){
        // make a buffer of sufficient length
        char *buffer = (char*)malloc(bufferLength);

        // now actually get the attribute string
        int s = getxattr(filePath, attrName, buffer, bufferLength, 0, 0);

        if(s > 0){
            // convert the buffer to a null terminated string
            char *value = (char*)malloc(s+1);
            *(char*)value = 0;
            strncat(value, buffer, s);
            free(buffer);

            // convert the c-String to a java string
            jvalue = env->NewStringUTF(value);
        }
    }
    return jvalue;
}

Now the makefile which has troubled me quite a bit to get things working:

CC=gcc
LDFLAGS= -fPIC -bundle
CFLAGS= -c -shared -I/System/Library/Frameworks/JavaVM.framework/Versions/Current/Headers -m64


SOURCES_DIR=src/main/c++
OBJECTS_DIR=target/c++
EXECUTABLE=target/classes/libxattrj.dylib

SOURCES=$(shell find '$(SOURCES_DIR)' -type f -name '*.cpp')
OBJECTS=$(SOURCES:$(SOURCES_DIR)/%.cpp=$(OBJECTS_DIR)/%.o)

all: $(EXECUTABLE)

$(EXECUTABLE): $(OBJECTS)
    $(CC) $(LDFLAGS) $(OBJECTS) -o $@

$(OBJECTS): $(SOURCES)
    mkdir -p $(OBJECTS_DIR)
    $(CC) $(CFLAGS) $< -o $@



clean:
    rm -rf $(OBJECTS_DIR) $(EXECUTABLE)
Outrelief answered 3/5, 2013 at 15:57 Comment(2)
Nice, glad to see you got it working. Just a couple comments on your code... Why do you hardcode 255 as the buffer size in your second call to getxattr()? You've already dynamically determined the size of the attribute value. Secondly, if this is intended to set and retrieve arbitrary attributes (rather than being specific to your application) you can't treat all xattr values as strings. They can and often do contain arbitrary binary data; for an example, download a file using Safari and check its xattrs.Rouault
I plan to create seaprate methods to support binary data. The buffersize was coming from the offical OS X example and I simply made it the same way. I will adapt your suggestions. Btw: My experience in C/C++ is rather limited, so any other suggestions are highly appreciated :) For example, I guess my c-string handling could be simplified since I use the c++ string.h...Outrelief
R
4

OS X's /usr/bin/xattr is probably rather slow because it's implemented as a Python script. The C API for setting extended attributes is setxattr(2). Here's an example:

if(setxattr("/path/to/file",
            attribute_name,
            (void *)attribute_data,
            attribute_size,
            0,
            XATTR_NOFOLLOW) != 0)
{
  /* an error occurred, see errno */
}

You can create a JNI wrapper to access this function from Java; you might also want getxattr(2), listxattr(2), and removexattr(2), depending on what else your app needs to do.

Rouault answered 2/5, 2013 at 17:42 Comment(2)
I will give the JNI approach a try and post my results!Outrelief
I have created the JNI wrapper (see my answer) and its now blazing fast. Thank you!Outrelief
O
4

I have created a JNI wrapper for accessing the extended attributes now directly over the C-API. It is a open source Java Maven project and avaiable on GitHub/xattrj

For reference, I post the interesting source pieces here. For the latest sources, please refer to the above project page.

Xattrj.java

public class Xattrj {

    /**
     * Write the extended attribute to the given file
     * @param file
     * @param attrKey
     * @param attrValue
     */
    public void writeAttribute(File file, String attrKey, String attrValue){
        writeAttribute(file.getAbsolutePath(), attrKey, attrValue);
    }

    /**
     * Read the extended attribute from the given file
     * @param file
     * @param attrKey
     * @return
     */
    public String readAttribute(File file, String attrKey){
        return readAttribute(file.getAbsolutePath(), attrKey);
    }

    /**
     * Write the extended attribute to the given file
     * @param file
     * @param attrKey
     * @param attrValue
     */
    private native void writeAttribute(String file, String attrKey, String attrValue);

    /**
     * Read the extended attribute from the given file
     * @param file
     * @param attrKey
     * @return
     */
    private native String readAttribute(String file, String attrKey);


    static {
        try {
            System.out.println("loading xattrj...");
            LibraryLoader.loadLibrary("xattrj");
            System.out.println("loaded!");
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

org_securityvision_xattrj_Xattrj.cpp

#include <jni.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "org_securityvision_xattrj_Xattrj.h"
#include <sys/xattr.h>


/**
 * writeAttribute
 * writes the extended attribute
 *
 */
JNIEXPORT void JNICALL Java_org_securityvision_xattrj_Xattrj_writeAttribute
    (JNIEnv *env, jobject jobj, jstring jfilePath, jstring jattrName, jstring jattrValue){

    const char *filePath= env->GetStringUTFChars(jfilePath, 0);
    const char *attrName= env->GetStringUTFChars(jattrName, 0);
    const char *attrValue=env->GetStringUTFChars(jattrValue,0);

    int res = setxattr(filePath,
                attrName,
                (void *)attrValue,
                strlen(attrValue), 0,  0); //XATTR_NOFOLLOW != 0
    if(res){
      // an error occurred, see errno
        printf("native:writeAttribute: error on write...");
        perror("");
    }
}


/**
 * readAttribute
 * Reads the extended attribute as string
 *
 * If the attribute does not exist (or any other error occurs)
 * a null string is returned.
 *
 *
 */
JNIEXPORT jstring JNICALL Java_org_securityvision_xattrj_Xattrj_readAttribute
    (JNIEnv *env, jobject jobj, jstring jfilePath, jstring jattrName){

    jstring jvalue = NULL;

    const char *filePath= env->GetStringUTFChars(jfilePath, 0);
    const char *attrName= env->GetStringUTFChars(jattrName, 0);

    // get size of needed buffer
    int bufferLength = getxattr(filePath, attrName, NULL, 0, 0, 0);

    if(bufferLength > 0){
        // make a buffer of sufficient length
        char *buffer = (char*)malloc(bufferLength);

        // now actually get the attribute string
        int s = getxattr(filePath, attrName, buffer, bufferLength, 0, 0);

        if(s > 0){
            // convert the buffer to a null terminated string
            char *value = (char*)malloc(s+1);
            *(char*)value = 0;
            strncat(value, buffer, s);
            free(buffer);

            // convert the c-String to a java string
            jvalue = env->NewStringUTF(value);
        }
    }
    return jvalue;
}

Now the makefile which has troubled me quite a bit to get things working:

CC=gcc
LDFLAGS= -fPIC -bundle
CFLAGS= -c -shared -I/System/Library/Frameworks/JavaVM.framework/Versions/Current/Headers -m64


SOURCES_DIR=src/main/c++
OBJECTS_DIR=target/c++
EXECUTABLE=target/classes/libxattrj.dylib

SOURCES=$(shell find '$(SOURCES_DIR)' -type f -name '*.cpp')
OBJECTS=$(SOURCES:$(SOURCES_DIR)/%.cpp=$(OBJECTS_DIR)/%.o)

all: $(EXECUTABLE)

$(EXECUTABLE): $(OBJECTS)
    $(CC) $(LDFLAGS) $(OBJECTS) -o $@

$(OBJECTS): $(SOURCES)
    mkdir -p $(OBJECTS_DIR)
    $(CC) $(CFLAGS) $< -o $@



clean:
    rm -rf $(OBJECTS_DIR) $(EXECUTABLE)
Outrelief answered 3/5, 2013 at 15:57 Comment(2)
Nice, glad to see you got it working. Just a couple comments on your code... Why do you hardcode 255 as the buffer size in your second call to getxattr()? You've already dynamically determined the size of the attribute value. Secondly, if this is intended to set and retrieve arbitrary attributes (rather than being specific to your application) you can't treat all xattr values as strings. They can and often do contain arbitrary binary data; for an example, download a file using Safari and check its xattrs.Rouault
I plan to create seaprate methods to support binary data. The buffersize was coming from the offical OS X example and I simply made it the same way. I will adapt your suggestions. Btw: My experience in C/C++ is rather limited, so any other suggestions are highly appreciated :) For example, I guess my c-string handling could be simplified since I use the c++ string.h...Outrelief
G
3

I've just contributed my <sys/xattr.h> JNA wrapper to the JNA-platform project. Since it's JNA you won't need to compile any native libraries. :)

@see https://github.com/twall/jna/pull/338

Should be part of the next release JNA release.

Graycegrayheaded answered 18/6, 2014 at 13:17 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.