Where are the reference pages of the Google App Engine bulkloader transform?
Asked Answered
R

1

5

From an empty datastore, I was able to auto-generate a bulkloader.yaml file. It only contains the python_preamble, but the transformers section was empty.

python_preamble:
- import: google.appengine.ext.bulkload.transform
- import: google.appengine.ext.bulkload.bulkloader_wizard
- import: my_own_transformers
- import: data_models  # This is where the SomeData class is defined.
# some more imports here

Then based on the examples in the documentation, I need to define a property map for each of the columns in my CSV:

transformers:
- kind: SomeData
  connector: csv
  property_map:
    - property: date
      import_transform: transform.some_undocumented_function

Two Questions:

My understanding is that the function defined as the import_transform will transform the ordinary CSV string into a Property Class worthy of the datastore. I want to understand how the transforms work, so I think I have two alternatives.

  1. Where is the library reference for google.appengine.ext.bulkload.transform? I want to know how to use transform.some_undocumented_function, as well as all the other transform.some_other_undocumented_transformers

  2. You can see from my python_preamble that I - import: my_own_transformers. In that module, I defined a function transform_date that takes an ISO date string such as 2001-01-01 and transforms it into a type that can fit into db.DateProperty(). If my concept is correct, can I use:

property_map:
  - property: date
  import_transform: my_own_transforms.transform_date
Riojas answered 25/7, 2011 at 14:24 Comment(7)
You don't transform data into property classes, or even property class instances. Property classes are a detail of how db.Model denotes models; the data types they can store are completely separate.Turtleback
@Nick, I think you mean db.Model is an example of a Model Class, a subclass of which I called SomeData in my question, and a db.DateProperty() is an example of a Property Class. That's what I also meant. I'm not sure where I stumbled in my explanation.Riojas
@Nick, I think I should have said "transform the CSV string into a data type required by a Property Class instance owned by SomeData". Is that correct?Riojas
No, you transform the string into a data type supported by the datastore. Property classes don't come into it, because the datastore loads data at a lower level than that.Turtleback
@Nick, Property Classes do play a major role. If one of my models has a name = db.StringProperty(), name becomes an instance of db.StringProperty, which requires that any value assigned to name must be a Python str or unicode. Otherwise, name won't accept any other data type, and raises an exception.Riojas
You're describing how db.Model works. The bulkloader (the yaml-based one, that is) does not use models by default - it inserts data directly into the datastore.Turtleback
Thanks, @Nick. It's all clearer to me now.Riojas
T
7

1)
You can check the source code or giving to the interactive console something like this:

from google.appengine.ext.bulkload import transform 
help(transform)

you will get:

Help on module google.appengine.ext.bulkload.transform in google.appengine.ext.bulkload:

NAME
    google.appengine.ext.bulkload.transform - Bulkloader Transform Helper functions.

FILE
    /Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/ext/bulkload/transform.py

DESCRIPTION
    A collection of helper functions for bulkloading data, typically referenced
    from a bulkloader.yaml file.

FUNCTIONS
    blob_to_file(filename_hint_propertyname=None, directory_hint='')
        Write the blob contents to a file, and replace them with the filename.

        Args:
          filename_hint_propertyname: If present, the filename will begin with
            the contents of this value in the entity being exported.
          directory_hint: If present, the files will be stored in this directory.

        Returns:
          A function which writes the input blob to a file.

    blobproperty_from_base64 = wrapper(value)

    bytestring_from_base64 = wrapper(value)

    child_node_from_list(child_node_name)
        Return a value suitable for generating an XML child node on export.

        The return value is a list of tuples which the simplexml connector will
        use to build a child node.

        See also list_from_child_node

        Args:
          child_node_name: The name to use for each child node.

        Returns:
          Transform function which works as described in the args.

    create_deep_key(*path_info)
        A method to make multi-level Key objects.

        Generates multi-level key from multiple fields in the input dictionary.

        This is typically used for Keys for entities which have variable parent keys,
        e.g. ones with owned relationships. It can used for both __key__ and
        references.

        Use create_foreign_key as a simpler way to create single level keys.

        Args:
          path_info: List of tuples, describing (kind, property, is_id=False).
            kind: The kind name.
            property: The external property in the current import dictionary, or
              transform.CURRENT_PROPERTY for the value passed to the transform.
            is_id: Converts value to int and treats as numeric ID if True, otherwise
              the value is a string name. Default is False.
            Example:
              create_deep_key(('rootkind', 'rootcolumn'),
                              ('childkind', 'childcolumn', True),
                              ('leafkind', transform.CURRENT_PROPERTY))

        Returns:
          Transform method which parses the info from the current neutral dictionary
          into a Key with parents as described by path_info.

    create_foreign_key(kind, key_is_id=False)
        A method to make one-level Key objects.

        These are typically used in ReferenceProperty in Python, where the reference
        value is a key with kind (or model) name name.

        This helper method does not support keys with parents. Use create_deep_key
        instead to create keys with parents.

        Args:
          kind: The kind name of the reference as a string.
          key_is_id: If true, convert the key into an integer to be used as an id.
            If false, leave the key in the input format (typically a string).

        Returns:
          Single argument method which parses a value into a Key of kind entity_kind.

    empty_if_none(fn)
        A wrapper for a value to return '' if it's None. Useful on export.

        Can be used in config files (e.g. "transform.empty_if_none(unicode)" or
        as a decorator.

        Args:
          fn: Single argument transform function.

        Returns:
          Wrapped function.

    export_date_time(format)
        A wrapper around strftime. Also returns '' if the input is None.

        Args:
          format: Format string for strftime.

        Returns:
          Single argument method which convers a datetime into a string using format.

    import_date_time(format, _strptime=None)
        A wrapper around strptime. Also returns None if the input is empty.

        Args:
          format: Format string for strptime.

        Returns:
          Single argument method which parses a string into a datetime using format.

    join_list(delimeter)
        Join a list into a string using the delimeter.

        This is just a wrapper for string.join.

        Args:
          delimeter: The delimiter to use when joining the string.

        Returns:
          Method which joins the list into a string with the delimeter.

    key_id_or_name_as_string = transform_function(key)

    key_id_or_name_as_string_n(index)
        Pull out the nth (0-based) key id or name from a key which has parents.

        If a key is present, return its id or name as a string.

        Note that this loses the distinction between integer IDs and strings
        which happen to look like integers. Use key_type to distinguish them.

        This is a useful complement to create_deep_key.

        Args:
          index: The depth of the id or name to extract. Zero is the root key.
              Negative one is the leaf key.

        Returns:
          Function extracting the name or ID of the key at depth index, as a unicode
          string. Returns '' if key is empty (unsaved), otherwise raises IndexError
          if the key is not as deep as described.

    key_kind = wrapper(value)

    key_kind_n(index)
        Pull out the nth (0-based) key kind from a key which has parents.

        This is a useful complement to create_deep_key.

        Args:
          index: The depth of the id or name to extract. Zero is the root key.
            Negative one is the leaf key.

        Returns:
          Function returning the kind of the key at depth index, or raising
          IndexError if the key is not as deep as described.

    key_type = transform_function(key)

    key_type_n(index)
        Pull out the nth (0-based) key type from a key which has parents.

        This is most useful when paired with key_id_or_name_as_string_n.
        This is a useful complement to create_deep_key.

        Args:
          index: The depth of the id or name to extract. Zero is the root key.
              Negative one is the leaf key.

        Returns:
          Method returning the type ('ID' or 'name') of the key at depth index.
          Returns '' if key is empty (unsaved), otherwise raises IndexError
          if the key is not as deep as described.

    list_from_child_node(xpath, suppress_blank=False)
        Return a list property from child nodes of the current xml node.

        This applies only the simplexml helper, as it assumes __node__, the current
        ElementTree node corresponding to the import record.

        Sample usage for structure:
         <Visit>
          <VisitActivities>
           <Activity>A1</Activity>
           <Activity>A2</Activity>
          </VisitActivities>
         </Visit>

        property: activities
        external_name: VisitActivities # Ignored on import, used on export.
        import_transform: list_from_xml_node('VisitActivities/Activity')
        export_transform: child_node_from_list('Activity')

        Args:
          xpath: XPath to run on the current node.
          suppress_blank: if True, ndoes with no text will be skipped.

        Returns:
          Transform function which works as described in the args.

    list_from_multiproperty(*external_names)
        Create a list from multiple properties.

        Args:
          external_names: List of the properties to use.

        Returns:
          Transform function which returns a list of the properties in external_names.

    none_if_empty(fn)
        A decorator which returns None if its input is empty else fn(x).

        Useful on import.  Can be used in config files
        (e.g. "transform.none_if_empty(int)" or as a decorator.

        Args:
          fn: Single argument transform function.

        Returns:
          Wrapped function.

    property_from_list(index)
        Return the Nth item from a list, or '' if the list is shorter.

        Args:
          index: Item in the list to return.

        Returns:
          Function returning the item from a list, or '' if the list is too short.

    regexp_bool(regexp, flags=0)
        Return a boolean if the expression matches with re.match.

        Note that re.match anchors at the start but not end of the string.

        Args:
          regexp: String, regular expression.
          flags: Optional flags to pass to re.match.

        Returns:
          Method which returns a Boolean if the expression matches.

    regexp_extract(pattern, method=<function match at 0x336270>, group=1)
        Return first group in the value matching the pattern using re.match.

        Args:
          pattern: A regular expression to match on with at least one group.
          method: The method to use for matching; normally re.match or re.search.
          group: The group to use for extracting a value.

        Returns:
          A single argument method which returns the group_arg group matched,
          or None if no match was found or the input was empty.

    regexp_to_list(pattern)
        Return function that returns a list of objects that match the regex.

        Useful on import.  Uses the provided regex to split a string value into a list
        of strings.  Wrapped by none_if_input_or_result_empty, so returns none if
        there are no matches for the regex and none if the input is empty.

        Args:
          pattern: A regular expression pattern to match against the input string.

        Returns:
          None if the input was none or no matches were found, otherwise a list of
          strings matching the input expression.

    split_string(delimeter)
        Split a string using the delimeter into a list.

        This is just a wrapper for string.split.

        Args:
          delimeter: The delimiter to split the string on.

        Returns:
          Method which splits the string into a list along the delimeter.

DATA
    CURRENT_PROPERTY = None
    KEY_TYPE_ID = 'ID'
    KEY_TYPE_NAME = 'name'
    __loader__ = <google.appengine.tools.dev_appserver.HardenedModulesHook...

2)
Exactly, you can use your defined transform functions or, in this specific case, you could directly use transform.import_date_time.

import_date_time(format, _strptime=None)
   A wrapper around strptime. Also returns None if the input is empty.

   Args:
     format: Format string for strptime.

   Returns:
     Single argument method which parses a string into a datetime using format.
Tigress answered 25/7, 2011 at 19:52 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.