How to parse stepfunction executionId to SageMaker batch transform job name?
Asked Answered
J

2

2

I have created a stepfunction, the definition for this statemachine below (step-function.json) is used in terraform (using the syntax in this page:https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateTransformJob.html)

The first time if I execute this statemachine, it will create a SageMaker batch transform job named example-jobname, but I need to exeucute this statemachine everyday, then it will give me error "error": "SageMaker.ResourceInUseException", "cause": "Job name must be unique within an AWS account and region, and a job with this name already exists .

The cause is because the job name is hard-coded as example-jobname so if the state machine gets executed after the first time, since the job name needs to be unique, the task will fail, just wondering how I can add a string (something like ExecutionId at the end of the job name). Here's what I have tried:

  1. I added "executionId.$": "States.Format('somestring {}', $$.Execution.Id)" in the Parameters section in the json file, but when I execute the task I got error "error": "States.Runtime", "cause": "An error occurred while executing the state 'SageMaker CreateTransformJob' (entered at the event id #2). The Parameters '{\"BatchStrategy\":\"SingleRecord\",..............\"executionId\":\"somestring arn:aws:states:us-east-1:xxxxx:execution:xxxxx-state-machine:xxxxxxxx72950\"}' could not be used to start the Task: [The field \"executionId\" is not supported by Step Functions]"}

  2. I modified the jobname in the json file to "TransformJobName": "example-jobname-States.Format('somestring {}', $$.Execution.Id)",, when I execute the statemachine, it gave me error: "error": "SageMaker.AmazonSageMakerException", "cause": "2 validation errors detected: Value 'example-jobname-States.Format('somestring {}', $$.Execution.Id)' at 'transformJobName' failed to satisfy constraint: Member must satisfy regular expression pattern: ^[a-zA-Z0-9](-*[a-zA-Z0-9]){0,62}; Value 'example-jobname-States.Format('somestring {}', $$.Execution.Id)' at 'transformJobName' failed to satisfy constraint: Member must have length less than or equal to 63

I really run out of ideas, can someone help please? Many thanks.

Jakie answered 14/1, 2021 at 14:39 Comment(0)
U
3

So as per the documentation, we should be passing the parameters in the following format

        "Parameters": {
            "ModelName.$": "$$.Execution.Name",  
            ....
        },

If you take a close look this is something missing from your definition, So your step function definition should be something like below:

either

      "TransformJobName.$": "$$.Execution.Id",

OR

      "TransformJobName.$: "States.Format('mytransformjob{}', $$.Execution.Id)"

full State machine definition:

    {
        "Comment": "Defines the statemachine.",
        "StartAt": "Generate Random String",
        "States": {
            "Generate Random String": {
                "Type": "Task",
                "Resource": "arn:aws:lambda:eu-central-1:1234567890:function:randomstring",
                "ResultPath": "$.executionid",
                "Parameters": {
                "executionId.$": "$$.Execution.Id"
                },
                "Next": "SageMaker CreateTransformJob"
            },
        "SageMaker CreateTransformJob": {
            "Type": "Task",
            "Resource": "arn:aws:states:::sagemaker:createTransformJob.sync",
            "Parameters": {
            "BatchStrategy": "SingleRecord",
            "DataProcessing": {
                "InputFilter": "$",
                "JoinSource": "Input",
                "OutputFilter": "xxx"
            },
            "Environment": {
                "SAGEMAKER_MODEL_SERVER_TIMEOUT": "300"
            },
            "MaxConcurrentTransforms": 100,
            "MaxPayloadInMB": 1,
            "ModelName": "${model_name}",
            "TransformInput": {
                "DataSource": {
                    "S3DataSource": {
                        "S3DataType": "S3Prefix",
                        "S3Uri": "${s3_input_path}"
                    }
                },
                "ContentType": "application/jsonlines",
                "CompressionType": "Gzip",
                "SplitType": "Line"
            },
            "TransformJobName.$": "$.executionid",
            "TransformOutput": {
                "S3OutputPath": "${s3_output_path}",
                "Accept": "application/jsonlines",
                "AssembleWith": "Line"
            },    
            "TransformResources": {
                "InstanceType": "xxx",
                "InstanceCount": 1
            }
        },
            "End": true
        }
        }
    }

In the above definition the lambda could be a function which parses the execution id arn which I am passing via the parameters section:

 def lambda_handler(event, context):
    return(event.get('executionId').split(':')[-1])

Or if you dont wanna pass the execution id , it can simply return the random string like

 import string
 def lambda_handler(event, context):
    return(string.ascii_uppercase + string.digits)

you can generate all kinds of random string or do generate anything in the lambda and pass that to the transform job name.

Unbuild answered 14/1, 2021 at 15:13 Comment(23)
Hey I used "TransformJobName.$": "$$.Execution.Id", but when I execute the statemachine, it gave me an error: ``` "error": "SageMaker.AmazonSageMakerException", "cause": "2 validation errors detected: Value 'arn:aws:states:us-east-1:xxx:execution:xxx-state-machine:070xxx3-xxxx-xxx-xxxx-xxxxxxxxx_xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx' at 'transformJobName' failed to satisfy constraint: Member must satisfy regular expression pattern: ^[a-zA-Z0-9](-*[a-zA-Z0-9]){0,62}; Value 'arn:xxxxxx' at 'transformJobName' failed to satisfy constraint: Member must have length less than or equal to 63```Jakie
as we have talked earlier on the thread and I tried to describe in this gist you'll get the arn with execution id, so you need to extract the ID from the arn.Unbuild
Oh sorry, let me have a second look again, I'll get back to you soon. Thanks a lotJakie
Oh right, the solution in the gist is to use a random string instead of extract the string within the arn right?Jakie
nope, use a lambda to extract the id from the arn or use a random string.Unbuild
Can you paste the solution of extracting the id based on my step function definition please? I'm still a bit confused...where should I add "ResultPath": "$.randomstring", and why we have to use a Lambda? Does that mean I have to create a separate lambda for this? Is there a way that I can extract the id directly in this definition file without setting up a Lambda?Jakie
I think I know the solution, instead of ` "TransformJobName.$": "$$.Execution.Id", we can use "TransformJobName.$": "$$.Execution.Name"`, and it will only return the string behind the arn, let me have a try.Jakie
Seems like $$.Execution.Name did only return the strings after arn, but it's too long so state machine is still complaining the name 2 validation errors detected: Value '808xxxx-xxxx-xxxx-xxxx-xxxxxxxxx_xxxxxxx-xxxxxxx-xxxx-xxxxxxxxxx' at 'transformJobName' failed to satisfy constraint: Member must satisfy regular expression pattern: ^[a-zA-Z0-9](-*[a-zA-Z0-9]){0,62} I feel like I'm very close now, any thoughts?Jakie
added the full definition.Unbuild
Thanks for the answer, but honestly I don't like the idea of setting up a separate Lambda only for the purpose of generating a random string for the job name, it will make the whole infratructur too complicated, is there a way we can just extract the final bits of the the Execution.Name? I'll accept this answer here, I posted another question here: #65723214Jakie
<"TransformJobName": "$$.Execution.Name",> , I think you should correct this, it should be <"TransformJobName.$": "$$.Execution.Name",, so that people can help you quickly.Unbuild
@Cecilia I just tested <"executionName.$": "$$.Execution.Name"> this too works without any trouble, you can remove the lambda and can use this directly in your statemachine. here is the gistUnbuild
by it works do you mean it returns the right string? It is returning the string but the format doesn't meet the requirement of a SageMaker batch job name. My goal is to add this string to TransformJobName instead of executionName, the error is complaining this string is too long for a batch job name.Jakie
@Cecilia it gives me string like this <f676f5d0-53ff-4e5e-27e2-6a1680fd8bb5> I think this is the same string which we get when we parse the <$$.Execution.Id>, so should work.Unbuild
@Cecilia to be frank, AFAIK unless you have something custom like a lambda I added for parsing or generating the string, you can't slice/dice the string in step function on the fly. I checked few times the intrinsic function too, they cant help there as well. Anyways Good Luck if someone response .Unbuild
Thank you very much @Unbuild . Just a quick one, I'll probably need to try the lambda solution, what code did you add to this Lambda when you created it? I wanted to try to create one manually and test it.Jakie
@Cecilia in the answer her , I have added the code for both ways<parsing the execution id arn and generating the random string as well> please take a lookUnbuild
something's very interesting, you might be interested. With $$.Execution.Name if I start a new execution manually in the AWS console, the string name is shorter and the execution succeeded, but if the statemachine is triggered by a cloudwatch event, with the same stepfunction definition, it will return the longer string which will cause the task to faile. Do you know why?Jakie
@Cecilia there is nothing in the documentation for the id length except the limit which can be 256 chars. It might also differs between express workflows and standard workflows as well.Unbuild
Hey @Unbuild I ended up with using the Lambda approach, it's simpler than I thought, thanks for the answer and discussions. Have a good day.Jakie
@Cecilia glad it worked out. You have a nice day ahead!!Unbuild
Hi @Unbuild , just a follow-up question about the first Lambda script return(event.get('executionId').split(':')[-1]), is there a way to write unit test for it?Jakie
@Cecilia you can use moto for writing the unit test cases.Unbuild
L
0

I would like to throw another idea. You can use, if applicable, also another executionId or other unique identifier from the previous task.

I am triggering the BatchTransform job after a successful GlueJob. Therefore, I can take the output variables and concatenate in the BatchTransform job to be used a new TransformJobName.

"TransformJobName.$": "States.Format('scoring-titanic-{}', $.CompletedOn)"

Ley answered 18/10, 2021 at 16:47 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.