Powershell start-job scope
Asked Answered
B

1

2

I have a long script. i have a function for logging:

function Log ([string]$Content){
    $Date = Get-Date
    Add-Content -Path $LogPath -Value ("$Date : $Content")
}

In some point at the script i have the need to run jobs in parallel. I have a list of computer names and i need to use psexec to each one of them. this should be done as jobs to to run in parallel


        Start-Job -ScriptBlock {
        
        Log "$line Has called"
            $Program_List = gc $USERS_DB_PATH\$line.txt | select -Skip 1
            if (Test-Connection $line -Quiet) {
                    ForEach ($program in $Program_List){
                    Log "$line $program"
                    #"line $line bot is $bot pwd is $pwd"
                    psexec \\"$line" -u bla.local\"$bot" -p $pwd cmd bla
                    
                    }
            }
            else{
                  Log "Cannot Connect to $line"
            }
        
        
        }
        #Remove-Item "$USERS_DB_PATH\$line.txt"


}

I understand this is something to do with Scope but how can I make this scriptblock see the function Log and all the neccesery variables? they all come up empty

Billiton answered 22/8, 2021 at 12:15 Comment(2)
Think of it as remote variables. So try $using:myvar when referencing variables outside of it.Entrepreneur
I Actually used $using:var but that didnt work, still came empty, also how can i use this way to call a function and not a var?Billiton
T
7

tl;dr

  • Reference variables from the caller's scope via the $using: scope.

  • Recreate your Log function in the context of the background job, using
    $function:Log = "$using:function:Log"

Start-Job -ScriptBlock {

  # Required in Windows PowerShell only (if needed).
  # Change to the same working directory as the caller.
  Set-Location -LiteralPath ($using:PWD).ProviderPath

  # Recreate the Log function.
  $function:Log = "$using:function:Log"
        
  # All variable values from the *caller*'s scope must be $using: prefixed.
  Log "$using:line Has called"
  # ...        
        
}
  • Read on for an explanation.

  • See the bottom section for better alternatives to Start-Job: Start-ThreadJob and ForEach-Object -Parallel (PowerShell (Core) 7+ only).


A background job runs in an invisible PowerShell child process, i.e. a separate powershell.exe (Windows PowerShell) pwsh (PowerShell (Core) 7+) process.

Such a child process:

  • does not load $PROFILE files.
  • knows nothing about the caller's state; that is, it doesn't have access to the caller's variables, functions, aliases, ... defined in the session; only environment variables are inherited from the caller.

Conversely, this means that only the following commands are available by default in background jobs:

  • external programs and *.ps1 scripts, via the directories listed in the $env:PATH environment variable.
  • commands in modules available via the module-autoloading feature, from directories listed in the $env:PSModulePath environment variable (which has a default module).

Passing caller-state information to background jobs:

  • Variables:

    • While you cannot pass variables as such to background jobs, you can pass their values, using the $using: scope; in other words: you can get the value of but not update a variable in the caller's scope - see the conceptual about_Remote_Variables.

    • Alternatively, pass the value as an argument via Start-Job's -ArgumentList (-Args) parameter, which the -ScriptBlock argument must then access in the usual manner: either via the automatic $args variable or via explicitly declared parameters, using a param() block.

  • functions:

    • Analogously, you cannot pass a function as such, but only a function's body, and the simplest way to do that is via namespace variable notation; e.g. to get the body of function foo, use $function:foo; to pass it to a background job (or remote call), use "$using:function:foo".

    • Since namespace variable notation can also be used to assign values, assigning to $function:foo creates or updates a function named foo, so that $function:foo = $using:function:foo effectively recreates a foo function in the background session.

      • Note that while $function:foo returns the function body as a [scriptblock] instance, $using:function:foo, turns into a string during serialization (see GitHub issue #11698; however, fortunately you can also create function bodies from strings.

      • As such, enclosing $using:function:foo in "..." isn't strictly necessary for Start-Job; it is, however, required for Start-ThreadJob, because in the absence of serialization in thread-based parallelism, $using:function:foo is a [scriptblock] instance, but is associated with the caller's runspace and must therefore be rebuilt from a string in the job context (otherwise, state corruption can occur).

      • That Start-ThreadJob even allows such script-block references may be an oversight, and the PowerShell v7+ ForEach-Object -Parallel feature (which shares technical underpinning with Start-ThreadJob) explicitly disallows them, necessitating a workaround via a helper variable that first stringifies the script block in the caller's scope - see this answer.

  • classes

    • While there is no namespace variable notation for classes, you can work around that via a helper script block: see this answer.
  • Working directory:

    • In Windows PowerShell background jobs use a fixed working directory: the users Documents folder. To ensure that the background job uses the same directory as the caller, call
      Set-Location -LiteralPath ($using:PWD).ProviderPath as the first statement from inside the script block passed to -ScriptBlock.

    • In PowerShell (Core) 7+ background job now - fortunately - use the same working directory as the caller.

Caveat re type fidelity:

  • Since values must be marshaled across process boundaries, serialization and deserialization of values is of necessity involved. Background jobs use the same serialization infrastructure as PowerShell's remoting, which - with the exception of a handful of well-known types, including .NET primitive types - results in loss of type fidelity, both on passing values to background jobs and receiving output from them - see this answer

Preferable alternative to background jobs: thread jobs, via Start-ThreadJob:

PowerShell (Core) 7+ comes with the ThreadJob module, which offers the Start-ThreadJob cmdlet; in Windows PowerShell you can install it on demand.

  • Additionally, PowerShell (Core) 7+ offers essentially the same functionality as an extension to the ForEach-Object cmdlet, via the -Parallel parameter, which executes a script block passed to it in a separate thread for each input object.

Start-ThreadJob fully integrates with PowerShell's other job-management cmdlets, but uses threads (i.e. in-process concurrency) rather than child processes, which implies:

  • much faster execution
  • use of fewer resources
  • no loss of type fidelity (though you can run into thread-safety issues and explicit synchronization may be required)

Also, the caller's working directory is inherited.

The need for $using: / -ArgumentList equally applies.

  • For ForEach-Object -Parallel an improvement is being considered to allow copying the caller's state to the thread script blocks on an opt-in basis - see GitHub issue #12240.

This answer provides an overview of ForEach-Object -Parallel and compares and contrasts Start-Job and Start-ThreadJob.

Tillfourd answered 22/8, 2021 at 14:13 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.