How to pass a custom function inside a ForEach-Object -Parallel
Asked Answered
J

5

31

I can't find a way to pass the function. Just variables.

Any ideas without putting the function inside the ForEach loop?

function CustomFunction {
    Param (
        $A
    )
    Write-Host $A
}

$List = "Apple", "Banana", "Grape" 
$List | ForEach-Object -Parallel {
    Write-Host $using:CustomFunction $_
}

enter image description here

Jesher answered 17/4, 2020 at 13:46 Comment(2)
Either package your function in a module, or (re-)define it inside the -Parallel blockIraqi
As an aside: Write-Host is typically the wrong tool to use, unless the intent is to write to the display only, bypassing the success output stream and with it the ability to send output to other commands, capture it in a variable, redirect it to a file. To output a value, use it by itself; e.g., $value instead of Write-Host $value (or use Write-Output $value, though that is rarely needed). See also: the bottom section of https://mcmap.net/q/19876/-powershell-write-output-only-writes-one-objectRollet
R
47

The solution isn't quite as straightforward as one would hope:

# Sample custom function.
function Get-Custom {
  Param ($A)
  "[$A]"
}

# Get the function's definition *as a string*
$funcDef = ${function:Get-Custom}.ToString()

"Apple", "Banana", "Grape"  | ForEach-Object -Parallel {
  # Define the function inside this thread...
  ${function:Get-Custom} = $using:funcDef
  # ... and call it.
  Get-Custom $_
}

Note: This answer contains an analogous solution for using a script block from the caller's scope in a ForEach-Object -Parallel script block.

  • Note: If your function were defined in a module that is placed in one of the locations known to the module-autoloading feature, your function calls would work as-is with ForEach-Object -Parallel, without extra effort - but each thread would incur the cost of (implicitly) importing the module.

  • The above approach is necessary, because - aside from the current location (working directory) and environment variables (which apply process-wide) - the threads that ForEach-Object -Parallel creates do not see the caller's state, notably neither with respect to variables nor functions (and also not custom PS drives and imported modules).

  • As of PowerShell 7.2.x, an enhancement is being discussed in GitHub issue #12240 to support copying the caller's state to the parallel threads on demand, which would make the caller's functions automatically available.

Note that redefining the function in each thread via a string is crucial, as an attempt to make do without the aux. $funcDef variable and trying to redefine the function with ${function:Get-Custom} = ${using:function:Get-Custom} fails, because ${function:Get-Custom} is a script block, and the use of script blocks with the $using: scope specifier is explicitly disallowed in order to avoid cross-thread (cross-runspace) issues.

  • However, ${function:Get-Custom} = ${using:function:Get-Custom} would work with Start-Job; see this answer for an example.

  • It would not work with Start-ThreadJob, which currently syntactically allows you to do & ${using:function:Get-Custom} $_, because ${using:function:Get-Custom} is preserved as a script block (unlike with Start-Job, where it is deserialized as a string, which is itself surprising behavior - see GitHub issue #11698), even though it shouldn't. That is, direct cross-thread use of [scriptblock] instances causes obscure failures, which is why ForEach-Object -Parallel prevents it in the first place.

  • A similar loophole that leads to cross-thread issues even with ForEach-Object -Parallel is using a command-info object obtained in the caller's scope with Get-Command as the function body in each thread via the $using: scope: this too should be prevented, but isn't as of PowerShell 7.2.7 - see this post and GitHub issue #16461.

${function:Get-Custom} is an instance of namespace variable notation, which allows you to both get a function (its body as a [scriptblock] instance) and to set (define) it, by assigning either a [scriptblock] or a string containing the function body.

Rollet answered 17/4, 2020 at 14:4 Comment(3)
Thank you very much. It is not the cleaner solution I was hoping for but it works. Performance-side every iteration is basically instantiating a new function. It was like inserting the function inside the foreach but more cleaner visually, right?Jesher
Glad to hear it was helpful, @smark91. The technique is primarily useful if you have a preexisting function that you want to use in the ForEach-Object -Parallel block; directly inserting the function definition is probably faster, though I'm not sure it makes much difference in practice.Rollet
This is all great for one-offs but if you have several modules imported, more functions defined, variables up in the air, essentially a whole house of cards going, it's too much trouble and too prone to error. Here's to hoping the PowerShell Core crew decide to make runspace copying an option.Fictionalize
C
6

I added a whole set of custom functions to parallel processes via a ps1 file by using an include inside the loop. This keeps things very clean and neat.

ForEach-Object -Parallel {
    # Include custom functions inside parallel scope
    . $using:PSScriptRoot\CustomFunctions.ps1
    # Now you can reference any function defined in the file
    My-CustomFunction
    ....

This indeed incurs overhead requiring the loading of functions in each parallel process, but in my case this was miniscule related to the overall processing time.

Casilde answered 28/3, 2023 at 20:14 Comment(0)
B
1

I just figured out another way using get-command, which works with the call operator. $a ends up being a FunctionInfo object.

EDIT: I'm told this isn't thread safe, but I don't understand why.

function hi { 'hi' }
$a = get-command hi
1..3 | foreach -parallel { & $using:a }

hi
hi
hi
Berenice answered 25/11, 2020 at 16:12 Comment(2)
Actually, it turns out there are indeed thread-safety issues, even with function bodies that do not rely on the caller's state; this can lead to obscure failures - see this question.Rollet
See also github.com/PowerShell/PowerShell/issues/16461 .Pretension
S
1

So I figured out another little trick that may be useful for people trying to add the functions dynamically, particularly if you might not know the name of it beforehand, such as when the functions are in an array.

# Store the current function list in a variable
$initialFunctions=Get-ChildItem Function:

# Source all .ps1 files in the current folder and all subfolders
Get-ChildItem . -Recurse | Where-Object { $_.Name -like '*.ps1' } |
     ForEach-Object { . "$($_.FullName)" }

# Get only the functions that were added above, and store them in an array
$functions = @()
Compare-Object $initialFunctions (Get-ChildItem Function:) -PassThru |
    ForEach-Object { $functions = @($functions) + @($_) }

1..3 | ForEach-Object -Parallel {
    # Pull the $functions array from the outer scope and set each function
    # to its definition
    $using:functions | ForEach-Object {
        Set-Content "Function:$($_.Name)" -Value $_.Definition
    }
    # Call one of the functions in the sourced .ps1 files by name
    SourcedFunction $_
}

The main "trick" of this is using Set-Content with Function: plus the function name, since PowerShell essentially treats each entry of Function: as a path.

This makes sense when you consider the output of Get-PSDrive. Since each of those entries can be used as a "Drive" in the same way (i.e., with the colon).

Samuelson answered 14/3, 2022 at 20:58 Comment(1)
I misspoke earlier: You are using strings, given that $_.Definition returns a string. However, it comes with a caveat: Using [System.Management.Automation.FunctionInfo] instances (as obtained via Get-ChildItem Function: or Get-Command) is inherently problematic, because they do contain a script script block (.ScriptBlock) and can even be invoked directly, with &. Such invocations can lead to state corruption, as explained in GitHub issue #16461, which advocates for disallowing use of FunctionInfos via $using:Rollet
H
0

This might be a more elegant low-code option to get a function into Foreach-Object -Parallel block:

$m = New-Module -Name MyFunctions -ScriptBlock {
    function Func-Timestamp {
        return [DateTime]::Now.ToString("HH:mm:ss.ff")
    }
}
    
$files | ForEach-Object -Parallel {
    import-module $using:m -DisableNameChecking
    [Console]::WriteLine("Here is the Time! $(Func-TimeStamp)")
}

You create $m to be module variable and then import it into the Foreach-Object loop.

Hectoliter answered 6/9, 2023 at 18:50 Comment(1)
This is tempting, and may sometimes work, but I suspect it is subject to the same state-corruption-due-to-runspace-affinity problems as the simpler option of passing a function-info object, shown in js2010's answer. See GitHub issue #16461Rollet

© 2022 - 2024 — McMap. All rights reserved.