Use an array as key in a hashtable
Asked Answered
J

2

8

Can an array be used as the key in a hashtable? How can I reference the hashtable item with an array key?

PS C:\> $h = @{}
PS C:\> $h[@(1,2)] = 'a'
PS C:\> $h

Name                           Value
----                           -----
{1, 2}                         a         # looks like the key is a hash

PS C:\> $h[@(1,2)]                       # no hash entry
PS C:\> $h.Keys                          # 
1
2
PS C:\> $h[@(1,2)] -eq 'a'
PS C:\> $h[@(1,2)] -eq 'b'
PS C:\> foreach ($key in $h.Keys) { $key.GetType() }   # this is promising

IsPublic IsSerial Name                                     BaseType
-------- -------- ----                                     --------
True     True     Object[]                                 System.Array

PS C:\> $PSVersionTable.PSVersion.ToString()
7.1.4
Jenks answered 26/9, 2021 at 12:55 Comment(0)
M
7

While you can use arrays as hashtable keys, doing so is impractical:

  • Update: There is a way to make arrays work as hashtable keys, but it requires nontrivial effort during construction of the hashtable - see this answer.

  • You'll have the use the very same array instances as both the keys and for later lookups.

    • The reason is that arrays, which are instances of .NET reference types (as opposed to value types such as integers), use the default implementation of the .GetHashCode() method to return a hash code (as used in hashtables), and this default implementation returns a different code for each instance - even for two array instances that one would intuitively think of as "the same".
    • In other words: you'll run into the same problem trying to use instances of any such .NET reference type as hashtable keys, including other collection types - unless a given type happens to have a custom .GetHashCode() implementation that explicitly considers distinct instances equal based on their content.
  • Additionally, it makes use of PowerShell's indexer syntax ([...]) awkward, because the array instance must be nested, with the unary form of ,, the array constructor operator. However, dot notation (property access) works as usual.

$h = @{}

# The array-valued key.
$key = 1, 2

$h[$key] = 'a'

# IMPORTANT:
# The following lookups work, but only because
# the *very same array instance* is used for the lookup.

# Nesting required so that PowerShell doesn't think that
# *multiple* keys are being looked up.
$h[, $key] 

# Dot notation works normally.
$h.$key

# Does NOT work, because a *different array instance* is used.
$h.@(1,2)

A simple test for whether a given expression results in the same hashtable lookup every time and is therefore suitable as a key is to call the .GetHashCode() method on it repeatedly; only if the same number is returned every time (in a given session) can the expression be used:

# Returns *different* numbers.
@(1, 2).GetHashCode()
@(1, 2).GetHashCode()

To inspect a given object or type for whether it is (an instance of) a .NET reference type vs. value type:

# $false is returned in both cases, confirming that the .NET array 
# type is a *reference type*
@(1, 2).GetType().IsValueType
[Array].IsValueType

Workaround:

A workaround would be to use string representations of arrays, though coming up with unique (enough) ones may be a challenge.

In the simplest case, use PowerShell's string interpolation, which represents arrays as a space-separated list of the elements' (stringified) values; e.g. "$(1, 2)" yields verbatim 1 2:

$h = @{}

# The array to base the key on.
$array = 1, 2

# Use the *stringified* version as the key.
$h["$array"] = 'a'

# Works, because even different array instances with equal-valued
# instances of .NET primitive types stringify the same.
#   '1 2'
$h["$(1, 2)"]

iRon points out that this simplistic approach can lead to ambiguity (e.g., a single '1 2' string would result in the same key as array 1, 2) and recommends the following instead:

a more advanced/explicit way for array keys would be:

  • joining their elements with a non-printable character; e.g.
    $key = $array -join [char]27
  • or, for complex object array elements, serializing the array:
    $key = [System.Management.Automation.PSSerializer]::Serialize($array)

Note that even the XML (string)-based serialization provided by the System.Management.Automation.PSSerializer class (used in PowerShell remoting and background jobs for cross-process marshaling) has its limits with respect to reliably distinguishing instances, because its recursion depth is limited - see this answer for more information; you can increase the depth on demand, but doing so can result in very large string representations.

A concrete example:

using namespace System.Management.Automation

$ht = @{}

# Use serialization on an array-valued key.
$ht[[PSSerializer]::Serialize(@(1, 2))] = 'a'

# Despite using a different array instance, this
# lookup succeeds, because the serialized representation is the same.
$ht[[PSSerializer]::Serialize(@(1, 2))]  # -> 'a'
Multilingual answered 26/9, 2021 at 13:26 Comment(0)
C
4

The primary cause of your problems here is that PowerShell's index access operator [] supports multi-index access by enumerating any array values passed.

To understand why, let's have a look at how the index accessor [...] actually works in PowerShell. Let's start with a simple hashtable, with 2 entries using scalar keys:

$ht = @{}
$ht['a'] = 'This is value A'
$ht['b'] = 'This is value B'

Now, let's inspect how it behaves!

Passing a scalar argument resolves to the value associated with the key represented by said argument, so far so good:

PS ~> $ht['a']
This is value A

But we can also pass an array argument, and all of a sudden PowerShell will try to resolve all items as individual keys:

PS ~> $ht[@('a', 'b')]
This is value A
This is value B
PS ~> $ht[@('b', 'a')]   # let's try in reverse!
This is value B
This is value A

Now, to understand what happens in your example, let's try an add an entry with an array reference as the key, along with two other entries where the key is the individual values fround in the array:

$ht = @{}
$keys = 1,2
$ht[$keys[0]] = 'Value 1'
$ht[$keys[1]] = 'Value 2'
$ht[$keys]    = 'Value 1,2'

And when we subsequently try to resolve the last entry using our array reference:

PS ~> $ht[$keys]
Value 1
Value 2

Oops! PowerShell unraveled the $keys array, and never actually attempted to resolve the entry associated with the key corresponding to the array reference in $keys.

In other words: The index accessor cannot be used to resolve dictionary entries by key is the key type is enumerable

So, how does one access an entry by array reference without having PowerShell unravel the array?

Use the IList.Item() parameterized property instead:

PS ~> $ht.Item($keys)
Value 1,2
Careworn answered 26/9, 2021 at 13:28 Comment(3)
Yes, the indexer syntax is tricky, but an easier alternative is to use dot notation ($ht.$Keys). However, the larger issue is that the lookup only works if the very same array instance that was used to create the entry is also used in the lookup, which makes use of arrays as keys impractical.Multilingual
@Multilingual I originally wrote a paragraph in about ref vs value type equality, but realized it's no different from using any other reference type as a key - the complicating factor with the indexer is solely the fact that the key contains an enumerable valueCareworn
Good point: It applies to all .NET reference types (without a custom .GetHashCode() implementation) - I've updated my answer to make that clearer. The fact that the question uses two distinct array instances - @(1, 2) literals - where the lookup will fail even if you work around the syntax problem, makes me think that clarifying the reference-vs.-value-equality issue is more important - not just for the OP, but for a PowerShell audience in general, which isn't used to thinking in such terms.Multilingual

© 2022 - 2024 — McMap. All rights reserved.