After a lot of searching, I figured the easiest way to manipulate the entries without recreating the managing framework will be through hashtable. During the process I had to account for two edge cases:
- additional commas in the values
- empty values
The final solution I got is this:
$configuration = Import-Csv .\configuration.csv
$customization = Import-Csv .\customization.csv
$merged = New-Object System.Collections.ArrayList
$hashTable = @{}
#initializing the hashTable with the defaults
foreach ($entry in $configuration)
{
$hashTable[$entry.path + ',' + $entry.item] = $entry.value + ',' + $entry.type
}
#updating the hashTable with customization that add or overwrite existing entries
foreach ($entry in $customization)
{
$hashTable[$entry.path + ',' + $entry.item] = $entry.value + ',' + $entry.type
}
#the regex handles multiple commas and empty values.
#It returns an empty string before and after group so we start from 1
foreach ($key in $hashTable.keys)
{
$psobject = [PSCustomObject]@{
path = ($key -split '(.*),(.*)')[1]
item = ($key -split '(.*),(.*)')[2]
value = ($hashTable[$key] -split '(.*),(.*)')[1]
type = ($hashTable[$key] -split '(.*),(.*)')[2]
}
[void] $merged.Add($psobject)
}
Write-Output $merged
Once imported, I transform the configuration.csv
into hashTable with keys comprised of path
and value
. I then do the same with customization.csv
using the same hashTable which overwrites any existing key
values or add them as new.
The third loop converts the hashTable to PSCustomObject
similar to what Import-Csv
does. I split each of the key
and value
attributes while accounting for multiple commas and also empty values.
NOTE: the regex will split on the last occurrence of the separator (here it's comma, but you can select anything, really). If you want to split on the first, you can use (.*?),(.*)
. In my case only the value
column could contain an instance of the separator.
If the CSV had a unique column, then a solution similar to this answer could've been used.
Another alternative is to set the keys to be the sum of all columns, and this will filter out any duplicates in the CSV, but the splitting can get tricky, depending on the values in the columns.