Parse YAML file in Powershell
Asked Answered
B

2

15

I want to parse an YAML file from IBM API Connect in a PowerShell file. I won't be able to put third party packages or DLLs since security review won't agree with it.

---
product: "1.0.0"
info:
  name: "api2product"
  title: "API2product"
  version: "1.0.0"
visibility:
  view:
    enabled: true
    type: "public"
    tags: []
    orgs: []
  subscribe:
    enabled: true
    type: "authenticated"
    tags: []
    orgs: []
apis:
  api1:
    $ref: "api1_1.0.0.yaml"
  api2:
    $ref: "api2_1.0.0.yaml"
  api3:
    $ref: "api3_1.0.0.yaml"
  api4:
    $ref: "api4_1.0.0.yaml"
  api5:
    $ref: "api5_1.0.0.yaml"
plans:
  default:
    title: "Default Plan"
    description: "Default Plan"
    approval: false
    rate-limit:
      hard-limit: false
      value: "100/hour"

I am interested to get only the API YAML files associated with it for which I have googled and developed a sample PowerShell code which is actually running.

$text = Get-Content -Path "C:\Users\abhi\Desktop\projects\TRYG\GitLab\APIPOC\test.yaml"
$regex = '(?ms)apis:(.+?)plans:'
$text = $text -join "`n"
$OutputText = [regex]::Matches($text, $regex) |
              foreach {$_.Groups[1].Value -split $regex}
$OutputText = $OutputText.Replace("`$ref:", "")
$OutputText = $OutputText.Replace(":", "=")
$OutputText = $OutputText.Replace("=`n", "=")
$OutputText = $OutputText.Replace("`"", "")
$AppProps = ConvertFrom-StringData ($OutputText)
$AppProps.GetEnumerator() | ForEach-Object {$_.Value}
[string[]]$l_array = $AppProps.Values | Out-String -Stream

Is there any simple way to achieve this instead of multiple replacements in the string?

Batholith answered 19/3, 2019 at 14:15 Comment(7)
YAML is a hierarchical format, whereas ConvertFrom-StringData processes a list of simple key/value pairs into a hashtable. Can you get your input data in JSON format instead of YAML? PowerShell comes with a parser for the former.Practical
Building on this - You can easily convert the YAML file to JSON and use the built in parser with PowerShell. python -c 'import sys, yaml, json; json.dump(yaml.load(sys.stdin), sys.stdout, indent=4)' < file.yaml > file.json You will need PyYAML installed but this should work.Whalen
@AnsgarWiechers I cant control the format of input data. It will come in YAML format only. I have tens of yaml files to processBatholith
@Whalen With that code, I can see a new json file created but with 0 KB. I am new to python btwn. I have installed PyYAML also.Batholith
How come you can install Python and Python modules, but not PowerShell modules? That doesn*t make sense.Practical
If there is a business need to parse YAML from a PowerShell script then you should push for that to be possible. If your security team is blocking that then your problem here isn't technical.Trinetta
But security would approve you taking advice from strangers ;) I would go with an open source project and if you're worried, fork it and review every line of it. Phil-Factor/PSYaml as example is using the MIT License.Navarro
U
6

I suggest to locally fork a new available powershell module: PowerShell CmdLets for YAML format manipulation.

This powershell module is a thin wrapper on top of YamlDotNet that serializes and un-serializes simple powershell objects to and from YAML. It was tested on powershell versions 4 and 5, supports Nano Server and apparently works with powershell on Linux. I suspect it works on Mac as well, but I have not had a chance to test it.

The module provides the ConvertFrom-Yaml command to parse yaml.

$text = @"
product: "1.0.0"
info:
  name: "api2product"
  title: "API2product"
  version: "1.0.0"
visibility:
  view:
    enabled: true
    type: "public"
    tags: []
    orgs: []
  subscribe:
    enabled: true
    type: "authenticated"
    tags: []
    orgs: []
apis:
  api1:
    `$ref: "api1_1.0.0.yaml"
  api2:
    `$ref: "api2_1.0.0.yaml"
  api3:
    `$ref: "api3_1.0.0.yaml"
  api4:
    `$ref: "api4_1.0.0.yaml"
  api5:
    `$ref: "api5_1.0.0.yaml"
plans:
  default:
    title: "Default Plan"
    description: "Default Plan"
    approval: false
    rate-limit:
      hard-limit: false
      value: "100/hour"
"@
Import-Module powershell-yaml

$yaml = ConvertFrom-Yaml $text

foreach ($api in $yaml.apis.GetEnumerator()) {
    Write-Host "$($api.Name) ref: $($api.Value['$ref'])"
}

Ula answered 5/8, 2023 at 4:32 Comment(2)
JBach, a link to a solution is welcome, but please ensure your answer is useful without it: add context around the link so your fellow users will have some idea what it is and why it is there, then quote the most relevant part of the page you are linking to in case the target page is unavailable. Answers that are little more than a link may be deleted.Privet
I think it's worth mentioning that this method is nice for READ operations but it won't work with updates on the yaml files. GetEnumerator() locks the collection and throws an error on the updateDarbie
N
2

I suggest you fork an open licensed module like the Phil-Factor/PSYaml as this isn't a trivial task.

Then you can keep an isolated version that you maintain your self.

Navarro answered 20/9, 2021 at 17:21 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.