Json file to powershell and back to json file
Asked Answered
P

2

8

I am trying to manipulate json file data in powershell and write it back to the file. Even before the manipulation, when I just read from the file, convert it to Json object in powershell and write it back to the file, some characters are being replaced by some codes. Following is my code:

$jsonFileData = Get-Content $jsonFileLocation

$jsonObject = $jsonFileData | ConvertFrom-Json

... (Modify jsonObject) # Commented out this code to write back the same object

$jsonFileDataToWrite = $jsonObject | ConvertTo-Json

$jsonFileDataToWrite | Out-File $jsonFileLocation

Some characters are being replaced by their codes. E.g.:

< is replaced by \u003c
> is replaced by \u003e. 
' is replaced by \u0027

Sample input:

{
    "$schema": "https://source.com/template.json#",
    "contentVersion": "1.0.0.0",
    "parameters": {
        "accountName": {
            "type": "string",
            "defaultValue": "<sampleAccountName>"
        },
        "accountType": {
            "type": "string",
            "defaultValue": "<sampleAccountType>"
        },
    },
    "variables": {
        "location": "sampleLocation",
        "account": "[parameters('accountName')]",
        "type": "[parameters('accountType')]",
    }
}

Output:

{
    "$schema": "https://source.com/template.json#",
    "contentVersion": "1.0.0.0",
    "parameters": {
        "accountName": {
            "type": "string",
            "defaultValue": "\u003csampleAccountName\u003e"
        },
        "accountType": {
            "type": "string",
            "defaultValue": "\u003csampleAccountType\u003e"
        },
    },
    "variables": {
        "location": "sampleLocation",
        "account": "[parameters(\u0027accountName\u0027)]",
        "type": "[parameters(\u0027accountType\u0027)]",
    }
}

Why is this happening and what can I do to make it not to replace the characters and write them back the same way?

Photodrama answered 5/10, 2015 at 23:25 Comment(0)
P
7

Since ConvertTo-Json uses .NET JavaScriptSerializer under the hood, the question is more or less already answered here.

Here's some shameless copypaste:

The characters are being encoded "properly"! Use a working JSON library to correctly access the JSON data - it is a valid JSON encoding.

Escaping these characters prevents HTML injection via JSON - and makes the JSON XML-friendly. That is, even if the JSON is emited directly into JavaScript (as is done fairly often as JSON is a valid2 subset of JavaScript), it cannot be used to terminate the element early because the relevant characters (e.g. <, >) are encoded within JSON itself.


If you really need to turn character codes back to unescaped characters, the easiest way is probably to do a regex replace for each character code. Example:

$dReplacements = @{
    "\\u003c" = "<"
    "\\u003e" = ">"
    "\\u0027" = "'"
}

$sInFile = "infile.json"
$sOutFile = "outfile.json"

$sRawJson = Get-Content -Path $sInFile | Out-String
foreach ($oEnumerator in $dReplacements.GetEnumerator()) {
    $sRawJson = $sRawJson -replace $oEnumerator.Key, $oEnumerator.Value
}

$sRawJson | Out-File -FilePath $sOutFile
Prophylactic answered 6/10, 2015 at 5:54 Comment(1)
Except, that if you're posting the content as application/json, then one would expect ConvertTo-JSON to follow the JSON spec, which specifies that only the control characters, the double-quote (U+0022) and a relatively few others need to actually be escaped. Any other character does not. There's an open issue on PowerShell's GH whereby when they switched to NewtonSoftJSON in PowerShell Core, the JSON was different than in PSv5. In short, PS Core follows the JSON spec by virtue of using the default NewtonSoft.Json string escaper.Maddy
R
0

this one line code finds ANY hex representation of chars in $jsonFileDataToWrite and replaces it to its char representation:

([regex]'(?i)\\u([0-9a-h]{4})').Replace($jsonFileDataToWrite, {param($Match) "$([char][int64]"0x$($Match.Groups[1].Value)")"})

So the original code would look something like this:

$jsonFileData = Get-Content $jsonFileLocation
$jsonObject = $jsonFileData | ConvertFrom-Json

... (Modify jsonObject) # Commented out this code to write back the same object

$jsonFileDataToWrite = $jsonObject | ConvertTo-Json
$jsonFileDataToWrite = ([regex]'(?i)\\u([0-9a-h]{4})').Replace($jsonFileDataToWrite, {param($Match) "$([char][int64]"0x$($Match.Groups[1].Value)")"})
$jsonFileDataToWrite | Out-File $jsonFileLocation
Rubicund answered 9/7, 2022 at 16:55 Comment(1)
This doesn't distinguish between quoted literals or not, and doesn't work for characters that need to be encoded (such as double quotes).Beatitude

© 2022 - 2024 — McMap. All rights reserved.