tee with utf-8 encoding
Asked Answered
A

5

20

I'm trying to tee a server's output to both the console and a file in Powershell 4. The file is ending up with a UTF-16 encoding, which is incompatible with some other tools I'm using. According to help tee -full:

Tee-Object uses Unicode enocding when it writes to files.
...
To specify the encoding, use the Out-File cmdlet

So tee doesn't support changing encoding, and the help for both tee and Out-File don't show any examples of splitting a stream and encoding it with UTF-8.

Is there a simple way in Powershell 4 to tee (or otherwise split a stream) to a file with UTF-8 encoding?

Aplomb answered 2/6, 2015 at 21:27 Comment(1)
It's a shame that Microsoft chose to output in UCS2 (aka UTF-16) by default instead of UTF8...Chiquia
P
17

One option is to use Add-Content or Set-Content instead of Out-File.

The *-Content cmdlets use ASCII encoding by default, and have a -Passthru switch so you can write to the file, and then have the input pass through to the console:

Get-Childitem -Name | Set-Content file.txt -Passthru
Prince answered 2/6, 2015 at 22:5 Comment(7)
Can't append.. not the same.Interlineate
@Interlineate The answer mentions Add-Content, which is for appending. Please read carefully before brushing off an answer.Portable
@AnsgarWiechers, Add-Content is not the same as tee, try both commands before brushing off an answer :) dir | Add-Content out1 vs dir | tee out2 -Append. Also, its not semantically correct. People use tee for such stuff, when you see it, you know about what it is supposed to do.Interlineate
Nobody said it was the same. It can be used to the same end, though. Try dir | Out-String | Add-Content out1 -PassThru.Portable
@Interlineate What exactly is "not semantically correct" about it?Prince
this is the best solution so far, although unlike tee it locks the fileAplomb
Worked for my purposes. ty.Guthrey
K
7

You would have to use -Variable and then write it out to a file in a separate step.

$data = $null
Get-Process | Tee-Object -Variable data
$data | Out-File -Path $path -Encoding Utf8

At first glance it seems like it's easier to avoid tee altogether and just capture the output in a variable, then write it to the screen and to a file.

But because of the way the pipeline works, this method allows for a long running pipeline to display data on screen as it goes along. Unfortunately the same cannot be said for the file, which won't be written until afterwards.

Doing Both

An alternative is to roll your own tee so to speak:

[String]::Empty | Out-File -Path $path  # initialize the file since we're appending later
Get-Process | ForEach-Object {
    $_ | Out-File $path -Append -Encoding Utf
    $_
}

That will write to the file and back to the pipeline, and it will happen as it goes along. It's probably quite slow though.

Kaciekacy answered 2/6, 2015 at 21:36 Comment(1)
The Variable and Append do not work together in tee. Out-File is OK sollution, but requires a function since it doesn't have PassThru.Interlineate
C
7

Tee-object seems to invoke out-file, so this will make tee output utf8:

$PSDefaultParameterValues = @{'Out-File:Encoding' = 'utf8'}

Places to store the setting:

$profile | select *


AllUsersAllHosts       : C:\Windows\System32\WindowsPowerShell\v1.0\profile.ps1
AllUsersCurrentHost    : C:\Windows\System32\WindowsPowerShell\v1.0\Microsoft.PowerShell_profile.ps1
CurrentUserAllHosts    : C:\Users\admin\Documents\WindowsPowerShell\profile.ps1
CurrentUserCurrentHost : C:\Users\admin\Documents\WindowsPowerShell\Microsoft.PowerShell_profile.ps1
Length                 : 78
Choreograph answered 18/11, 2019 at 18:6 Comment(2)
How can this be set as a enforced global default? so we can deploy on all windows servers.Jaine
@Jaine Try putting it in $profile.AllUsersAllHosts or $profile.AllUsersCurrentHost.Choreograph
L
0

Addressed in the GitHub issue #11104.

PowerShell 7.3.0 or above supports the -Encoding parameter that takes one of ASCII, BigEndianUnicode, OEM, Unicode, UTF7, UTF8, UTF8BOM, UTF8NoBOM (default), and UTF32.

NAME
    Tee-Object

SYNTAX
    Tee-Object [-FilePath] <string> [-InputObject <psobject>] [-Append] [-Encoding <Encoding>] [<CommonParameters>]

    Tee-Object -LiteralPath <string> [-InputObject <psobject>] [-Encoding <Encoding>] [<CommonParameters>]

    Tee-Object -Variable <string> [-InputObject <psobject>] [<CommonParameters>]


ALIASES
    tee


REMARKS
    Get-Help cannot find the Help files for this cmdlet on this computer. It is displaying only partial help.
        -- To download and install Help files for the module that includes this cmdlet, use Update-Help.
        -- To view the Help topic for this cmdlet online, type: "Get-Help Tee-Object -Online" or
           go to https://go.microsoft.com/fwlink/?LinkID=2097034.

Even with that flag, it might still be garbled due to how PowerShell parses pipe outputs. See #17523.

@jbobrean93: PowerShell relies on System.Diagnostics.Process to parse the output from a pipe and in the absence of an explicit setting of the Standard*Encoding property in the start info it will rely on the global setting of Console.OutputEncoding to determine what encoding is used. On Windows the default console encoding is still whatever the OS is configured to, which is typically 431 on English hosts. Unfortunately your only workaround here is to set [Console]::OutputEncoding = [System.Text.Encoding]::UTF8 and then run your command.


The command to use

[Console]::OutputEncoding = [System.Text.Encoding]::UTF8
YOUR_COMMAND_HERE | Tee-Object -FilePath YOUR_OUTPUT_FILE -Encoding UTF8NoBOM

Notes

You might want to update your PowerShell to use this feature. Use

$PSVersionTable

to check the version of your PowerShell.

Littell answered 8/11, 2023 at 20:35 Comment(0)
I
-1

First create the file using appropriate flags then append to it:

Set-Content  out $null -Encoding Unicode
...
cmd1 | tee out -Append
...
cmdn | tee out -Append
Interlineate answered 3/6, 2015 at 6:15 Comment(2)
this is putting null bytes between each utf8 character in the fileAplomb
That is strange. I checked and it doesn't happen with Unicode or UTF7 encoding.Interlineate

© 2022 - 2024 — McMap. All rights reserved.