Error :cannot use a string pattern on a bytes-like object
Asked Answered
B

4

12

Hy am using Python RegEx to show all internet wirless profiles connected to a computer.There is error (TypeError: cannot use a string pattern on a bytes-like object) in my Second last line pls anyone help to identifi my mistake.Thanks

My Program

import subprocess,re
command = "netsh wlan show profile"
output = subprocess.check_output(command, shell=True)  
network_names = re.search("(Profile\s*:\s)(.*)", output)  
print(network_names.group(0))

.....................................................

ERROR

line 8, in <module>


 return _compile(pattern, flags).search(string)


TypeError: cannot use a string pattern on a bytes-like object
Bringingup answered 10/5, 2020 at 23:49 Comment(2)
You could try str(output) in your re.search or output.decode('utf-8') maybe?Blackstock
output = output.decode() ? subprocess return bytes and you have to manually convert to string/unicode (using default 'utf-8' or other encoding - ie. decode('latin1') - if system uses different encoding then utf-8)Wax
L
20

Python 3 distinguishes "bytes" and "string" types; this is especially important for Unicode strings, where each character may be more than one byte, depending on the character and the encoding.

Regular expressions can work on either, but it has to be consistent — searching for bytes within bytes, or strings within strings.

Depending on what you need, there are two solutions:

  • Decode the output variable before searching in it; for instance, with: output_text = output.decode('utf-8')

    This depends on the encoding that you are using; UTF-8 is the most common these days.

    The matched group will be a string.

  • Search with bytes by adding a b prefix to the regular expression. A regular expression should also use the r prefix, so it becomes: re.search(br"(Profile\s*:\s)(.*)", output)

    The matched group will be a bytes object.

Lemons answered 11/5, 2020 at 0:15 Comment(1)
Decoding the output variable before searching worked for me..Rarefy
O
3

From the documentation for Popen.stdout:

If the stdout argument was PIPE, this attribute is a readable stream object as returned by open(). Reading from the stream provides output from the child process. If the encoding or errors arguments were specified or the universal_newlines argument was True, the stream is a text stream, otherwise it is a byte stream. If the stdout argument was not PIPE, this attribute is None.

So without setting these options you get a byte stream.

subprocess.check_output supports an encoding keyword argument. Set this to 'utf8' and you will get a text stream:

output = subprocess.check_output(command, shell=True, encoding='utf8')
Oringas answered 11/5, 2020 at 0:10 Comment(0)
C
0

I tried the same code on my computer with python 2.7. Works perfect.

Output is a str object on my side.

I think you can add a line after this code "output = subprocess.check_output(command, shell=True)", the line is print(type(output)).

You may see the real data type, if it's not str, try to use output = str(output) to convert it to str

Chukar answered 11/5, 2020 at 0:7 Comment(3)
Python 2 treats bytes as string but Python 3 doesn't treast bytes as stringWax
So I said use str method to convert to strChukar
The downside of using output = str(output) is that (a) it'll add b' and ' marks around the text, and (b) it won't work well for accented characters, emoji, etc. For instance, instead of café it'll print out b'caf\xc3\xa9' Using the .decode() method will treat all these characters correctly.Tufts
M
0

I recently had a similar issue. I was trying to convert an input from a .csv table into a number, removing the '£' prefix.

e.g. £25.68 need to be 25.68

The .csv file was imported with latin1 encoding due to '£' not being readable by pd.pandas. This meant values were bytes, these just needed to be converted to strings.

OutgoingArray = ['£3.13', '£11.50', '£5.90', '£4.72']
    iteration=0
    temp = []
    for O in OutgoingArray:  # For loop cycles through 'Amount' column and removes '£' from number.
        #print(O) # Prints the current value from statement being processed
        temp = (re.findall('\d+.+\d+', str(O) )) # Identifies the number in the string 
        # "str(O)" is the bit that fixed my code
        #print(temp[0])
        OutgoingArray[iteration] = float(temp[0]) # Replaces string with prefix '£' with a float value.
        iteration += 1

Output:

OutgoingArray >> [3.13, 11.5, 5.9, 4.72]
Maryettamaryjane answered 4/4 at 15:46 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.