How do I split a string into a list of characters?
Asked Answered
A

17

687

How do I split a string into a list of characters? str.split does not work.

"foobar"    →    ['f', 'o', 'o', 'b', 'a', 'r']
Advocation answered 12/2, 2011 at 15:14 Comment(3)
In Python, strings are already arrays of characters for all purposes except replacement. You can slice them, reference or look up items by index, etc.Southland
Link to other directionWowser
See stackoverflow.com/questions/743806 for splitting the string into words.Occipital
P
1210

Use the list constructor:

>>> list("foobar")
['f', 'o', 'o', 'b', 'a', 'r']

list builds a new list using items obtained by iterating over the input iterable. A string is an iterable -- iterating over it yields a single character at each iteration step.

Pursuant answered 12/2, 2011 at 15:16 Comment(4)
In my opinion much better than the ruby method, you can convert between sequence types freely, even better, in C level.Wysocki
I want flag here to not do this ... but anyway if you want callable you could escape this behavior using cast_method = lambda x: [x]Lemmuela
@Doogle: Capabilities-wise while String is an object and split() can be called on it, list() is a function so it cannot be called on it.Sidwell
This does not not work in the latest versions of R anymore, I think.Reveille
R
93

You take the string and pass it to list()

s = "mystring"
l = list(s)
print l
Recept answered 12/2, 2011 at 15:16 Comment(0)
M
85

You can also do it in this very simple way without list():

>>> [c for c in "foobar"]
['f', 'o', 'o', 'b', 'a', 'r']
Meader answered 24/3, 2015 at 6:0 Comment(3)
Welcome to stackoverflow. Would you mind extending the answer a little bit to explain how it solves the problem.Fenton
This is a mere for, there's not much to explain. I think you should read the python tutorial on data structures, especially list comprehension.Botts
This just means list(map(lambda c: c, iter("foobar"))), but more readable and meaningful.Colorado
L
55

If you want to process your String one character at a time. you have various options.

uhello = u'Hello\u0020World'

Using List comprehension:

print([x for x in uhello])

Output:

['H', 'e', 'l', 'l', 'o', ' ', 'W', 'o', 'r', 'l', 'd']

Using map:

print(list(map(lambda c2: c2, uhello)))

Output:

['H', 'e', 'l', 'l', 'o', ' ', 'W', 'o', 'r', 'l', 'd']

Calling Built in list function:

print(list(uhello))

Output:

['H', 'e', 'l', 'l', 'o', ' ', 'W', 'o', 'r', 'l', 'd']

Using for loop:

for c in uhello:
    print(c)

Output:

H
e
l
l
o

W
o
r
l
d
Lactoflavin answered 3/6, 2017 at 14:48 Comment(1)
Are there differences in the performance characteristics of each of these methods?Idolum
G
41

If you just need an array of chars:

arr = list(str)

If you want to split the str by a particular delimiter:

# str = "temp//temps" will will be ['temp', 'temps']
arr = str.split("//")
Grimaldi answered 13/12, 2018 at 21:31 Comment(0)
C
25

I explored another two ways to accomplish this task. It may be helpful for someone.

The first one is easy:

In [25]: a = []
In [26]: s = 'foobar'
In [27]: a += s
In [28]: a
Out[28]: ['f', 'o', 'o', 'b', 'a', 'r']

And the second one use map and lambda function. It may be appropriate for more complex tasks:

In [36]: s = 'foobar12'
In [37]: a = map(lambda c: c, s)
In [38]: a
Out[38]: ['f', 'o', 'o', 'b', 'a', 'r', '1', '2']

For example

# isdigit, isspace or another facilities such as regexp may be used
In [40]: a = map(lambda c: c if c.isalpha() else '', s)
In [41]: a
Out[41]: ['f', 'o', 'o', 'b', 'a', 'r', '', '']

See python docs for more methods

Costrel answered 10/9, 2014 at 19:7 Comment(2)
The first way is very simple. Are there reasons people would want something more complex?Sprinkler
Hello! First option is simple indeed. The second one, though, has better potential for handling more complex processing.Costrel
D
25

The task boils down to iterating over characters of the string and collecting them into a list. The most naïve solution would look like

result = []
for character in string:
    result.append(character)

Of course, it can be shortened to just

result = [character for character in string]

but there still are shorter solutions that do the same thing.

list constructor can be used to convert any iterable (iterators, lists, tuples, string etc.) to list.

>>> list('abc')
['a', 'b', 'c']

The big plus is that it works the same in both Python 2 and Python 3.

Also, starting from Python 3.5 (thanks to the awesome PEP 448) it's now possible to build a list from any iterable by unpacking it to an empty list literal:

>>> [*'abc']
['a', 'b', 'c']

This is neater, and in some cases more efficient than calling list constructor directly.

I'd advise against using map-based approaches, because map does not return a list in Python 3. See How to use filter, map, and reduce in Python 3.

Desk answered 5/4, 2016 at 17:24 Comment(1)
I think the last proposal is very nice. But I don't see why you revisited some of the other approaches, (most of them) have been posted here already and distract from the amazing python 3.5 solution!Airwaves
D
19

split() inbuilt function will only separate the value on the basis of certain condition but in the single word, it cannot fulfill the condition. So, it can be solved with the help of list(). It internally calls the Array and it will store the value on the basis of an array.

Suppose,

a = "bottle"
a.split() // will only return the word but not split the every single char.

a = "bottle"
list(a) // will separate ['b','o','t','t','l','e']
Declinometer answered 18/8, 2018 at 6:53 Comment(0)
T
17

Unpack them:

word = "Paralelepipedo"
print([*word])
Torsk answered 26/11, 2019 at 12:47 Comment(0)
S
11

To split a string s, the easiest way is to pass it to list(). So,

s = 'abc'
s_l = list(s) #  s_l is now ['a', 'b', 'c']

You can also use a list comprehension, which works but is not as concise as the above:

s_l = [c for c in s]

There are other ways, as well, but these should suffice. Later, if you want to recombine them, a simple call to "".join(s_l) will return your list to all its former glory as a string...

Slurry answered 17/12, 2020 at 6:53 Comment(0)
P
6

You can use extend method in list operations as well.

>>> list1 = []
>>> list1.extend('somestring')
>>> list1
['s', 'o', 'm', 'e', 's', 't', 'r', 'i', 'n', 'g']
Punchy answered 6/9, 2020 at 19:23 Comment(0)
K
4

If you wish to read only access to the string you can use array notation directly.

Python 2.7.6 (default, Mar 22 2014, 22:59:38) 
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> t = 'my string'
>>> t[1]
'y'

Could be useful for testing without using regexp. Does the string contain an ending newline?

>>> t[-1] == '\n'
False
>>> t = 'my string\n'
>>> t[-1] == '\n'
True
Kokaras answered 17/5, 2014 at 14:28 Comment(0)
H
3

Well, much as I like the list(s) version, here's another more verbose way I found (but it's cool so I thought I'd add it to the fray):

>>> text = "My hovercraft is full of eels"
>>> [text[i] for i in range(len(text))]
['M', 'y', ' ', 'h', 'o', 'v', 'e', 'r', 'c', 'r', 'a', 'f', 't', ' ', 'i', 's', ' ', 'f', 'u', 'l', 'l', ' ', 'o', 'f', ' ', 'e', 'e', 'l', 's']
Highbrow answered 9/2, 2015 at 4:7 Comment(2)
camelcase = ''.join([text[i].upper() if i % 2 else text[i].lower() for i in range(len(text))])Eous
@Eous - that's actually aLtErNaTiNg case. CamelCase looks likeThis or LikeThis.Slurry
S
3
from itertools import chain

string = 'your string'
chain(string)

similar to list(string) but returns a generator that is lazily evaluated at point of use, so memory efficient.

Supertanker answered 16/7, 2018 at 10:19 Comment(1)
Not sure where this would be more useful than the string itself, which is iterable.Individuation
S
3

Since strings are iterables, you can also use iterable unpacking to assign to a list. Below, the characters in my_string are unpacked into the my_list list.

my_string = "foobar"
*my_list, = my_string

print(my_list)   # ['f', 'o', 'o', 'b', 'a', 'r']

This is especially useful, if you need to save the first or last character into a separate variable.

first, *rest = "foobar"

print(first)  # f
print(rest)   # ['o', 'o', 'b', 'a', 'r']
Selfappointed answered 17/9, 2023 at 19:15 Comment(0)
W
1

Here is a nice script that will help you find which method is most efficient for your case:

import timeit
from itertools import chain

string = "thisisthestringthatwewanttosplitintoalist"

def getCharList(str):
  return list(str)

def getCharListComp(str):
  return [char for char in str]

def getCharListMap(str):
  return list(map(lambda c: c, str))

def getCharListForLoop(str):
  list = []
  for c in str:
    list.append(c)

def getCharListUnpack(str):
  return [*str]

def getCharListExtend(str):
  list = []
  return list.extend(str)

def getCharListChain(str):
  return chain(str)
 
time_list = timeit.timeit(stmt='getCharList(string)', globals=globals(), number=1)
time_listcomp = timeit.timeit(stmt='getCharListComp(string)', globals=globals(), number=1)
time_listmap = timeit.timeit(stmt='getCharListMap(string)', globals=globals(), number=1)
time_listforloop = timeit.timeit(stmt='getCharListForLoop(string)', globals=globals(), number=1)
time_listunpack = timeit.timeit(stmt='getCharListUnpack(string)', globals=globals(), number=1)
time_listextend = timeit.timeit(stmt='getCharListExtend(string)', globals=globals(), number=1)
time_listchain = timeit.timeit(stmt='getCharListChain(string)', globals=globals(), number=1)

print(f"Execution time using list constructor is {time_list} seconds")
print(f"Execution time using list comprehension is {time_listcomp} seconds")
print(f"Execution time using map is {time_listmap} seconds")
print(f"Execution time using for loop is {time_listforloop} seconds")
print(f"Execution time using unpacking is {time_listunpack} seconds")
print(f"Execution time using extend is {time_listextend} seconds")
print(f"Execution time using chain is {time_listchain} seconds")
Whortleberry answered 8/12, 2022 at 18:3 Comment(1)
Passing number=1 to timeit.timeit is probably not a good idea as a large number of iterations is required to get a reliable result. The default number=1000000 is a better amount.Austria
S
0

you can use

*var, = othervar

to convert it to a list

The code would look like this:

foo = "foobar"

*foolist, = foo

print(foolist)
['f', 'o', 'o', 'b', 'a', 'r']
Shellyshelman answered 13/4 at 1:23 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.