List of dicts to/from dict of lists

S

14

136

I want to change back and forth between a dictionary of (equal-length) lists:

DL = {'a': [0, 1], 'b': [2, 3]}

and a list of dictionaries:

LD = [{'a': 0, 'b': 2}, {'a': 1, 'b': 3}]

Stoicism answered 5/4, 2011 at 20:57 Comment(1)

It is unclear how you would interpret the order of DL? ie, if you have many elements, they loose their insertion order. If 'a' and 'b' come out of DL in a different order, what should the order of the resulting LD be? – G 7/4, 2011 at 0:35

E

17

Perhaps consider using numpy:

import numpy as np

arr = np.array([(0, 2), (1, 3)], dtype=[('a', int), ('b', int)])
print(arr)
# [(0, 2) (1, 3)]

Here we access columns indexed by names, e.g. 'a', or 'b' (sort of like DL):

print(arr['a'])
# [0 1]

Here we access rows by integer index (sort of like LD):

print(arr[0])
# (0, 2)

Each value in the row can be accessed by column name (sort of like LD):

print(arr[0]['b'])
# 2

Eufemiaeugen answered 5/4, 2011 at 21:15 Comment(3)

Nifty. Could you explain the difference between passing [(0,2),(1,3)] and [[0,2],[1,3]] to np.array? Specifically why does the second not work? – Stoicism 7/4, 2011 at 18:36

@Adam Greenhall: You are asking a very good question. I don't know the complete answer. I know that numpy sometimes makes a much greater distinction between lists and tuples than does Python. The documentation for dtype syntax docs.scipy.org/numpy/docs/numpy.doc.structured_arrays says, when defining a dtype using a "[l]ist argument ... the record structure is defined with a list of tuples." But I don't know why it must be this way. – Eufemiaeugen 7/4, 2011 at 23:14

@Eufemiaeugen thanks, very interesting. I had not heard of structured arrays. The documentation link has now changed: numpy.org/doc/stable/user/basics.rec.html. Also, I note in the docs it says structured arrays "are meant for interfacing with C code and for low-level manipulation of structured buffers... Users looking to manipulate tabular data, such as stored in csv files, may find other pydata projects more suitable, such as xarray, pandas, or DataArray." – Topsyturvydom 11/8, 2020 at 18:53

M

164

For those of you that enjoy clever/hacky one-liners.

Here is DL to LD:

v = [dict(zip(DL,t)) for t in zip(*DL.values())]
print(v)

and LD to DL (all keys are same in each dict):

v = {k: [dic[k] for dic in LD] for k in LD[0]}
print(v)

or LD to DL (all keys are not same in each dict):

common_keys = set.intersection(*map(set, LD))
v = {k: [dic[k] for dic in LD] for k in common_keys}
print(v)

Also, please note that I do not condone the use of such code in any kind of real system.

Meyerbeer answered 9/10, 2015 at 20:48 Comment(5)

LD to DL returns tuples instead lists, which may or may not be more desirable. BTW, very nice and handy oneliners – Rooke 2/7, 2016 at 16:17

@GillBates You were correct; the code LD->DL code relied on all dicts to be ordered the same way, which is a horrible assumption to make. I've replaced the bad code. – Horning 5/6, 2018 at 16:12

To handle dicts with different keys: LD[0] can be replaced by reduce(set.union, [set(D.keys()) for D in LD]) then [dic[k] for dic in LD if k in dic], so the resulting one liner is: v = {k: [dic[k] for dic in LD if k in dic] for k in reduce(set.union, [set(D.keys()) for D in LD])} – Thermonuclear 21/6, 2019 at 21:40

I sure do enjoy clever/hacky one-liners. Also I think these are nice solutions that are very pythonic. They use core python idioms that I think python programmers should be familiar with. – Meridel 25/2, 2020 at 7:32

Please - if using a clever hacky one-liner like this, add a comment, and preferably a reference to this SO answer. Your future self will thank you. – Churchwell 27/4, 2021 at 7:44

S

19

If you're allowed to use outside packages, Pandas works great for this:

import pandas as pd
pd.DataFrame(DL).to_dict(orient="records")

Which outputs:

[{'a': 0, 'b': 2}, {'a': 1, 'b': 3}]

You can also use orient="list" to get back the original structure

{'a': [0, 1], 'b': [2, 3]}

Speciosity answered 8/8, 2014 at 17:43 Comment(1)

This is probably a version issue, but the above returns {'a': [0, 1], 'b': [2, 3]} in pandas 0.18.1. pd.DataFrame(DL).to_dict('records') works as described. – Hereinafter 27/5, 2016 at 18:14

E

17

Perhaps consider using numpy:

import numpy as np

arr = np.array([(0, 2), (1, 3)], dtype=[('a', int), ('b', int)])
print(arr)
# [(0, 2) (1, 3)]

Here we access columns indexed by names, e.g. 'a', or 'b' (sort of like DL):

print(arr['a'])
# [0 1]

Here we access rows by integer index (sort of like LD):

print(arr[0])
# (0, 2)

Each value in the row can be accessed by column name (sort of like LD):

print(arr[0]['b'])
# 2

Eufemiaeugen answered 5/4, 2011 at 21:15 Comment(3)

Nifty. Could you explain the difference between passing [(0,2),(1,3)] and [[0,2],[1,3]] to np.array? Specifically why does the second not work? – Stoicism 7/4, 2011 at 18:36

@Adam Greenhall: You are asking a very good question. I don't know the complete answer. I know that numpy sometimes makes a much greater distinction between lists and tuples than does Python. The documentation for dtype syntax docs.scipy.org/numpy/docs/numpy.doc.structured_arrays says, when defining a dtype using a "[l]ist argument ... the record structure is defined with a list of tuples." But I don't know why it must be this way. – Eufemiaeugen 7/4, 2011 at 23:14

@Eufemiaeugen thanks, very interesting. I had not heard of structured arrays. The documentation link has now changed: numpy.org/doc/stable/user/basics.rec.html. Also, I note in the docs it says structured arrays "are meant for interfacing with C code and for low-level manipulation of structured buffers... Users looking to manipulate tabular data, such as stored in csv files, may find other pydata projects more suitable, such as xarray, pandas, or DataArray." – Topsyturvydom 11/8, 2020 at 18:53

G

14

To go from the list of dictionaries, it is straightforward:

You can use this form:

DL={'a':[0,1],'b':[2,3], 'c':[4,5]}
LD=[{'a':0,'b':2, 'c':4},{'a':1,'b':3, 'c':5}]

nd={}
for d in LD:
    for k,v in d.items():
        try:
            nd[k].append(v)
        except KeyError:
            nd[k]=[v]

print nd     
#{'a': [0, 1], 'c': [4, 5], 'b': [2, 3]}

Or use defaultdict:

nd=cl.defaultdict(list)
for d in LD:
   for key,val in d.items():
      nd[key].append(val)

print dict(nd.items())
#{'a': [0, 1], 'c': [4, 5], 'b': [2, 3]}

Going the other way is problematic. You need to have some information of the insertion order into the list from keys from the dictionary. Recall that the order of keys in a dict is not necessarily the same as the original insertion order.

For giggles, assume the insertion order is based on sorted keys. You can then do it this way:

nl=[]
nl_index=[]

for k in sorted(DL.keys()):
    nl.append({k:[]})
    nl_index.append(k)

for key,l in DL.items():
    for item in l:
        nl[nl_index.index(key)][key].append(item)

print nl        
#[{'a': [0, 1]}, {'b': [2, 3]}, {'c': [4, 5]}]

If your question was based on curiosity, there is your answer. If you have a real-world problem, let me suggest you rethink your data structures. Neither of these seems to be a very scalable solution.

G answered 5/4, 2011 at 22:10 Comment(0)

A

12

Here are the one-line solutions (spread out over multiple lines for readability) that I came up with:

if dl is your original dict of lists:

dl = {"a":[0, 1],"b":[2, 3]}

Then here's how to convert it to a list of dicts:

ld = [{key:value[index] for key,value in dl.items()}
         for index in range(max(map(len,dl.values())))]

Which, if you assume that all your lists are the same length, you can simplify and gain a performance increase by going to:

ld = [{key:value[index] for key, value in dl.items()}
        for index in range(len(dl.values()[0]))]

Here's how to convert that back into a dict of lists:

dl2 = {key:[item[key] for item in ld]
         for key in list(functools.reduce(
             lambda x, y: x.union(y),
             (set(dicts.keys()) for dicts in ld)
         ))
      }

If you're using Python 2 instead of Python 3, you can just use reduce instead of functools.reduce there.

You can simplify this if you assume that all the dicts in your list will have the same keys:

dl2 = {key:[item[key] for item in ld] for key in ld[0].keys() }

Astounding answered 8/5, 2014 at 20:32 Comment(1)

What is the point of the rollback to version 4? Note that you have made the range in the second code snippet wrong, and removing the python code formatting makes the code actively worse. – Achromatize 20/11, 2019 at 11:0

D

6

The python module of pandas can give you an easy-understanding solution. As a complement to @chiang's answer, the solutions of both D-to-L and L-to-D are as follows:

import pandas as pd
DL = {'a': [0, 1], 'b': [2, 3]}
out1 = pd.DataFrame(DL).to_dict('records')

Output:

[{'a': 0, 'b': 2}, {'a': 1, 'b': 3}]

In the other direction:

LD = [{'a': 0, 'b': 2}, {'a': 1, 'b': 3}]
out2 = pd.DataFrame(LD).to_dict('list')

Output:

{'a': [0, 1], 'b': [2, 3]}

Delaryd answered 1/5, 2018 at 8:45 Comment(0)

D

6

`cytoolz.dicttoolz.merge_with`

Docs

from cytoolz.dicttoolz import merge_with

merge_with(list, *LD)

{'a': [0, 1], 'b': [2, 3]}

Non-cython version

Docs

from toolz.dicttoolz import merge_with

merge_with(list, *LD)

{'a': [0, 1], 'b': [2, 3]}

Disjunctive answered 9/10, 2018 at 16:21 Comment(1)

Thank you @Disjunctive for introducing me to the world of cytoolz. Where has it been my whole life?! :) – Lawsuit 12/6, 2020 at 19:20

C

2

Cleanest way I can think of a summer friday. As a bonus, it supports lists of different lengths (but in this case, DLtoLD(LDtoDL(l)) is no more identity).

From list to dict

Actually less clean than @dwerk's defaultdict version.

def LDtoDL (l) :
   result = {}
   for d in l :
      for k, v in d.items() :
         result[k] = result.get(k,[]) + [v] #inefficient
   return result

From dict to list

def DLtoLD (d) :
   if not d :
      return []
   #reserve as much *distinct* dicts as the longest sequence
   result = [{} for i in range(max (map (len, d.values())))]
   #fill each dict, one key at a time
   for k, seq in d.items() :
      for oneDict, oneValue in zip(result, seq) :
     oneDict[k] = oneValue
   return result

Cristinacristine answered 10/8, 2012 at 15:3 Comment(1)

Does not work for me: DLtoLD({1: [3], 2: [4, 5]}) yields [{1: 3, 2: 4}, {2: 5}] while I'd expect [{1: 3, 2: 4}, {1: 3, 2: 5}]... – Beaverette 27/7, 2021 at 15:35

B

2

I needed such a method which works for lists of different lengths (so this is a generalization of the original question). Since I did not find any code here that the way that I expected, here's my code which works for me:

def dict_of_lists_to_list_of_dicts(dict_of_lists: Dict[S, List[T]]) -> List[Dict[S, T]]:
    keys = list(dict_of_lists.keys())
    list_of_values = [dict_of_lists[key] for key in keys]
    product = list(itertools.product(*list_of_values))

    return [dict(zip(keys, product_elem)) for product_elem in product]

Examples:

>>> dict_of_lists_to_list_of_dicts({1: [3], 2: [4, 5]})
[{1: 3, 2: 4}, {1: 3, 2: 5}]
>>> dict_of_lists_to_list_of_dicts({1: [3, 4], 2: [5]})
[{1: 3, 2: 5}, {1: 4, 2: 5}]
>>> dict_of_lists_to_list_of_dicts({1: [3, 4], 2: [5, 6]})
[{1: 3, 2: 5}, {1: 3, 2: 6}, {1: 4, 2: 5}, {1: 4, 2: 6}]
>>> dict_of_lists_to_list_of_dicts({1: [3, 4], 2: [5, 6], 7: [8, 9, 10]})
[{1: 3, 2: 5, 7: 8},
 {1: 3, 2: 5, 7: 9},
 {1: 3, 2: 5, 7: 10},
 {1: 3, 2: 6, 7: 8},
 {1: 3, 2: 6, 7: 9},
 {1: 3, 2: 6, 7: 10},
 {1: 4, 2: 5, 7: 8},
 {1: 4, 2: 5, 7: 9},
 {1: 4, 2: 5, 7: 10},
 {1: 4, 2: 6, 7: 8},
 {1: 4, 2: 6, 7: 9},
 {1: 4, 2: 6, 7: 10}]

Beaverette answered 29/7, 2021 at 7:56 Comment(0)

P

1

Here my small script :

a = {'a': [0, 1], 'b': [2, 3]}
elem = {}
result = []

for i in a['a']: # (1)
    for key, value in a.items():
        elem[key] = value[i]
    result.append(elem)
    elem = {}

print result

I'm not sure that is the beautiful way.

(1) You suppose that you have the same length for the lists

Ploughboy answered 5/4, 2011 at 21:16 Comment(0)

S

1

Here is a solution without any libraries used:

def dl_to_ld(initial):
    finalList = []
    neededLen = 0

    for key in initial:
        if(len(initial[key]) > neededLen):
            neededLen = len(initial[key])

    for i in range(neededLen):
        finalList.append({})

    for i in range(len(finalList)):
        for key in initial:
            try:
                finalList[i][key] = initial[key][i]
            except:
                pass

    return finalList

You can call it as follows:

dl = {'a':[0,1],'b':[2,3]}
print(dl_to_ld(dl))

#[{'a': 0, 'b': 2}, {'a': 1, 'b': 3}]

Stearin answered 22/2, 2019 at 18:35 Comment(0)

M

0

If you don't mind a generator, you can use something like

def f(dl):
  l = list((k,v.__iter__()) for k,v in dl.items())
  while True:
    d = dict((k,i.next()) for k,i in l)
    if not d:
      break
    yield d

It's not as "clean" as it could be for Technical Reasons: My original implementation did yield dict(...), but this ends up being the empty dictionary because (in Python 2.5) a for b in c does not distinguish between a StopIteration exception when iterating over c and a StopIteration exception when evaluating a.

On the other hand, I can't work out what you're actually trying to do; it might be more sensible to design a data structure that meets your requirements instead of trying to shoehorn it in to the existing data structures. (For example, a list of dicts is a poor way to represent the result of a database query.)

Maryjanemaryjo answered 5/4, 2011 at 21:44 Comment(0)

A

0

List of dicts ⟶ dict of lists

from collections import defaultdict
from typing import TypeVar

K = TypeVar("K")
V = TypeVar("V")


def ld_to_dl(ld: list[dict[K, V]]) -> dict[K, list[V]]:
    dl = defaultdict(list)
    for d in ld:
        for k, v in d.items():
            dl[k].append(v)
    return dl

defaultdict creates an empty list if one does not exist upon key access.

Dict of lists ⟶ list of dicts

Collecting into "jagged" dictionaries

from typing import TypeVar

K = TypeVar("K")
V = TypeVar("V")


def dl_to_ld(dl: dict[K, list[V]]) -> list[dict[K, V]]:
    ld = []
    for k, vs in dl.items():
        ld += [{} for _ in range(len(vs) - len(ld))]
        for i, v in enumerate(vs):
            ld[i][k] = v
    return ld

This generates a list of dictionaries ld that may be missing items if the lengths of the lists in dl are unequal. It loops over all key-values in dl, and creates empty dictionaries if ld does not have enough.

Collecting into "complete" dictionaries only

(Usually intended only for equal-length lists.)

from typing import TypeVar

K = TypeVar("K")
V = TypeVar("V")


def dl_to_ld(dl: dict[K, list[V]]) -> list[dict[K, V]]:
    ld = [dict(zip(dl.keys(), v)) for v in zip(*dl.values())]
    return ld

This generates a list of dictionaries ld that have the length of the smallest list in dl.

Analyzer answered 8/10, 2022 at 6:58 Comment(0)

F

-4

DL={'a':[0,1,2,3],'b':[2,3,4,5]}
LD=[{'a':0,'b':2},{'a':1,'b':3}]
Empty_list = []
Empty_dict = {}
# to find length of list in values of dictionry
len_list = 0
for i in DL.values():
    if len_list < len(i):
        len_list = len(i)

for k in range(len_list):        
    for i,j in DL.items():
        Empty_dict[i] = j[k]
    Empty_list.append(Empty_dict)
    Empty_dict = {}
LD = Empty_list

Fanchie answered 22/3, 2019 at 7:16 Comment(1)

Hi Anup, can you please elaborate your answer with some explanation? – Moa 22/3, 2019 at 7:21