How to transpose a dataset in a csv file?
Asked Answered
E

7

27

For example, i would like to transform:

Name,Time,Score
Dan,68,20
Suse,42,40
Tracy,50,38

Into:

Name,Dan,Suse,Tracy
Time,68,42,50
Score,20,40,38

Edit: The original question used the term "transpose" incorrectly.

Economical answered 1/2, 2011 at 23:29 Comment(11)
The word you are looking for is pivot, as opposed to transpose.Clipping
What have you tried? Is this homework? WHY do you want to transform it from something usual to something weird?Godthaab
lol. i wish i were back to the times of hwEconomical
@zr: I'm with you. We need a "wish-it-was-homework" tag in SO. :-)Tiemannite
I don't think transpose is incorrect here, personally. See en.wikipedia.org/wiki/TransposeSpringy
@chmullig: ??? matrix transposition is EXACTLY what the OP wants. "pivot" is not the word anyone should be looking for.Godthaab
@John Machin - I agree. I was disagreeing with @Slomojo. Should have phrased it better.Springy
@chmullig: looks like I missed the "in" in incorrect ... I'm not usually fazed by double negatives; I'm the guy who writes not not x instead of bool(x) :-)Godthaab
Nor am I usually not unfazed by double negatives.Springy
Excel calls this operation "transpose". And it is a reflection in the diagonal, top-left to bottom-right, line as Wikipedia states. Why isn't it a transposition?Crystallization
Yes: it is transposition. But WHY is it going from row-orientation to column-orientation... and more importantly: why do all tutorials for CSV data perform this transpose... and then fail to tell you how to get ROWS of data into your model? I have spent 2 days trying to get CSV data into a model - it should not be so hard, or so obscure.Amylopectin
L
45

If the whole file contents fits into memory, you can use

import csv
from itertools import izip
a = izip(*csv.reader(open("input.csv", "rb")))
csv.writer(open("output.csv", "wb")).writerows(a)

You can basically think of zip() and izip() as transpose operations:

a = [(1, 2, 3),
     (4, 5, 6),
     (7, 8, 9)]
zip(*a)
# [(1, 4, 7),
#  (2, 5, 8),
#  (3, 6, 9)]

izip() avoids the immediate copying of the data, but will basically do the same.

Leslileslie answered 1/2, 2011 at 23:36 Comment(10)
That's very slick. You might explain what you're doing though, since it's probably not intuitive to someone unfamiliar with izip and python iterables.Springy
@chmullig: Added and explanatory example.Leslileslie
If the whole file doesn't fit in memory add more memoryTurnbuckle
+1. BTW, it looks like the manual needs updating "If the syntax *expression appears in the function call, expression must evaluate to a sequence" ... csv.reader() doesn't evaluate to a sequence.Godthaab
I've never seen the * symbol being used this way in python before, any links where I can find documentation for it?Fallacious
@Tony: In the Python Tutorial.Leslileslie
@xApple: As long as all your lines have the same number of items, this will work. If you have lines with different numbers of columns, transposition is rather meaningless anyway.Leslileslie
After investigation, it was a buffering and cache issue with the file objects. Sorry.Pearle
Holy wow. Two years on, I learned more about unpacking and zip from reading this answer (found on a google search whildst trying to quick-hack around a stupid excel bug) than in all previous deliberate study. super-slick, much respect.Stuart
numpy with it transpose methodZucchetto
M
18

Transfer from input.csv to output.csv. Pandas can also help.

import pandas as pd
pd.read_csv('input.csv', header=None).T.to_csv('output.csv', header=False, index=False)
Martyrize answered 24/12, 2016 at 0:43 Comment(2)
Excellent one-liner.Humes
Saved the day, @anton-tarasenko ! I still wonder why csv.writerows(my_list) cannot do what I need it to. Instead I am having to do csv.writerow(my_list) then use your stransposing one-line to get my one row to become multiple rows. Crazy!Clarinda
P
5

Same answer of nosklo (all credits to him), but for python3:

from csv import reader, writer 
with open('source.csv') as f, open('destination.csv', 'w') as fw: 
    writer(fw, delimiter=',').writerows(zip(*reader(f, delimiter=',')))
Preform answered 10/5, 2019 at 12:24 Comment(1)
Needs open(output_filename, 'w', newline='') as fw or it's double spaced.Coccidiosis
L
2
from itertools import izip
from csv import reader, writer

with open('source.csv') as f, open('destination.csv', 'w') as fw:
    writer(fw, delimiter=',').writerows(izip(*reader(f, delimiter=',')))
Lancashire answered 1/2, 2011 at 23:44 Comment(1)
-1 If Python 2.x, should use binary mode on both files. If Python 3.x, should use newline='' on both files. Also: delimiter=',' is the default.Godthaab
L
1

If lines is the list of your original text than it should be

for i in range(1,len(lines)):
    lines[i] = lines[i].split(',')

new_lines = []
for i in range(len(lines[0])):
    new_lines.append("%s,%s,%s" % (lines[0][i], lines[1][i], lines[2][i]))

or use csv Python module - http://docs.python.org/library/csv.html

Loppy answered 1/2, 2011 at 23:36 Comment(0)
R
0

The simplest way is:

import numpy as np
import pandas as pd

_mat = pd.read_csv("test.csv")
_mat = _mat[_mat.columns[0:3]].values
_t_mat = np.transpose(_mat)

Result:

  • Input matrix is : [[1 2 3] [4 5 6]]
  • the output is: [[1 4] [2 5] [3 6]]
Restate answered 14/11, 2017 at 20:23 Comment(0)
B
0

Read the CSV into pandas data frame, pandas has build in function for transpose which can be invoked as below.

import pandas as pd

csv = pd.read_csv("test.csv", skiprows=1)
# use skiprows if you want to skip headers
df_csv = pd.DataFrame(data=csv)
transposed_csv = df_csv.T
print(transposed_csv)
Benthos answered 10/4, 2019 at 10:37 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.