Transporting Sparse Matrix from Python to R - McMap

About

Transporting Sparse Matrix from Python to R

Asked 5/6, 2015 at 21:15 Answered 5/6, 2015 at 21:36

Solved python r sparse-matrix text-analysis

S

1

11

I am doing some text analysis work in Python. Unfortunately, I need to switch to R in order to use a particular package (unfortunately, the package cannot be replicated in Python easily).

Currently the text is parsed into bigram counts, reduced to a vocabulary of about 11,000 bigrams, and then stored as a dictionary:

{id1: {'bigrams':[(bigram1, count), (bigram2, count), ...]},
id2: {'bigrams': ...}

I need to get this into a dgCMatrix in R, where the rows are id1, id2, ... and the columns are the different bigrams such that a cell represents the 'count' for that id-bigram.

Any suggestions? I thought about expanding it just to a massive CSV, but that seems super inefficient plus probably infeasible due to memory constraints.

Suppletory answered 5/6, 2015 at 21:15 Comment(1)

An example with actual values and in greater numbers might be more useful. As it is you are expecting us to do quite a bit of work before even attempting to code. Maybe you fancy Python coders grasp this layout better than this feeble R-coder, but can you please provide more substance? – Perry 5/6, 2015 at 21:27

U

10

Could you could write out the matrix in MatrixMarket format using scipy mmwrite and then read it into R using readMM from the Matrix package?

Uchish answered 5/6, 2015 at 21:36 Comment(2)

This worked! It isn't a super memory efficient way of doing it (as far as I can tell), but managed to get it to run on my computer just fine. – Suppletory 6/6, 2015 at 0:5

Hopefully it's pretty time efficient! LOL! :) Glad I could help. – Uchish 7/6, 2015 at 2:45

Recommended topics

#Godot #Unity #Godot 4.X #Mongodb

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

© 2022 - 2024 — McMap. All rights reserved.