large-data Questions

6

Solved

I have three large lists. First contains bitarrays (module bitarray 0.8.0) and the other two contain arrays of integers. l1=[bitarray 1, bitarray 2, ... ,bitarray n] l2=[array 1, array 2, ... , ar...
Sikhism asked 2/1, 2013 at 15:28

5

Use the PRNG with seed 4020 (first 3 numbers are -2123524894 961034805 1071375651) to generate 10^10 integers. Print the 10^5th largest element among the numbers generated. Of course if the problem...
Minnie asked 28/11, 2023 at 7:51

4

Solved

I'd like to calculate the square root of a number bigger than 10^2000 in Python. If I treat this number like a normal integer, I will always get this result back: Traceback (most recent call last)...
Danie asked 17/12, 2017 at 11:29

4

Solved

I'm working on a project which will involve running algorithms on large graphs. The largest two have around 300k and 600k vertices (fairly sparse I think). I'm hoping to find a java library that ca...
Exploiter asked 29/3, 2012 at 17:33

4

I'm new using R. I'm trying to add (append) new lines to a file with my existing data in R. The problem is that my data has about 30000 rows and 13000 cols. I already try to add a line with the wri...
Blurt asked 12/10, 2011 at 14:22

2

Solved

I am planning to create a merchant table, which will have store locations of the merchant. Most merchants are small businesses and they only have a few stores. However, there is the odd multi-chain...
Phelloderm asked 7/5, 2018 at 1:25

4

Solved

I'm trying real hard to install vowpal wobbit and it fails when i run the make file, throwing: cd library; make; cd .. g++ -g -o ezexample temp2.cc -L ../vowpalwabbit -l vw -l allreduce -l boo...
Enenstein asked 11/7, 2012 at 16:31

1

I have a data.frame of hospital data with 11 million rows. Columns: ID (chr), outcome (1|0), 20x ICD-10 codes (chr). Rows: 10.6 million I wish to make the data tidy to allow modelling of diagnosti...
Samphire asked 15/3, 2022 at 11:52

5

I have a large Firestore collection with 10,000 documents. I want to show these documents in a table by paging and filtering the results at 25 at a time. My idea, to limit the "reads" (and theref...

1

I m doing some data manipulation with dplyr that with my huge data(b) frame. I have been able to work successfully on smaller subsets of my data. I guess my problem is with the size of my data fram...
Dunkle asked 7/9, 2020 at 23:57

4

I need to run a logistic regression on a relatively large data frame with 480.000 entries with 3 fixed effect variables. Fixed effect var A has 3233 levels, var B has 2326 levels, var C has 811 lev...
Brubaker asked 20/2, 2015 at 16:47

8

Solved

I would like to know what is the difference between laravel chunk and laravel cursor method. Which method is more suitable to use? What will be the use cases for both of them? I know that you shoul...
Farnham asked 2/8, 2017 at 15:12

8

I am implementing Kosaraju's Strong Connected Component(SCC) graph search algorithm in Python. The program runs great on small data set, but when I run it on a super-large graph (more than 800,000...
Haematogenesis asked 5/4, 2012 at 20:28

1

I'm writing a multilayer card game (like hearthstone) with Nodejs back-end and an angular front-end. I tried to connect the two with Socket.IO, but it turned out that if I send a JSON object over ...
Zanthoxylum asked 27/6, 2018 at 18:29

2

Solved

I have a project in which I have to achieve fast search, insert and delete operations on data ranging from megabytes to terabytes. I had been studying data structures of late and analyzing them. Be...

2

Solved

I've a large dataset comprises 10^5 data points. And now I'm considering the following question related to large dataset: Is there any efficient way to visualize very large dataset? In my case I h...
Insubstantial asked 15/8, 2013 at 1:35

16

Solved

I have tried to puzzle out an answer to this question for many months while learning pandas. I use SAS for my day-to-day work and it is great for it's out-of-core support. However, SAS is hor...
Asper asked 10/1, 2013 at 16:20

3

Solved

So, I have a couple of system backup image files that are around 1 terabyte, and i want to calculate fast the hash of each one of them (preferably SHA-1). At first i tried to calculate the md5 has...
Crosspollination asked 28/3, 2014 at 23:12

2

Solved

I'm creating a chart similar to Mike Bostock's zoomable area chart. For my specific project, I have a bunch of sensors which are recording values every 30 seconds (temperature, light, humidity and...
Logomachy asked 10/1, 2014 at 17:8

2

I am trying to store records with a set of doubles and ints (around 15-20) in mongoDB. The records mostly (99.99%) have the same structure. When I store the data in a root which is a very structur...
Vlada asked 20/11, 2013 at 5:11

3

I have a some large files (more than 30gb) with pieces of information which I need to do some calculations on, like averaging. The pieces I mention are the slices of file, and I know the beginning ...
Gunnar asked 23/4, 2019 at 0:8

4

I have a large dataset that I have to generate CSV and PDF for. With CSV, I use this guide: https://docs.djangoproject.com/en/3.1/howto/outputting-csv/ import csv from django.http import Streaming...
Tva asked 10/8, 2020 at 14:18

1

Goal Implement a Shiny app to efficiently visualize and adjust uploaded data sets. Each set may contain 100000 to 200000 rows. After data adjustments are done, the adjusted data can be downloaded. ...
Tomchay asked 23/7, 2020 at 14:21

2

I am not a survey methodologist or demographer, but am an avid fan of Thomas Lumley's R survey package. I've been working with a relatively large complex survey data set, the Healthcare Cost and Ut...
Lynlyncean asked 4/2, 2016 at 20:13

2

I am working on a C++ project that needs to perform FFT on a large 2D raster data (10 to 100 GB). In particular, the performance is quite bad when applying FFT for each column, whose elements are n...
Poona asked 8/8, 2018 at 5:44

© 2022 - 2025 — McMap. All rights reserved.