data-partitioning Questions

7

Solved

I would like to convert a single array into a group of smaller arrays, based on a variable. So, 0,1,2,3,4,5,6,7,8,9 would become 0,1,2,3,4,5,6,7,8,9 when the size is 3. My current approach: $ids=...
Tender asked 29/8, 2017 at 21:28

12

Solved

I'd like to partition a list into a list of lists, by specifying the number of elements in each partition. For instance, suppose I have the list {1, 2, ... 11}, and would like to partition it such...
Sedge asked 8/9, 2009 at 20:6

5

Solved

Say I have a list L. How can I get an iterator over all partitions of K groups? Example: L = [ 2,3,5,7,11, 13], K = 3 List of all possible partitions of 3 groups: [ [ 2 ], [ 3, 5], [ 7,11,13] ] ...
Sparing asked 21/8, 2013 at 9:8

2

Solved

I have a large JSON file with I'm guessing 4 million objects. Each top level has a few levels nested inside. I want to split that into multiple files of 10000 top level objects each (retaining the ...
Tumefaction asked 13/4, 2018 at 2:52

14

Solved

Let's say I have a list, and a filtering function. Using something like >>> filter(lambda x: x > 10, [1,4,12,7,42]) [12, 42] I can get the elements matching the criterion. Is t...
Vole asked 2/1, 2011 at 13:34

3

Solved

I am developing an application that using IDocumentClient to perform query to CosmosDB. My GenericRepository support for query by Id and Predicate. I am in trouble when change Database from SqlServ...

2

Solved

I tried to search on Web and in my algorithms book if the Lomuto's specific solution of QSort Partition is stable or not (I know that the Hoare's version is unstable) but i didn't find a precise an...
Kaz asked 10/7, 2011 at 12:30

3

Solved

What is the difference between DataFrame repartition() and DataFrameWriter partitionBy() methods? I hope both are used to "partition data based on dataframe column"? Or is there any difference?
Kinky asked 4/11, 2016 at 6:10

1

Solved

I try to optimize a join query between two spark dataframes, let's call them df1, df2 (join on common column "SaleId"). df1 is very small (5M) so I broadcast it among the nodes of the spark cluster...
Katlynkatmai asked 2/7, 2019 at 17:28

0

I have a large database that will use partitioned column-store tables in a redesign. Is it possible to specify the partition in the generated sql with Entity Framework Core 2.2? This is for an Azu...
Lessee asked 25/6, 2019 at 2:49

2

With caret package, when creating data partition 75% training and 25% test, we use: inTrain<- createDataPartition(y=spam$type,p=0.75, list=FALSE) Note: dataset is named spam and target variab...
Ung asked 20/7, 2016 at 20:5

3

I am using Scala on Flink with DataSet API. I want to re-partition my data across the nodes. Spark has a function that lets the user to re-partition the data with a given numberOfPartitions parame...
Mantelpiece asked 14/1, 2019 at 23:1

2

Solved

I have a problem that I feel could be solved using lag/lead + partitions but I can't wrap my head around it. Clients are invited to participate in research-projects every two years (aprox.). A num...
Pogge asked 14/11, 2018 at 10:49

3

Would it be possible to automatically split a table into several files based on column values if I don't know how many different key values the table contains? Is it possible to put the key value i...
Giacomo asked 6/3, 2017 at 22:32

6

Solved

I have a Set of numbers : Set<Integer> mySet = [ 1,2,3,4,5,6,7,8,9] I want to divide it into 2 sets of odds and evens. My way was to use filter twice : Set<Integer> set1 = mySet.s...
Fibre asked 6/2, 2018 at 17:2

1

Solved

As part of a security product I have high scale cloud service (azure worker role) that reads events from event hub, batches them to ~2000 and stores in blob storage. Each event has a MachineId (the...
Euthanasia asked 20/1, 2018 at 10:34

6

Solved

I'm searching for an algorithm that generates all permutations of fixed-length partitions of an integer. Order does not matter. For example, for n=4 and length L=3: [(0, 2, 2), (2, 0, 2), (2, 2, ...
Prenatal asked 10/11, 2010 at 16:27

7

I have a hard time translating QuickSort with Hoare partitioning into C code, and can't find out why. The code I'm using is shown below: void QuickSort(int a[],int start,int end) { int q=HoarePar...
Uncounted asked 25/8, 2011 at 22:51

1

Solved

From the documentation: For bootstrap samples, simple random sampling is used. For other data splitting, the random sampling is done within the levels of y when y is a factor in an attempt to bala...
Raffaello asked 20/11, 2016 at 21:42

4

Solved

I'm having trouble formulating a query for the following problem: For pair values that have a certain score, how do you group them in way that will only return distinct pair values with the best r...
Radiancy asked 1/11, 2016 at 17:17

3

Solved

I have an array which I need to divide up into 3-element sub-arrays. I wanted to do this with iterators, but I end up iterating past the end of the array and segfaulting even though I don't derefer...
Lumbar asked 5/4, 2016 at 11:40

2

Solved

I'm trying to query a table in Windows Azure storage and was initially using the TableQuery.CombineFilters in the TableQuery<RecordEntity>().Where function as follows: TableQuery.CombineFilt...
Rheotropism asked 16/1, 2014 at 16:42

6

I'm trying to solve one of the Project Euler problems. As a consequence, I need an algorithm that will help me find all possible partitions of a set, in any order. For instance, given the set 2 3 ...
Brady asked 24/10, 2009 at 18:25

6

I need a way of storing sets of arbitrary size for fast query later on. I'll be needing to query the resulting data structure for subsets or sets that are already stored. === Later edit: To clarif...
Varus asked 4/6, 2014 at 10:33

1

Solved

Hello following on from my question: Windows Azure table access latency Partition keys and row keys selection about the way I have organised data in my Azure storage account. I have a table storage...
Laminous asked 16/1, 2014 at 11:29

© 2022 - 2025 — McMap. All rights reserved.