distributed-computing Questions

3

Solved

I would like to see a progress bar on Jupyter notebook while I'm running a compute task using Dask, I'm counting all values of id column from a large csv file +4GB, so any ideas? import dask.datafr...
Rhino asked 28/2, 2018 at 22:33

4

this is really dumb but what does zookeeper do that raft doesn't - not talking about zab but zookeeper itself. I get raft does leader election etc. w servers but what's the point of zookeeper? is ...
Rutilant asked 11/12, 2017 at 19:56

4

I'm trying to learn about the Distributed Hash Table (DHT) paradigm, as it fits into a P2P or fully distributed computing architecture. From a theoretical standpoint, once a cluster is established,...
Sturm asked 23/10, 2012 at 15:59

4

How can a distributed system be consistent and available (CA)? Because I would argue when a network partition occurs, CA cannot be possible in a way where every node of the network, even the partio...
Pean asked 28/11, 2017 at 19:13

7

Solved

I am copying the pyspark.ml example from the official document website: http://spark.apache.org/docs/latest/api/python/pyspark.ml.html#pyspark.ml.Transformer data = [(Vectors.dense([0.0, 0.0]),), ...

2

Solved

There are several similar-yet-different concepts in Spark-land surrounding how work gets farmed out to different nodes and executed concurrently. Specifically, there is: The Spark Driver node (sp...

7

In terms of RDD persistence, what are the differences between cache() and persist() in spark ?
Economics asked 11/11, 2014 at 17:14

4

Solved

I'm developing my insight about distributed systems, and how to maintain data consistency across such systems, where business transactions covers multiple services, bounded contexts and network bou...
Viceregent asked 21/2, 2018 at 13:10

3

Solved

Suppose I have 2 machines with 4 GPUs each. Suppose that each instance of the training algorithm requires 2 GPUs. I would like to run 4 processes, 2 for each machine, each process using 2 GPUs. H...
Oleander asked 3/4, 2020 at 21:58

3

Solved

I am very confused between these two consistency models. Please give some timeline examples along with explanation. http://en.wikipedia.org/wiki/Consistency_model
Varicella asked 20/11, 2011 at 7:7

4

Can somebody please explain the following TensorFlow terms inter_op_parallelism_threads intra_op_parallelism_threads or, please, provide links to the right source of explanation. I have condu...

1

I noticed that the docs do not have that function. Thus, it's unclear where one should be calling that. Does one have to: call it at the end of each worker code (i.e. inside of mp.spawn) or call i...

2

Solved

In my understanding, a leader sends AppendEntries RPC to the followers, and if majority of followers return success, the leader will commit this entry. It will commit this entry by applying it to i...
Carillo asked 10/12, 2020 at 8:18

6

I know golang is very good at concurrency with its built-in support, but seems to me they are not distributed, so what would be the framework/library allow us to write producers/consumers applicati...
Variegation asked 1/2, 2014 at 4:42

3

Solved

Following up from my previous question: Using "Cursors" for paging in PostgreSQL What is a good way to provide an API client with 1,000,000 database results? We are currently using Pos...
Amari asked 30/10, 2012 at 16:56

3

I have been having trouble finding an example of what use cases are suitable for Vector Clocks and Version Vectors, and how they might differ. I understand that they largely work in the same way, w...

3

Solved

How does a consensus algorithm like Paxos "guarantee safety (freedom from inconsistency)" when two generals prove the "impossibility of designing algorithms to safely agree"? When I consider the s...
Frescobaldi asked 13/2, 2013 at 18:0

2

Found this acronym in the docs of Ray Core, used for its main API server: [..] the head node needs to open several more ports: --port: Port of Ray (GCS server). The head node will start a GCS serv...

5

Solved

I'm developing an application that works distributed, and I have a SQLite database that must be shared between distributed servers. If I'm in serverA, and change sqlite row, this change must be in ...

5

I am looking for a python package that can do multiprocessing not just across different cores within a single computer, but also with a cluster distributed across multiple machines. There are a lot...
Microeconomics asked 12/11, 2014 at 0:2

4

I've seen multiple issue about the: RuntimeError: NCCL error in: /opt/conda/conda-bld/pytorch_1614378083779/work/torch/lib/c10d/ProcessGroupNCCL.cpp:825, unhandled cuda error, NCCL version 2.7.8 nc...
Marker asked 25/3, 2021 at 20:28

2

I'm reading how the probabilistic data structure count-min-sketch is used in finding the top k elements in a data stream. But I cannot seem to wrap my head around the step where we maintain a heap ...

20

Solved

According to Learning Spark Keep in mind that repartitioning your data is a fairly expensive operation. Spark also has an optimized version of repartition() called coalesce() that allows avoidi...
Lining asked 24/7, 2015 at 12:49

3

Solved

I want to know if it would be possible to run an OpenMP program on multiple hosts. So far I only heard of programs that can be executed on multiple thread but all within the same physical computer....

2

My understanding of consistent hashing is that you take a key space, hash the key and then mod by say 360, and place the values in a ring. Then you equally space nodes on that ring. You pick the no...
Highspirited asked 4/11, 2021 at 15:16

© 2022 - 2025 — McMap. All rights reserved.