distributed Questions
4
Solved
I am developing a Java library for communication via HTTP, and I want to test its reliability and performance in case of network problems such as packet loss, high latency, low bandwidth, and conge...
Crapshooter asked 21/10, 2011 at 11:26
8
Solved
When is it a good idea to use something like CRDT instead of paxos or raft?
Lightyear asked 28/6, 2012 at 23:32
3
Solved
The questions below are intended to be serious rather than frivolous. I lack experience in distributed systems, but I do understand how Basic Paxos works and why leader selection is useful. Unfortu...
Samite asked 22/5, 2014 at 5:49
7
while doing logs in the multiple module of vertx, it is a basic requirement that we should be able to correlate all the logs for a single request.
as vertx being asynchronous what will be the best...
Runt asked 11/7, 2017 at 22:44
2
I am loading my pre-trained keras model and then trying to parallelize a large number of input data using dask? Unfortunately, I'm running into some issues with this relating to how I'm creating my...
Festival asked 20/5, 2020 at 23:49
3
Solved
Info:
$ julia --version
julia version 1.6.0
$ lscpu
~/root/MyPackage$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 39 bits physical, 48 bits vi...
Jollity asked 20/1, 2022 at 19:6
4
While running a distributed training on 4 A6000 GPUs, I get the following error:
[E ProcessGroupNCCL.cpp:630] [Rank 3] Watchdog caught collective operation timeout: WorkNCCL(OpType=BROADCAST, Timeo...
Blagoveshchensk asked 24/10, 2021 at 4:43
1
Solved
I am aware there are a multitude of questions about running parallel for loops in Julia, using @threads, @distributed, and other methods. I have tried to implement the solutions there with no luck....
Evaporation asked 10/12, 2022 at 19:1
6
Solved
I'm trying to set up a distributed load testing environment using JMeter. I need to set up the remote clients using something portable like a Linux Live CD, but whenever I attempt to launch jmeter-...
Malines asked 30/6, 2010 at 14:56
4
Solved
I am currently trying to understand Lamport timestamps. Consider two processes P1 (producing events a1, a2,...) and P2 (producing events b1, b2,...). Let C(e) denote the Lamport timestamp associate...
Cutworm asked 20/6, 2015 at 18:51
2
Solved
I've launched a lot of tasks, but some of then hasn't finished (763 tasks), are in a PENDING state, but the system isn't processing anything...
It's possible to retry this tasks giving celery the t...
Ptolemaeus asked 28/2, 2011 at 10:20
4
Solved
I've been examining several DLM's but most of them are written in JAVA or C++. Can't seem to find any service specifically implemented for .NET. Any idea or recommendations for distributed synchron...
Behring asked 20/12, 2010 at 12:1
2
I've been trying to implement and understand the working of IPFS and have a few things that aren't clear.
Things I've tried:
Implemented IPFS on my system and stored files on it. Even if I de...
Growler asked 23/11, 2017 at 7:43
1
Assume data is stored the same in database and in distributed cache (.i.e. no join needed), is it still relevant that distributed cache much faster than accessing database directly?
As far as I und...
Windbound asked 19/12, 2014 at 8:48
8
Solved
Phase 2. (a) If the proposer receives a response to its prepare requests (numbered n) from a majority of acceptors, then it sends an accept request to each of those acceptors for a proposal numbe...
Bennion asked 26/4, 2015 at 17:27
2
As this decentralisation wave is taking place around the digital world, I was wondering how can you remove some content that you just uploaded on a decentralized network.
As I understand, more and ...
Ilowell asked 3/11, 2021 at 11:13
2
Solved
So I have a text file bigger than my ram memory, I would like to create a dataset in PyTorch that reads line by line, so I don't have to load it all at once in memory. I found pytorch IterableDatas...
Teleran asked 30/10, 2021 at 9:32
1
What is the difference between the two? The protocol on the surface looks different, but I would like to understand what is really different between the two and why they are not equivalent.
Deadening asked 3/8, 2021 at 17:16
1
I was asked this question in an interview and was unable to answer it.
How does FB messenger order the messages on user side when two messages are concurrent in order to avoid view difference in di...
Pythian asked 29/1, 2021 at 11:22
3
Solved
There are several resources about distributed systems, like the mongo db documentation that recommend odd number of nodes in a cluster.
What are the benefits of having odd number of nodes?
Graiggrail asked 12/11, 2019 at 16:59
4
Solved
As the paper says:
Election Safety: at most one leader can be elected in a given term. §5.2
However, there may be more than one leader in the system. Raft only can promise that there is only o...
Theron asked 10/7, 2014 at 16:14
2
Solved
I have a distributed application that uses ZooKeeper for leader election. Only the elected leader can commit to the database. I recently discovered that there is a potential situation which could l...
Dehydrogenase asked 27/5, 2016 at 8:14
5
Solved
I'm using a Microsoft Azure Service Bus queue to process calculations and my program runs fine for a few hours but then I start to get this exception for every message that I process from then on. ...
Eldrida asked 24/1, 2015 at 15:23
1
I have some troubles with the new option of tensorflow that allows us to run distributed tensorflow.
I just would like to run 2 tf.constant with 2 tasks but my code never ends. it looks like that ...
Wondering asked 18/5, 2016 at 8:31
7
Solved
I am currently working on a project using Hadoop DFS.
I notice there is no search or find command in Hadoop Shell. Is there a way to search and find a file (e.g. testfile.doc) in Hadoop DFS?
Do...
Bobbysoxer asked 9/6, 2011 at 18:31
1 Next >
© 2022 - 2025 — McMap. All rights reserved.