Role cannot be reached by the host system Azure- WorkerRole
Asked Answered
S

2

7

I'm using the Worker Role machines (Medium -> 2 Cores with 3,5 GB of Ram) to do massive work, and I'm able to use 100% of the CPU (of both cores) and 85% of RAM.

During this work, each takes around 20 minutes/ 40 minutes the Azure thinks the machine is unhealthy and stops all my work. In the Portal I see my worker instance are getting the message "Waiting for the status (Role cannot be reached by the host system).

Can anyone know a work around that doesn't include: 1) Use a more power full Role with cores that I will not use 2) Try to reduce the CPU usage by my application (100% CPU usage is what we want to use)

Thanks in advance Rui

Sapor answered 17/7, 2013 at 8:48 Comment(5)
This sounds like some kind of deadlock. May it be that Azure does some callback to your instances and those block because of high load?Hubblebubble
Deadlock where ? Doesn't seems deadlock, seems the machine is using her full resources to run my big task, and doesn't have resource to own applications (that are checking the connectivity or other measurements tools...)Sapor
I dunno where exactly. Something like a "status check" handler taking long. Btw you can get Azure internal logs and try read them.Hubblebubble
Hi, thanks for your comment, this is nothing that I could control probably... this is something that I think MS should take care, I'm using a machine with 2 cores and I will try to use the resources at 100%, if this don't leave resource for MS run their process to control the health of the machine, this should be controlled by MS, for instance running this process with Higher priority... I would just need a confirmation of this "bug" from MS ;)Sapor
They will request logs anyway which means you could just as well get those logs and read them yourself.Hubblebubble
M
0

try this:

Thread.CurrentThread.Priority = ThreadPriority.BelowNormal

maybe some other things(processes, threads) need lower priority's also but this should keep the cpu utilization at 100%

for (external) processes start them with the following code(this is vb but you should be able to covert it to your language

Dim myprocess As New System.Diagnostics.Process() 
myprocess.StartInfo.FileName = "C:\the\path\to\the\the\process.exe" 
myprocess.Start() 
myprocess.PriorityClass = ProcessPriorityClass.BelowNormal

you could set the priority of the current process of the worker role but this might be dependent of other processes so watch out, its better to set the priority of the demanding process lower this won't slow it down unless there is other work to be proformed

Process.GetCurrentProcess().PriorityClass = ProcessPriorityClass.AboveNormal
Mcinerney answered 18/7, 2013 at 17:35 Comment(5)
Hi, thanks for your comment, but I cant, the .exe that Im running, which consume 100% CPU, is a native app and I cant change the code.. :(Sapor
Your solution might work, thanks, but like I said in the post, I would like a solution that shouldn't take CPU of my application, what would be the best is the priority of the host_heart_beat.exe of the cloud host be running with higher priority...Sapor
the heartbeat is mainly handled in the current process but the heartbeat can be dependent on other processes, so its better to lower the priority of the working process(see edit) as it will only slow down if there is other work to do(this should keep CPU utilization on 100%)Mcinerney
Hi thanks very much for your comment. Do you know anyway to set the priorityClass using fortran ? Since what is happening is the .net code is launching a fortran app.exe and this one is launching other.exe that is consuming 100% CPU. From the .NET code I don't have access to the other.exe. Thanks in advance.Sapor
i'm happy to help,i don't know fortran, you should open a new question named: 'How to change the priority of a child process in fortran?', you didn't tag the question with fortran or mentioned it so please mark the answer as accepted if you are pleased with the solutionMcinerney
N
0

This is something that is affecting a service I'm running in a Windows Azure as well. I have just tried manually setting the Priority of WaAppAgent to High. Hopefully that helps.

But really this is shouldn't be my problem. Sometimes my database is running at 100% CPU and really this is the WORST possible time for a restart.

I really don't want to over provision resources just so some heart beat will be happy. Do the VM instances have a heart beat event as well? Maybe the solution is to switch to using a VM instead of using a PaaS role?

Nasa answered 17/11, 2013 at 23:35 Comment(2)
I have never test with VM, could be a good idea... I hope MS can fix this, if they haven't fixed already. Today i'm testing a bit the same scenario that I had, and I see my application can't get the same 100% CPU, but are around 98% which is excellent, I don't know if they already fixed this "situation" or was just luck :)Sapor
Remember a VM will likely cost you more and require more maintenance than a cloud service, performing the exact same work.Embolectomy

© 2022 - 2024 — McMap. All rights reserved.