The session for this agent already exists
Asked Answered
P

1

9

I am using TFS to execute a nightly build that includes several steps that use the TFS Test Agent. I am running the latest version of TFS/Test Agent(2015 - Update 3) and there are no other builds being run at this time. Often(maybe half the time), when the nightly job is run the step "Visual Studio Test Agent Deployment" fails with the following error:

The job has been abandoned because agent Agent-XXX did not renew the lock. Ensure agent is running, not sleeping, and has not lost communication with the service.

This is due to the error found in the Test Agent's log file(under _diag):

The session for this agent already exists. Sleeping for 30 seconds before next retry.

Microsoft.TeamFoundation.DistributedTask.WebApi.TaskAgentSessionConflictException: The task agent Agent-XXX already has an active session for owner XXX.

This issue is directly referenced here, and indirectly talked about here.

The solution I've found to this issue is to restart the server that the test agent is running on, this clears any dead sessions, and after the server starts back up, the tests run just fine. I think this is effectively what is being done in the previously mentioned post. The result of resetting the configs is that the service is restarted.

While being presented as a solution in the linked article, it is only temporary. Even after the server has been restarted and the build runs successfully, the next day the issue will again reappear necessitating manual intervention to get the build to run.

I could schedule a task to reset the service or even restart the server directly before the nightly build is run, but it strikes me as a bandage rather than a fix. Has anyone experienced this issue before, and if so is there any way to prevent it from occurring in the first place?

Update 1

I simply set up a build that runs 5 minutes before my main tests that runs a Bat script to restart all my servers hosting my test agents. This is a workaround, but one that seems to resolve the issue. Hopefully someday someone can come up with a better solution than this, but for now, it's how I have to run automated testing in TFS.

Update 2

I have three servers now, all three exhibit the same issue, though it is hard to pin down exactly when it occurs. Scaling up the workaround without creating downtime it proving to be quite challenging.

Update 3

A better day came, I upgraded TFS to 2018, and the build agent to the latest version, this issue no longer occurs, I think its a bug in the old build agent. I still don't have a solution for the original version of the build agent...

Personally answered 25/1, 2017 at 18:11 Comment(5)
What’s the result if there isn’t Visual Studio Test Agent Deployment step? The build agent can be run as service and interactive mode, does the issue occur for both mode? Try to setup a new build agent on the same/other machine and check whether the issue still persist.Ree
@starain-MSFT Those are excellent suggestions/questions and probably should've been addressed in my original post: The build runs fine if there is no "Test Agent Deployment" step(excluding testing). I've tried running the agent in both modes when I first set it up but the issue is a little hard to pin down, it worked fine once or twice in interactive mode, but that doesn't necessary indicate the issue is not present. Right now I don't have any other servers to install another agent on :(, maybe I'll try with my desktop, thought to see the issue may take a few days.Personally
Multiple build agents can be installed on the same machine, so also try to setup a new build agent on the same machine and check whether the issue presist.Ree
@starain-MSFT Yes, its been a challenge to get time on this server as it's has been our only build server, and when I'm testing it I can't predictably run our nightly builds, however, I have a new build server now, I will verify the issue occurs on this(so far it has not).Personally
@starain-MSFT I've tried this on multiple servers and the issue seems to persist, so for now I am using the proposed work around in my Update.Personally
P
2

t sounds like a process Agent.Listener.exe was running under somewhere on the machine, maybe as a service (not a logged in user session).

note, if an agent process is abruptly terminated while it has an active session, the session will eventually timeout (after 5 minutes i think). and on startup, if an agent encounters session conflict then it will retry for up to 5.5 minutes i think before giving up (enough time for an abruptly terminated session to expire).

i'm going to go ahead and close this and assume a process was running somewhere. we havent had any issues in this area and haven't heard any other reports, so i dont think there is an issue here with the agent. if you find a repro, or it looks like i'm wrong then please reopen.

Pyromancy answered 23/7, 2022 at 4:43 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.