Azure Selenium Testing - "The STDIO streams did not close within 10 seconds of the exit event from process"
Asked Answered
A

2

8

I have a Azure DevOps CI release that runs a massive number of selenium tests on the same server at the same time. Typically it works great, but occasionally my selenium test task will timeout due to this error:

2020-05-07T15:47:37.0692681Z Completed TestExecution Model... 2020-05-07T15:47:48.6637501Z The STDIO streams did not close within 10 seconds of the exit event from process 'C:\TFSAgent5_work_tasks\VSTest_ef087383-ee5e-42c7-9a53-ab56c98420f9\2.153.9\Modules\DTAExecutionHost.exe'. This may indicate a child process inherited the STDIO streams and has not yet exited. 2020-05-07T16:08:50.9254238Z ##[error]The task has timed out.

This typically occurs on a test rerun, I see it maybe once every 100 test runs. It's a killer issue because it will lock up the test agent for the maximum amount of time the timeout is set to (in my case 30 minutes). A number of other posts point out that this can occur if your not properly closing your selenium driver, however I believe I am, and, in my case 99/100 time it works great, this is the code that I use to close my selenium driver:

[AssemblyCleanup]
public static void Cleanup()
{
    try
    {
        driver.Close();
        driver.Quit();
    }
    catch (Exception e)
    {
        Debug.WriteLine(e.Message);
    }
}

They're really not allot of useful suggestions floating around. I think this issue is related to the load the test agent server (or test server) is under. When I run a smaller CI release (nightly) I never see this issue.

Has anyone experienced this issue under high load before? I wonder where that "10 seconds" comes from and whether that can be adjusted somehow? Is there a issue with the code that I use to close the driver, is there a better way of closing that that will ensure even when it's locked up I can still kill it, maybe something I could add to my catch statement?

Aftereffect answered 7/5, 2020 at 16:59 Comment(7)
What's the result if we remove the driver.Close() and use single Quit()/Dispose()? Also i seems that you're using TFS, what's the tfs version? And how long the execution of your tests always take, sorry but I it's hard for me to reproduce the high load scenario on my side/Legation
@LanceLi-MSFT actually I used to do just a single quit(same issue), I've been switching it up multiple ways, recently I split the close and quit into separate try catch blocks(same issue still), but doesn't seem to have helped things. Also on the naming side of things, a year ago I upgraded TFS (on premise) to the latest version and it now says "Azure DevOps Server Dev17.M153.5". Yes executions take total about 20 minutes and selenium is about 9 minutes of that, I'm running 25 sets of tests simultaneously, I have 3 build agent servers each with 10 agents. Other than this issue, it works well.Aftereffect
I am facing same issue. It is intermittent. Have you found any solution for this?Alveolus
Does this answer your question? Azure DevOps Pipeline test showing partially succeededSedan
@Sedan No it doesn't and not I haven't found a solution to this issue, it still occurs semi-regularlyAftereffect
I'm facing the same issue with a self-hosted runner, works fine with a Microsoft-hosted runner.Entomology
I'm running into a similar issue, but we're using gulp not a selenium test driver. This used to happen somewhat intermittently, but with a recent change it seems to be happening 9/10 times. Hopefully I can find something on this.Squire
B
0

I've got similar issue with C# and Selenium Webdriver. The problem appeared only on one vm agent that was installed areound 2 years ago with version: 3.218.0. I've upgraded it to 3.220.5 version -> it didn't help. The thing is that 2 other agents with the same version on the same vm in azure worked fine (but they're installed freshly as 3.220.5). What I've done I've just removed problematic agent and reinstall it again from scratch. After reinstallation problem disappeared, however it's interesting as the version did not changed and I have no clue why this helped. I've tried also with different NUGET timeout settings and also with TASKLIB_TEST_TOOLRUNNER_EXITDELAY - unfortunately without any feedback.

Bacterin answered 13/7, 2023 at 10:45 Comment(0)
B
0

What've helped me here was to use plain 'powershell' task instaed DotNetCoreCLI@2.

Problem disappeared with that approach:

- powershell: |
    dotnet test your.dll --no-restore --logger trx
  displayName: Tests
Bacterin answered 23/8, 2023 at 13:12 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.