intermittent issues with ClamAV clamd INSTREAM on socket
Asked Answered
A

0

7

I've got an AWS Lambda function running NodeJS code to stream files from S3 to ClamAV running on an EC2 instance.

Generally (about 75% of the time) the system works, but often (especially when multiple files are being scanned from different Lambda containers) clamd threads gets stuck on INSTREAM.

Once a thread has been in INSTREAM for 25-30 seconds it does not seem to be able to recover. When it has been QUEUEDSINCE 350 seconds it is killed off. I can't figure out how either of these numbers relate to any value in my config.

I'm struggling to find any sign of an error in the logs - the number of INSTREAM requests matches the number of complete scans:

$ sudo grep -c "got command INSTREAM" /var/log/clamav/clamav.log
129
$ sudo grep -c "Chunks complete" /var/log/clamav/clamav.log
129
$ sudo grep -c "Scanthread: connection shut down" /var/log/clamav/clamav.log
129

...okay, now that I look a little more deeply into the logs it just takes a lot longer for some to be scanned. When I do a batch of 16 files, with Lambda concurrency restricted to 7 the first 7 files are scanned within a few seconds. The next file begins scanning soon after, gets to "Chunks complete" within a second, but takes 23 seconds before "Scanthread: connection shutdown". From here on it just gets worse - 1:24, 1:45... and then the 3rd batch of 7 files take over 3 minutes to scan.

performance on AWS m3.medium

If I give the system a few minutes to settle down, all the threads to die off, the same files that took over 3 minutes now take about 5-7 seconds.

If I run the same test on a faster machine the performance improves, but the issue is still there:

performance on AWS m4.xlarge

When threads get stuck at INSTREAM I can see that the files are still there:

$ ls -al /tmp
drwx------  2 clamav clamav     4096 Aug 29 16:52 clamav-493bdf893ce4d8d7763c00fee22d9d69.tmp
-rwx------  1 clamav clamav 25683921 Aug 29 16:52 clamav-5cdefd83d5531a03c7cf22fda37d133f.tmp
Abney answered 29/8, 2018 at 8:26 Comment(1)
One possible solution is to run everything in a Lambda container: engineering.upside.com/…Abney

© 2022 - 2024 — McMap. All rights reserved.