Antivirus detecting compiled C++ files as trojans
Asked Answered
A

3

11

I had installed a c++ compiler for windows with MinGW. I tried to make a simple program:

#include <iostream>
using namespace std;

int main() {
   cout << "Hello World!";
   return 0;
}

And saved it as try.cc. Afterwards I opened cmd in the folder and ran g++ try.cc -o some.exe. It generated some.exe but my antivirus (avast) recognized it as malware. I thought it could be a false positive, but it specifically said it's a trojan.

I removed the file from the virus chest and uploaded it to "https://www.virustotal.com/" The result: https://static.mcmap.net/file/mcmap/ZG-AbGLDKwfpKnMxcF_AZVLQamyA/jC2oz.png

24 out of 72 engines detected it as malware and a lot of them as a trojan.

Is this a false positive? Why would it get detected as a trojan? If it is, how do I avoid getting this warning every time I make a new program?

Edit:

Thanks all for the help, I ran a full scan of my computer, with 2 antivirus and everything seemed clean. I also did a scan on the MinGW folder and nothing.

The problem keeps appearing each time I make a new c++ program. I tried modifying the code and the name but the AV kept detecting it as a virus. Funny thing is that changing the code changed the type of virus the av reported.

I'm still not 100% sure that the compiler is clean so I dont know if I should ignore it and run the programs anyway. I downloaded MinGW from "https://osdn.net/projects/mingw/releases/"

If anyone knows how to be completely sure that the executables created are not viruses, only false positives I would be glad they share it.

Edit 2:

It occurred to me that if the compiler is infected and it's adding code, then I might be able to see it with a decompiler/disassembler, feeding it the executable. I downloaded a c++ decompiler I found here "snowman" and used it on the file. The problem is that the code went from 7 lines in the original executable to 5265 and is a bit hard to make sense of it. If someone has some experience with reverse engineering, a link to the original file is in the comments below.

Ascendancy answered 10/11, 2020 at 13:1 Comment(9)
Unless your toolchain has been infected to make it produce trojans, I wouldn't worry about it. You exe just happens to share the same fingerprint as some trojans. See if g++ -O3 -g0 try.cc -o some.exe makes a difference.Gladisgladney
Looks like a true positive (multiple unrelated engines detect it). Maybe you downloaded some dodgy bundle of mingw with extra malware included for free. Scan your computer now.Siren
How exactly did you install mingw?Jasik
Avast is not a usable product on a dev machine.Botheration
Also, spurious false positives are not unknown, e.g. mingw-users.1079350.n2.nabble.com/…. As an aside, run some non-mingw executables through the online scanner, and perform a scan of your computer with a free offline- scanner because the compromization could be unrelated to mingw.Jasik
Also, security.stackexchange.com/questions/229576/….Jasik
Can you share the compiled .exe?Spacecraft
@Spacecraft Uh, I uploaded it to this site, I dont know how long will the link work: filedropper.com/some, worst case scenario it's a trojan, so I guess that unless you open it you are safe.Ascendancy
This why Microsoft recommends to not use loaders to activate windows. Always use the Digital license through Microsoft servers.Merchantman
A
3

Update:

It actually was some kind of hash collision, the compiler wasn't infected. I did change the string in the print function, as suggested, several times, even adding line breaks, but everytime, my AV detected it as malware. I also tried deleting some lines of code (the includes and the print) and it also detected it as malware.

Funny enough, when I added more lines to the code, the AV stopped recognizing it as a virus. Makes you wonder how the hash function used works, and how it relates to the actual content of the programs.

So is solved, and everything was fine, just some AV sloppiness (which I guess has it's reasons).

Ascendancy answered 14/11, 2020 at 0:23 Comment(4)
Interesting, and good to know. But are you sure it's a hash collision? Hashes are sensitive to smallest variations in the code. I'd think it's some other form of pattern recognition. For example, there may be the idea that useful programs cannot be smaller than some minimum size which makes the AV programs biased against small programs.Jasik
Yes, you could be right, but remember that deleting lines of code changed the type of virus detected, so It can't be only the size of the executable. I was wondering how hard could it be to ignore the strings of the program after compilation, before calculating the hash function. That could explain why the changes in the print didn't have an effect (but I guess that it could potentially be exploited by inserting code in strings), but I'm not sure.Ascendancy
In case of a hash-collision there would be only one matching AV. The chance for a collision is very very low. @Peter Reinstate Monica guess is much better.Spacecraft
Each time the code written in C++ is compiled, MinGW adds fixed code to it. This code is just a small part of the executable, but as the size of the executable decreases the proportion of original code decreases as well. That means that if there is any small-sized malware that uses MinGW, then, as my program gets smaller, the percentage of it's code that is identical to the malware increases. If any AV applies a hash function to sections of the code (which could help detecting polymorphic viruses) and then checks for matches, the high similarity could cause a false positive.Ascendancy
J
5

The issue has come up before. Programs compiled with mingw tend to trigger the occasional snake oil (i.e., antivirus program) alarm. That's probably because mingw is a popular tool chain for virus authors and thus its output matches generic patterns occurring in true positives. This has come up over and over again, also on SE (e.g. https://security.stackexchange.com/questions/229576/program-compiled-with-mingw32-is-reported-as-infected). [rant] In my opinion that's true evidence of incapacity for the AV companies because it would be easy to fix and makes you wonder whether the core functions of their programs are better implemented. [/rant]

Your case is a bit suspicious though because the number of triggered AV programs is so large. While I have never heard of a compromised mingw, and a cursory google search did not change that, it's not impossible. Compromising compilers is certainly an efficient method to spread a virus; the most famous example with an added level of indirection is the Ken Thompson hack.

It is also certainly possible that your computer is infected with a non-mingw-originating virus which simply inserts itself into new executables it finds on disk. That should be easy to find out by the usual means. A starting point could be to subject a few other (non-mingw) new executables to the online examination; they should trigger the same AV programs.

Note that while I have some general IT experience I have no special IT security knowledge; take everything I say just as a starting point for your own research and actions.

Jasik answered 10/11, 2020 at 18:10 Comment(0)
Z
3

This could be caused by two things

  1. It really is a trojan, you downloaded your mingw from some places where its code was altered to add a virus inside each program you create. This is done for almost all the commercial compilers, all "free" (cracked) version have that code inside them, each time you compile your code the virus is added to your exe.

  2. The hash of your exe for some reason matched an existing virus, you can confirm if this by altering one characters in your code for example "hello world!" to "hello world?" and see if it is still considered as a virus, if yes, there is a very high chance that your compiler adds viruses to your programs.

Zawde answered 10/11, 2020 at 13:10 Comment(5)
Can you point to reports of specific examples for such compromised compilers?Jasik
Both points are hard to believe. Did you have any references?Spacecraft
The first one is so easy to make, for instance if your virus is an function, at compilation time, and in the background, you rename main to oldmain, and your virus main calls the main after it excution. you can also run two thread (the main prog and the virus), you can create custom libs with virus (ex: replace printf with callVirusCode, then printf) those are just examples. For second the theory is called "collision" (when to hashes match).Zawde
For 1.: Yes, but I never heard of this kind of attack. I think, that this kind is not really valuable. Is that only a guess? For 2.: Yes, but more than 24 collisions for every complied program? Hash-Collisions are very rare ...Spacecraft
For 1: "You" never heard of it does not mean it does not exist. for 2 it depend on the used hash size and the number of virus there.Zawde
A
3

Update:

It actually was some kind of hash collision, the compiler wasn't infected. I did change the string in the print function, as suggested, several times, even adding line breaks, but everytime, my AV detected it as malware. I also tried deleting some lines of code (the includes and the print) and it also detected it as malware.

Funny enough, when I added more lines to the code, the AV stopped recognizing it as a virus. Makes you wonder how the hash function used works, and how it relates to the actual content of the programs.

So is solved, and everything was fine, just some AV sloppiness (which I guess has it's reasons).

Ascendancy answered 14/11, 2020 at 0:23 Comment(4)
Interesting, and good to know. But are you sure it's a hash collision? Hashes are sensitive to smallest variations in the code. I'd think it's some other form of pattern recognition. For example, there may be the idea that useful programs cannot be smaller than some minimum size which makes the AV programs biased against small programs.Jasik
Yes, you could be right, but remember that deleting lines of code changed the type of virus detected, so It can't be only the size of the executable. I was wondering how hard could it be to ignore the strings of the program after compilation, before calculating the hash function. That could explain why the changes in the print didn't have an effect (but I guess that it could potentially be exploited by inserting code in strings), but I'm not sure.Ascendancy
In case of a hash-collision there would be only one matching AV. The chance for a collision is very very low. @Peter Reinstate Monica guess is much better.Spacecraft
Each time the code written in C++ is compiled, MinGW adds fixed code to it. This code is just a small part of the executable, but as the size of the executable decreases the proportion of original code decreases as well. That means that if there is any small-sized malware that uses MinGW, then, as my program gets smaller, the percentage of it's code that is identical to the malware increases. If any AV applies a hash function to sections of the code (which could help detecting polymorphic viruses) and then checks for matches, the high similarity could cause a false positive.Ascendancy

© 2022 - 2024 — McMap. All rights reserved.