Are there any downsides to using UPX to compress a Windows executable?
Asked Answered
A

13

46

I've used UPX before to reduce the size of my Windows executables, but I must admit that I am naive to any negative side effects this could have. What's the downside to all of this packing/unpacking?

Are there scenarios in which anyone would recommend NOT UPX-ing an executable (e.g. when writing a DLL, Windows Service, or when targeting Vista or Win7)? I write most of my code in Delphi, but I've used UPX to compress C/C++ executables as well.

On a side note, I'm not running UPX in some attempt to protect my exe from disassemblers, only to reduce the size of the executable and prevent cursory tampering.

Assist answered 9/12, 2008 at 17:53 Comment(1)
+1 reason for not using UPX: some anti-virus indentifies Delphi+UPX as a potential Virus. Preferable is to use our LVCL, which makes 30 KB exe with Delphi, with no compression (instead of 300KB/800KB exe with regular VCL). It's enough e.g. for a setup program, which will uncompress its content. See synopse.info/forum/viewtopic.php?id=30 You can take a look also at KOL, which are more difficult to use, but also more powerful.Lavadalavage
B
53

... there are downsides to using EXE compressors. Most notably:

  • Upon startup of a compressed EXE/DLL, all of the code is decompressed from the disk image into memory in one pass, which can cause disk thrashing if the system is low on memory and is forced to access the swap file. In contrast, with uncompressed EXE/DLLs, the OS allocates memory for code pages on demand (i.e. when they are executed).

  • Multiple instances of a compressed EXE/DLL create multiple instances of the code in memory. If you have a compressed EXE that contains 1 MB of code (before compression) and the user starts 5 instances of it, approximately 4 MB of memory is wasted. Likewise, if you have a DLL that is 1 MB and it is used by 5 running applications, approximately 4 MB of memory is wasted. With uncompressed EXE/DLLs, code is only stored in memory once and is shared between instances.

http://www.jrsoftware.org/striprlc.php#execomp

Bioclimatology answered 9/12, 2008 at 18:11 Comment(5)
Not all software is in a situation where VM page sharing between multiple-instances is an issue though, and some compressors allow skipping of shared sections. PECompact skips them by default, for instance. Also, concerning 'all the memory being loaded', that is true, but that memory will be paged back out if unused and needed elsewhere. In some situations it creates faster loads because the storage medium (usually HDD) overhead is lower and everything gets loaded 'at once' and is 'already there'. So, there are exceptions, and to each his own. DISCLAIMER: I am the author of PECompact.Skinny
Jeremy: Shared sections are not so much the problem as the actual code sections. Code is normally memory mapped read-execute, which means that if you launch 10 programs, they will all use the same, identical phsyical memory pages, and the physical data is only loaded from disk once. The OS can also discard pages that contain code which has never been called if it runs low on memory, knowing that it can always trivially reload them from the image. None of that is the case with code that came from an exe packer, the respective pages must be backed by the swap file.Fredrick
I agree with you Damon, so the question then becomes - does your application tend to have multiple instances running? This is one of the caveats I've long warned against. Of course, the code sections are usually pretty small anyway - look at the sizes we are talking about in comparison to today's systems. I would argue MUCH more memory tends to be dynamically allocated in MOST applications.Skinny
I also expect this slows down start-up time a fair bit -- which may not be noticeable for a single run, but if you have the kind of executable that gets run repeatedly (like grep) then it might make a fairly noticeable difference, right?Daredevil
Does this mean upx original.exe -o compressed.exe compressed then decompressed upx -d compressed.exe -o original-decompressed.exe executable will behave normally like an exe file where the os shares code between instances and no disk thrashing?Despair
L
27

I'm surprised this hasn't been mentioned yet but using UPX-packed executables also increases the risk of producing false-positives from heuristic anti-virus software because statistically a lot of malware also uses UPX.

Loreenlorelei answered 10/12, 2008 at 10:4 Comment(4)
UPX is not for protection. Just packing.Smithsonite
Sorry, I slurred that a bit. Changed the wording now.Loreenlorelei
Fully agree. At Free Pascal we get a few such reports (that one or the other binary in the distro triggers some avirus) an year. We then stopped routinely upxing.Sarracenia
@JohnSmith You are right.. but just FYI, you can actually (pretty easily) shift the UPX-compressed file's entrypoint so default "standard" UPX unpacking method won't be able to unpack it, essentially making it protected (look for UPX Scrambler which does that automatically)Tombaugh
E
20

There are three drawbacks:

  1. The whole code will be fully uncompressed in virtual memory, while in a regular EXE or DLL, only the code actually used is loaded in memory. This is especially relevant if only a small portion of the code in your EXE/DLL is used at each run.
  2. If there are multiple instances of your DLL and EXE running, their code can't be shared across the instances, so you'll be using more memory.
  3. If your EXE/DLL is already in cache, or on a very fast storage medium, or if the CPU you're running on is slow, you will experience reduced startup speed as decompression will still have to take place, and you won't benefit from the reduced size. This is especially true for an EXE that will be invoked multiple times repeatedly.

Thus the above drawbacks are more of an issue if your EXE or DLLs contains lots of resources, but otherwise, they may not be much of a factor in practice, given the relative size of executables and available memory, unless you're talking of DLLs used by lots of executables (like system DLLs).

To dispell some incorrect information in other answers:

  • UPX will not affect your ability to run on DEP-protected machines.
  • UPX will not affect the ability of major anti-virus software, as they support UPX-compressed executables (as well as other executable compression formats).
  • UPX has been able to use LZMA compression for some time now (7zip's compression algorithm), use the --lzma switch.
Eisenach answered 11/12, 2008 at 13:22 Comment(0)
A
11

The only time size matters is during download off the Internet. If you are using UPX then you actually get worse performance than if you use 7-zip (based on my testing 7-Zip is twice as good as UPX). Then when it is actually left compressed on the target computer your performance is decreased (see Lars' answer). So UPX is not a good solution for file size. Just 7zip the whole thing.

As far as to prevent tampering, it is a FAIL as well. UPX supports decompressing too. If someone wants to modify the EXE then they will see it is compress with UPX and then uncompress it. The percentage of possible crackers you might slow down does not justify the effort and performance loss.

A better solution would be to use binary signing or at least just a hash. A simple hash verification system is to take a hash of your binary and a secret value (usually a guid). Only your EXE knows the secret value, so when it recalculates the hash for verification it can use it again. This isn't perfect (the secret value can be retrieved). The ideal situation would be to use a certificate and a signature.

Assiduity answered 9/12, 2008 at 18:39 Comment(5)
Size matters on USB stick. Portable apps are one case where it'd make sense.Benzene
Given the current prices of multi-gigabyte sticks, I'm not sure size matters there anymore either. Unless you are storing HD video, of course.Selfsatisfaction
UPX compression algorithm is at least equal to LZMA, for compressing individual executable file. 7-zip only is better if you have several exe/dll to compress at once: in this case, 7-zip will compress all those files as once, therefore gaining some ratio.Lavadalavage
Size also matters when using files NOT on a local harddisk. Like on a networkshare or and external harddisk.Lysozyme
Size matters in Docker containers, too.Strahan
D
6

The final size of the executable on disk is largely irrelevant these days. Your program may load a few milliseconds faster, but once it starts running the difference is indistinguishable.

Some people may be more suspicious of your executable just because it is compressed with UPX. Depending on your end users, this may or may not be an important consideration.

Daryldaryle answered 9/12, 2008 at 18:5 Comment(3)
+1 I'm always suspicious of packed executables / DLLs, especially if I can't unpack them.Ezequiel
Size matters in a Docker containers.Strahan
Size matters if your reading from slow flash memory, which can be the case in an embedded system.Burushaski
C
2

The last time I tried to use it on a managed assembly, it munged it so bad that the runtime refused to load it. That's the only time I can think of that you wouldn't want to use it (and, really, it's been so long since I tried that that the situation may even be better now). I've used it extensively in the past on all types of unmanaged binaries, and never had an issue.

Colloquy answered 9/12, 2008 at 18:0 Comment(0)
P
2

If your only interest is in decreasing the size of the executables, then have you tried comparing the size of the executable with and without runtime packages? Granted you will have to also include the sizes of the packages overall along with your executable, but if you have multiple executables which use the same base packages, then your savings would be rather high.

Another thing to look at would be the graphics/glyphs you use in your program. You can save quite a bit of space by consolidating them to a single Timagelist included in a global data module rather than have them repeated on each form. I believe each image is stored in the form resource as hex, so that would mean that each byte takes up two bytes...you can shrink this a bit by loading the image from a RCData resource using a TResourceStream.

Possibility answered 9/12, 2008 at 21:32 Comment(0)
T
2

There are no drawbacks.

But just FYI, there is a very common misconception regarding UPX as--

resources are NOT just being compressed

Essentially you are building a new executable that has a "loader" duty and the "real" executable, well, is being section-stripped and compressed, placed as a binary-data resource of the loader executable (regardless the types of resources were in the original executable).

Using reverse-engineering methods and tools either for education purposes or other will show you the information regarding the "loader executable", and not variable information regarding the original executable.

executable uncompressed by UPX

executable compressed by UPX

Tombaugh answered 21/10, 2015 at 18:24 Comment(4)
"There are no drawbacks." - is simply a wrong statement.Linguistic
@Linguistic although "no drawbacks" *should probably be changed to "no practical drawbacks", * which essentially (for a practical developer in nowadays architecture) means the same, I've decided that it requires no editing, in favor of a more clear, realistic answer. Deal with that :)Tombaugh
As the author of one of UPX's classical competitors, PECompact, I can say there definitely are practical drawbacks. Two name three: False positives, potential future interoperability issues, and the inability of the OS to load specific pages from disk on-demand (only applies to really large module).Skinny
Your explanation about how the compression works is a bit off too... Native PE compression actually does 'in-place' decompression. So, it's not like you described, where it just decompresses the whole original EXE from a compressed resource, then runs that. Instead, the OS is actually running the compressed module. The loader decompresses the virtual pages in-place. Now, there ARE bad/lame compressors that act like you described, and .NET assembly compressors would be such, but that is not it's usually done. I apologize for being critical, I just mean to provide accuracy.Skinny
S
1

IMHO routinely UPXing is pointless, but the reasons are spelled above, mostly, memory is more expensive than disk.

Erik: the LZMA stub might be bigger. Even if the algorithm is better, it does not always be a net plus.

Specter answered 25/4, 2009 at 22:32 Comment(1)
It seems memory and disk space aren't impacted very much either way on modern hardware, making UPX just an extra complication. In my experience I only save around 500 KB from a 3 MB app, especially on Mac where dylib files aren't supported.Macedo
A
1

Virus scanners that look for 'unknown' viruses can flag UPX compressed executables as having a virus. I have been told this is because several viruses use UPX to hide themselves. I have used UPX on software and McAfee will flag the file as having a virus.

Aussie answered 2/5, 2009 at 0:47 Comment(2)
McAfee? It's the most terrible antivirus software I've ever used. Stay away!Ops
In fact, stay away from any virus scanner that is stupid enough to mark every upx compressed file as a virus.Amok
S
1

The reason UPX has so many false alarms is because its open licensing allows malware authors to use and modify it with impunity. Of course, this issue is inherent to the industry, but sadly the great UPX project is plagued by this problem.

UPDATE: Note that as the Taggant project is completed, the ability to use UPX (or anything else) without causing false positives will be enhanced, assuming UPX supports it.

Skinny answered 26/9, 2010 at 16:34 Comment(0)
W
0

I believe there is a possibility that it might not work on computers that have DEP (Data Execution Prevention) turned on.

Whereto answered 9/12, 2008 at 18:48 Comment(4)
As the author of a commercial PE compressor (PECompact), I can say that every packer should support DEP fully - including UPX. I can't say for sure that it does, but surely it does -- that was an issue from years ago.Skinny
Coul you give any details how this is achieved in detail? (source/literature)Peculate
@kn0x DEP only prevents execution of memory that is not marked as executable, for which VirtualProtect(Ex) is used to switch memory pages between writeable and executable.Peeler
That’s a throwback :) Thx @Suchiman, my question originally addressed the internals of upx, not DEP.Peculate
D
0

When Windows load a binary, first thing it does is called Import/Export Table resolution. Ie, what ever API and DLL that is indicated in the Import Table, it will first load the DLL into a randomly generated base address. And using the base address plus offset into the DLL's function, this information will be updated to the Import Table.

EXE does not have Export Table.

All these happened even before jumping to the original entry point for execution.

Then after it start executing from the entry point, the EXE will run a small piece of code before starting the decompression algorithm. This small piece of code also means that the Windows API needed will be very small, resulting in a small Import Table.

But after the binary is decompressed, if it started to use any Windows API not resolved before, then likely it is going to crash. So it is essential that the decompression routine will resolve and update the Import Table for all the referenced Window API inside the decompressed codes, before executing the decompressed codes.

References:

enter image description here

https://malwaretips.com/threads/malware-analysis-2-pe-imports-static-analysis.62135/

Disconnection answered 22/4, 2020 at 16:17 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.