I'm trying to decompress a data file that was originally compressed with an extension for AMOS Pro, the old Amiga BASIC language, that shipped with the AMOS Pro compiler. I've still got the programming language and have access to the compressor and decompressor, but I'm trying to decompress the files using C. I ultimately want to be able to view these files on modern hardware without having to resort to using an Amiga emulator first.
However, there's no documentation as to how the compressor worked, so I'm trying to reverse-engineer it solely from watching its behaviour. Here's what I've got so far.
This is a raw file (ASCII):
AABCDEFGHIJKLMNOPQRSTUVWXYZAABCDEFGHIJKLMNOPQRSTUVWXYZAABCDEFGHIJKLMNOPQRSTUVWXYZ
Here's the compressed version (hex):
D802C6B5
05048584
4544C5C4
2524A5A4
6564E5E4
15149594
5554D5D4
3534B591
00000007
AD763363
00000051
Testing with various files has given me to a few insights:
- The last 4 bytes are the size of the original file.
- The file seems to function as a bit stream, so byte boundaries aren't important (I say this because I've seen ASCII codes appear in a few files and they aren't aligned to byte boundaries).
- All of the bits in the file are stored in reverse.
The first 4 byte seems to represent a sequence length. In the above example, the value 0xD8
is 11011000
in binary; mirror it (bits are in reverse) and you'll get 00011011
, which is 0x1B
in hex or 27 in decimal. That matches the sequence length.
However, I'm not making any more progress. Does this look like a standard compression algorithm? What do I try next?
+header.s
, unfortunately it is in assembly language, undocumented (except for rudimentary calling convention information), so it's going to be hard to figure out. Still, not impossible. – Ce