Byte conversion
Asked Answered
B

1

6

I have two img files. Origin (2GB) and Destination (4GB), They are the result of some sort of encoding which I'm trying to identify and revert. So in order to successfully revert encoding I have to see if I'm able to obtain again Origin from the Destination file.

I've built a table that show that Origin has 256 types of bytes and Destination has 256 types of bytes-pair. Here is the list of the bytes converted in Hex of Origin with occurrence rate.

FF=24575615
FE=3242667
FD=3009202
FC=3063146
FB=3003652
FA=3025947
F9=3005543
F8=7684326
F7=4554041
F6=2933185
F5=3373967
F4=5597006
F3=2906784
F2=3789554
9F=3102630
9E=3005388
F1=3557574
F0=4365911
9D=3078506
9C=2840242
9B=2763692
9A=2804976
EF=2941117
EE=3025616
99=2877085
ED=2902961
98=3028895
EC=2817617
97=2752245
EB=3333926
96=2789702
EA=2850121
95=2989513
94=3031653
93=2911830
92=2658657
91=2728002
90=3419534
E9=2887403
E8=3208952
E7=3285198
E6=2644790
E5=4609467
E4=2650016
E3=4372245
8F=2991145
E2=3368100
E1=5113630
8E=2575537
E0=9155599
8D=3578967
8C=3038052
8B=2921954
8A=2675041
DF=2917213
DE=2560516
89=2736502
DD=2625394
88=3270888
DC=2599744
87=3366265
DB=2698959
86=2899131
DA=2673989
85=3330569
84=3367665
83=3421457
82=3444192
81=3864339
80=6354686
D9=2792340
D8=3572281
D7=2917209
D6=2502705
D5=2726792
D4=2599407
D3=2526731
7F=3667154
D2=2594634
D1=3798179
7E=2752138
D0=5792504
7D=2931975
7C=2876880
7B=3192909
7A=3348958
CF=2842460
CE=2904295
79=4933142
CD=2468499
78=4201043
CC=2551223
77=4251200
CB=2410778
76=5307097
CA=2417649
75=7217741
74=15428931
73=12268233
72=14409973
71=4741548
70=9798438
C9=2359024
C8=2549326
C7=2608153
C6=2524731
C5=2483222
C4=2848155
C3=3696683
6F=15455489
C2=2971749
6E=14311776
C1=2383297
6D=8538221
C0=3270606
6C=10639469
6B=4601490
6A=3337833
BF=3527482
BE=3305589
69=15717960
BD=3364649
68=6544569
BC=2989446
67=7873918
BB=2867947
66=5310067
BA=2996525
65=22005763
64=10819109
63=10271386
62=5649243
61=17118578
60=3714590
B9=2931805
B8=3617901
B7=2980605
B6=2841578
B5=3470008
B4=3329220
B3=2808383
5F=7462619
B2=3022737
5E=2545337
B1=3328536
B0=4808034
5D=3011851
5C=2786455
5B=3763489
5A=3363499
AF=3138318
AE=3058472
59=3023985
AD=2753771
58=3200666
AC=2718493
57=3198750
AB=2727749
56=3157681
AA=3016716
55=3625987
54=7058037
53=6318637
52=5403634
51=2927288
50=5225038
A9=2758574
A8=3190446
A7=2891160
A6=2873612
A5=3024935
A4=3732070
A3=2715548
4F=4252264
A2=2423484
4E=4458144
A1=2799897
4D=4589889
A0=4347937
4C=5262566
4B=4257717
4A=3099467
49=5937076
48=3346052
47=3830489
46=6790552
45=6137365
44=5804764
43=5414206
42=4114199
41=5409554
40=4442287
3F=3156472
3E=3225065
3D=4457800
3C=3929336
3B=4066190
3A=9022387
39=6277213
38=8240388
37=6495438
36=5451005
35=6141671
34=7080579
33=7806046
32=9798066
31=11882632
30=15283799
2F=6985857
2E=8044627
2D=6636208
2C=4805977
2B=3220182
2A=3167464
29=4090111
28=5709938
27=3502804
26=2929070
25=3358752
24=3916999
23=4057819
22=5124209
21=5277533
20=42872703
1F=3987784
1E=3484472
1D=3643916
1C=4174216
1B=3662986
1A=4933323
19=3677299
18=4216614
17=4043968
16=3582845
15=3683685
14=4540186
13=4812066
12=6464885
11=6488640
10=12415842
0F=4932667
0E=6787886
0D=4760047
0C=8731063
0B=7069143
0A=12241413
09=10858120
08=13149164
07=8219751
06=6926974
05=7701026
04=12557557
03=14887136
02=20154437
01=29508103
00=835691837

and here is the list of bytes couple in the destination file

0E,00=6791835
2C,00=4806159
4A,00=3099823
FF,00=3030567
80,25=2915869
B3,00=3061678
D1,00=3024917
7E,00=2752043
14,00=4543724
32,00=9800493
50,00=5226411
C9,00=3419606
E7,00=3367141
48,00=3344687
66,00=5308554
BA,00=2890612
1B,00=3662605
EE,00=3039868
51,25=2996741
4F,00=4251746
A2,00=3364659
6D,00=8535725
C0,00=2980676
03,00=14884374
21,00=5277874
B8,00=4554035
1C,25=3697411
D6,00=2878193
F4,00=2911302
19,00=3677995
37,00=6496436
55,00=3621900
73,00=12268699
0A,00=16849664
BF,00=3191022
DD,00=2901038
FB,00=2790679
3E,00=3226874
5C,00=2785989
7A,00=3348851
10,00=12415134
92,25=3328216
A7,00=3374104
C5,00=2992633
E3,00=2524591
FF,FE=1
08,00=13152284
26,00=2927651
44,00=5803368
62,00=5647266
F9,00=2750935
5D,25=2990402
AE,00=2758502
78,00=4211254
CC,00=2560487
EA,00=3271925
0F,00=4934398
2D,00=6635518
4B,00=4257690
63,25=2931269
B4,00=2940342
D2,00=4371679
7F,00=3667613
F0,00=5791943
15,00=3684778
33,00=7806422
51,00=2927325
E8,00=2675786
49,00=5937427
67,00=7873266
BB,00=3134171
00,25=2849189
1C,00=4180501
3A,00=9021107
34,25=2382740
EF,00=2921132
A3,00=2840826
6E,00=14310898
C1,00=3469735
04,00=12561200
22,00=5125096
40,00=4443771
B9,00=3002742
D7,00=3005298
F5,00=2649810
38,00=8247159
56,00=3158613
AA,00=2874152
74,00=15429251
92,01=3102012
0B,00=7069820
DE,00=3208144
FC,00=3865021
3F,00=3156189
B0,00=7683525
5D,00=3011343
7B,00=3193376
57,25=2867868
11,00=6490582
93,25=3022263
A8,00=3006278
0C,25=2674557
C6,00=2658527
E4,00=3366420
09,00=10858792
27,00=3506948
45,00=6136951
63,00=10272001
AF,00=3026422
79,00=4934274
CD,00=2502816
EB,00=2734596
2E,00=8048888
4C,00=5263799
6A,00=3337574
00,00=835408324
B5,00=2644329
D3,00=9153408
F1,00=3732278
16,00=3583727
34,00=7080805
52,00=5404524
70,00=9797831
E9,00=3443955
A0,25=3241460
68,00=6544647
BC,00=2721172
DA,00=2887297
1D,00=3644022
3B,00=4065122
17,20=3790204
A4,00=2842361
6F,00=15455236
C2,00=2841458
E0,00=3329927
05,00=7700764
69,25=2417913
23,00=4057297
41,00=5410631
D8,00=3078746
F6,00=3032474
3C,25=2483865
5A,25=2550298
39,00=6276359
AB,00=3058994
57,00=3198862
75,00=7216338
0C,00=8731202
2A,00=3167488
DF,00=5114400
24,25=3329540
FD,00=2819394
60,25=2551483
B1,00=3556946
5E,00=2545159
7C,00=2883942
12,00=6465795
30,00=15283965
A9,00=3617885
C7,00=6356230
E5,00=2898862
28,00=5709988
46,00=6790071
64,00=10820537
CE,00=2917633
EC,00=3579490
2F,00=6986708
A0,00=24806546
4D,00=4589771
6B,00=4601028
01,00=29501552
B6,00=5596464
D4,00=3367061
F2,00=2990103
17,00=4044395
35,00=6142063
53,00=6317913
71,00=4740237
6C,25=2904343
69,00=15721818
BD,00=2728255
02,25=2808648
DB,00=2849348
1E,00=3485468
3C,00=3929076
5A,00=3363435
18,25=2793271
54,25=2359093
A5,00=3305528
C3,00=2608966
E1,00=4348666
06,00=6927361
24,00=3917870
88,25=2699985
42,00=4114151
60,00=3715031
D9,00=3334096
F7,00=2933573
AC,00=3016682
58,00=3203765
76,00=5306969
CA,00=2594516
0D,00=16849664
2B,00=3219425
FE,00=3284535
5F,00=7459837
B2,00=3008654
D0,00=3798597
7D,00=2932025
13,00=4813318
31,01=2726914
31,00=11882336
C8,00=2599419
E6,00=2728324
2C,25=2971956
29,00=4090622
47,00=3830335
65,00=22005256
1A,00=4933435
CF,00=3572248
14,25=3268735
ED,00=2800139
50,25=2468708
4E,00=4457687
A1,00=2753982
6C,00=10638547
02,00=20156417
66,25=2411484
20,00=42875581
84,25=2599935
B7,00=3025065
D5,00=4608195
F3,00=2423749
18,00=4217479
36,00=5451070
54,00=7055997
72,00=14410154
BE,00=2907783
DC,00=2804396
FA,00=2715354
1F,00=3988798
3D,00=4458111
5B,00=3762930
91,25=4806083
A6,00=2624919
C4,00=2576697
E2,00=3421114
07,00=8216863
25,00=3358496
43,00=5414386
61,00=17120598
F8,00=2763409
AD,00=4364881
59,00=3024233
77,00=4249782
CB,00=2526456
10,25=3526973

at the beginning of the Destination file I have FF,FE, which is the little endian BOM, followed by lots of zeroes. I have tried reading the destination file with UTF-16 encoding and saving the result as UTF-8 but the latter is 2,5 GB and has unwanted transformation for example the sequence.

ORIGIN CD       7C 78 38 81 7C 78 7C 38 06 00 FF FF 53 EF

ORIGIN REBUILD  7C 78 38 C3 BC 7C 78 38 06 00 C2 A0 C2 A0

I have later tried to read the stream as UTF-16 and the convert it to IBM850. I have found that this conversion looks promising (the reverted file resemble a little more the origin) but the copy have some addition and some inexplicable conversion, that will be (correctly) converted in the reverted file, making it unreadable.

For example on the original file I have:

7F 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 6C 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02 01 20

On the file copied with netcat file I have:

7F 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 F5 00 FE 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02 00 01 00 20 00

The first impression is a byte to UTF16LE conversion. But I'm wondering why the byte 6c should be converted to the three bytes F5 00 FE 00 01 00 What kind of conversion has happened? Have you ever seen it? I have done the import with dd over netcat. I do not have the original file anymore. That's why I want to revert the conversion.

The netcat command I used:

SHELL1

adb forward tcp:9999 tcp:9999

adb shell

su

dd if=/dev/block/nandxxx | nc -l -p 9999



SHELL2

nc localhost 9999 >nandxxx.img

To be more clear, of what I mean with the word "copy" and "reverted": original physical image -> copy (or copied image) -> reverted I would like to obtain is of course original physical = reverted. If you need any more information please tell me in comments: I will update this question, with the requested details. Thanks in advance.

UPDATE 29-03-2017:

I have reproduced the steps on a known ISO Image which is not the original one (that's where the statistics and the observations comes from). I still observe the byte manipulation/insertion. Obviously applying the IBM850 encoding on the copy does not lead to a reverted version of the original Android good enough to perform any recovery operation (carved images are way too blurred, too few system files, etc.)

Balsamic answered 20/3, 2017 at 16:28 Comment(6)
is it more common to pipe into "netcat host port"? and then have the listening net cat to output to the file?Buhl
I am struggling to understand what you have done there. Is my assumption correct that you have opened a remote shell connection to an Android device and tried to back up one of the device's disk partitions over the very same port that also transfers the client/server communication? I do not think that either dd (without conv parameter) or nc do any character set conversion, they should both leave the input unchanged. I think you really messed up because you used the exact same port as for the shell communication. So I guess there is no conversion problem as such but trash characters.Frigidarium
Sorry, I am not an Android developer and never used adb before. I have re-read the documentation and think that the client/server communication for the interactive shell session does not use the forwarded port, after all. So my guess was probably wrong. But now I am even more puzzled because I just cannot imagine that any of the shell commands would have altered the binary partition dump. I would assume that your commands were maybe slightly different or there was another step you did not mention, such as opening the dump in a text editor accidentally and re-saving it - just speculating.Frigidarium
Couple of questions: You say you donot have the original, then how did you obtain those frequency counts? Second, nc does not do any character set conversion, so what is missing? Third, you seem to be still able to reproduce this problem, how are you doing that?Embezzle
Thanks for your comment john16384. I've edited the question, adding that I have reproduced the steps on a known Android Iso.Balsamic
@Gorgia666, unless I miss something obvious you still didn't provide any details on how exactly you actually rerproduce this encoding. This is crucial because if this is a known process, it migth be much easier to reverse-engineer that process rather than just guess by data.Hide
E
1

The following commands were use to analyze the possibility of a byte-to-byte-pair cipher:

sed -rn "s/^.*=//p" <originByteOccurance |sort -u >/tmp/qqq1
sed -rn "s/^.*=//p" <destianationBytePairOccurance |sort -u >/tmp/qqq2
wc -l /tmp/qqq1 /tmp/qqq2
# /tmp/qqq1 256
# /tmp/qqq2 256
cat /tmp/qqq1 /tmp/qqq2 |sort |uniq -d
sed -rn "s/,.*//p" </tmp/qq2 |sort -u |wc -l
# 230
sed -rn "s/^..,//; s/=.*//p" </tmp/qq2 |sort -u
# 00
# 01
# 20
# 25
# FE

Although wc -l reveals that there are indeed 256 unique counts of both

unique byte occurrences in the origin and
unique byte pair occurrences in the destination,

there is no overlap between source and destination in the occurrences sets, therefore the transformation is not an encoding of bytes to byte pairs. An inverse transformation using a reverse map is not possible.

A block encryption is unlikely because of the sparse coverage most significant byte in the destination byte pair set.

If the origin images entering the transform can be controlled, then creating an experimental fixture that allows the sending of a series of 2 x 2 pixel images with pastel colors (sparse 1s among 0s) through the transform may be helpful in revealing more about the transform (and improve the likelihood of SO assistance).

These may be a good first set of pixel colors to try in these micro-images:

#000000
#00000f
#000f00
#0f0000

Upon examination of simpler results, several hypotheses may come to mind. Hypothetical models that produce the experimental results perfectly can then be tested with 16 x 16 pixel images to gain evidence for those hypotheses. The ideas that pass 16 x 16 can then be tried with 1600 x 900 HD images until a high level of confidence is established.

Eleemosynary answered 26/3, 2017 at 4:39 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.