Allright, I will reply to my own question to avoid the same pain to others.
0. WARNING
In case you are doing a recovery, ALWAYS COPY YOUR DATA and work
on the copy. Do NOT alter the original 'broken' data. That thing
said, keep reading.
1. Your partition looks like ...
Install sleuth kit and testdisk. Hopefully there will packages for your distro :)
# mmls -t gpt LUN01
GUID Partition Table (EFI)
Offset Sector: 0
Units are in 512-byte sectors
Slot Start End Length Description
00: Meta 0000000000 0000000000 0000000001 Safety Table
01: ----- 0000000000 0000000033 0000000034 Unallocated
02: Meta 0000000001 0000000001 0000000001 GPT Header
03: Meta 0000000002 0000000033 0000000032 Partition Table
04: 00 0000000034 0000002081 0000002048 LDM metadata partition
05: 01 0000002082 0000262177 0000260096 Microsoft reserved partition
06: 02 0000262178 1048576966 1048314789 LDM data partition
07: ----- 1048576967 1048576999 0000000033 Unallocated
Note: testdisk will give you the same info with less details
# testdisk /list LUN01
2. Extract disks metadata
All information about the disk order, data size and other ciphered attributes
about the partition will be found in the LDM metadata partition. W2k8 has not
changed so much since this document [2] albeit some sizes are different and some
attributes are new (and obviously unknown)...
# dd if=LUN01 skip=33 count=2048 |xxd -a > lun01.metadata
# less lun01.metadata
At line 0002410 you should see the name of the server. Reassuring ? But we are
after the disks order and disk ID. Scroll down.
2.1. Disks Order
At line 0003210 you should see 'Disk1' followed by a long string.
0003200: 5642 4c4b 0000 001c 0000 0006 0000 0001 VBLK............
0003210: 0000 0034 0000 003a 0102 0544 6973 6b31 ...4...:...Disk1
0003220: 2437 3965 3830 3239 332d 3665 6231 2d31 $79e80293-6eb1-1
0003230: 3164 662d 3838 6463 2d30 3032 3662 3938 1df-88dc-0026b98
0003240: 3335 6462 3300 0000 0040 0000 0000 0000 35db3....@......
0003250: 0048 0000 0000 0000 0000 0000 0000 0000 .H..............
This means that the first disk of this Volume is identfied by the
following Unique ID (UID) : 79e80293-6eb1-11df-88dc-0026b9835db3
But at the moment, we don't know which of the disk has this UID !
So move to the Disk2 entry and take note of its UID and so on for
all the disks you had in your volume. Note: Based on my experience
only the first 8 characters are changing, the rest stays the same.
Indeed, W2k8 seems to increment the ID by 6. $ is a separator.
Eg. :
Windows Disk1 UID : 79e80293-6eb1-11df-88dc-0026b9835db3
Windows Disk2 UID : 79e80299-...
Windows Disk3 UID : 79e8029f-...
2.2. Find Disk UID
Go to line 00e8200 (lun01.metadata). You should find 'PRIVHEAD'.
00e8200: 5052 4956 4845 4144 0000 2c41 0002 000c PRIVHEAD..,A....
00e8210: 01cc 6d37 2a3f c84e 0000 0000 0000 0007 ..m7*?.N........
00e8220: 0000 0000 0000 07ff 0000 0000 0000 0740 ...............@
00e8230: 3739 6538 3032 3939 2d36 6562 312d 3131 79e80299-6eb1-11
00e8240: 6466 2d38 3864 632d 3030 3236 6239 3833 df-88dc-0026b983
00e8250: 3564 6233 0000 0000 0000 0000 0000 0000 5db3............
00e8260: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00e8270: 3162 3737 6461 3230 2d63 3731 372d 3131 1b77da20-c717-11
00e8280: 6430 2d61 3562 652d 3030 6130 6339 3164 d0-a5be-00a0c91d
00e8290: 6237 3363 0000 0000 0000 0000 0000 0000 b73c............
00e82a0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00e82b0: 3839 3164 3065 3866 2d64 3932 392d 3131 891d0e8f-d929-11
00e82c0: 6530 2d61 3861 372d 3030 3236 6239 3833 e0-a8a7-0026b983
00e82d0: 3564 6235 0000 0000 0000 0000 0000 0000 5db5............
00e82e0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
What we are after is the disk UID of this particular disk. We see:
- Disk Id : 79e80299-6eb1-11df-88dc-0026b9835db3
- Host Id : 1b77da20-c717-11d0-a5be-00a0c91db73c
- Disk Group Id : 891d0e8f-d929-11e0-a8a7-0026b9835db5
So this disk with the UID 79e80299-... is Windows Disk2 but for us
it was Physical Disk 1. Indeed find this UID in the disk order you
found above.
Note: There is no logical order. I mean Windows decide how to setup
the disk order not you. So there is NO human logic and don't expect
your first disk to be Disk1.
So don't assume that the order above is going to follow any human
logic. I recommend you to go through all the LDM data of your disks
and extract their UID. (You can use the following command to just
extract the PRIVHEAD info: dd if=LUNXX skip=1890 count=1 |xxd -a)
e.g:
(Windows) Disk1 : 79e80293-... == Physical disk 2
(Windows) Disk2 : 79e80299-... == Physical disk 1
(Windows) Disk3 : 79e8029f-... == Physical disk 3
I am sure that somewhere in the LDM metadata you can find the type
of Volume (spanned, RAID0, RAIDX, and the associated stripe sizes)
However, I haven't dug it. I used a 'try and retry' method to find
my data. So if you know how you setup your configuration before
the drama, you will save yourself a lot of time.
3. Find the NTFS filesystem and your data
Now we are interested in the big chunk of data we want to restore.
In my case it's ~512GB of data so we won't convert the whole in
ASCII. I haven't really search how Windows find the beginning of
its NTFS partition. But what I found is that it logically starts
with the following keyword : R.NTFS. Let's find this and find the
offset we will have to apply later to see our NTFS FS.
06: 02 0000262178 1048576966 1048314789 LDM data partition
In this example, the data starts at 262178 and is 1048314789 sectors long
We found above that Disk1 (of the volume group) is actually the 2nd
physical disk. We will extract some of its information to find
where the NTFS partition start.
# dd if=LUN02 skip=262178 count=4096 |xxd -a > lun02.DATASTART-4k
# less lun02.DATASTART-4k
0000000: 0000 0000 0000 0000 0000 0000 0000 0000 ................
*
00fbc00: eb52 904e 5446 5320 2020 2000 0208 0000 .R.NTFS .....
00fbc10: 0000 0000 00f8 0000 3f00 ff00 0008 0400 ........?.......
00fbc20: 0000 0000 8000 8000 ffaf d770 0200 0000 ...........p....
Here we can see that NTFS starts at 00fbc00. So knowing that we can
start to extract our data from sector 262178 + 00fbc00 bytes. Let's
do a bit of hexadecimal to decimal conversion with bytes to sector
conversion as well.
0xfbc00 bytes = 1031168 bytes = 1031168/512 sectors = 2014 sectors
So our NTFS partition starts at 262178 + 2014 = 264192 sectors.
This value is going to be an offset we will use later on all disks.
Let's called it the NTFS offset.
Obviously the total size is shrinked by the offset. So the new size is:
1048314789 - 2014 = 1048312775 sectors
4. Try to mount/see the data
From now on, either it will work out of the box because your NTFS partition is
healthy or it won't because you're doing this to recover some data.
The following process is the same whatever is your status. All the following is
based on [1] (see Links at the bottom)
A spanned volume, will fill a volume after another. Where as a striped (RAID0)
will copy chunk of data over many disks (a.k.a a file is spread across many
disks). In my case, I didn't know if it was a spanned or striped volume. The
easiest way to know, if your volume is not full is to check if you have a lot
of zeroes at then end of all your volumes. If that's the case then it's striped.
Because if it's spanned, if will fill the first disk, then the second. I am
not 100% sure of that but that's what I observed. So dd a bunch of sectors
from the end of the LDM data partition.
4.0 Preparations to access your data
First mount your dd file or your device through a loopback device with the NTFS
offset and the size we calculated above. However the offset and size must be
in bytes not in sectors to be used with losetup.
offset = 264192*512 = 135266304
size = 1048312775*512 = 536736140800
# losetup /dev/loop2 DDFILE_OR_DEVICE -o 135266304 --size 536736140800
# blockdev --getsize /dev/loop2
1048312775 <---- total size in sectors, same number than before
Note: you can add '-r' to mount in Read-Only mode.
Do the above for all the physical disks part of your volume. Display the result
with: losetup -a
Note: If you don't have enough loop devices you can easily create more with :
# mknod -m0660 /dev/loopNUMBER b 7 NUMBER && chown root.disk /dev/loopNUMBER
Check your alignment by opening the first Disk of the group (eg: Disk2) to see
if the first line is R.NTFS. If not then your alignment is wrong. Verify your
calculations above and try again. Or you are not looking at the 1st Windows Disk
e.g:
First disk of the volume has been mounted on /dev/loop2
# xxd /dev/loop2 |head
0000000: eb52 904e 5446 5320 2020 2000 0208 0000 .R.NTFS .....
0000010: 0000 0000 00f8 0000 3f00 ff00 0008 0400 ........?.......
All good. Let's move to the annoying part :)
4.1 Spanned
Spanned disks are actually a chain of disks. You fill the first then you use
the second one and so and so forth. Create a file which look like this, eg :
# Offset into Size of this Raid type Device Start sector
# volume device of device
0 1048312775 linear /dev/loop2 0
1048312775 1048312775 linear /dev/loop1 0
2096625550 1048312775 linear /dev/loop3 0
Notes:
- Remember to use the good disk order (you found before). eg: Physical Disk2
followed by Physical Disk1 and Physical Disk3
- 2096625550 = 2*1048312775 and obviously if you have a fourth disk it's gonna
be 3 times the size for the offset for the 4th disk.
4.2 Striped
The problem with striped mode (aka RAID0) is you must know what is your stripe
size. Apparently by default it is 64k (in my case it was 128k but I dunno if it
was tuned by the Windows sysadmin:). Anyway if you don't know it, you just have to
try all the possible standard values and see which one gives you a possible viable
NTFS filesystem.
Create a file like the following for 3 disks with a 128k chunk size
.---+--> 3 chunks of 128k
0 3144938240 striped 3 128 /dev/loop2 0 /dev/loop3 0 /dev/loop1 0
`---> total size of the volume `----------+-----------+---> disk order
/!\ : Size of the volume is not exactly the size we calculated before. dmsetup needs
a volume size divisible by the chunk size (aka stripe size) AND by the number
of disks in the volume. So in our case. We have 3 disks of 1048312775 sectors
So the 'normal' size is 1048312775*3=3144938325 sectors but due to the above
contraint we will recalculate the size and round it
# echo "3144938325/128*128" | bc
3144938240 sectors
So 3144938240 is the size of your volume in a striped scenario with 3 disk and
128 chunks (aka stripes)
4.3 Mount it.
Now lets aggregate everything together with dmsetup :
# dmsetup create myldm /path/myconfigfile
# dmsetup ls
myldm (253, 1)
# mount -t ntfs -o ro /dev/mapper/myldm /mnt
If it does not mount. Then you can use testdisk :
# testdisk /dev/mapper/myldm
--> Analyse
----> Quick search
------> You should see the volume name (if any). If not it seems compromised :)
--------> Press 'P' to see files and copy with 'c'
5. Conclusion
The above worked for me. Your mileage may vary. And there is maybe a better and
easier way to do it. If so, share it so nobody else will have to go through this
hassle :) Also, it may look hard but it is not. As long as you copy your data
somewhere, just try and retry until you can see something. It took me 3 days to
understand how to put all the bits together. Hopefully the above will help you
to not waste 3 days.
Note: All examples above have been made up. There is maybe some inconsistencies
between the examples despite my thoroughness ;)
Good luck.
6. Links