I'm trying to merge multiple .bed
files by identifying the first two columns chr
and start
following this,
Merging multiple files with two common columns, and replace the blank to 0
However, I'm wondering how to make the file name a newly added column name.
$cat combineFWPS_02.sh
BEGIN {
for (k=1; k<ARGC; ++k)
s = s " " 0
}
FNR == 1 {
++ARGIND
}
{
key=$1 OFS $2
if (!(key in map))
map[key] = s
split(map[key], a)
a[ARGIND] = $3
v = ""
for (k=1; k<ARGC; ++k)
v = v " " a[k]
map[key]=v
}
END {
for (k in map)
print k map[k]
}
$cat comRwps_02.sh
awkCOM="~/scripts/combineFWPS_02.sh"
## Run the jobs
time awk -f $awkCOM *.xyz.bed | sort -k1 > 13jLiC.xyz.txt
The input files look like this:
FF85561.xyz.bed:
chr1 111001 234
chr2 22099 108
chr5 463100 219
FF85574.xyz.bed:
chr1 111001 42
chr1 430229 267
chr5 663800 319
FF85631.xyz.bed:
chr1 111001 92
chr3 22099 144
chr5 663800 311
FF85717.xyz.bed:
chr1 111001 129
chr1 157901 79
chr2 22099 442
The expected output file would be
$head 13jLiC.xyz.txt
chr start FF85561 FF85574 FF85631 FF85717
chr1 111001 234 42 92 129
chr1 157901 0 0 0 79
chr1 430229 0 267 0 0
chr2 22099 108 0 0 442
chr3 22099 0 0 144 0
chr5 463100 219 0 0 0
chr5 663800 0 319 311 0
awk
program filesomething.sh
is a bit strange. – Obliviousargind
instead ofARGIND
so that if it's ever run on a system that happens to have GNU awk it doesn't break (or need to be changed). – Estragon