When I run this script I recieve an error message with: "sort: write failed: standard output: Broken pipe"
If someone can help me it would be awesome, I am going crazy with this error
the input file is a list of files that all contain DNA sequences in a FASTA format, so each file has several sequences (each sequence in a single line) with the format: in $1 (Identifier) in $2,3,4,5,6,7&8 (more values) in $9 (the DNA sequence)
Then I want select each of this sequences by number of sequences ($common_hits) in each file (this number is not a fix value but i set 6 for the example) -All the files with less than 6 sequences must be removed -Files with 6 sequences are ok -The files with more than 6 sequences have to be reduced to 6 sequences (these sequences are selected by the higher values of field $5)
the output files must have all 6 sequences and the sequence (field $9) has to be in the line after the identifier
I am not removing the originals files with more than 6 sequences for now, because I want to be sure it works
par_list=`ls -1 *BR`
common_hits="6"
for i in ${par_list}
do
if [ "`cat ${i} | wc -l`" -lt "${common_hits}" ]
then
rm -f ${i}
elif [ "`cat ${i} | wc -l`" -gt "${common_hits}" ]
then
cat ${i} | sort -nr -k 5 | head -n ${common_hits} | \
awk '{print $1" " $2" " $3" " $4" " $5" " $6" " $7" "$8 ; print $9}' > ${i}.ph
fi
done
cat
program calls is unnecessary,wc
andsort
both take an input filename as their rightmost parameter. – Quinterohead | awk
either --awk
can do the work of picking out which lines to read itself. `awk '{ print ... } NR > 5 { exit }' – Willsonls
. The loop is better written (prone to fewer unexpected misbehaviors) asfor i in *BR
. – Willson