bash/unix toolchain binary stream processing/slicing
Asked Answered
E

5

7

I have a binary stream on standard input, it's in a fixed size format, a continuos stream of packets, each packet has a header with length X and a body with length Y.

So if X=2 Y=6 then it's something like 00abcdef01ghijkl02mnopqr03stuvwx, but it's binary and both the header and data can contain any "characters" (including '\0' and newline), the example is just for readability.

I want to get rid of the header data so the output looks like this: abcdefghijklmnopqrstuvwx.

Are there any commands in the unix toolchain that allow me to do this? And in general are there any tools for handling binary data? The only tool I could think of is od/hexdump but how do you convert the result back to binary?

Euphrates answered 16/8, 2011 at 15:36 Comment(1)
Are these network packets? What about tcpdump?Hallucinate
M
4

Use xxd which goes to and from a hexdump.

xxd -c 123 -ps

will output your stream with 123 bytes per line. To reverse use

xxd -r -p

You should now be able to put this together with cut to drop characters since you can do something like

cut -c 3-

to get all characters from 3 to the end of a line. Do not forget to use a number of characters equal to 2X to account for two hex characters per byte.

So something along the lines of

xxd -c X+Y -ps | cut -c 2X+1- | xxd -r -p

where X+Y and 2X+1 are replaced with actual numerical values. You'll need to put your datastream somewhere appropriate in to the above command.

Meliorism answered 16/8, 2011 at 15:53 Comment(1)
so the idea was ok.. xxd, never heard about it. Thanks! note: the cut is from 2X+1.Euphrates
C
1

Perl is a pretty standard unix tool. Pipe it to perl. If its fixed length byte aligned a simple substr operation should work. Here is a perl sample that should work.

#!/usr/bin/env perl

use strict;
use warnings;

my $buf;
my $len = 8;
my $off = 2;
while(sysread(STDIN,$buf,$len) != 0 ){
  print substr($buf,$off);
}

exit 0;

Coston answered 16/8, 2011 at 16:4 Comment(1)
True, but at least its configurable ;)Coston
C
1

As a one-liner, I'd write:

perl -00 -ne 'chomp; while (/(?:..)(......)/sg) {print $1}'

example:

echo '00abcdef01ghijkl02mnopqr03stuvw
00abcdef01ghi
kl02mnopqr' | perl -00 -ne 'chomp; while (/(?:..)(......)/sg) {print $1}' | od -c

produces

0000000   a   b   c   d   e   f   g   h   i   j   k   l   m   n   o   p
0000020   q   r   s   t   u   v   w  \n   a   b   c   d   e   f   g   h
0000040   i  \n   k   l   m   n   o   p   q   r
0000052
Cenac answered 16/8, 2011 at 16:23 Comment(2)
eats leading newlines echo -e "\n00abcdef01..". any idea?Euphrates
@yi_H, yeah, go with bot403's "sysread" answer.Cenac
R
0

There's also bbe - binary block editor, which is kind of binary sed for handling binary data the Unix way.

http://bbe-.sourceforge.net

Rosenblum answered 16/8, 2011 at 16:15 Comment(0)
I
0

The binary stream editor is a tool written in java for handling streams. It can be used from java as well as command line. https://sourceforge.net/projects/bistreameditor/

DISCLAIMER : i am the author of this tool.

Unlike new-line based tools like sed, it allows custom traversing and data-storage via the traversal and buffer. Binary data can be treated as one byte chars and string operations/matches allowed. It can write to multiple outputs and use different encodings. Because of this flexibility, currently the command line has a lot of parameters, which needs to be simplified.

The bse.zip file should be downloaded and used. For the above example, we would simply need to do a substr(2) on the input of len 8. The full command line is

java -classpath "./bin:$CMN_LIB_PATH/commons-logging-1.1.1.jar:$CMN_LIB_PATH/commons-io-2.1.jar:$CMN_LIB_PATH/commons-jexl-2.1.1.jar:$CMN_LIB_PATH/commons-lang3-3.1.jar" 
-Dinputsrc=file:/fullpathtofile|URL|System.in 
-Dtraverser=org.milunsagle.io.streameditor.FixedLengthTraverser 
-Dtraversercons=size -Dtraverserconsarg0=8 
-Dbuffer=org.milunsagle.io.streameditor.CircularBuffer 
-Dbuffercons=size -Dbufferconsarg0=8 
-Dcommands='PRN V $$__INPUT.substring(2)' 
org.milunsagle.io.streameditor.BinaryStreamEditorInvoker
Immobility answered 29/8, 2016 at 6:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.