How to process a 24-bit 3 channel color image with SSE2/SSE3/SSE4?

Asked 13/3, 2013 at 3:55 Answered 13/3, 2013 at 4:58

optimization opencv image-processing instructions sse2

I just started to use SS2 optimization of image processing, but for the 3 channel 24 bit color images have no idea. My pix data arranged by BGR BGR BGR ... ,unsigned char 8-bi, so if I want to implement the Color2Gray with SSE2/SSE3/SSE4's instruction C/C++ fun ,how would I do? Does need to align(4/8/16) for my pix data? I have read article:http://supercomputingblog.com/windows/image-processing-with-sse/ But it is ARGB 4 channel 32-bit color,exactly process 4 color pix data every time. Thanks!

//Assume the original pixel:
      unsigned char* pDataColor=(unsigned char*)malloc(src.width*src.height*3);//3

  //init pDataColor every pix val
  // The dst pixel:
  unsigned char* pDataGray=(unsigned char*)malloc(src.width*src.height*1);//1

//RGB->Gray: Y=0.212671*R + 0.715160*G + 0.072169*B

Parturient answered 13/3, 2013 at 3:55 Comment(0)

I have slides on de-interleaving of 24-bit RGB pixels, which explain how to do it with SSE2 and SSSE3.

Enforcement answered 13/3, 2013 at 4:58 Comment(2)

Yes,thank you.i have commplete the color RGB...RGB to RR...BB...GG,but when i try to makes: RGB->Gray: Y=0.212671*R + 0.715160*G + 0.072169*B . There have problems. since SSE use 128-bit,there is no SSE C/C++ instruction fun can do work like: _mm_mul_xxx(a0,b0);r0=a0*b0,r1=a1*b0,...,r15=a15*b0; (a0 is 16 uchar types,b0 is 2 double types). ???? – Parturient 17/3, 2013 at 4:11

For color conversion typically fixed-point arithmetic is used instead of floating-point. Unpack 8-bit values into 16-bit values (using _mm_unpacklo_epi8/_mm_unpackhi_epi8), and then use integer multiplication of 16-bit values. – Enforcement 18/3, 2013 at 0:3

Here is some answers to your question:

For How to use SSE2 instruction C/C++ functions. These references may be helpful.
For the alignment: Yes, 16-byte align is necessary. When there are memory accesses using SSE2 intrinsic functions（ The SSE2/SSE3/SSE4 instruction C／C++ functions), you should make sure that the memory address is 16-byte alignment. If you're using MSVC, you'll have to use declspec(align(16)), or with GCC, this would be __attribute((aligned (16))).
- The reason why align is necessary can be found here: Why does instruction/data alignment exist?
For 3-channel RGB conversion, I am not an image-processing experts, so can not give advice. There are also some open source image processing libraries that may already contain the code you want.

Owlet answered 13/3, 2013 at 4:29 Comment(0)

Recommended topics

Hot tags