How do you implement a circular buffer in C?
Asked Answered
M

9

82

I have a need for a fixed-size (selectable at run-time when creating it, not compile-time) circular buffer which can hold objects of any type and it needs to be very high performance. I don't think there will be resource contention issues since, although it's in a multi-tasking embedded environment, it's a co-operative one so the tasks themselves can manage that.

My initial thought was to store a simple struct in the buffer which would contain the type (simple enum/define) and a void pointer to the payload but I want this to be as fast as possible so I'm open to suggestions that involve bypassing the heap.

Actually I'm happy to bypass any of the standard library for raw speed - from what I've seen of the code, it's not heavily optimized for the CPU : it looks like they just compiled C code for things like strcpy() and such, there's no hand-coded assembly.

Any code or ideas would be greatly appreciated. The operations required are:

  • create a buffer with specific size.
  • put at the tail.
  • get from the head.
  • return the count.
  • delete a buffer.
Minium answered 6/5, 2009 at 1:52 Comment(5)
Do you need a circular buffer or a queue? The required operations make it sound like queue. I admit that with the requirement of a fixed size using a circular buffer make sense, but I'm not sure the question title reflects your actual question.Ericson
I'm open to other data structures if you think they can be faster, but I'm reasonably certain a fixed-in-memory circular buffer will outperform malloc/free of the items in the queue. Although I guess I'm having to do malloc/free of the payload anyway: if I could do one malloc for the item and payload, that could be worth it.Minium
"if you think they can be faster"? - I'd suggest you would have to benchmark. BTW, what do you classify as 'very high performance'?Blame
I'll be benchmarking all the ideas (I have test data generation capabilities based on actual throughput). "High performance" will be anything that allows the current CPU to handle this new recently-increased load that the customer has seen fit to inflict on it :-)Minium
I'll clarify that. I don't need the performance of a quad-way latest-Intel-screamer CPU. It's running on an 8051 variant which isn't the fastest so I'm really just looking for optimization ideas to test out. If none of them pan out, client will have to make new hardware based on a different CPU and that's not going to be cheap. The item handling in the current queues has been identified as the primary bottleneck.Minium
P
12

Can you enumerate the types needed at the time you code up the buffer, or do you need to be able to add types at run time via dynamic calls? If the former, then I would create the buffer as a heap-allocated array of n structs, where each struct consists of two elements: an enum tag identifying the data type, and a union of all the data types. What you lose in terms of extra storage for small elements, you make up in terms of not having to deal with allocation/deallocation and the resulting memory fragmentation. Then you just need to keep track of the start and end indices that define the head and tail elements of the buffer, and make sure to compute mod n when incrementing/decrementing the indices.

Paralogism answered 6/5, 2009 at 2:45 Comment(3)
The types needed are enumerated, yes. There's only about six of them and that will never change. But the queues that hold the items must be able to hold all six types. I'm not sure about the storing of a whole item in the queue rather than a pointer - that means copying items (several hundred bytes) rather than a pointer. But I like the idea of a union - my primary concern here is speed rather than memory (we have enough of the latter, but the CPU is a shocker :-).Minium
Your answer's given me a good idea tho' - the memory for the items could be pre-allocated from malloc() and handed out from a mymalloc() purpose-built to handle just those memory blocks. And I could still just use pointers. +1 for that.Minium
You might or might not need to do extra copying, depending on the access pattern to the data. If you could build the items in place, and reference them while they are still in the buffer before popping, there might not be any extra copying. But it's certainly safer and more flexible to hand them out from your own allocator, and use a separate array of pointers (or indices) as the buffer.Paralogism
R
96

The simplest solution would be to keep track of the item size and the number of items, and then create a buffer of the appropriate number of bytes:

typedef struct circular_buffer
{
    void *buffer;     // data buffer
    void *buffer_end; // end of data buffer
    size_t capacity;  // maximum number of items in the buffer
    size_t count;     // number of items in the buffer
    size_t sz;        // size of each item in the buffer
    void *head;       // pointer to head
    void *tail;       // pointer to tail
} circular_buffer;

void cb_init(circular_buffer *cb, size_t capacity, size_t sz)
{
    cb->buffer = malloc(capacity * sz);
    if(cb->buffer == NULL)
        // handle error
    cb->buffer_end = (char *)cb->buffer + capacity * sz;
    cb->capacity = capacity;
    cb->count = 0;
    cb->sz = sz;
    cb->head = cb->buffer;
    cb->tail = cb->buffer;
}

void cb_free(circular_buffer *cb)
{
    free(cb->buffer);
    // clear out other fields too, just to be safe
}

void cb_push_back(circular_buffer *cb, const void *item)
{
    if(cb->count == cb->capacity){
        // handle error
    }
    memcpy(cb->head, item, cb->sz);
    cb->head = (char*)cb->head + cb->sz;
    if(cb->head == cb->buffer_end)
        cb->head = cb->buffer;
    cb->count++;
}

void cb_pop_front(circular_buffer *cb, void *item)
{
    if(cb->count == 0){
        // handle error
    }
    memcpy(item, cb->tail, cb->sz);
    cb->tail = (char*)cb->tail + cb->sz;
    if(cb->tail == cb->buffer_end)
        cb->tail = cb->buffer;
    cb->count--;
}
Roundish answered 6/5, 2009 at 2:18 Comment(4)
Very standard solution - exactly to the spec. that the OP included as what he was trying to avoid. :PAgone
Adam, this solution assumes that items of the same size must always occupy the same position in the buffer. I believe the OP required that any given input, stored at any location in the buffer, be any of 6 differently sized data-types. This leaves 2 alternatives. Use calloc() and realloc() space for each newly-arrived datum, or allocate a buffer large enough to hold the largest datum type at every position in the buffer. The latter approach, if feasible, would be faster and cleaner.Dielle
Could you please provide a way to initiate a circular buffer? Using this code : circular_buffer *cb; cb_init(cb, 20, 4); i get this error Process finished with exit code 139 (interrupted by signal 11: SIGSEGV) Thanks in advance and sorry for writing code in the comments.Clipfed
Just wanted to add a note that this implementation will not work for multi-threaded case i.e. producer and consumer are separate threads since both are accessing and modifying "count". We will need to protect "count".Megasporangium
P
16
// Note power of two buffer size
#define kNumPointsInMyBuffer 1024 

typedef struct _ringBuffer {
    UInt32 currentIndex;
    UInt32 sizeOfBuffer;
    double data[kNumPointsInMyBuffer];
} ringBuffer;

// Initialize the ring buffer
ringBuffer *myRingBuffer = (ringBuffer *)calloc(1, sizeof(ringBuffer));
myRingBuffer->sizeOfBuffer = kNumPointsInMyBuffer;
myRingBuffer->currentIndex = 0;

// A little function to write into the buffer
// N.B. First argument of writeIntoBuffer() just happens to have the
// same as the one calloc'ed above. It will only point to the same
// space in memory if the calloc'ed pointer is passed to
// writeIntoBuffer() as an arg when the function is called. Consider
// using another name for clarity
void writeIntoBuffer(ringBuffer *myRingBuffer, double *myData, int numsamples) {
    // -1 for our binary modulo in a moment
    int buffLen = myRingBuffer->sizeOfBuffer - 1;
    int lastWrittenSample = myRingBuffer->currentIndex;

    int idx;
    for (int i=0; i < numsamples; ++i) {
        // modulo will automagically wrap around our index
        idx = (i + lastWrittenSample) & buffLen; 
        myRingBuffer->data[idx] = myData[i];
    }

    // Update the current index of our ring buffer.
    myRingBuffer->currentIndex += numsamples;
    myRingBuffer->currentIndex &= myRingBuffer->sizeOfBuffer - 1;
}

As long as your ring buffer's length is a power of two, the incredibly fast binary "&" operation will wrap around your index for you. For my application, I'm displaying a segment of audio to the user from a ring buffer of audio acquired from a microphone.

I always make sure that the maximum amount of audio that can be displayed on screen is much less than the size of the ring buffer. Otherwise you might be reading and writing from the same chunk. This would likely give you weird display artifacts.

Prostatectomy answered 20/11, 2009 at 16:22 Comment(3)
I like the use of the & and using ringBuffer->sizeOfBuffer as a bit mask. I assume the problem with "weird display artifacts" is due to writing to the FIFO without checking to see if you're going to write over the tail before you do the write?Dielle
I've done some testing, and I think both ways of % and & are the same with this snippet: uint8_t tmp1,tmp2; tmp1 = (34 + 1) & 31; tmp2 = (35 ) % 32; printf("%d %d",tmp1,tmp2); So, what's the actual difference or it's just coding style?Kannry
@Kannry & and % achieve the same result only when the operand (32) is a power of two.Parlay
P
12

Can you enumerate the types needed at the time you code up the buffer, or do you need to be able to add types at run time via dynamic calls? If the former, then I would create the buffer as a heap-allocated array of n structs, where each struct consists of two elements: an enum tag identifying the data type, and a union of all the data types. What you lose in terms of extra storage for small elements, you make up in terms of not having to deal with allocation/deallocation and the resulting memory fragmentation. Then you just need to keep track of the start and end indices that define the head and tail elements of the buffer, and make sure to compute mod n when incrementing/decrementing the indices.

Paralogism answered 6/5, 2009 at 2:45 Comment(3)
The types needed are enumerated, yes. There's only about six of them and that will never change. But the queues that hold the items must be able to hold all six types. I'm not sure about the storing of a whole item in the queue rather than a pointer - that means copying items (several hundred bytes) rather than a pointer. But I like the idea of a union - my primary concern here is speed rather than memory (we have enough of the latter, but the CPU is a shocker :-).Minium
Your answer's given me a good idea tho' - the memory for the items could be pre-allocated from malloc() and handed out from a mymalloc() purpose-built to handle just those memory blocks. And I could still just use pointers. +1 for that.Minium
You might or might not need to do extra copying, depending on the access pattern to the data. If you could build the items in place, and reference them while they are still in the buffer before popping, there might not be any extra copying. But it's certainly safer and more flexible to hand them out from your own allocator, and use a separate array of pointers (or indices) as the buffer.Paralogism
D
10

First, the headline. You don't need modulo arithmetic to wrap the buffer if you use bit ints to hold the head & tail "pointers", and size them so they are perfectly in synch. IE: 4096 stuffed into a 12-bit unsigned int is 0 all by itself, unmolested in any way. Eliminating modulo arithmetic, even for powers of 2, doubles the speed - almost exactly.

10 million iterations of filling and draining a 4096 buffer of any type of data elements takes 52 seconds on my 3rd Gen i7 Dell XPS 8500 using Visual Studio 2010's C++ compiler with default inlining, and 1/8192nd of that to service a datum.

I'd RX rewriting the test loops in main() so they no longer control the flow - which is, and should be, controlled by the return values indicating the buffer is full or empty, and the attendant break; statements. IE: the filler and drainer should be able to bang against each other without corruption or instability. At some point I hope to multi-thread this code, whereupon that behavior will be crucial.

The QUEUE_DESC (queue descriptor) and initialization function forces all buffers in this code to be a power of 2. The above scheme will NOT work otherwise. While on the subject, note that QUEUE_DESC is not hard-coded, it uses a manifest constant (#define BITS_ELE_KNT) for its construction. (I'm assuming a power of 2 is sufficient flexibility here)

To make the buffer size run-time selectable, I tried different approaches (not shown here), and settled on using USHRTs for Head, Tail, EleKnt capable of managing a FIFO buffer[USHRT]. To avoid modulo arithmetic I created a mask to && with Head, Tail, but that mask turns out to be (EleKnt -1), so just use that. Using USHRTS instead of bit ints increased performance ~ 15% on a quiet machine. Intel CPU cores have always been faster than their buses, so on a busy, shared machine, packing your data structures gets you loaded and executing ahead of other, competing threads. Trade-offs.

Note the actual storage for the buffer is allocated on the heap with calloc(), and the pointer is at the base of the struct, so the struct and the pointer have EXACTLY the same address. IE; no offset required to be added to the struct address to tie up registers.

In that same vein, all of the variables attendant with servicing the buffer are physically adjacent to the buffer, bound into the same struct, so the compiler can make beautiful assembly language. You'll have to kill the inline optimization to see any assembly, because otherwise it gets crushed into oblivion.

To support the polymorphism of any data type, I've used memcpy() instead of assignments. If you only need the flexibility to support one random variable type per compile, then this code works perfectly.

For polymorphism, you just need to know the type and it's storage requirement. The DATA_DESC array of descriptors provides a way to keep track of each datum that gets put in QUEUE_DESC.pBuffer so it can be retrieved properly. I'd just allocate enough pBuffer memory to hold all of the elements of the largest data type, but keep track of how much of that storage a given datum is actually using in DATA_DESC.dBytes. The alternative is to reinvent a heap manager.

This means QUEUE_DESC's UCHAR *pBuffer would have a parallel companion array to keep track of data type, and size, while a datum's storage location in pBuffer would remain just as it is now. The new member would be something like DATA_DESC *pDataDesc, or, perhaps, DATA_DESC DataDesc[2^BITS_ELE_KNT] if you can find a way to beat your compiler into submission with such a forward reference. Calloc() is always more flexible in these situations.

You'd still memcpy() in Q_Put(),Q_Get, but the number of bytes actually copied would be determined by DATA_DESC.dBytes, not QUEUE_DESC.EleBytes. The elements are potentially all of different types/sizes for any given put or get.

I believe this code satisfies the speed and buffer size requirements, and can be made to satisfy the requirement for 6 different data types. I've left the many test fixtures in, in the form of printf() statements, so you can satisfy yourself (or not) that the code works properly. The random number generator demonstrates that the code works for any random head/tail combo.

enter code here
// Queue_Small.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include <stdio.h>
#include <time.h>
#include <limits.h>
#include <stdlib.h>
#include <malloc.h>
#include <memory.h>
#include <math.h>

#define UCHAR unsigned char
#define ULONG unsigned long
#define USHRT unsigned short
#define dbl   double
/* Queue structure */
#define QUEUE_FULL_FLAG 1
#define QUEUE_EMPTY_FLAG -1
#define QUEUE_OK 0
//  
#define BITS_ELE_KNT    12  //12 bits will create 4.096 elements numbered 0-4095
//
//typedef struct    {
//  USHRT dBytes:8;     //amount of QUEUE_DESC.EleBytes storage used by datatype
//  USHRT dType :3; //supports 8 possible data types (0-7)
//  USHRT dFoo  :5; //unused bits of the unsigned short host's storage
// }    DATA_DESC;
//  This descriptor gives a home to all the housekeeping variables
typedef struct  {
    UCHAR   *pBuffer;   //  pointer to storage, 16 to 4096 elements
    ULONG Tail  :BITS_ELE_KNT;  //  # elements, with range of 0-4095
    ULONG Head  :BITS_ELE_KNT;  //  # elements, with range of 0-4095
    ULONG EleBytes  :8;     //  sizeof(elements) with range of 0-256 bytes
    // some unused bits will be left over if BITS_ELE_KNT < 12
    USHRT EleKnt    :BITS_ELE_KNT +1;// 1 extra bit for # elements (1-4096)
    //USHRT Flags   :(8*sizeof(USHRT) - BITS_ELE_KNT +1);   //  flags you can use
    USHRT   IsFull  :1;     // queue is full
    USHRT   IsEmpty :1;     // queue is empty
    USHRT   Unused  :1;     // 16th bit of USHRT
}   QUEUE_DESC;

//  ---------------------------------------------------------------------------
//  Function prototypes
QUEUE_DESC *Q_Init(QUEUE_DESC *Q, int BitsForEleKnt, int DataTypeSz);
int Q_Put(QUEUE_DESC *Q, UCHAR *pNew);
int Q_Get(UCHAR *pOld, QUEUE_DESC *Q);
//  ---------------------------------------------------------------------------
QUEUE_DESC *Q_Init(QUEUE_DESC *Q, int BitsForEleKnt, int DataTypeSz)    {
    memset((void *)Q, 0, sizeof(QUEUE_DESC));//init flags and bit integers to zero
    //select buffer size from powers of 2 to receive modulo 
    //                arithmetic benefit of bit uints overflowing
    Q->EleKnt   =   (USHRT)pow(2.0, BitsForEleKnt);
    Q->EleBytes =   DataTypeSz; // how much storage for each element?
    //  Randomly generated head, tail a test fixture only. 
    //      Demonstrates that the queue can be entered at a random point 
    //      and still perform properly. Normally zero
    srand(unsigned(time(NULL)));    // seed random number generator with current time
    Q->Head = Q->Tail = rand(); // supposed to be set to zero here, or by memset
    Q->Head = Q->Tail = 0;
    //  allocate queue's storage
    if(NULL == (Q->pBuffer = (UCHAR *)calloc(Q->EleKnt, Q->EleBytes)))  {
        return NULL;
    }   else    {
        return Q;
    }
}
//  ---------------------------------------------------------------------------
int Q_Put(QUEUE_DESC *Q, UCHAR *pNew)   
{
    memcpy(Q->pBuffer + (Q->Tail * Q->EleBytes), pNew, Q->EleBytes);
    if(Q->Tail == (Q->Head + Q->EleKnt)) {
        //  Q->IsFull = 1;
        Q->Tail += 1;   
        return QUEUE_FULL_FLAG; //  queue is full
    }
    Q->Tail += 1;   //  the unsigned bit int MUST wrap around, just like modulo
    return QUEUE_OK; // No errors
}
//  ---------------------------------------------------------------------------
int Q_Get(UCHAR *pOld, QUEUE_DESC *Q)   
{
    memcpy(pOld, Q->pBuffer + (Q->Head * Q->EleBytes), Q->EleBytes);
    Q->Head += 1;   //  the bit int MUST wrap around, just like modulo

    if(Q->Head == Q->Tail)      {
        //  Q->IsEmpty = 1;
        return QUEUE_EMPTY_FLAG; // queue Empty - nothing to get
    }
    return QUEUE_OK; // No errors
}
//
//  ---------------------------------------------------------------------------
int _tmain(int argc, _TCHAR* argv[])    {
//  constrain buffer size to some power of 2 to force faux modulo arithmetic
    int LoopKnt = 1000000;  //  for benchmarking purposes only
    int k, i=0, Qview=0;
    time_t start;
    QUEUE_DESC Queue, *Q;
    if(NULL == (Q = Q_Init(&Queue, BITS_ELE_KNT, sizeof(int)))) {
        printf("\nProgram failed to initialize. Aborting.\n\n");
        return 0;
    }

    start = clock();
    for(k=0; k<LoopKnt; k++)    {
        //printf("\n\n Fill'er up please...\n");
        //Q->Head = Q->Tail = rand();
        for(i=1; i<= Q->EleKnt; i++)    {
            Qview = i*i;
            if(QUEUE_FULL_FLAG == Q_Put(Q, (UCHAR *)&Qview))    {
                //printf("\nQueue is full at %i \n", i);
                //printf("\nQueue value of %i should be %i squared", Qview, i);
                break;
            }
            //printf("\nQueue value of %i should be %i squared", Qview, i);
        }
        //  Get data from queue until completely drained (empty)
        //
        //printf("\n\n Step into the lab, and see what's on the slab... \n");
        Qview = 0;
        for(i=1; i; i++)    {
            if(QUEUE_EMPTY_FLAG == Q_Get((UCHAR *)&Qview, Q))   {
                //printf("\nQueue value of %i should be %i squared", Qview, i);
                //printf("\nQueue is empty at %i", i);
                break;
            }
            //printf("\nQueue value of %i should be %i squared", Qview, i);
        }
        //printf("\nQueue head value is %i, tail is %i\n", Q->Head, Q->Tail);
    }
    printf("\nQueue time was %5.3f to fill & drain %i element queue  %i times \n", 
                     (dbl)(clock()-start)/(dbl)CLOCKS_PER_SEC,Q->EleKnt, LoopKnt);
    printf("\nQueue head value is %i, tail is %i\n", Q->Head, Q->Tail);
    getchar();
    return 0;
}
Dielle answered 15/12, 2012 at 0:29 Comment(7)
There's an even faster approach to curling the flat vector into a ring. Use a sentinel value at the end of the buffer, and when pHead, or pTail are == that pointer, reset each, in turn, back to buff[0]. About 70% of the time for powers-of-two, and 33% of the time for non powers-of-two. Time permitting, I'll put up some code.Gybe
Your code is broken for several reasons. First, the comparison Q->Tail == (Q->Head + Q->EleKnt) in your Q_Put method will never return true since Q->Head + Q->EleKnt isn't a modulo addition, meaning that you will simply overwrite the head on next write. If EleKnt is 4096, that's a value your Tail will never reach. Next, using this as a producer/consumer queue would wreak havoc, since your Q_Put "writes first, asks questions later" and updates the Tail even after realizing that the queue is full. Next call to Q_Put will simply overwrite the head like nothing ever happened.Croissant
You should analyze various algorithms presented at the Wikipedia page for Circular buffer, i.e. the full-or-empty buffer distinction problem. With your current approach, you need to keep one element less in your buffer to know the difference between full and empty.Croissant
@RocketRoy: I just did, Visual Studio 2012. And here, you can also check the results online (stdout is at the bottom), so that we are sure we're looking at the same code. The output of the test program is at the bottom of the page, and confirms what I wrote: it "works perfectly" as long as you don't try to fill it. Which is why it's useless as a FIFO buffer. :)Croissant
Ironically, I got to this post though your comment on this other answer, where you claimed that the snippet by @AdamDavis didn't work, and it's actually the other way around. Note also how Adam returns from Put as soon as he detects that it's full, while you nevertheless copy the data and then do the check.Croissant
Ironically, his code has a bad bug in it, one you'd find if you put this code up on a debugger and watch the ring, instead of insisting your logic is correct. Your bad.Dielle
@RocketRoy: so you are actually telling me you still don't agree that your code is broken? Yes, I am pretty sure your code doesn't "move the gap" around, since it simply works perfectly and overwrites the head without a way to detect when full. I hope you don't use this in any life critical systems, you might get in serious trouble.Croissant
U
9

Here is a simple solution in C. Assume interrupts are turned off for each function. No polymorphism & stuff, just common sense.


#define BUFSIZE 128
char buf[BUFSIZE];
char *pIn, *pOut, *pEnd;
char full;

// init
void buf_init()
{
    pIn = pOut = buf;       // init to any slot in buffer
    pEnd = &buf[BUFSIZE];   // past last valid slot in buffer
    full = 0;               // buffer is empty
}

// add char 'c' to buffer
int buf_put(char c)
{
    if (pIn == pOut  &&  full)
        return 0;           // buffer overrun

    *pIn++ = c;             // insert c into buffer
    if (pIn >= pEnd)        // end of circular buffer?
        pIn = buf;          // wrap around

    if (pIn == pOut)        // did we run into the output ptr?
        full = 1;           // can't add any more data into buffer
    return 1;               // all OK
}

// get a char from circular buffer
int buf_get(char *pc)
{
    if (pIn == pOut  &&  !full)
        return 0;           // buffer empty  FAIL

    *pc = *pOut++;              // pick up next char to be returned
    if (pOut >= pEnd)       // end of circular buffer?
        pOut = buf;         // wrap around

    full = 0;               // there is at least 1 slot
    return 1;               // *pc has the data to be returned
}
Uncoil answered 26/12, 2012 at 22:28 Comment(1)
pIn > pEnd should be used instead of pIn >= pEnd otherwise you'll never fill the last slot of the buf; same goes for pOut >= pEndShedd
S
2

A simple implementation could consist of:

  • A buffer, implemented as an array of size n, of whatever type you need
  • A read pointer or index (whichever is more efficient for your processor)
  • A write pointer or index
  • A counter indicating how much data is in the buffer (derivable from the read and write pointers, but faster to track it separately)

Every time you write data, you advance the write pointer and increment the counter. When you read data, you increase the read pointer and decrement the counter. If either pointer reaches n, set it to zero.

You can't write if counter = n. You can't read if counter = 0.

Semilunar answered 6/5, 2009 at 12:14 Comment(3)
How is the counter derivable from the pointers for the case when both the read and the write pointer point to the same location? In that case the buffer could be either empty or full and a counter would be needed (or a flag storing whether the buffer is full or not).Bove
@DimitriosDedoussis: either of those suggestions - or you say that the buffer is empty if the pointers point to the same location, and the buffer is full if the write pointer points to the location immediately before the read pointer (and you don't permit the write pointer to advance to match the read pointer).Semilunar
Exactly. In that case the buffer has to be of length n+1, in order to allow a capacity of n. What I wanted to highlight in the first place was that the counter is not derivable from the pointers, unless one does some workarounds in their buffer mechanics (such as increasing the length of the buffer), or adding a stateful boolean flag.Bove
R
2

C style, simple ring buffer for integers. First use init than use put and get. If buffer does not contain any data it returns "0" zero.

//=====================================
// ring buffer address based
//=====================================
#define cRingBufCount   512
int     sRingBuf[cRingBufCount];    // Ring Buffer
int     sRingBufPut;                // Input index address
int     sRingBufGet;                // Output index address
Bool    sRingOverWrite;

void    GetRingBufCount(void)
{
int     r;
`       r= sRingBufPut - sRingBufGet;
        if ( r < cRingBufCount ) r+= cRingBufCount;
        return r; 
}

void    InitRingBuffer(void)
{
        sRingBufPut= 0;
        sRingBufGet= 0;
}       

void    PutRingBuffer(int d)
{
        sRingBuffer[sRingBufPut]= d;
        if (sRingBufPut==sRingBufGet)// both address are like ziro
        {
            sRingBufPut= IncRingBufferPointer(sRingBufPut);
            sRingBufGet= IncRingBufferPointer(sRingBufGet);
        }
        else //Put over write a data
        {
            sRingBufPut= IncRingBufferPointer(sRingBufPut);
            if (sRingBufPut==sRingBufGet)
            {
                sRingOverWrite= Ture;
                sRingBufGet= IncRingBufferPointer(sRingBufGet);
            }
        }
}

int     GetRingBuffer(void)
{
int     r;
        if (sRingBufGet==sRingBufPut) return 0;
        r= sRingBuf[sRingBufGet];
        sRingBufGet= IncRingBufferPointer(sRingBufGet);
        sRingOverWrite=False;
        return r;
}

int     IncRingBufferPointer(int a)
{
        a+= 1;
        if (a>= cRingBufCount) a= 0;
        return a;
}
Rascal answered 12/10, 2015 at 22:13 Comment(0)
B
1

@Adam Rosenfield's solution, although correct, could be implemented with a more lightweight circular_buffer structure that does not involve count and capacity.

The structure could only hold the following 4 pointers:

  • buffer: Points to the start of the buffer in memory.
  • buffer_end: Points to the end of the buffer in memory.
  • head: Points to the end of stored data.
  • tail: Points to the start of stored data.

We could keep the sz attribute to allow the parametrisation of the unit of storage.

Both the count and the capacity values should be derive-able using the above pointers.

Capacity

capacity is straight forward, as it can be derived by dividing the distance between the buffer_end pointer and the buffer pointer by the unit of storage sz (snippet below is pseudocode):

capacity = (buffer_end - buffer) / sz

Count

For count though, things get a bit more complicated. For example, there is no way to determine whether the buffer is empty or full, in the scenario of head and tail pointing to the same location.

To tackle that, the buffer should allocate memory for an additional element. For example, if the desired capacity of our circular buffer is 10 * sz, then we need to allocate 11 * sz.

Capacity formula will then become (snippet below is pseudocode):

capacity_bytes = buffer_end - buffer - sz
capacity = capacity_bytes / sz

This extra element semantic allows us to construct conditions that evaluate whether the buffer is empty or full.

Empty state conditions

In order for the buffer to be empty, the head pointer points to the same location as the tail pointer:

head == tail

If the above evaluates to true, the buffer is empty.

Full state conditions

In order for the buffer to be full, the head pointer should be 1 element behind the tail pointer. Thus, the space needed to cover in order to jump from the head location to the tail location should be equal to 1 * sz.

if tail is larger than head:

tail - head == sz

If the above evaluates to true, the buffer is full.

if head is larger than tail:

  1. buffer_end - head returns the space to jump from the head to the end of the buffer.
  2. tail - buffer returns the space needed to jump from the start of the buffer to the `tail.
  3. Adding the above 2 should equal to the space needed to jump from the head to the tail
  4. The space derived in step 3, should not be more than 1 * sz
(buffer_end - head) + (tail - buffer) == sz
=> buffer_end - buffer - head + tail == sz
=> buffer_end - buffer - sz == head - tail
=> head - tail == buffer_end - buffer - sz
=> head - tail == capacity_bytes

If the above evaluates to true, the buffer is full.

In practice

Modifying @Adam Rosenfield's to use the above circular_buffer structure:

#include <string.h>

#define CB_SUCCESS 0        /* CB operation was successful */
#define CB_MEMORY_ERROR 1   /* Failed to allocate memory */
#define CB_OVERFLOW_ERROR 2 /* CB is full. Cannot push more items. */
#define CB_EMPTY_ERROR 3    /* CB is empty. Cannot pop more items. */

typedef struct circular_buffer {
  void *buffer;
  void *buffer_end;
  size_t sz;
  void *head;
  void *tail;
} circular_buffer;

int cb_init(circular_buffer *cb, size_t capacity, size_t sz) {
  const int incremented_capacity = capacity + 1; // Add extra element to evaluate count
  cb->buffer = malloc(incremented_capacity * sz);
  if (cb->buffer == NULL)
    return CB_MEMORY_ERROR;
  cb->buffer_end = (char *)cb->buffer + incremented_capacity * sz;
  cb->sz = sz;
  cb->head = cb->buffer;
  cb->tail = cb->buffer;
  return CB_SUCCESS;
}

int cb_free(circular_buffer *cb) {
  free(cb->buffer);
  return CB_SUCCESS;
}

const int _cb_length(circular_buffer *cb) {
  return (char *)cb->buffer_end - (char *)cb->buffer;
}

int cb_push_back(circular_buffer *cb, const void *item) {
  const int buffer_length = _cb_length(cb);
  const int capacity_length = buffer_length - cb->sz;

  if ((char *)cb->tail - (char *)cb->head == cb->sz ||
      (char *)cb->head - (char *)cb->tail == capacity_length)
    return CB_OVERFLOW_ERROR;

  memcpy(cb->head, item, cb->sz);

  cb->head = (char*)cb->head + cb->sz;
  if(cb->head == cb->buffer_end)
    cb->head = cb->buffer;

  return CB_SUCCESS;
}

int cb_pop_front(circular_buffer *cb, void *item) {
  if (cb->head == cb->tail)
    return CB_EMPTY_ERROR;

  memcpy(item, cb->tail, cb->sz);

  cb->tail = (char*)cb->tail + cb->sz;
  if(cb->tail == cb->buffer_end)
    cb->tail = cb->buffer;

  return CB_SUCCESS;
}

Bove answered 23/12, 2020 at 14:44 Comment(0)
M
0

Extending adam-rosenfield's solution, i think the following will work for multithreaded single producer - single consumer scenario.

int cb_push_back(circular_buffer *cb, const void *item)
{
  void *new_head = (char *)cb->head + cb->sz;
  if (new_head == cb>buffer_end) {
      new_head = cb->buffer;
  }
  if (new_head == cb->tail) {
    return 1;
  }
  memcpy(cb->head, item, cb->sz);
  cb->head = new_head;
  return 0;
}

int cb_pop_front(circular_buffer *cb, void *item)
{
  void *new_tail = cb->tail + cb->sz;
  if (cb->head == cb->tail) {
    return 1;
  }
  memcpy(item, cb->tail, cb->sz);
  if (new_tail == cb->buffer_end) {
    new_tail = cb->buffer;
  }
  cb->tail = new_tail;
  return 0;
}
Megasporangium answered 2/8, 2020 at 19:3 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.