ArraySegment - Returning the actual segment C#
Asked Answered
R

4

19

I have been looking around on ways to return the segment which is basically held by ArraySegment in terms of offset and count. Although ArraySegment holds the complete and original array, it just delimits it with the fact that any changes to the segment are reflected into the original. The problem or say the limitation with ArraySegment is that it will not return the segment itself as a whole and I have to traverse the values. What would be the best way to return the segment as a whole?

 byte[] input = new byte[5]{1,2,3,4,5};
 ArraySegment<byte> delimited = new ArraySegment<byte>(input,0,2);
 byte[] segment = HERE I NEED SOMETHING THAT WILL RETURN THE SEGMENT i.e. [0,1,2]

The most important point, segment must not be a copy but should refer to the original array. If any changes to segment are done, they must be reflected in the original array.

Any tips are highly appreciated, thanks!

ASSIGNMENT BENCHMARKS : After some answers from Thomas and digEmAll

Ok, I ran some benchmarks against the code from digEmAll and Thomas, and to my surprise the code is overwhelmingly faster. Just what I was desperately looking for. Here are the results.

Construct             Size    Elements assigned    Iterations       Time
_______________________________________________________________________________

ArraySegmentWrapper   1500        1500              1000000       396.3 ms
Array.Copy            1500        1500              1000000       4389.04 ms

As you can see the whopping difference, it is very clear to me that I shall be using the code for ArraySegment. Below is the benchmarking code. Please note that this might be a bit biased as people would argue why "new" has been put inside a loop. I am just trying to reproduce the situation I currently have at hand at resolve it as much as possible without moving much of the code. This just made my day!

namespace ArraySegmentWrapped
{
    class Program
    {

        public static Stopwatch stopWatch = new Stopwatch();
        public static TimeSpan span = new TimeSpan();
        public static double totalTime = 0.0;
        public static int iterations = 1000000;

        static void Main(string[] args)
        {
            int size = 1500;
            int startIndex = 0;
            int endIndex = 1499;
            byte[] array1 = new byte[size];
            byte[] array2 = null;

            for (int index = startIndex; index < size; index++)
            {
                array1[index] = (byte)index;
            }

            ArraySegmentWrapper<byte> arraySeg;

            for (int index = 0; index < iterations; index++)
            {
                stopWatch.Start();
                arraySeg = new ArraySegmentWrapper<byte>(array1, startIndex, endIndex);            
                stopWatch.Stop();
                totalTime += stopWatch.Elapsed.TotalMilliseconds;
            }

            Console.WriteLine("ArraySegment:{0:F6}", totalTime / iterations);
            stopWatch.Reset();
            totalTime = 0.0;

            for (int index = 0; index < iterations; index++)
            {
                stopWatch.Start();
                array2 = new byte[endIndex - startIndex + 1];
                Array.Copy(array1, startIndex, array2, 0, endIndex);
                stopWatch.Stop();
                totalTime += stopWatch.Elapsed.TotalMilliseconds;
            }
            Console.WriteLine("Array.Copy:{0:F6}", totalTime / iterations);                        


        }
    }
// Code for ArraySegmentWrapper goes here    

}

ACCESS BENCHMARKS (Updated) So after what Thomas had pointed about the benchmarks and said that access to simple arrays would be faster as compared to ArraySegment, he was totally right. But with digEmAll pointing out that I should test in Release mode (sorry for the old mistake of testing in debug mode), I left the code almost the same as above(iterations reduced by two zeros - could not wait very long for the output to come, sorry) and some modifications to access the same number of elements, Below is what I got.

Construct             Size    Elements accessed    Iterations       Time
_______________________________________________________________________________

ArraySegmentWrapper   1500        1500              1000000       5268.3 ms
Array.Copy            1500        1500              1000000       4812.4 ms

Concluded that although assingment is very fast, access is slow via ArraySegments.

Rummage answered 22/4, 2011 at 15:0 Comment(17)
Your benchmark doesn't really make sense: you're only creating an instance of ArraySegmentWrapper, which doesn't perform any actual work, so of course it's faster than copying data between two arrays... What you should measure is the performance of access to the data via the ArraySegmentWrapper vs directly in the copied array. The array will probably be faster, but of course you have to copy the data first, which isn't free. So you have to make a choice between creation time and access time...Eraeradiate
@Thomas, well when I am creating an instance of ArraySegmentWrapper, isn't the segment being created as well? With the segment being put in arraySeg. That is where the work is being done as far as I see it.Rummage
@Wajih, the segment costs nothing to create: it just stores a reference to the array, an offset and a count... It doesn't actually do anything.Eraeradiate
@Thomas, exactly, spot on and that is what was needed :)Rummage
So your benchmark isn't really conclusive: what you should test is access to the data. The ArraySegmentWrapper is cheap to create, but (slightly) slower to access the data. The array copy is more expensive to create, but faster to access the data.Eraeradiate
@Thomas, well I might need to test that as well.Rummage
@Thomas is right, the benchmark is obviously biased to the wrapper because it does basically nothing vs an Array.Copy(). Of course, if you need to reference always the same array (or a part of it) this can simplify the job... just keep in mind that you're exploiting mutability to the nth degree (so no thread safety etc...)Stour
@Thomas and digEmAll, thanks guys for your valuable input. Got to learn something new today from you.Rummage
@Wajih: You're welcome. BTW, segment wrapper access should be slower but not so slower (i guess)... did you test it in release mode detached from VS ?Stour
@digEmAll, oh, it was debug mode. Let me do it in Release.Rummage
I agree with digEmAll, the difference shouldn't be that big... Did you just enumerate the content, or access it by index?Eraeradiate
@Thomas, I accessed it by index.Rummage
@Thomas and digEmAll. Well the results are much sensible now. I guess there is some difference. But not as much that I would have thought.Rummage
@Wajih: well, optimizer and jitter make their work after all :DStour
@digEmAll. Yup sure they do make them really work!Rummage
I created an ArraySegment clone (not a wrapper) and added an indexer and the performance was only slightly less than with Array.Copy. As I increased the size of the array though, eventually the ArraySegment clone was faster. In either case - the Array.Copy method does use more memory (at least in the short term). Eventually the old array will be garbage collected(if there are no references to it anywhere back up the call stack). You may have to consider how much memory your server has(IF this is a web app). In most cases this is a micro optimization either way so use whatever is more elegant.Manx
@Manx - Mind sharing the benchmarks? Post your answer and benchmarks as an answer. May be you can put a side by side benchmark compared with mine?Rummage
S
8

Starting from Thomas Levesque's suggestion I've built a simple ArraySegmentWrapper<T> class to use in this way:

static void Main(string[] args)
{
    int[] arr = new int[10];
    for (int i = 0; i < arr.Length; i++)
        arr[i] = i;

    // arr = 0,1,2,3,4,5,6,7,8,9

    var segment = new ArraySegmentWrapper<int>(arr, 2, 7);
    segment[0] = -1;
    segment[6] = -1;
    // now arr = 0,1,-1,3,4,5,6,7,-1,9


    // this prints: -1,3,4,5,6,7,-1
    foreach (var el in segment)
        Console.WriteLine(el);
}

Implementation:

public class ArraySegmentWrapper<T> : IList<T>
{
    private readonly ArraySegment<T> segment;

    public ArraySegmentWrapper(ArraySegment<T> segment)
    {
        this.segment = segment;
    }

    public ArraySegmentWrapper(T[] array, int offset, int count)
        : this(new ArraySegment<T>(array, offset, count))
    {
    }

    public int IndexOf(T item)
    {
        for (int i = segment.Offset; i < segment.Offset + segment.Count; i++)
            if (Equals(segment.Array[i], item))
                return i;
        return -1;
    }

    public void Insert(int index, T item)
    {
        throw new NotSupportedException();
    }

    public void RemoveAt(int index)
    {
        throw new NotSupportedException();
    }

    public T this[int index]
    {
        get
        {
            if (index >= this.Count)
                throw new IndexOutOfRangeException();
            return this.segment.Array[index + this.segment.Offset];
        }
        set
        {
            if (index >= this.Count)
                throw new IndexOutOfRangeException();
            this.segment.Array[index + this.segment.Offset] = value;
        }
    }

    public void Add(T item)
    {
        throw new NotSupportedException();
    }

    public void Clear()
    {
        throw new NotSupportedException();
    }

    public bool Contains(T item)
    {
        return this.IndexOf(item) != -1;
    }

    public void CopyTo(T[] array, int arrayIndex)
    {
        for (int i = segment.Offset; i < segment.Offset + segment.Count; i++)
        {
            array[arrayIndex] = segment.Array[i];
            arrayIndex++;
        }
    }

    public int Count
    {
        get { return this.segment.Count; }
    }

    public bool IsReadOnly
    {
        get { return false; }
    }

    public bool Remove(T item)
    {
        throw new NotSupportedException();
    }

    public IEnumerator<T> GetEnumerator()
    {
        for (int i = segment.Offset; i < segment.Offset + segment.Count; i++)
            yield return segment.Array[i];
    }

    System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }
}

EDIT :

As pointed out by @JeppeStigNielsen in the comments, since .NET 4.5 ArraySegment<T> implements IList<T>

Stour answered 22/4, 2011 at 15:42 Comment(7)
Well, its a gem for me, I will do some profiling now and see what I can come up with.Rummage
Why are you using an ArraySegment as your backing class? That class is only like 20 lines of code - you are causing a lot of performance loss with all the de-referencing that has to be done. Also you have redundant code checking the array indexes and such. I modified your code to just use an array reference and accessing it was very close to as fast as just using a plain array since it doesn't have to do doouble and triple de-referencing and constantly access property getters. I know that these are micro optimizations, but they do matter in some instances.Manx
@RepDbg: that was just an example. Of course, since ArraySegment is really simple you can easily implement it directly inside the wrapper. Anyway, I harldy think that performances will change so much, because compiler and jitter usually inline a lot of method-calls. Probably considering only the relative performance gain the difference will be big, but in an absolute view, we're talking about crumbs of seconds... :)Stour
I really like this wrapper! I'm trying to find the best way to pass an ArraySegment to a Sha-256 hashing class. That means I can either convert this to an Array or to a Stream. I'm not sure which has less overhead... or if they end up being the same...Vezza
@makerofthings7: Well, converting to an array you will have the overhead for the array creation, while if you build an ad-hoc stream class for ArraySegmentWrapper you won't. On the other hand, implement a custom stream is not as straightforward as copying an array...Stour
Note: Since .NET 4.5 from Aug 2012, the ArraySegment<> struct implements IList<> in itself.Treponema
@JeppeStigNielsen: thanks, added a note in the answerStour
E
7

I use the following set of extension methods to work with array segments:

    #region ArraySegment related methods

    public static ArraySegment<T> GetSegment<T>(this T[] array, int from, int count)
    {
        return new ArraySegment<T>(array, from, count);
    }

    public static ArraySegment<T> GetSegment<T>(this T[] array, int from)
    {
        return GetSegment(array, from, array.Length - from);
    }

    public static ArraySegment<T> GetSegment<T>(this T[] array)
    {
        return new ArraySegment<T>(array);
    }

    public static IEnumerable<T> AsEnumerable<T>(this ArraySegment<T> arraySegment)
    {
        return arraySegment.Array.Skip(arraySegment.Offset).Take(arraySegment.Count);
    }

    public static T[] ToArray<T>(this ArraySegment<T> arraySegment)
    {
        T[] array = new T[arraySegment.Count];
        Array.Copy(arraySegment.Array, arraySegment.Offset, array, 0, arraySegment.Count);
        return array;
    }

    #endregion

You can use them as follows:

byte[] input = new byte[5]{1,2,3,4,5};
ArraySegment<byte> delimited = input.GetSegment(0, 2);
byte[] segment = delimited.ToArray();
Eraeradiate answered 22/4, 2011 at 15:3 Comment(6)
Will this not be a copy of the segment? Which means any changes to segment are not reflected in the original array?Rummage
Yes, it will be a copy. Isn't it what you want? There is no way to have a different array that points to the same data... However you could create an implementation of IList that accesses a segment of the array.Eraeradiate
Hmmm, i guess I am going to dearly miss C! Thanks anyway!Rummage
@Wajih: C# is different from C, you can't have an array that actually is a sub-array of another, because you can't (actually shouldn't) use pointers. You could create a custom IList<> implementation keeping the original array instance inside as suggested by Thomas (N.B. all T[] arrays implement IList<T>).Stour
@Wajih: have a look at my answer for an example ;)Stour
@Stour and @Thomas, thanks indeed for the answers guys. I will try out the answers. I highly appreciate your input, thanks!Rummage
L
2

C# (and .NET in general) doesn't allow you to create a standard array reference that 'points to' the inside of another array. As such, you either need to change your consuming APIs so that they can deal with ArraySegment instances or you need to create a copy of the data and then copy the changes back after operating on the copy. This is generally a safer approach anyway, as passing around references to an array breaks insulation and makes it more difficult to track down bugs as the number of consumers of the array increases. Constructing new array instances and copying values is relatively cheap in .NET, as long as the arrays are not extremely large in size, so the performance impact here is generally negligible.

If you're running into performance problems and you need to micro-optimize, I would recommend using either unsafe C# code (where you can fix the array reference and pass around pointers) or pulling out the performance-critical code to a C++/CLI assembly where you can do the computations with unmanaged memory. I would recommend profiling the code first to verify that this is really your bottleneck. I can't stress enough that you shouldn't fear allocating new memory in .NET, as the nature of the compacting GC heap means that frequent small allocations are cheaper than they would be in C (where memory allocation must accommodate for possible heap fragmentation.)

Lohengrin answered 22/4, 2011 at 15:14 Comment(1)
I did some profiling and found out that ArraySegment out performs Array.Copy by a factor of 2x. I used Array.Copy to create a segment. I need some fast code with reference to original data. That's why I explored ArraySegment. The only issue was the segment reference, Array.Copy had to be pulled out of the code due to its slowness and non-referenced array slice. Nevertheless, thanks for the answer. I will explore a few more things.Rummage
P
1

Check out the answer I posted on this topic here.

Basically all you have to do is cast ArraySegment to IList to get the functionality you expect.

Pinkney answered 19/5, 2015 at 23:43 Comment(1)
Note: This thread is from 2011, from before .NET 4.5. In those day, the ArraySegment<> struct did not implement any interfaces. That changed in Aug. 2012 when .NET 4.5 was released.Treponema

© 2022 - 2024 — McMap. All rights reserved.