Is there anything like a restrict keyword for C++ to indicate that _iterators_ are not aliased
Asked Answered
H

2

10

g++ does implement __restrict__ for pointers, but I could not find anything about iterators. My overall intent is to encourage the compiler to vectorize stl loops.

Edit:

Even if the compiler is unable to vectorize, the __restrict__ keyword should be able to tell the compiler that no unnecessary reloads are necessary inside a loop.

Honeysuckle answered 13/2, 2011 at 10:52 Comment(1)
__restrict__ is officially not specified in the C++ standard AFAIK, although many compilers support it. Here is a paper on this issue: N3635Jotham
W
6

I don't know the answer to your direct question. However, the compiler would only ever be able to vectorize a loop for std::vector, as it's the only container (I think) that has contiguous storage, and no dependencies between successive storage locations (unlike e.g. std::list). I don't know how to make it do so, though.

Update

After some experimentation (which may or may not be relevant to the overall goal), I discovered that in ICC, the following does not vectorise:

typedef std::vector<float> V;

V vec(4096);

for (V::iterator it = vec.begin(); it != vec.end(); ++it)
{
    *it *= *it;
}

whereas the following does:

V vec(4096);

V::iterator it2 = vec.end();
for (V::iterator it = vec.begin(); it != it2; ++it)
{
    *it *= *it;
}

So apparently, the problem is not so much iterators, but the call to vec.end() inside the loop construct, which apparently cannot be factored out, even though it's clear that the loop body doesn't affect the vector bounds.

In GCC, I couldn't get anything to vectorise. This isn't surprising, because GCC is much worse than ICC at spotting SSE opportunities.

Waterbuck answered 13/2, 2011 at 10:55 Comment(20)
Yes definitely. Given that iterators are an abstraction of pointers I would think restrict applies to pointers as much as they do to iterators (into vectors). But I am not sure.Honeysuckle
Vectorization of random access iterators on contiguous storage would already be a lot of help, actually.Proletariat
@Konrad: It seems that even given a trivial for (std::vector<float>::iterator ...) loop, even ICC (which is normally much better than GCC at vectorisation) isn't able to vectorise.Waterbuck
@Konrad,@Oli I guess the v[1] object may refer to v[2] as a data member. So unless there is a mechanism to tell the compiler that no such aliasing takes place, the compiler wouldn't be able to vectorize, except may be if the functor that for takes is a constant functor.Honeysuckle
@srean: See my comment reply to @Konrad. Even with a vector<float> (i.e. a vector of primitives where dependencies are impossible), the compiler was unable to vectorise ("loop was not vectorized: unsupported loop structure.").Waterbuck
@Oli Oh! I did not notice they were vector of floats. That is indeed strange.Honeysuckle
@Oli: that’s my point though. It would be nice if the compiler did this and in principle it should be possible if we can make the compiler realize that the iterators here resolve to a simple pointer (or index or however you implement them) and are never aliased.Proletariat
@Konrad: It does seem to realise this, as the generated assembler for an iterator-based loop (for a vector) is essentially identical to that for a (non-vectorised) C-style loop. I guess the problem is that the vectorisation pass of the compiler has been heuristically tuned to only work with a small subset of possible loop constructs, as the "rules" for the general case are probably horrifically complex!Waterbuck
@Oli: “It does seem to realise this” – exactly. I believe vectorization fails for the sole reason that srean has pointed out, i.e. aliasing.Proletariat
@Konrad: At least in the case of ICC, I don't think that aliasing is the reason (well, at least not the only reason). The vectorisation report simply states "unsupported loop structure"; the vectorisation optimiser simply hasn't been written to recognise anything other than very simple loop constructs.Waterbuck
@Oli: hm. But once the code is optimized it is a simple loop structure. Of course, it’s possible that the vectorization optimization is applied before the iterator is optimized away. But I would assume that vectorization is an optimization that is applied very late, and inlining the iterator class should be applied very early.Proletariat
@Konrad: After some more experimentation, I've found that this loop construct will not vectorise: for (vec_t::iterator it = vec.begin(); it != vec.end(); ++it), whereas this one will: vec_t::iterator itEnd = vec.end(); for (vec_t::iterator it = vec.begin(); it != itEnd; ++it). So the problem seem to be not so much iterators, but simply the call to vector<T>::end() that for whatever reason it cannot optimise away.Waterbuck
@Oli: interesting. Thanks for making the effort. Now put it in the answer so we can vote this up.Proletariat
@Oli: I always read as a guideline that recomputing end() on each loop iteration was BAD (tm) but I thought (naively) that compilers would be able to recognize it as a loop invariant and hoist it out. Apparently it is not so...Couldst
@Oli that makes sense, because the vec.end() could change because of addition or deletion in the loop. But STL algorithm, like foreach etc should vectorize because they have no way of changing the number of elements in the container. I totally second @Konrad's suggestion of editing the answer. I am curious of something else, does it do any loop-unrolling when it does not vectorize. Unrolling and avoiding unnecessary reloads is still better than nothing. Thanks for all your experiments.Honeysuckle
@srean: Answer updated! Looking at the resulting assembler, I didn't observe any loop unrolling, unfortunately. Having thought about the overall problem some more, I can now see that your question about restrict is a pertinent one, and unfortunately I don't believe there is an equivalent for iterators.Waterbuck
When you say that GCC sucks in vectorizing loops, did you even try with the option enabled gcc -O2 -ftree-vectorize -msse[2]?Loreeloreen
@rubenvb: That's a good point; I'd forgotten that GCC doesn't do this by default! You're right, with this flag enabled, I get vectorised assembler, albeit considerably more bloated than ICC's attempt.Waterbuck
@rubenvb: Nevertheless, I stand by my assertion that GCC takes fewer SSE opportunities than ICC does...Waterbuck
@OliverCharlesworth, presumable this means to that C++11's range-for loop (for std::vector) will vectorize, because that is equivalent to store the end() iterator.Curhan
L
-2

Take this C++20-solution:

#pragma once
#include <cstddef>
#include <memory>
#include <iterator>
#include <type_traits>
#include <concepts>

template<typename Iterator>
concept restrict_it_concept = requires() { { *Iterator() }; } || requires() { { (Iterator())[(std::ptrdiff_t)1] }; };

template<restrict_it_concept Iterator>
struct restrict_it final
{
    using value_type = std::remove_reference_t<std::conditional_t<requires() { { *Iterator() }; }, decltype(*Iterator()), decltype((Iterator())[(ptrdiff_t)1])>>;
    constexpr restrict_it() noexcept
        requires requires() { { Iterator() }; };
    constexpr restrict_it( Iterator it ) noexcept
        requires requires( Iterator it ) { { Iterator( it ) }; };
    constexpr restrict_it( restrict_it const &other ) noexcept
        requires requires( Iterator it ) { { Iterator( it ) }; };
    constexpr restrict_it &operator =( Iterator it ) noexcept
        requires requires( Iterator it ) { { it = it }; };
    constexpr restrict_it &operator =( restrict_it const &other ) noexcept
        requires requires( Iterator it ) { { it = it }; };
    constexpr bool operator <( restrict_it const &other ) const noexcept
        requires requires( Iterator it ) { { it < it } -> std::convertible_to<bool>; };
    constexpr bool operator <=( restrict_it const &other ) const noexcept
        requires requires( Iterator it ) { { it <= it } -> std::convertible_to<bool>; };
    constexpr bool operator >( restrict_it const &other ) const noexcept
        requires requires( Iterator it ) { { it > it } -> std::convertible_to<bool>; };
    constexpr bool operator >=( restrict_it const &other ) const noexcept
        requires requires( Iterator it ) { { it >= it } -> std::convertible_to<bool>; };
    constexpr bool operator ==( restrict_it const &other ) const noexcept
        requires requires( Iterator it ) { { it == it } -> std::convertible_to<bool>; };
    constexpr bool operator !=( restrict_it const &other ) const noexcept
        requires requires( Iterator it ) { { it != it } -> std::convertible_to<bool>; };
    constexpr std::ptrdiff_t operator -( restrict_it const &other ) const noexcept
        requires requires( Iterator it ) { { it - it } -> std::convertible_to<std::ptrdiff_t>; };
    constexpr restrict_it &operator ++() noexcept
        requires requires( Iterator it ) { { ++it }; };
    constexpr restrict_it &operator --() noexcept
        requires requires( Iterator it ) { { --it }; };
    constexpr restrict_it operator ++( int ) const noexcept
        requires requires( Iterator it ) { { it++ } -> std::convertible_to<Iterator>; };
    constexpr restrict_it operator --( int ) const noexcept
        requires requires( Iterator it ) { { it-- } -> std::convertible_to<Iterator>; };
    constexpr restrict_it operator +( std::ptrdiff_t offset ) const noexcept
        requires requires( Iterator it, std::ptrdiff_t offset ) { { it + offset } -> std::convertible_to<Iterator>; };
    friend constexpr restrict_it operator +( std::ptrdiff_t offset, restrict_it it ) noexcept
        requires requires( std::ptrdiff_t offset, Iterator it ) { { offset + it } -> std::convertible_to<Iterator>; }
    {
        return restrict_it( offset + it.m_it );
    }
    constexpr restrict_it operator -( std::ptrdiff_t offset ) const noexcept
        requires requires( Iterator it, std::ptrdiff_t offset ) { { it - offset } -> std::convertible_to<Iterator>; };
    constexpr value_type &__restrict operator *() const noexcept
        requires requires( Iterator it ) { { *it } -> std::convertible_to<value_type &>; };
    constexpr value_type &__restrict operator []( std::ptrdiff_t index ) const noexcept
        requires requires( Iterator it, std::ptrdiff_t index ) { { it[index] } -> std::convertible_to<value_type &>; };
    constexpr value_type *__restrict operator ->() const noexcept
        requires requires( Iterator it ) { { it.operator ->() } -> std::convertible_to<value_type *>; };
    constexpr restrict_it &operator +=( std::ptrdiff_t offset ) const noexcept
        requires requires( Iterator it, std::ptrdiff_t offset ) { { it += offset }; };
    constexpr restrict_it &operator -=( std::ptrdiff_t offset ) const noexcept
        requires requires( Iterator it, std::ptrdiff_t offset ) { { it -= offset }; };
private:
    Iterator m_it;
};

template<restrict_it_concept Iterator>
constexpr restrict_it<Iterator>::restrict_it() noexcept
    requires requires() { { Iterator() }; }
    : m_it()
{
}

template<restrict_it_concept Iterator>
constexpr restrict_it<Iterator>::restrict_it( Iterator it ) noexcept
    requires requires( Iterator it ) { { Iterator( it ) }; }
    : m_it( it )
{
}

template<restrict_it_concept Iterator>
constexpr restrict_it<Iterator>::restrict_it( restrict_it const &other ) noexcept
    requires requires( Iterator it ) { { Iterator( it ) }; }
    : m_it( other.m_it )
{
}

template<restrict_it_concept Iterator>
constexpr restrict_it<Iterator> &restrict_it<Iterator>::operator =( Iterator it ) noexcept
    requires requires( Iterator it ) { { it = it }; }
{
    m_it = it;
    return *this;
}

template<restrict_it_concept Iterator>
constexpr restrict_it<Iterator> &restrict_it<Iterator>::operator =( restrict_it const &other ) noexcept
    requires requires( Iterator it ) { { it = it }; }
{
    m_it = other.m_it;
    return *this;
}

template<restrict_it_concept Iterator>
constexpr bool restrict_it<Iterator>::operator <( restrict_it const &other ) const noexcept
    requires requires( Iterator it ) { { it < it } -> std::convertible_to<bool>; }
{
    return m_it < other.m_it;
}

template<restrict_it_concept Iterator>
constexpr bool restrict_it<Iterator>::operator <=( restrict_it const &other ) const noexcept
    requires requires( Iterator it ) { { it <= it } -> std::convertible_to<bool>; }
{
    return m_it <= other.m_it;
}

template<restrict_it_concept Iterator>
constexpr bool restrict_it<Iterator>::operator >( restrict_it const &other ) const noexcept
    requires requires( Iterator it ) { { it > it } -> std::convertible_to<bool>; }
{
    return m_it > other.m_it;
}

template<restrict_it_concept Iterator>
constexpr bool restrict_it<Iterator>::operator >=( restrict_it const &other ) const noexcept
    requires requires( Iterator it ) { { it >= it } -> std::convertible_to<bool>; }
{
    return m_it >= other.m_it;
}

template<restrict_it_concept Iterator>
constexpr bool restrict_it<Iterator>::operator ==( restrict_it const &other ) const noexcept
    requires requires( Iterator it ) { { it == it } -> std::convertible_to<bool>; }
{
    return m_it == other.m_it;
}

template<restrict_it_concept Iterator>
constexpr bool restrict_it<Iterator>::operator !=( restrict_it const &other ) const noexcept
    requires requires( Iterator it ) { { it != it } -> std::convertible_to<bool>; }
{
    return m_it != other.m_it;
}

template<restrict_it_concept Iterator>
constexpr std::ptrdiff_t restrict_it<Iterator>::operator -( restrict_it const &other ) const noexcept
    requires requires( Iterator it ) { { it - it } -> std::convertible_to<std::ptrdiff_t>; }
{
    return m_it - other.m_it;
}

template<restrict_it_concept Iterator>
constexpr restrict_it<Iterator> &restrict_it<Iterator>::operator ++() noexcept
    requires requires( Iterator it ) { { ++it }; }
{
    ++m_it;
    return *this;
}

template<restrict_it_concept Iterator>
constexpr restrict_it<Iterator> &restrict_it<Iterator>::operator --() noexcept
    requires requires( Iterator it ) { { --it }; }
{
    --m_it;
    return *this;
}

template<restrict_it_concept Iterator>
constexpr restrict_it<Iterator> restrict_it<Iterator>::operator ++( int ) const noexcept
    requires requires( Iterator it ) { { it++ } -> std::convertible_to<Iterator>; }
{
    return restrict_it( m_it++ );
}

template<restrict_it_concept Iterator>
constexpr restrict_it<Iterator> restrict_it<Iterator>::operator --( int ) const noexcept
    requires requires( Iterator it ) { { it-- } -> std::convertible_to<Iterator>; }
{
    return restrict_it( m_it-- );
}

template<restrict_it_concept Iterator>
constexpr restrict_it<Iterator> restrict_it<Iterator>::operator +( std::ptrdiff_t offset ) const noexcept
    requires requires( Iterator it, std::ptrdiff_t offset ) { { it + offset } -> std::convertible_to<Iterator>; }
{
    return restrict_it( m_it + offset );
}

template<restrict_it_concept Iterator>
constexpr restrict_it<Iterator> restrict_it<Iterator>::operator -( std::ptrdiff_t offset ) const noexcept
    requires requires( Iterator it, std::ptrdiff_t offset ) { { it - offset } -> std::convertible_to<Iterator>; }
{
    return restrict_it( m_it - offset );
}

template<restrict_it_concept Iterator>
constexpr typename restrict_it<Iterator>::value_type &__restrict restrict_it<Iterator>::operator *() const noexcept
    requires requires( Iterator it ) { { *it } -> std::convertible_to<value_type &>; }
{
    return *m_it;
}

template<restrict_it_concept Iterator>
constexpr typename restrict_it<Iterator>::value_type &__restrict restrict_it<Iterator>::operator []( std::ptrdiff_t index ) const noexcept
    requires requires( Iterator it, std::ptrdiff_t index ) { { it[index] } -> std::convertible_to<value_type &>; }
{
    return m_it[index];
}

template<restrict_it_concept Iterator>
constexpr typename restrict_it<Iterator>::value_type *__restrict restrict_it<Iterator>::operator ->() const noexcept
    requires requires( Iterator it ) { { it.operator ->() } -> std::convertible_to<value_type *>; }
{
    return m_it.operator ->();
}

template<restrict_it_concept Iterator>
constexpr restrict_it<Iterator> &restrict_it<Iterator>::operator +=( std::ptrdiff_t offset ) const noexcept
    requires requires( Iterator it, std::ptrdiff_t offset ) { { it += offset }; }
{
    m_it += offset;
    return *this;
}

template<restrict_it_concept Iterator>
constexpr restrict_it<Iterator> &restrict_it<Iterator>::operator -=( std::ptrdiff_t offset ) const noexcept
    requires requires( Iterator it, std::ptrdiff_t offset ) { { it -= offset }; }
{
    m_it -= offset;
    return *this;
}

#if !defined(NDEBUG)
#include <vector>
template<>
struct restrict_it<typename std::vector<char>::const_iterator>;
#endif

With that you could simply write:

restrict_it it( vec.cbegin() );

Works with any container.

Lollipop answered 30/6, 2022 at 10:36 Comment(7)
Perhaps you could explain what this actually does.Brianbriana
It helps you to have __restrict-semantics with any type of iterator.Lollipop
Hmm I see the code uses __restrict on function return types. Does it really work? "Restrict semantics apply to lvalue expressions only; for example, a cast to restrict-qualified pointer or a function call returning a restrict-qualified pointer are not lvalues and the qualifier has no effect." (cppteference)Brianbriana
The above code doesn't compile with clang++ 13 because the concepts implementation has mistakes. MSVC doesn't honor the __restrict-qualfiers correctly, but g++ 11.1.0 does.Lollipop
There is no such thing as "the" correct handling of __restrict. It is a non-standard extension, therefore anything implementation does is correct by (its own) definition.Brianbriana
Sorry wanted to ask a question but got distracted. The question is, does this code result in optimisations being done (by any compiler) that are normally not done?Brianbriana
clang 13 does Optimizations according to &__restrict references, g++ also, but clang 13 is too stupid to compile the above code. MSVC compiles the above code but doesn't honor the __restrict with that.Lollipop

© 2022 - 2024 — McMap. All rights reserved.