How far to go with a strongly typed language?
Asked Answered
H

14

42

Let's say I am writing an API, and one of my functions take a parameter that represents a channel, and will only ever be between the values 0 and 15. I could write it like this:

void Func(unsigned char channel)
{
    if(channel < 0 || channel > 15)
    { // throw some exception }
    // do something
}

Or do I take advantage of C++ being a strongly typed language, and make myself a type:

class CChannel
{
public:
    CChannel(unsigned char value) : m_Value(value)
    {
        if(channel < 0 || channel > 15)
        { // throw some exception }
    }
    operator unsigned char() { return m_Value; }
private:
    unsigned char m_Value;
}

My function now becomes this:

void Func(const CChannel &channel)
{
    // No input checking required
    // do something
}

But is this total overkill? I like the self-documentation and the guarantee it is what it says it is, but is it worth paying the construction and destruction of such an object, let alone all the additional typing? Please let me know your comments and alternatives.

Hilarius answered 5/7, 2010 at 19:32 Comment(5)
FWIW, since the parameter type is unsigned, there's no purpose in testing channel < 0.Remediless
Random interesting facts: Ada supports this natively, as type CChannel is range 0..15;, and there was a paper a few years ago about adding a similar capability to CSpotlight
@Michael: ...and Ada got it from Pascal, from which it changed the syntax only minutely (type CChannel = 0 .. 15;).Lineation
Now that you've got the answer to the question you asked, have you considered whether you actually ought to have a class Channel with a member Channel::Func()?Cormick
Why make Func a member ? This is neither Java nor C#!Flautist
R
60

If you wanted this simpler approach generalize it so you can get more use out of it, instead of tailor it to a specific thing. Then the question is not "should I make a entire new class for this specific thing?" but "should I use my utilities?"; the latter is always yes. And utilities are always helpful.

So make something like:

template <typename T>
void check_range(const T& pX, const T& pMin, const T& pMax)
{
    if (pX < pMin || pX > pMax)
        throw std::out_of_range("check_range failed"); // or something else
}

Now you've already got this nice utility for checking ranges. Your code, even without the channel type, can already be made cleaner by using it. You can go further:

template <typename T, T Min, T Max>
class ranged_value
{
public:
    typedef T value_type;

    static const value_type minimum = Min;
    static const value_type maximum = Max;

    ranged_value(const value_type& pValue = value_type()) :
    mValue(pValue)
    {
        check_range(mValue, minimum, maximum);
    }

    const value_type& value(void) const
    {
        return mValue;
    }

    // arguably dangerous
    operator const value_type&(void) const
    {
        return mValue;
    }

private:
    value_type mValue;
};

Now you've got a nice utility, and can just do:

typedef ranged_value<unsigned char, 0, 15> channel;

void foo(const channel& pChannel);

And it's re-usable in other scenarios. Just stick it all in a "checked_ranges.hpp" file and use it whenever you need. It's never bad to make abstractions, and having utilities around isn't harmful.

Also, never worry about overhead. Creating a class simply consists of running the same code you would do anyway. Additionally, clean code is to be preferred over anything else; performance is a last concern. Once you're done, then you can get a profiler to measure (not guess) where the slow parts are.

Remediless answered 5/7, 2010 at 19:44 Comment(8)
+1. Considering how similar this is to what I posted, it'd be almost hypocritical if I didn't give it an up-vote! :-)Lineation
+1, but, you could use some kind of explicit typedef (I know boost has one), so you get compile time errors if you happen to mix it up with some other ranged_value <0,15>. Ofcourse the arguable operator value_type overload is removed in this case.Jenette
@Viktor: What do you mean mix up?Remediless
If you have another typedef ranged_value<unsigned char, 0, 15> potatoe;, there is a risk you pass a potatoe instance instead of a channel instance. Boost has Strong Typedef for this. Otherwise there is always the solution to use a 4th discriminant template parameter to pass in a tag from an anonymous namespace.Flautist
@Matt: Oh, right. I thought he meant other channels. I'll leave that as an exercise for the reader.Remediless
@GMan: haha, love this phrase. In this case it's quite easy to tackle though :)Flautist
@Remediless could you explain why you copy the 'min' and 'max' variables to 'minimum' and 'maximum'?Purgatory
@MartinDrozdik: Just makes it easier to get the template arguments from outside the class.Remediless
L
28

Yes, the idea is worthwhile, but (IMO) writing a complete, separate class for each range of integers is kind of pointless. I've run into enough situations that call for limited range integers that I've written a template for the purpose:

template <class T, T lower, T upper>
class bounded { 
    T val;
    void assure_range(T v) {
        if ( v < lower || upper <= v)
            throw std::range_error("Value out of range");
    }
public:
    bounded &operator=(T v) { 
        assure_range(v);
        val = v;
        return *this;
    }

    bounded(T const &v=T()) {
        assure_range(v);
        val = v;
    }

    operator T() { return val; }
};

Using it would be something like:

bounded<unsigned, 0, 16> channel;

Of course, you can get more elaborate than this, but this simple one still handles about 90% of situations pretty well.

Lineation answered 5/7, 2010 at 19:49 Comment(5)
+1, I'm adding this question to my favourites for this answer!Multiplechoice
+1. Considering how similar this is to what I posted, it'd be almost hypocritical if I didn't give it an up-vote! :-) :PRemediless
Now that I look at it again, how can it be used? I cannot do channel c; c = 5; because it has no default constructor, nor can I do channel c(5); because it has no conversion constructor.Remediless
Though now you can get rid of the copy assignment operator, since the default with use the constructor to make the object of which to assign from. (Then our comments will truly match up. :P)Remediless
You can, but I prefer not to -- the assignment operator is a fairly significant optimization (it avoids range checking when assigning a value we know is in range).Lineation
Y
14

No, it is not overkill - you should always try to represent abstractions as classes. There are a zillion reasons for doing this and the overhead is minimal. I would call the class Channel though, not CChannel.

Yuen answered 5/7, 2010 at 19:37 Comment(0)
E
11

Can't believe nobody mentioned enum's so far. Won't give you a bulletproof protection, but still better than a plain integer datatype.

Erective answered 5/7, 2010 at 19:51 Comment(4)
+1 - was about to write exactly the same thing (almost word for word).Epicurean
Speaking about objects, it's not the same thing a range from 0 to 15 than 16 defined values.Ultraviolet
Ok, now I want a type that represents numbers greater than 11. Show me the enum that represents that.Implacable
enum SomeNaturalNumbers {One=1, Twelve=12};Erective
F
6

Looks like overkill, especially the operator unsigned char() accessor. You're not encapsulating data, you're making evident things more complicated and, probably, more error-prone.

Data types like your Channel are usually a part of something more abstracted.

So, if you use that type in your ChannelSwitcher class, you could use commented typedef right in the ChannelSwitcher's body (and, probably, your typedef is going to be public).

// Currently used channel type
typedef unsigned char Channel;
Frustrate answered 5/7, 2010 at 19:34 Comment(3)
I think the typedef is a good way to handle this, unless a Channel will have its own channel-specific behaviours (such as open, connect, send...), or if there is a likely future possibility to need subclasses if Channel.Multiplechoice
@Frustrated I actually think that the second case makes class usage pretty obvious :)Frustrate
@Multiplechoice - The Channel class will only likely ever hold this value. @Kotti - I will still have to limit check the typedef, but yes, that does seem preferable to my first solution.Hilarius
I
6

Whether you throw an exception when constructing your "CChannel" object or at the entrance to the method that requires the constraint makes little difference. In either case you're making runtime assertions, which means the type system really isn't doing you any good, is it?

If you want to know how far you can go with a strongly typed language, the answer is "very far, but not with C++." The kind of power you need to statically enforce a constraint like, "this method may only be invoked with a number between 0 and 15" requires something called dependent types--that is, types which depend on values.

To put the concept into pseudo-C++ syntax (pretending C++ had dependent types), you might write this:

void Func(unsigned char channel, IsBetween<0, channel, 15> proof) {
    ...
}

Note that IsBetween is parameterized by values rather than types. In order to call this function in your program now, you must provide to the compiler the second argument, proof, which must have the type IsBetween<0, channel, 15>. Which is to say, you have to prove at compile-time that channel is between 0 and 15! This idea of types which represent propositions, whose values are proofs of those propositions, is called the Curry-Howard Correspondence.

Of course, proving such things can be difficult. Depending on your problem domain, the cost/benefit ratio can easily tip in favor of just slapping runtime checks on your code.

Implacable answered 30/7, 2010 at 10:43 Comment(2)
Thanks, this is an excellent comment. Do you think C++ will ever see such a feature?Hilarius
No, I think that's highly unlikely.Implacable
V
4

Whether something is overkill or not often depends on lots of different factors. What might be overkill in one situation might not in another.

This case might not be overkill if you had lots of different functions that all accepted channels and all had to do the same range checking. The Channel class would avoid code duplication, and also improve readability of the functions (as would naming the class Channel instead of CChannel - Neil B. is right).

Sometimes when the range is small enough I will instead define an enum for the input.

Vinaigrette answered 5/7, 2010 at 19:45 Comment(0)
I
1

If you add constants for the 16 different channels, and also a static method that fetches the channel for a given value (or throws an exception if out of range) then this can work without any additional overhead of object creation per method call.

Without knowing how this code is going to be used, it's hard to say if it's overkill or not or pleasant to use. Try it out yourself - write a few test cases using both approaches of a char and a typesafe class - and see which you like. If you get sick of it after writing a few test cases, then it's probably best avoided, but if you find yourself liking the approach, then it might be a keeper.

If this is an API that's going to be used by many, then perhaps opening it up to some review might give you valuable feedback, since they presumably know the API domain quite well.

Idolatry answered 5/7, 2010 at 19:45 Comment(0)
E
1

In my opinion, I don't think what you are proposing is a big overhead, but for me, I prefer to save the typing and just put in the documentation that anything outside of 0..15 is undefined and use an assert() in the function to catch errors for debug builds. I don't think the added complexity offers much more protection for programmers who are already used to C++ language programming which contains alot of undefined behaviours in its specs.

Emulsifier answered 5/7, 2010 at 20:31 Comment(0)
H
1

You have to make a choice. There is no silver bullet here.

Performance

From the performance perspective, the overhead isn't going to be much if at all. (unless you've got to counting cpu cycles) So most likely this shouldn't be the determining factor.

Simplicity/ease of use etc

Make the API simple and easy to understand/learn. You should know/decide whether numbers/enums/class would be easier for the api user

Maintainability

  1. If you are very sure the channel type is going to be an integer in the foreseeable future , I would go without the abstraction (consider using enums)

  2. If you have a lot of use cases for a bounded values, consider using the templates (Jerry)

  3. If you think, Channel can potentially have methods make it a class right now.

Coding effort Its a one time thing. So always think maintenance.

Hierogram answered 5/7, 2010 at 20:38 Comment(0)
H
1

The channel example is a tough one:

  • At first it looks like a simple limited-range integer type, like you find in Pascal and Ada. C++ gives you no way to say this, but an enum is good enough.

  • If you look closer, could it be one of those design decisions that are likely to change? Could you start referring to "channel" by frequency? By call letters (WGBH, come in)? By network?

A lot depends on your plans. What's the main goal of the API? What's the cost model? Will channels be created very frequently (I suspect not)?

To get a slightly different perspective, let's look at the cost of screwing up:

  • You expose the rep as int. Clients write a lot of code, the interface is either respected or your library halts with an assertion failure. Creating channels is dirt cheap. But if you need to change the way you're doing things, you lose "backward bug-compatibility" and annoy authors of sloppy clients.

  • You keep it abstract. Everybody has to use the abstraction (not so bad), and everybody is futureproofed against changes in the API. Maintaining backwards compatibility is a piece of cake. But creating channels is more costly, and worse, the API has to state carefully when it is safe to destroy a channel and who is responsible for the decision and the destruction. Worse case scenario is that creating/destroying channels leads to a big memory leak or other performance failure—in which case you fall back to the enum.

I'm a sloppy programmer, and if it were for my own work, I'd go with the enum and eat the cost if the design decision changes down the line. But if this API were to go out to a lot of other programmers as clients, I'd use the abstraction.


Evidently I'm a moral relativist.

Hoist answered 6/7, 2010 at 3:19 Comment(0)
G
1

An integer with values only ever between 0 and 15 is an unsigned 4-bit integer (or half-byte, nibble. I imagine if this channel switching logic would be implemented in hardware, then the channel number might be represented as that, a 4-bit register). If C++ had that as a type you would be done right there:

void Func(unsigned nibble channel)
{
    // do something
}

Alas, unfortunately it doesn't. You could relax the API specification to express that the channel number is given as an unsigned char, with the actual channel being computed using a modulo 16 operation:

void Func(unsigned char channel)
{
    channel &= 0x0f; // truncate
    // do something
}

Or, use a bitfield:

#include <iostream>
struct Channel {
    // 4-bit unsigned field
    unsigned int n : 4;
};
void Func(Channel channel)
{
    // do something with channel.n
}
int main()
{
    Channel channel = {9};
    std::cout << "channel is" << channel.n << '\n';
    Func (channel); 
}

The latter might be less efficient.

Gradation answered 20/7, 2015 at 16:31 Comment(2)
For this particular case, using a nibble might be OK, but is is tying value (Channel 13) rather tightly to representation (0xE). In another arbitrary example, what if range checking needed to be done to ensure volume is in the range 0-11?Socratic
@Socratic Yes, this was intended to be a very specific answer valid in this particular case only, under the assumption there will always be 15 channels. I usually try to solve a problem in its most specific form, and then generalize as and when needed.Gradation
S
0

I vote for your first approach, because it's simpler and easier to understand, maintain, and extend, and because it is more likely to map directly to other languages should your API have to be reimplemented/translated/ported/etc.

Shane answered 5/7, 2010 at 20:48 Comment(0)
I
0

This is abstraction my friend! It's always neater to work with objects

Impignorate answered 5/7, 2010 at 20:55 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.