Dealing with id's in entity object design

Asked 17/5, 2011 at 14:20 Answered 17/5, 2011 at 15:41

For a while I have been thinking about how to deal with objects which are assigned identifiers by the database.

A typical object representing a table entity may look like:

public class Test
{
    public int Id { get; private set; }
    public string Something { get; set; }
}

Suppose we would like to use this object for inserting, retrieving and updating objects in the database. As for retrieving and updating we have no problems, since the Id field always has a value.

However, if we want to insert a new object of type Test into the database, the Id field will still need to have a value. We could just use "0", as it is unlikely to be used as a database key, but really this is not a good design.

Likewise, if we invert the situation and make the Id property nullable, we could use null for objects which have not yet been assigned an identifier by the database. However, it is now possible for an object retrieved from the database to not have an identifier (as allowed by the class design, but not the database design)

Any good ideas on how to make a good design for this problem?

Progestin answered 17/5, 2011 at 14:20 Comment(5)

it's not possible to get an object without an identifier, if it was retrieved from the database.. as you insert it, it will get an id immediately (unless you have some reaaally strange database) – Corbel 17/5, 2011 at 14:27

@Corbel Obviously. However, looking strictly at the object type definition this is possible. – Progestin 17/5, 2011 at 14:28

Well, you say the opposite in your question.. But anyhow, what do you want to do with these objects? Unless you explain which operations will be done with them, it's unclear what the problem is.. – Corbel 17/5, 2011 at 14:32

Are you using Entity Framework or did you just mean "entity" in the general sense? – Elle 17/5, 2011 at 14:45

I mean entity in the general sense, although the issue obviously exists when using Entity Framework as well. – Progestin 17/5, 2011 at 14:48

If you treat id as a way to identify/grant uniqueness to object within your application, this should be handled by database (unless of course, you have other ways to assign identifiers to objects).

If it's not (as in, it's object's property driven by business needs) - whether 0 is valid/good design value or not depends on those business needs purely. Is 0 valid value from say, end user point of view?

You can always wrap your object properties into separate class, if you feel that having objects without ids set around in your application is problematic. You'll use such class essentially only for carrying parameters for not yet created object (creation process is finalized with database insert). Once the object gets inserted, id assigned and stuff - you can work with your regular entity. Will your user go "Oh, snap! What is that?" or "Ok.. I know what to do." once approached by id = 0?

Edit

This question (or rather my answer about wrapping parameters) reminded me of a fact that once surprised me. When a child is born, she doesn't exist in system until her parents officialy register her and she gets personal identification number assigned. So technically, without id - child doesn't exist (at least from system point of view), even tho everybody knows she was born and stuff. It's the same with database/your app - object without id (one that database cannot identify) doesn't exist - it's just a set of parameters. Bit bizzare, but I hope my point is clear :)

Underside answered 17/5, 2011 at 15:21 Comment(0)

There is nothing wrong with designing a class so that an ID of 0 indicates that the entity has not yet been serialized. I have built systems in the past that successfully used this approach. Just make sure that these semantics are well defined in your API, and that this meaning is respected in all of the code.

One trap to watch out for is using the ID to define an equality relationship (such as for generating hash codes for a dictionary). This must only be done for non-zero IDs. For testing equality with two IDs of zero, reference equality may be used.

However, since unstored entities may have their ID change at some point in the future, it is very important that such objects are never stored in a Dictionary. Or, at the very least, such items must be removed from any dictionaries before saving and then restored afterwards using the new ID.

With that one reservation, this design should work fine.

public class Test : IEquatable<Test>
{ 
    /// <summary>
    /// The unique identifier for this Test entity, or zero if this
    /// Test entity has not yet been serialized to the database.
    /// </summary>
    public int Id { get; private set; } 

    public string Something { get; set; }

    public override bool Equals(object obj)
    {
        return Equals(obj as Test);
    }

    public bool Equals(Test other)
    {
        if (other == null)
            return false;
        // Distinct entities may exist with the Id value of zero.
        if (Id == 0)
            return object.ReferenceEquals(this, other);
        return Id == other.Id;
    }

    /// <summary>
    /// Gets a hash code for this Test entity. Warning: an instance with
    /// an Id of zero will change its identity when saved to the DB. Use with care.
    /// </summary>
    /// <returns>The hash code.</returns>
    public override int GetHashCode()
    {
        return Id;
    }
}

Mesoglea answered 17/5, 2011 at 15:41 Comment(3)

I realize that an id of 0 indicating that an object has no id is certainly a valid working solution; I was just wondering whether I would be feasible to do something better. I believe it is always better to not allow "incorrect" data and documenting your way out of it, as opposed to not have it be possible at all. – Progestin 17/5, 2011 at 21:58

@Progestin - Yes, that was one of the options in your original question. I was just saying that in my opinion it is fine. I see no distinction between defining 0 as unsaved vs. defining null that way, except that the first is slightly less trouble than the second. In my opinion, properly documenting and implementing a special value is a good solution. – Mesoglea 17/5, 2011 at 22:5

@Progestin - Think of this way: in a double the first bit is interpreted to indicate whether the numeric value is positive or negative. It is, however, just a bit like any other, and can be interpreted other ways in other contexts. Defining certain combinations of bits as meaning one thing as opposed to another thing is a basic part of programming, and does not indicate bad design. – Mesoglea 17/5, 2011 at 22:14

So, you have a class that contains database table's identifier as a field, so in case of retrieval/deletion and updation your entity's identifier is aligned with the database record's identifier. In case of insertion, you could retrieve the identify value of the record just inserted and update your entity with that value. So you could have separate store procedures for insertion/updation/retrieval and deletion. The insertion SP could return you the id of the record just inserted as a out parameter. Hope I answer your question.

Trstram answered 17/5, 2011 at 15:18 Comment(3)

I agree with Abdul's assertion, that it's the db's responsibility to determine new ID values and set whatever other defaults are appropriate for new records. You need to request a new row from the db, then allow the user to modify the new row, if they want. – Ewer 17/5, 2011 at 15:57

Certainly the object would be assigned an id after being inserted into the database and this id could be set on the object being inserted. However, I still see a problem representing the object before an identifier is assigned. – Progestin 17/5, 2011 at 21:54

So whats the problem you see? – Trstram 18/5, 2011 at 5:18

Recommended topics

Hot tags