MongoDb: Benefit of using ObjectID vs a string containing an Id?
Asked Answered
P

3

29

Is there any benefit to storing an id to a related document as an ObjectId versus storing it as a string literal?

Using ObjectID:

{
   "_id": ObjectId("522bb79455449d881b004d27"),
   "username": "admin",
   "folder": ObjectId("522bb79455449d881b004d23")
}

versus a string:

{
   "_id": ObjectId("522bb79455449d881b004d27"),
   "username": "admin",
   "folder": "522bb79455449d881b004d23"
}

For my API where I'm sending data back to a client... using the string means I don't have to "cleanup" the data... and as we have to do a second query to get the folder document anyway... is it worth using ObjectId? (and if so why?)

Thanks

Photodynamics answered 8/9, 2013 at 4:37 Comment(2)
If you're making sure that the string you're using is quite unique (just like ObjectId generation makes sure), it might not make much of difference. Also, ObjectId type is limited to 12 bytes in BSON but you can use bigger strings if required. However, ObjectId will come handy if you're using the data from Mongo shells because you can print the timestamp also using ObjectId, which you cannot do with strings. Does that answer your questions?Entrepreneur
@Entrepreneur Thanks.. The string will always be an ObjectID which i can use in php to lookup another object new MongoID('myIDstring'); I get the space saving... although I'm not sure where the crossover point will be .. converting ObjectIds to strings and back for the clients vs space saving by using ObjectIds rather than strings.Photodynamics
C
34

The biggest reason is that ObjectIDs are 12 bytes, whereas an equivalent string is 24 bytes. Over a large enough collection, those 12 bytes saved per ID really add up! Those IDs also mean fewer bytes transferred over the wire when reading or writing the document, as well.

Additionally, some ODMs expect ObjectIDs for external document references, and may be confused by string versions of the ID. I am not familiar enough with PHP ODMs to say if this might affect you specifically, though.

Regarding the API stuff, though, you should probably be doing normalization of the data before sending it to the client anyhow, because since Mongo doesn't enforce a schema, you can have literally any sort of data in a given field, so you might have some documents that have string IDs, and others that have BSON IDs, and your API would happily send them both through to the client, but one or the other might cause breakage. In this particular case, you should use BSON ObjectIDs in your documents, and then should cast them to strings in your API output.

Comminate answered 8/9, 2013 at 7:35 Comment(3)
Thanks Chris, The space saving isn't an issue for this syetem . I estimate about 3% saving by using ObjectIds rather than strings... I could easily save that by shortening some field names too. I think not having to convert back and forth between the ObjectIDs and strings in my application will create a speedier api.Photodynamics
24 bytes of digit + lowercases only. It doesn't seem more saving than strings.Selwin
Strings are 1 byte per character, even if you only use a subset of the characters they can represent. String entropy is not the same as storage costs. String versions of IDs will always take up twice as much space as the same ID represented as an ObjectID. This is true for any int32 number represented as a hex string.Comminate
S
14

Briefly, for example, if you shorten the filed named last_name to lname , you could save 9 bytes per document. This really makes a difference if you have millions of documents in your collection.

Shifra answered 8/9, 2013 at 8:20 Comment(1)
Thanks. Good point about saving space with shorter field names.Photodynamics
E
3

In addition, ObjectId() has the following attribute and methods that you can use.

  1. str - Returns the hexadecimal string representation of the object. as a Date.

  2. ObjectId.toString() # Returns the JavaScript representation.

  3. ObjectId.getTimestamp() # Returns the timestamp portion of the object

  4. ObjectId.valueOf() # Returns the representation of the object as a hexadecimal string

Ecumenism answered 9/12, 2020 at 1:10 Comment(1)
do they really come handy though? do you have a use-case where you would use these methods? For example, do you usually use the ObjectId timestamp or you rely on another datetime field you update yourself?Shiva

© 2022 - 2024 — McMap. All rights reserved.