How can I serialize a 3rd party type using protobuf-net or other serializers?
Asked Answered
H

2

3

I have List<HtmlAgilityPack.HtmlNode> but protobuf-net gives me error that it doesn't have a contract. How can I specify a contract for it when I don't have the source? It actually said it couldn't infer the type but I assume it's because I didn't use its attibute, right?

The default binary serializer also complains because the type is not marked as serializable.

EDIT: The error message is:

Type is not expected, and no contract can be inferred: HtmlAgilityPack.HtmlNode
Hols answered 23/10, 2011 at 21:54 Comment(0)
L
5

Frankly, in the case of HTML I'd just store... the html - it is kinda pre-serialised! However, to answer the question:

In protobuf-net v2, you can configure a TypeModel at runtime, which allows everything you can do via attributes and a few other tricks (in v2 the attributes just help steer the model if nothing else is specified). And because you can do all this at runtime, you dont need to change the type - and hence can apply it to models outside your control. The default model instance is RuntimeTypeModel.Default, and you can add types to the model, and configure each MetaType individually (which maps to Type). This allows you to tell it what members (properties/fields), sub-types, callbacks, etc to apply.

If that gets too complex, you can also specify a "surrogate", which allows you to configure a simple DTO, and use a standard conversion operator (explicit or implicit) to change between the complex model and the simple DTO model.

For info, the significance of the default model is: that is what Serializer.* uses. However, if you use the TypeModel instance to perform serialization/deserialization you can have multiple differently configured models for the same types.

I can't remember the full details of HTML-agility-pack, but those are the main options available for your scenario via protobuf-net.

Luzon answered 23/10, 2011 at 22:39 Comment(3)
Thanks Mark. The reason I was trying to do like this is because I am trying to collect all the links in 3000 pages, but I get a timeout in 400s. So I want to save and exit the app and then bring my links back into memory on startup, because after I get all the links from 3000 pages, then for all these links (20 links per page), I parse the link's info, all in parallel. I thought doing it in 2 passes is better instead of doing all this operations, all at once. In the second part, I use agility pack methods like GetAttributeValue, that's why I wanted to store the HtmlNode values.Hols
@Joan can't you just store the HTML? For emphasis: DOMs are complex beasts, typically with multi-directional navigation (parent, child, sibling). Recreating one outside of their carefully constructed factory methods may be ambitious. Much easier to simply re-parse.Luzon
You are right, I guess I could do that, I forgot I could just load an offline html stream. Btw do you know how I can access the html from the HtmlDocument so I could store them?Hols
U
1

For BSon you can specify your own serializer for any class; see http://www.mongodb.org/display/DOCS/CSharp+Driver+Serialization+Tutorial#CSharpDriverSerializationTutorial-Writeacustomserializer

Here's an example using it to serialize C# dynamic variables.

Unisexual answered 23/10, 2011 at 22:4 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.