Html5 local datastore, and sync across devices
Asked Answered
O

3

8

I am building a full featured web application. Naturally, you can save when you are in 'offline' mode to the local datastore. I want to be able to sync across devices, so people can work on one machine, save, then get on another machine and load their stuff.

The questions are:

1) Is it a bad idea to store json on the server? Why parse the json on the server into model objects when it is just going to be passed back to the (other) client(s) as json?

2) Im not sure if I would want to try a NoSql technology for this. I am not breaking the json down, for now the only relationships in the db would be from a user account to their entries. Other than the user data, the domain model would be a String, which is the json. Advice welcome.

In theory, in the future I might want to do some processing on the server or set up more complicated relationships. In other words, right now I would just be saving the json, but in the future I might want a more traditional relational system. Would NoSQL approach get in the way of this?

3) Are there any security concerns with this? JS injection for example? In theory, for this use case, the user doesn't get to enter anything, at least right now.

Thank you in advance.

EDIT - Thanx for the answers. I chose the answer I did because it went into the most detail on the advantages and disadvantages of NoSql.

Oculus answered 5/11, 2010 at 14:45 Comment(8)
I don't think you need tons of data to consider a noSQL solution. I think you should pick the tool that's right for the job based on it's features. In this case CouchDB could be perfect because of it's powerful replication and offline approach.Suribachi
@Suribachi -- yes i agree. my question is: 'Is a nosql store the correct technology' for storing json. Among others questions.Oculus
JSON is format CouchDB uses to store its documents, so I'd say it's definitely the correct technology to store JSON :PSuribachi
@rwilliams, don't feel like writing a detailed answer?Oculus
Added an answer. Ideally, I'd like to know a lot more about the app you're building and the audience it's for.Suribachi
@rwilliams, the link is on my profile.Oculus
@rwilliams, what happens if I change my mind and want to process on the server.Oculus
If you want to process on the server then CouchDB would be fine choice as well. Either way it's more optimized for storing json than a blob in a mysql table.Suribachi
S
3

JSON on the SERVER

It's not a bad idea at all to store JSON on the server, especially if you go with a noSQL solution like MongoDB or CouchDB. Both use JSON as their native format(MongoDB actually uses BSON but it's quite similar).

noSQL Approach: Assuming CouchDB as the storage engine

  • Baked in replication and concurrency handling
  • Very simple Rest API, talk to the data base with HTTP.
  • Store data as JSON natively and not in blobs or text fields
  • Powerful View/Query engine that will allow you to continue to grow the complexity of your documents
  • Offline Mode. You can talk to CouchDb directly using javascript and have the entire app continue to run on the client if the internet isn't available.

Security

Make sure you're parsing the JSON documents with the browers JSON.parse or a Javascript library that is safe(json2.js).

Conclusion

I think the reason I'd suggest going with noSQL here, CouchDB in particular, is that it's going to handle all of the hard stuff for you. Replication is going to be a snap to setup. You won't have to worry about concurrency, etc.

That said, I don't know what kind of App you're building. I don't know what your relationship is going to be to the clients and how easy it'll be to get them to put CouchDB on their machines.

Links

  1. CouchDB @ Apache
  2. CouchOne
  3. CouchDB the definitive guide
  4. MongoDB

Update:

After looking at the app I don't think CouchDB will be a good client side option as you're not going to require folks to install a database engine to play soduku. That said, I still think it'd be a great server side option. If you wanted to sync the server CouchDb instance with the client you could use something like BrowserCouch which is a JavaScript implementation of CouchDB for local-storage.

Suribachi answered 14/11, 2010 at 1:41 Comment(6)
@rwilliams, what is the process like running a server side batch job to analyze things, given that the data is stored as json? Do you just load and parse and go on your way? I was going to use html5 localstorage to save client side data. How would that work in addition to couchdb? Isn't it one or the other?Oculus
To analyze things you'd write a map or map/reduce function with CouchDB's view engine. This function is continuously applied to all the documents in the database as they're created and updated. The function creates a b-tree index with keys and values you choose. ** Unless you really want folks to be able to play the game offline I'd say it's probably going to be one or the other. If you wanted to use localdata in addition to CouchDB you could use BrowerCouch and then sync the data to the server to process. The map function I talked about earler could be done in BrowserCouch as well.Suribachi
@rwilliams, what about downsides of nosql approach? give some tradeoffs for a more complete answer.Oculus
Slow add hoc queries/views. Smaller community for support/questions. Flat namespace, no 'tables'. Larger database files as it uses copy_on_write(this can be alleviated with regular compaction). No true relationships between documents, you'll have to relate the documents via the view engine.Suribachi
@rwilliams, hmm it seems that if one stores json on server, it is better to use nosql because of the 'view' into the data. With RDBMS one would have to process the json to analyze it. Is that correct?Oculus
Correct. With an RDBMS you'd have to marshall(?) the JSON into objects and then start analyzing it. Unless you stored the analysis results somewhere else you'd have to do this process every time you wanted to analyze the data. Alternatively, the CouchDB view engine will incrementally update your view as each document is changed, so your analysis queries will be extremely fast with almost no processing at all.Suribachi
R
3
  1. If most of your processing is going to be done on the client side using JavaScript, I don't see any problem in storing JSON directly on the server.

  2. If you just want to play around with new technologies, you're most welcome to try something different, but for most applications, there isn't a real reason to depart from traditional databases, and SQL makes life simple.

  3. You're safe as long as you use the standard JSON.parse function to parse JSON strings - some browsers (Firefox 3.5 and above, for example) already have a native version, while Crockford's json2.js can replicate this functionality in others.

Reine answered 8/11, 2010 at 0:47 Comment(3)
@hvgotcodes: When you've got tons of data and you actually notice that a traditional database is affecting your performance. This almost never happens unless, for example, you're Google. Even Facebook uses only MySQL databases coupled with memcached to improve performance.Reine
any updates before the end of the bounty, based on other answers?Oculus
@hvgotcodes: I guess the only thing that remains questionable is part (2), and that, at least in this case, is somewhat subjective / based on personal preference. The "answer" is whichever you find best for you. :)Reine
S
3

JSON on the SERVER

It's not a bad idea at all to store JSON on the server, especially if you go with a noSQL solution like MongoDB or CouchDB. Both use JSON as their native format(MongoDB actually uses BSON but it's quite similar).

noSQL Approach: Assuming CouchDB as the storage engine

  • Baked in replication and concurrency handling
  • Very simple Rest API, talk to the data base with HTTP.
  • Store data as JSON natively and not in blobs or text fields
  • Powerful View/Query engine that will allow you to continue to grow the complexity of your documents
  • Offline Mode. You can talk to CouchDb directly using javascript and have the entire app continue to run on the client if the internet isn't available.

Security

Make sure you're parsing the JSON documents with the browers JSON.parse or a Javascript library that is safe(json2.js).

Conclusion

I think the reason I'd suggest going with noSQL here, CouchDB in particular, is that it's going to handle all of the hard stuff for you. Replication is going to be a snap to setup. You won't have to worry about concurrency, etc.

That said, I don't know what kind of App you're building. I don't know what your relationship is going to be to the clients and how easy it'll be to get them to put CouchDB on their machines.

Links

  1. CouchDB @ Apache
  2. CouchOne
  3. CouchDB the definitive guide
  4. MongoDB

Update:

After looking at the app I don't think CouchDB will be a good client side option as you're not going to require folks to install a database engine to play soduku. That said, I still think it'd be a great server side option. If you wanted to sync the server CouchDb instance with the client you could use something like BrowserCouch which is a JavaScript implementation of CouchDB for local-storage.

Suribachi answered 14/11, 2010 at 1:41 Comment(6)
@rwilliams, what is the process like running a server side batch job to analyze things, given that the data is stored as json? Do you just load and parse and go on your way? I was going to use html5 localstorage to save client side data. How would that work in addition to couchdb? Isn't it one or the other?Oculus
To analyze things you'd write a map or map/reduce function with CouchDB's view engine. This function is continuously applied to all the documents in the database as they're created and updated. The function creates a b-tree index with keys and values you choose. ** Unless you really want folks to be able to play the game offline I'd say it's probably going to be one or the other. If you wanted to use localdata in addition to CouchDB you could use BrowerCouch and then sync the data to the server to process. The map function I talked about earler could be done in BrowserCouch as well.Suribachi
@rwilliams, what about downsides of nosql approach? give some tradeoffs for a more complete answer.Oculus
Slow add hoc queries/views. Smaller community for support/questions. Flat namespace, no 'tables'. Larger database files as it uses copy_on_write(this can be alleviated with regular compaction). No true relationships between documents, you'll have to relate the documents via the view engine.Suribachi
@rwilliams, hmm it seems that if one stores json on server, it is better to use nosql because of the 'view' into the data. With RDBMS one would have to process the json to analyze it. Is that correct?Oculus
Correct. With an RDBMS you'd have to marshall(?) the JSON into objects and then start analyzing it. Unless you stored the analysis results somewhere else you'd have to do this process every time you wanted to analyze the data. Alternatively, the CouchDB view engine will incrementally update your view as each document is changed, so your analysis queries will be extremely fast with almost no processing at all.Suribachi
B
2

Just read your post and I have to say I quite like your approach, it heralds the way many web applications will probably work in the future, with both an element of local storage (for disconnected state) and online storage (the master database - to save all customers records in one place and synch to other client devices).

Here are my answers:

1) Storing JSON on server: I'm not sure I would store the objects as JSON, its possible to do so if your application is quite simple, however this will hamper efforts to use the data (running reports and emailing them on a batch job for example). I would prefer to use JSON for TRANSFERRING the information myself and a SQL database for storing it.

2) NoSQL Approach: I think you've answered your own question there. My preferred approach would be to setup a SQL database now (if the extra resource needed is not a problem), that way you'll save yourself a bit of work setting up the data access layer for NoSQL since you will probably have to remove it in the future. SQLite is a good choice if you dont want a fully-featured RDBMS.

If writing a schema is too much hassle and you still want to save JSON on the server, then you can hash up a JSON object management system with a single table and some parsing on the server side to return relevant records. Doing this will be easier and require less permissioning than saving/deleting files.

3) Security: You mentioned there is no user input at the moment:

"for this use case, the user doesn't get to enter anything"

However at the begining of the question you also mentioned that the user can

"work on one machine, save, then get on another machine and load their stuff"

If this is the case then your application will be storing user data, it doesn't matter that you havent provided a nice GUI for them to do so, you will have to worry about security from more than one standpoint and JSON.parse or similar tools only solve half the the problem (client-side).

Basically, you will also have to check the contents of your POST request on the server to determine if the data being sent is valid and realistic. The integrity of the JSON object (or any data you are tying to save) will need to be validated on the server (using php or another similar language) BEFORE saving to your data store, this is because someone can easily bypass your javascript-layer "security" and tamper with the POST request even if you didnt intend them to do so and then your application will be sending the evil input out the client anyway.

If you have the server side of things tidied up then JSON.parse becomes a bit obsolete in terms of preventing JS injection. Still its not bad to have the extra layer, specially if you are relying on remote website APIs to get some of your data.

Hope this is useful to you.

Bonnice answered 8/11, 2010 at 20:6 Comment(2)
any updates before the end of the bounty, based on other answers?Oculus
@hvgotcodes, also proposed using simple RDBMS solution with 1 table for storing JSON.Bonnice

© 2022 - 2024 — McMap. All rights reserved.