When building a web based IMAP client, should I make the web server talk directly to the IMAP server or should I have a database in the middle that gets synchronized when required?
The fact that you are asking the question implies that you are worried that the webmail front end will not work effectively with the IMAP backend. I can think of a few reasons, please correct me if I'm wrong:
- The webmail client, being stateless, will make a lot of calls on the IMAP server, and a lot of them will be essentially repeats (eg when the user refreshes the screen). You are concerned that the number of calls, and the strictly unnecessary nature of most of them will either:
- Overwhelm the IMAP server, or
- Rack up large network/data/SLA bills with a third-party IMAP provider, or
- Be much slower than direct DB access
- The IMAP server is external and may sometimes be unavailable to your datacenter, and you need to ensure the webmail client continues to provide a service to your customers.
A key decision point needs to be made, but I think it depends on whether the IMAP server you need to connect to is internal or external to your organisation.
- Internal:
- Performance of the webmail component may suffer in latencies to the IMAP server
- External:
- Network bandwidth or IMAP usage is monetarily costly
- Performance as above
- You need to mirror the mail data while the main IMAP server is offline.
Against this backdrop, you are considering if you can get away without a database, because you know that adding this layer will be:
- Expensive
- Add considerably to the project timeline (for design, development and testing)
- Adds technical and delivery risks to the project and
- Adds a major architectural component that you might otherwise do without.
Internal IMAP Server - Performance
First, it's probably a good idea to benchmark the system to see if there really is a performance hit. My gut says that there doesn't have to be one, because there are many highly responsive webmail systems out there.
First, there are a number of very good IMAP proxy servers that enable you to keep IMAP connections alive decreasing latency considerably. Examples include:
Secondly, if you look at the IMAP server and webmail web application as a system, it probably makes sense that you don't cache the IMAP data in yet another database. You will introduce data latencies from IMAP server to your database, have data and database management headaches and introduce system complexities with many new points of failure.
Instead, can you optimize your IMAP server for use with the webmail app? This might involve buying an additional server or upgrading your current one - but at the same time, your webmail server will be that much smaller and you won't have to buy a database server for it.
The IMAP server almost certainly has internal caches for performance, and almost certainly uses a database (with its own caching etc) - its been tuned and debugged over many years by many hands. You can use that experience and maturity.
Let's imagine a hypothetical problem - the system grows large and starts to suffer performance problems. Is it easier to tune and scale a custom application with custom DB tables, or is it easier to scale a widely used commercial or open source IMAP server with commercial support available and good, tested documentation?
External IMAP Server - minimizing traffic and maximizing performance
With this concern, the aim is to minimize IMAP protocol calls because they are expensive in time (network latency) or money.
First, you can use an IMAPProxy (as described above) to ensure connections stay alive and users logged in.
Additionally, I would argue that you need to use a database but in the mode of a cache rather than an full data model. For example, you could use a NoSQL DB (either a key-value or object db) rather than a SQL DB:
- Store objects (messages, folders metadata, attachments, etc) rather than denormalized data
- ACID behaviour is probably not needed - it's a cache
- Most lookups by object id or class, not by complex WHERE clauses
Implementing in this way will make the task very use-case or user-story specific, reduce cost and risk and make the system more testable. If there is a serious issue in your implementation, the whole cache can be flushed and service will be restored.
Data and database management will also be much easier, and the full user's mailbox will not be stored.
External IMAP Server - disconnected model
This presupposes you are providing a webmail client for an external IMAP service, and you know that periodically that the external IMAP service goes offline, yet you still need to provide email to the users.
Here, you clearly need to mirror the users' email in a local database. I'd suggest that you look at an architectural solution where you have a local IMAP server and you use any one of many open source IMAP synchronization tools to mirror the third-party IMAP server. The benefits here are:
- To your webmail app, the local IMAP server looks exactly the same as any other IMAP server simplifying the webmail client
- Synchronization is a tough problem with many edge cases; all these have been solved before
- By having a full local IMAP server, you have a fully manageable component with optional support with no development cost
- You can have some users connecting direct to external IMAP servers, and some to cached ones using this architecture - it's just a URL
The disadvantages are:
- Potentially a huge cost in storage of duplicate emails
- Laggy time for the user to see new emails, since they have to be detected first on the external IMAP server, then again locally.
You can use one of many IMAP servers locally, and for synchronization, here are some possibilities:
(Disclaimer: I have not used these tools myself)
this totally depends on the requirements, criticality of service that is to be provided and the amount of time you have ;-)
I have seen cases with yahoo, rediffmail IMAP where the mails were redownloaded at randon timings on my blackberry.( for your info. blackberry mail services first downloads all the mails to there own servers from IMAP servers and then distribute it to the devices.)
Using a intermediate is a good idea since it can reduce any server related issues like mail fetch error, random bugs or server crashes. Even the mail related activities would be faster like mail read, delete, moved to different folder can be handled more conviniently even if the destined IMAP server is down.
I would think that the user would benefit greatly from the database approach and perhaps storage cost could be cut by limiting the number of recent messages stored in the database. This would provide the user with a snappy initial load so they could get in, read the latest emails and get out without waiting for the IMAP server sync. The database approach may be a solution to the folder problem as well because you can cache what you downloaded the last time you connected then update in the background preventing the user from waiting around for the server sync. In short the user would be happier with a database approach and that's what matters.
© 2022 - 2024 — McMap. All rights reserved.