I've used Alloy on a few projects and have found it helpful; on some but not all of those projects I have been able to persuade others involved to use Alloy as well, or at least to work with the Alloy models I wrote. These projects may or may not be what you have in mind in asking for 'real-world' projects, but they certainly took place in the part of the real world I work in.
In 2006 and 2007 I created a partial Alloy model for the then-current draft of the W3C XProc specification; as far as I could tell, most members of the working group never read the paper I wrote (at http://www.w3.org/XML/XProc/2006/12/alloy-models/models.html); they said "Oh, we changed that part of the spec last week, so what the model says is no longer relevant". But the paper did manage to persuade the editor of the spec that the abstract 'component' level described in the first draft of the spec was woefully underspecified and needed to be either fully specified or dropped. He dropped it, with (I think) good results for the readability and usability of the spec.
In 2010 I made an Alloy model of the XPath 1.0 data model, which uncovered some glitches in the specification. The reaction of most interested parties (including the W3C working group responsible for maintaining the XPath 1.0 spec) has, unfortunately, not been encouraging.
A research project I'm involved with has used Alloy to model the MLCD Overlap Corpus, a collection of sample documents and related information we are creating (hyperlinks suppressed at SO's insistence); the Alloy model found a couple of errors in our initial design for the corpus catalog, so it was well worth the effort.
And we have also used Alloy to formalize some modeling work we have done on the nature of transcription and on the extension of the type/token distinction to document structure (for our paper, look for the 2010 proceedings of Balisage: The Markup Conference). This lies a little bit outside Alloy's usual area of application, as it has nothing to do with software design, but Alloy's ability to check models for consistency and generate instances has been invaluable in showing us some of the logical consequences of this or that possible axiom for our model.
To answer your specific questions: yes, Alloy has helped me specify cleaner domain models, and yes, it has found errors and glitches. They have often been small, for the reasons Daniel Jackson explains in his book Software Abstractions: first, if you use models during design, you catch errors early, when everything is still small. And, second (in Jackson's words), "In hindsight, most software design issues are trivial."
He continues: "But if you don't address them head-on, trivial issues have a nasty habit of becoming nontrivial." My experience amply confirms this. Much better to head off such problems early. So yes, I will use Alloy again.