Inlining CSS in C#
Asked Answered
W

8

29

I need to inline css from a stylesheet in c#.

Like how this works.

http://www.mailchimp.com/labs/inlinecss.php

The css is simple, just classes, no fancy selectors.

I was contemplating using a regex (?<rule>(?<selector>[^{}]+){(?<style>[^{}]+)})+ to strip the rules from the css, and then attempting to do simple string replaces where the classes are called, but some of the html elements already have a style tag, so I'd have to account for that as well.

Is there a simpler approach? Or something already written in c#?

UPDATE - Sep 16, 2010

I've been able to come up with a simple CSS inliner provided your html is also valid xml. It uses a regex to get all the styles in your <style /> element. Then converts the css selectors to xpath expressions, and adds the style inline to the matching elements, before any pre-existing inline style.

Note, that the CssToXpath is not fully implemented, there are some things it just can't do... yet.

CssInliner.cs

using System.Collections.Generic;
using System.Text.RegularExpressions;
using System.Xml.Linq;
using System.Xml.XPath;

namespace CssInliner
{
    public class CssInliner
    {
        private static Regex _matchStyles = new Regex("\\s*(?<rule>(?<selector>[^{}]+){(?<style>[^{}]+)})",
                                                RegexOptions.IgnoreCase
                                                | RegexOptions.CultureInvariant
                                                | RegexOptions.IgnorePatternWhitespace
                                                | RegexOptions.Compiled
                                            );

        public List<Match> Styles { get; private set; }
        public string InlinedXhtml { get; private set; }

        private XElement XhtmlDocument { get; set; }

        public CssInliner(string xhtml)
        {
            XhtmlDocument = ParseXhtml(xhtml);
            Styles = GetStyleMatches();

            foreach (var style in Styles)
            {
                if (!style.Success)
                    return;

                var cssSelector = style.Groups["selector"].Value.Trim();
                var xpathSelector = CssToXpath.Transform(cssSelector);
                var cssStyle = style.Groups["style"].Value.Trim();

                foreach (var element in XhtmlDocument.XPathSelectElements(xpathSelector))
                {
                    var inlineStyle = element.Attribute("style");

                    var newInlineStyle = cssStyle + ";";
                    if (inlineStyle != null && !string.IsNullOrEmpty(inlineStyle.Value))
                    {
                        newInlineStyle += inlineStyle.Value;
                    }

                    element.SetAttributeValue("style", newInlineStyle.Trim().NormalizeCharacter(';').NormalizeSpace());
                }
            }

            XhtmlDocument.Descendants("style").Remove();
            InlinedXhtml = XhtmlDocument.ToString();
        }

        private List<Match> GetStyleMatches()
        {
            var styles = new List<Match>();

            var styleElements = XhtmlDocument.Descendants("style");
            foreach (var styleElement in styleElements)
            {
                var matches = _matchStyles.Matches(styleElement.Value);

                foreach (Match match in matches)
                {
                    styles.Add(match);
                }
            }

            return styles;
        }

        private static XElement ParseXhtml(string xhtml)
        {
            return XElement.Parse(xhtml);
        }
    }
}

CssToXpath.cs

using System.Text.RegularExpressions;

namespace CssInliner
{
    public static class CssToXpath
    {
        public static string Transform(string css)
        {
            #region Translation Rules
            // References:  http://ejohn.org/blog/xpath-css-selectors/
            //              http://code.google.com/p/css2xpath/source/browse/trunk/src/css2xpath.js
            var regexReplaces = new[] {
                                          // add @ for attribs
                                          new RegexReplace {
                                              Regex = new Regex(@"\[([^\]~\$\*\^\|\!]+)(=[^\]]+)?\]", RegexOptions.Multiline),
                                              Replace = @"[@$1$2]"
                                          },
                                          //  multiple queries
                                          new RegexReplace {
                                              Regex = new Regex(@"\s*,\s*", RegexOptions.Multiline),
                                              Replace = @"|"
                                          },
                                          // , + ~ >
                                          new RegexReplace {
                                              Regex = new Regex(@"\s*(\+|~|>)\s*", RegexOptions.Multiline),
                                              Replace = @"$1"
                                          },
                                          //* ~ + >
                                          new RegexReplace {
                                              Regex = new Regex(@"([a-zA-Z0-9_\-\*])~([a-zA-Z0-9_\-\*])", RegexOptions.Multiline),
                                              Replace = @"$1/following-sibling::$2"
                                          },
                                          new RegexReplace {
                                              Regex = new Regex(@"([a-zA-Z0-9_\-\*])\+([a-zA-Z0-9_\-\*])", RegexOptions.Multiline),
                                              Replace = @"$1/following-sibling::*[1]/self::$2"
                                          },
                                          new RegexReplace {
                                              Regex = new Regex(@"([a-zA-Z0-9_\-\*])>([a-zA-Z0-9_\-\*])", RegexOptions.Multiline),
                                              Replace = @"$1/$2"
                                          },
                                          // all unescaped stuff escaped
                                          new RegexReplace {
                                              Regex = new Regex(@"\[([^=]+)=([^'|""][^\]]*)\]", RegexOptions.Multiline),
                                              Replace = @"[$1='$2']"
                                          },
                                          // all descendant or self to //
                                          new RegexReplace {
                                              Regex = new Regex(@"(^|[^a-zA-Z0-9_\-\*])(#|\.)([a-zA-Z0-9_\-]+)", RegexOptions.Multiline),
                                              Replace = @"$1*$2$3"
                                          },
                                          new RegexReplace {
                                              Regex = new Regex(@"([\>\+\|\~\,\s])([a-zA-Z\*]+)", RegexOptions.Multiline),
                                              Replace = @"$1//$2"
                                          },
                                          new RegexReplace {
                                              Regex = new Regex(@"\s+\/\/", RegexOptions.Multiline),
                                              Replace = @"//"
                                          },
                                          // :first-child
                                          new RegexReplace {
                                              Regex = new Regex(@"([a-zA-Z0-9_\-\*]+):first-child", RegexOptions.Multiline),
                                              Replace = @"*[1]/self::$1"
                                          },
                                          // :last-child
                                          new RegexReplace {
                                              Regex = new Regex(@"([a-zA-Z0-9_\-\*]+):last-child", RegexOptions.Multiline),
                                              Replace = @"$1[not(following-sibling::*)]"
                                          },
                                          // :only-child
                                          new RegexReplace {
                                              Regex = new Regex(@"([a-zA-Z0-9_\-\*]+):only-child", RegexOptions.Multiline),
                                              Replace = @"*[last()=1]/self::$1"
                                          },
                                          // :empty
                                          new RegexReplace {
                                              Regex = new Regex(@"([a-zA-Z0-9_\-\*]+):empty", RegexOptions.Multiline),
                                              Replace = @"$1[not(*) and not(normalize-space())]"
                                          },
                                          // |= attrib
                                          new RegexReplace {
                                              Regex = new Regex(@"\[([a-zA-Z0-9_\-]+)\|=([^\]]+)\]", RegexOptions.Multiline),
                                              Replace = @"[@$1=$2 or starts-with(@$1,concat($2,'-'))]"
                                          },
                                          // *= attrib
                                          new RegexReplace {
                                              Regex = new Regex(@"\[([a-zA-Z0-9_\-]+)\*=([^\]]+)\]", RegexOptions.Multiline),
                                              Replace = @"[contains(@$1,$2)]"
                                          },
                                          // ~= attrib
                                          new RegexReplace {
                                              Regex = new Regex(@"\[([a-zA-Z0-9_\-]+)~=([^\]]+)\]", RegexOptions.Multiline),
                                              Replace = @"[contains(concat(' ',normalize-space(@$1),' '),concat(' ',$2,' '))]"
                                          },
                                          // ^= attrib
                                          new RegexReplace {
                                              Regex = new Regex(@"\[([a-zA-Z0-9_\-]+)\^=([^\]]+)\]", RegexOptions.Multiline),
                                              Replace = @"[starts-with(@$1,$2)]"
                                          },
                                          // != attrib
                                          new RegexReplace {
                                              Regex = new Regex(@"\[([a-zA-Z0-9_\-]+)\!=([^\]]+)\]", RegexOptions.Multiline),
                                              Replace = @"[not(@$1) or @$1!=$2]"
                                          },
                                          // ids
                                          new RegexReplace {
                                              Regex = new Regex(@"#([a-zA-Z0-9_\-]+)", RegexOptions.Multiline),
                                              Replace = @"[@id='$1']"
                                          },
                                          // classes
                                          new RegexReplace {
                                              Regex = new Regex(@"\.([a-zA-Z0-9_\-]+)", RegexOptions.Multiline),
                                              Replace = @"[contains(concat(' ',normalize-space(@class),' '),' $1 ')]"
                                          },
                                          // normalize multiple filters
                                          new RegexReplace {
                                              Regex = new Regex(@"\]\[([^\]]+)", RegexOptions.Multiline),
                                              Replace = @" and ($1)"
                                          },

                                      };
            #endregion

            foreach (var regexReplace in regexReplaces)
            {
                css = regexReplace.Regex.Replace(css, regexReplace.Replace);
            }

            return "//" + css;
        }
    }

    struct RegexReplace
    {
        public Regex Regex;
        public string Replace;
    }
}

And some tests

    [TestMethod]
    public void TestCssToXpathRules()
    {
        var translations = new Dictionary<string, string>
                               {
                                   { "*", "//*" }, 
                                   { "p", "//p" }, 
                                   { "p > *", "//p/*" }, 
                                   { "#foo", "//*[@id='foo']" }, 
                                   { "*[title]", "//*[@title]" }, 
                                   { ".bar", "//*[contains(concat(' ',normalize-space(@class),' '),' bar ')]" }, 
                                   { "div#test .note span:first-child", "//div[@id='test']//*[contains(concat(' ',normalize-space(@class),' '),' note ')]//*[1]/self::span" }
                               };

        foreach (var translation in translations)
        {
            var expected = translation.Value;
            var result = CssInliner.CssToXpath.Transform(translation.Key);

            Assert.AreEqual(expected, result);
        }
    }

    [TestMethod]
    public void HtmlWithMultiLineClassStyleReturnsInline()
    {
        #region var html = ...
        var html = XElement.Parse(@"<html>
                                        <head>
                                            <title>Hello, World Page!</title>
                                            <style>
                                                .redClass { 
                                                    background: red; 
                                                    color: purple; 
                                                }
                                            </style>
                                        </head>
                                        <body>
                                            <div class=""redClass"">Hello, World!</div>
                                        </body>
                                    </html>").ToString();
        #endregion

        #region const string expected ...
        var expected = XElement.Parse(@"<html>
                                            <head>
                                                <title>Hello, World Page!</title>
                                            </head>
                                            <body>
                                                <div class=""redClass"" style=""background: red; color: purple;"">Hello, World!</div>
                                            </body>
                                        </html>").ToString();
        #endregion

        var result = new CssInliner.CssInliner(html);

        Assert.AreEqual(expected, result.InlinedXhtml);
    }

There are more tests, but, they import html files for the input and expected output and I'm not posting all that!

But I should post the Normalize extension methods!

private static readonly Regex NormalizeSpaceRegex = new Regex(@"\s{2,}", RegexOptions.None);
public static string NormalizeSpace(this string data)
{
    return NormalizeSpaceRegex.Replace(data, @" ");
}

public static string NormalizeCharacter(this string data, char character)
{
    var normalizeCharacterRegex = new Regex(character + "{2,}", RegexOptions.None);
    return normalizeCharacterRegex.Replace(data, character.ToString());
}
Wilburnwilburt answered 9/9, 2010 at 17:56 Comment(5)
Added a bounty, hoping someone has something already in .NETWilburnwilburt
I hope you get some bites, I don't like my answer.Chee
@Greg, same here! I'm attempting to write something simple...and it's not going to be so simple...Wilburnwilburt
Added the code I used for my solution. Feel free to improve upon it. The CssToXpath class could definitely use a few more enhancements, but it serves my purposes currently.Wilburnwilburt
Hey - I just blogged about my solution to this, PreMailer.Net: martinnormark.com/move-css-inline-premailer-netCristinecristiona
K
9

Since you're already 90% of the way there with your current implementation, why don't you use your existing framework but replace the XML parsing with an HTML parser instead? One of the more popular ones out there is the HTML Agility Pack. It supports XPath queries and even has a LINQ interface similar to the standard .NET interface provided for XML so it should be a fairly straightforward replacement.

Koine answered 21/9, 2010 at 2:50 Comment(3)
if I run into any issues with the current implementation I'll look into that. Right now, I know my "html" is valid xml, so xml parsing is fine, and it's working in production.Wilburnwilburt
Also, the HTML Agility Pack's parser doesn't deal well even with valid HTML such as <ul><li>x<li>y<li>z</ul> - it's just not robust enough to deal with real-world HTML.Alister
Check out this thread (https://mcmap.net/q/194243/-what-is-the-best-way-to-parse-html-in-c-closed) for discussion of other HTML parser as well as the HTML Agility Pack.Koine
C
17

I have a project on Github that makes CSS inline. It's very simple, and support mobile styles. Read more on my blog: http://martinnormark.com/move-css-inline-premailer-net

Cristinecristiona answered 10/6, 2011 at 11:6 Comment(4)
Nice. Seems to work mostly, but I get the following warning when I try it with Zurb "Ink basic template": "PreMailer.Net is unable to process the pseudo class/element 'a:active' due to a limitation in CsQuery." - I guess it's up to CsQuery to address that. BTW this post says that CsQuery does support them now github.com/milkshakesoftware/PreMailer.Net/issues/34Kuhlman
@MatthewLock Yes, there's an issue at the moment as described in the GitHub issue you linked to.Cristinecristiona
Dear @Cristinecristiona thanks for kind project , i have an issue when i use MoveCssInline i get this exception (Unexpected character found at position 34: ".. rol-solid::>>-<<moz-placeholder) any help please?Lizarraga
@Younisbarznji Can you paste your source into a new question here on SO?Cristinecristiona
K
9

Since you're already 90% of the way there with your current implementation, why don't you use your existing framework but replace the XML parsing with an HTML parser instead? One of the more popular ones out there is the HTML Agility Pack. It supports XPath queries and even has a LINQ interface similar to the standard .NET interface provided for XML so it should be a fairly straightforward replacement.

Koine answered 21/9, 2010 at 2:50 Comment(3)
if I run into any issues with the current implementation I'll look into that. Right now, I know my "html" is valid xml, so xml parsing is fine, and it's working in production.Wilburnwilburt
Also, the HTML Agility Pack's parser doesn't deal well even with valid HTML such as <ul><li>x<li>y<li>z</ul> - it's just not robust enough to deal with real-world HTML.Alister
Check out this thread (https://mcmap.net/q/194243/-what-is-the-best-way-to-parse-html-in-c-closed) for discussion of other HTML parser as well as the HTML Agility Pack.Koine
Y
9

As this option is not very clear in the other replies, I think it deserves a straightforward answer.

Use PreMailer.Net.

All you have to do is:

  1. Install PreMailer.NET via nuget.
  2. Type this:

    var inlineStyles = PreMailer.Net.PreMailer.MoveCssInline(htmlSource, false);
    destination = inlineStyles.Html;
    

And you are done!

BTW, you may want to add a using directive to shorten that line.

More usage info in the link above, of course.

Yordan answered 5/5, 2017 at 22:32 Comment(0)
C
4

Excellent question.

I have no idea if there is a .NET solution, but I found a Ruby program called Premailer that claims to inline CSS. If you want to use it you have a couple options:

  1. Rewrite Premailer in C# (or any .NET language you are familiar with)
  2. Use IronRuby to run Ruby in .NET
Chee answered 9/9, 2010 at 18:28 Comment(1)
PreMailer has a .NET equivalent called PreMailer.Net github.com/milkshakesoftware/PreMailer.NetKuhlman
A
3

I'd recommend using an actual CSS parser rather than Regexes. You don't need to parse the full language since you're interested mostly in reproduction, but in any case such parsers are available (and for .NET too). For example, take a look at antlr's list of grammars, specifically a CSS 2.1 grammar or a CSS3 grammar. You can possibly strip large parts of both grammars if you don't mind sub-optimal results wherein inline styles may include duplicate definitions, but to do this well, you'll need some idea of internal CSS logic to be able to resolve shorthand attributes.

In the long run, however, this will certainly be a lot less work than a neverending series of adhoc regex fixes.

Alister answered 22/9, 2010 at 9:10 Comment(1)
I'll likely do this if it becomes an issue.Wilburnwilburt
W
1

Here is an idea, why dont you make a post call to http://www.mailchimp.com/labs/inlinecss.php using c#. from analysis using firebug it looks like the post call needs 2 params html and strip which takes values (on/off) the result is in a param called text.

here is a sample on how to make a post call using c#

Warford answered 16/9, 2010 at 7:48 Comment(1)
this app is to critical to rely on another site being online, fast, and not changing.Wilburnwilburt
R
1

Chad, do you necessarily have to add the CSS inline? Or could you maybe be better off by adding a <style> block to your <head>? This will in essence replace the need for a reference to a CSS file as well plus maintain the rule that the actual inline rules override the ones set in the header/referenced css file.

(sorry, forgot to add the quotes for code)

Reeva answered 16/9, 2010 at 12:48 Comment(2)
Yes, the css is already in a <style/> element, but the BB email client doesn't appear to support it. It does however support inline styles.Wilburnwilburt
There are multiple reasons why CSS inlining is used for HTML emails. The main reason is that many web email clients strip existing style blocks out of the html email source before including it into their own HTML output. But also there is Outlook, which even in 2019 is a terrible HTML email client which you will never tame without inlining the CSS styles.Benedict
R
1

I would recommend a dictonary like this:

private Dictionary<string, Dictionary<string, string>> cssDictionary = new Dictionary<string, Dictionary<string, string>();

I would parse the css to fill this cssDictionary.

(Adding 'style-type', 'style-property', 'value'. In example:

Dictionary<string,string> bodyStyleDictionary = new Dictionary<string, string();
    bodyStyleDictionary.Add("background", "#000000");
    cssDictionary.Add("body", bodyStyleDictionary);

After that I would preferably convert the HTML to an XmlDocument.

You can recursively run through the documents nodes by it's children and also look up it's parents (This would even enable you being able to use selectors).

On each element you check for the element type, the id and the class. You then browse through the cssDictionary to add any styles for this element to the style attribute (Granted, you might want to place them in order of occurrence if they have overlapping properties (And add the existing inline styles the last).

When you're done, you emit the xmlDocument as a string and remove the first line (<?xml version="1.0"?>) This should leave you with a valid html document with inline css.

Sure, it might half look like a hack, but in the end I think it's a pretty solid solution that ensures stability and quite does what you seem to be looking for.

Reeva answered 16/9, 2010 at 14:5 Comment(1)
I actually have something similar written that's working pretty good. It's far from flawless, but it's working pretty good. I'll try and package it up and post it here.Wilburnwilburt

© 2022 - 2024 — McMap. All rights reserved.