Tag Archives: xslt

XslCompiledTransform !♥ the UTF-8 BOM : System.Xml.XmlException: Data at the root level is invalid. Line 1, position 1.

Recently I came across a rather annoying issue with the XslCompiledTransform class. Namely it really doesn’t like having a BOM (byte order mark) shoved down it’s load method.

I got a the follow error message and stack trace

System.Xml.XmlException: Data at the root level is invalid. Line 1, position 1.
at System.Xml.XmlTextReaderImpl.Throw(Exception e)
at System.Xml.XmlTextReaderImpl.Throw(String res, String arg)
at System.Xml.XmlTextReaderImpl.ParseRootLevelWhitespace()
at System.Xml.XmlTextReaderImpl.ParseDocumentContent()
at System.Xml.XmlTextReaderImpl.Read()
at System.Xml.Xsl.Xslt.XsltInput.ReadNextSiblingHelper()
at System.Xml.Xsl.Xslt.XsltInput.ReadNextSibling()
at System.Xml.Xsl.Xslt.XsltInput.MoveToNextSibling()
at System.Xml.Xsl.Xslt.XsltInput.Start()
at System.Xml.Xsl.Xslt.XsltLoader.LoadDocument()
at System.Xml.Xsl.Xslt.XsltLoader.LoadStylesheet(XmlReader reader, Boolean include)
System.Xml.Xsl.XslLoadException: XSLT compile error.
at System.Xml.Xsl.Xslt.XsltLoader.LoadStylesheet(XmlReader reader, Boolean include)
at System.Xml.Xsl.Xslt.XsltLoader.Load(Compiler compiler, Object stylesheet, XmlResolver xmlResolver)
at System.Xml.Xsl.Xslt.Compiler.Compile(Object stylesheet, XmlResolver xmlResolver, ref QilExpression qil)
at System.Xml.Xsl.XslCompiledTransform.CompileXsltToQil(Object stylesheet, XsltSettings settings, XmlResolver stylesheetResolver)
at System.Xml.Xsl.XslCompiledTransform.LoadInternal(Object stylesheet, XsltSettings settings, XmlResolver stylesheetResolver)
at System.Xml.Xsl.XslCompiledTransform.Load(XmlReader stylesheet)

Viewing the input data to the XslCompiledTransform in debug mode showed a perfectly valid and lovely XSLT file. Visual studio does not display the BOM in the visual representation of the string. Saving the string to disk and examining it in a hex editor revealed that the first 3 bytes were indeed the BOM for UTF-8: EF BB BF

I wrote a little extension method for the bytes class to get rid of the BOM if it’s present

/// <summary>
/// Removes the byte order mark.
/// </summary>
///
<param name="bytes">The bytes.</param>
/// <returns>byte array without the BOM.</returns>
public static byte[] RemoveByteOrderMark(this byte[] bytes)
{
    if (!bytes.StartsWithByteOrderMark())
    {
        return bytes;
    }

    byte[] results = new byte[bytes.Length - 3];
    Array.Copy(bytes, 3, results, 0, bytes.Length - 3);

    return results;
}

/// <summary>
/// Determines if the byte array starts with a byte order mark.
/// </summary>
///
<param name="bytes">The bytes.</param>
/// <returns><c>true</c> if the byte array starts with a byte order mark; otherwise false.</returns>
public static bool StartsWithByteOrderMark(this byte[] bytes)
{
    if (bytes == null)
    {
        return false;
    }

    if (bytes.Length < 3)
    {
        return false;
    }

    return
        bytes[0] == 0xEF &&
        bytes[1] == 0xBB &&
        bytes[2] == 0xBF;
}
Tagged
Follow

Get every new post delivered to your Inbox.