XslCompiledTransform !♥ the UTF-8 BOM : System.Xml.XmlException: Data at the root level is invalid. Line 1, position 1.

Recently I came across a rather annoying issue with the XslCompiledTransform class. Namely it really doesn’t like having a BOM (byte order mark) shoved down it’s load method.

I got a the follow error message and stack trace

System.Xml.XmlException: Data at the root level is invalid. Line 1, position 1.
at System.Xml.XmlTextReaderImpl.Throw(Exception e)
at System.Xml.XmlTextReaderImpl.Throw(String res, String arg)
at System.Xml.XmlTextReaderImpl.ParseRootLevelWhitespace()
at System.Xml.XmlTextReaderImpl.ParseDocumentContent()
at System.Xml.XmlTextReaderImpl.Read()
at System.Xml.Xsl.Xslt.XsltInput.ReadNextSiblingHelper()
at System.Xml.Xsl.Xslt.XsltInput.ReadNextSibling()
at System.Xml.Xsl.Xslt.XsltInput.MoveToNextSibling()
at System.Xml.Xsl.Xslt.XsltInput.Start()
at System.Xml.Xsl.Xslt.XsltLoader.LoadDocument()
at System.Xml.Xsl.Xslt.XsltLoader.LoadStylesheet(XmlReader reader, Boolean include)
System.Xml.Xsl.XslLoadException: XSLT compile error.
at System.Xml.Xsl.Xslt.XsltLoader.LoadStylesheet(XmlReader reader, Boolean include)
at System.Xml.Xsl.Xslt.XsltLoader.Load(Compiler compiler, Object stylesheet, XmlResolver xmlResolver)
at System.Xml.Xsl.Xslt.Compiler.Compile(Object stylesheet, XmlResolver xmlResolver, ref QilExpression qil)
at System.Xml.Xsl.XslCompiledTransform.CompileXsltToQil(Object stylesheet, XsltSettings settings, XmlResolver stylesheetResolver)
at System.Xml.Xsl.XslCompiledTransform.LoadInternal(Object stylesheet, XsltSettings settings, XmlResolver stylesheetResolver)
at System.Xml.Xsl.XslCompiledTransform.Load(XmlReader stylesheet)

Viewing the input data to the XslCompiledTransform in debug mode showed a perfectly valid and lovely XSLT file. Visual studio does not display the BOM in the visual representation of the string. Saving the string to disk and examining it in a hex editor revealed that the first 3 bytes were indeed the BOM for UTF-8: EF BB BF

I wrote a little extension method for the bytes class to get rid of the BOM if it’s present

///

/// Removes the byte order mark.
///

/// The bytes. /// byte array without the BOM.
public static byte[] RemoveByteOrderMark(this byte[] bytes)
{
if (!bytes.StartsWithByteOrderMark())
{
return bytes;
}

byte[] results = new byte[bytes.Length – 3];
Array.Copy(bytes, 3, results, 0, bytes.Length – 3);

return results;
}

///

/// Determines if the byte array starts with a byte order mark.
///

/// The bytes. /// true if the byte array starts with a byte order mark; otherwise false.
public static bool StartsWithByteOrderMark(this byte[] bytes)
{
if (bytes == null)
{
return false;
}

if (bytes.Length < 3) { return false; } return bytes[0] == 0xEF && bytes[1] == 0xBB && bytes[2] == 0xBF; } [/sourcecode]

Tagged

One thought on “XslCompiledTransform !♥ the UTF-8 BOM : System.Xml.XmlException: Data at the root level is invalid. Line 1, position 1.

  1. DrKov says:

    Very useful! I did the same. Thanks for post about it!

Leave a comment