Article
21 comments

Encode and decode field names from display name to internal name

In this article you will learn how fields in SharePoint can be encoded from a display name to an internal field name. The problem with the difference between internal field names and display name is that, SharePoint won’t have any built in function that is available to use in your code. If field names are only defined in English you might not have such problem. My mother tongue is German and we have some special characters that needs to be special treated and encoded.

The basic

Lets’s create a field called “Project Number”. The internal name for this field then will be “Project_x0020_Number”. The “_x0020_” in the field name is the urlencoded representation of the space in between of both words. To be more precise it is a form of unicode url encoding. The Unicode encoding of the space is “%u0020” but for fields “_x0020_” is required. To transform the encoded field name to a proper SharePoint internal field name we need to transform the encoded space somehow.

Another point that need to be considered is that SharePoint 2010 has a maxium length for internal field names is limited by 32 characters. By encoding the string we will get a much more longer string than the display name is. SharePoint handles this by truncating the field name and if multiple field names will be produced an additional index will be added as character number 33.

The good news is that this has been improved in SharePoint 2013. Now you can use display names with a length up to 255 characters. The limitation of the internal field names doesn’t exists anymoure. It seems that the limitation of the internal name is now raised to 1024 characters. In some test cases I got fields with an internal name length of more than 600 characters.

From display name to internal field name in C#

As mentioned before somehow SharePoint doesn’t have an out of the box function to convert a field name from the display field name to an internal field name. At least as far as I know.

There is one method that works for field names in English, but it doesn’t work for non ASCII characters and punctuation marks. The method which was recommended by Hugh Wood (@HughAJWood) is to use XmlConvert.EncodeName for encoding and XmlConvert.DecodeName for decoding the field names. What I have experienced with this is that it cannot considered to be a save approach to encode the display name strings. To give you an example “Österreich”, the german name of the country where I live in, will be encoded to “Österreich” but a field with this label in SharePoint will be encoded as “_x00d6_sterreich”. This is because the letter “Ö” is valid UTF-8 character to be used in a XML but somehow it’s not save enough for SharePoint.

My approach is a little bit different, might not the fastest, but save in a manner of SharePoint. So what I do to create a valid encoding, is to encode every single character of the display name using the method HttpUtility.UrlEncodeUnicode.

private string EncodeToInternalField(string toEncode)
{

    if (toEncode != null)
    {
        StringBuilder encodedString = new StringBuilder();

        foreach (char chr in toEncode.ToCharArray())
        {
            string encodedChar = HttpUtility.UrlEncodeUnicode(chr.ToString());

            if (encodedChar.StartsWith("%"))
            {
                encodedChar = encodedChar.Replace("u", "x");
                encodedChar = encodedChar.Substring(1, encodedChar.Length - 1);
                encodedChar = String.Format("_{0}_", encodedChar);
                encodedString.Append(encodedChar);
            }
            else if (encodedChar == "+" || encodedChar == " ")
            {
                encodedString.Append("_x0020_");
            }
            else if (encodedChar == ".")
            {
                encodedString.Append("_x002e_");
            }
            else
            {
                encodedString.Append(chr);
            }

        }
        return encodedString.ToString();
    }
    return null;
}

I tested this helper with the Japanese word for Austria, which is written like “オーストリア”. In this case the XmlConvert. EncodeName method will fail too, but mine do the encoding right. I mean I cannot read this Japanese word but I think you will get the point here. The only thing the helper not support is the truncation of the field names but I think this won’t be that hard to do.

Decoding the field name is pretty much straight forward all that needs to be done is to replace all “_x” with “%u”, remove all remaining underscores and use HttpUtility.UrlDecode and here is the code.

private string DecodeInternalField(string toDecode)
{
    if (toDecode != null)
    {
        string decodedString = toDecode.Replace("_x", "%u").Replace("_", "");
        return HttpUtility.UrlDecode(decodedString);
    }
    else
    {
        return null;
    }
}

From display name to internal field name using Javascript

If you use SPServices or the Javascript Client Object model, then you might need the same functionality that I previously presented in C# in Javascript. The encoding and decoding can be used in CAML Queries for example. This javascript encoder and decoder follow the sam pattern at the c# methods. I do the same, encode every single character by using the “escape” function for encoding and “unescape” for decoding. The finished code looks like this.

// Encode Fields
var SPEncode = function(toEncode){

    var charToEncode = toEncode.split('');
    var encodedString = "";

    for(i = 0; i < charToEncode.length; i++)
    {
        encodedChar = escape(charToEncode[i]).toLowerCase();

        if(encodedChar.length == 3)
        {
            encodedString += encodedChar.replace("%", "_x00") + "_";
        } 
        else if(encodedChar.length == 5)
        {
            encodedString += encodedChar.replace("%u", "_x") + "_";
        } 
        else 
        {
            encodedString += encodedChar;
        }
    }

    return encodedString;

}

// Decode Fields
var SPDecode = function(toDecode){

    var decodedString = toDecode.replace("_x", "%u").replace("_", "");
    document.write(decodedString);

    return unescape(decodedString);

}

To use this you just need to call the functions. In the code you also should consider the length of the the field names of SharePoint.

Conclusion

In SharePoint you cannot prevent that end user will create field names with special characters. I also think that it is not useful to deploy any column by using code or the declarative approach. One big advantage of SharePoint is that the end user is able to customize the platform to their needs, without asking a developer.

Both approaches, JavaScript and C#, show how to handle the SharePoint encoded field names, no matter if you use SharePoint 2007, SharePoint 2010 or SharePoint 2013. I also think that the JavaScript solution will become more and more important in future implementations. One reason is the new JavaScript API available in SharePoint 2013 or one of my all time favorite SPServices, if you like to access SharePoint via JavaScript and Web Services.

At the end I like to thank Marc D. Anderson (@sympmarc), James Love (@jimmywim) and Hugh Wood (@HughAJWood) for a great discussion on twitter prior this blog post.

If you have any enhancements to the provided code or like to give me a general feedback, please feel free to add a comment.