Article
21 comments

Encode and decode field names from display name to internal name

In this article you will learn how fields in SharePoint can be encoded from a display name to an internal field name. The problem with the difference between internal field names and display name is that, SharePoint won’t have any built in function that is available to use in your code. If field names are only defined in English you might not have such problem. My mother tongue is German and we have some special characters that needs to be special treated and encoded.

The basic

Lets’s create a field called “Project Number”. The internal name for this field then will be “Project_x0020_Number”. The “_x0020_” in the field name is the urlencoded representation of the space in between of both words. To be more precise it is a form of unicode url encoding. The Unicode encoding of the space is “%u0020” but for fields “_x0020_” is required. To transform the encoded field name to a proper SharePoint internal field name we need to transform the encoded space somehow.

Another point that need to be considered is that SharePoint 2010 has a maxium length for internal field names is limited by 32 characters. By encoding the string we will get a much more longer string than the display name is. SharePoint handles this by truncating the field name and if multiple field names will be produced an additional index will be added as character number 33.

The good news is that this has been improved in SharePoint 2013. Now you can use display names with a length up to 255 characters. The limitation of the internal field names doesn’t exists anymoure. It seems that the limitation of the internal name is now raised to 1024 characters. In some test cases I got fields with an internal name length of more than 600 characters.

From display name to internal field name in C#

As mentioned before somehow SharePoint doesn’t have an out of the box function to convert a field name from the display field name to an internal field name. At least as far as I know.

There is one method that works for field names in English, but it doesn’t work for non ASCII characters and punctuation marks. The method which was recommended by Hugh Wood (@HughAJWood) is to use XmlConvert.EncodeName for encoding and XmlConvert.DecodeName for decoding the field names. What I have experienced with this is that it cannot considered to be a save approach to encode the display name strings. To give you an example “Österreich”, the german name of the country where I live in, will be encoded to “Österreich” but a field with this label in SharePoint will be encoded as “_x00d6_sterreich”. This is because the letter “Ö” is valid UTF-8 character to be used in a XML but somehow it’s not save enough for SharePoint.

My approach is a little bit different, might not the fastest, but save in a manner of SharePoint. So what I do to create a valid encoding, is to encode every single character of the display name using the method HttpUtility.UrlEncodeUnicode.

private string EncodeToInternalField(string toEncode)
{

    if (toEncode != null)
    {
        StringBuilder encodedString = new StringBuilder();

        foreach (char chr in toEncode.ToCharArray())
        {
            string encodedChar = HttpUtility.UrlEncodeUnicode(chr.ToString());

            if (encodedChar.StartsWith("%"))
            {
                encodedChar = encodedChar.Replace("u", "x");
                encodedChar = encodedChar.Substring(1, encodedChar.Length - 1);
                encodedChar = String.Format("_{0}_", encodedChar);
                encodedString.Append(encodedChar);
            }
            else if (encodedChar == "+" || encodedChar == " ")
            {
                encodedString.Append("_x0020_");
            }
            else if (encodedChar == ".")
            {
                encodedString.Append("_x002e_");
            }
            else
            {
                encodedString.Append(chr);
            }

        }
        return encodedString.ToString();
    }
    return null;
}

I tested this helper with the Japanese word for Austria, which is written like “オーストリア”. In this case the XmlConvert. EncodeName method will fail too, but mine do the encoding right. I mean I cannot read this Japanese word but I think you will get the point here. The only thing the helper not support is the truncation of the field names but I think this won’t be that hard to do.

Decoding the field name is pretty much straight forward all that needs to be done is to replace all “_x” with “%u”, remove all remaining underscores and use HttpUtility.UrlDecode and here is the code.

private string DecodeInternalField(string toDecode)
{
    if (toDecode != null)
    {
        string decodedString = toDecode.Replace("_x", "%u").Replace("_", "");
        return HttpUtility.UrlDecode(decodedString);
    }
    else
    {
        return null;
    }
}

From display name to internal field name using Javascript

If you use SPServices or the Javascript Client Object model, then you might need the same functionality that I previously presented in C# in Javascript. The encoding and decoding can be used in CAML Queries for example. This javascript encoder and decoder follow the sam pattern at the c# methods. I do the same, encode every single character by using the “escape” function for encoding and “unescape” for decoding. The finished code looks like this.

// Encode Fields
var SPEncode = function(toEncode){

    var charToEncode = toEncode.split('');
    var encodedString = "";

    for(i = 0; i < charToEncode.length; i++)
    {
        encodedChar = escape(charToEncode[i]).toLowerCase();

        if(encodedChar.length == 3)
        {
            encodedString += encodedChar.replace("%", "_x00") + "_";
        } 
        else if(encodedChar.length == 5)
        {
            encodedString += encodedChar.replace("%u", "_x") + "_";
        } 
        else 
        {
            encodedString += encodedChar;
        }
    }

    return encodedString;

}

// Decode Fields
var SPDecode = function(toDecode){

    var decodedString = toDecode.replace("_x", "%u").replace("_", "");
    document.write(decodedString);

    return unescape(decodedString);

}

To use this you just need to call the functions. In the code you also should consider the length of the the field names of SharePoint.

Conclusion

In SharePoint you cannot prevent that end user will create field names with special characters. I also think that it is not useful to deploy any column by using code or the declarative approach. One big advantage of SharePoint is that the end user is able to customize the platform to their needs, without asking a developer.

Both approaches, JavaScript and C#, show how to handle the SharePoint encoded field names, no matter if you use SharePoint 2007, SharePoint 2010 or SharePoint 2013. I also think that the JavaScript solution will become more and more important in future implementations. One reason is the new JavaScript API available in SharePoint 2013 or one of my all time favorite SPServices, if you like to access SharePoint via JavaScript and Web Services.

At the end I like to thank Marc D. Anderson (@sympmarc), James Love (@jimmywim) and Hugh Wood (@HughAJWood) for a great discussion on twitter prior this blog post.

If you have any enhancements to the provided code or like to give me a general feedback, please feel free to add a comment.

21 Comments

  1. Stefan:

    Good work! All of this assumes that the StaticName and the DisplayName actually match. I can’t tell you how many columns I see that have been renamed so many times that their actual use has nothing to do with the StaticName. Since the StaticName can never be changed, this is bound to happen, but it is a challenge.

    M.

    Reply

    • Marc, you are absolutely right about renaming the fields and that this could lead to a problem on using the DisplayName to get the StaticName.
      I don’t see the StaticName as such a big problem, because you are allowed to sync DisplayName and StaticName using code. The StaticName should then be set to the same encoded value that SharePoint does when you initial create a new column. In this case you can use those functions as well for encoding the display name. In real life nobody would do this sync but technically you can do.
      A bigger problem is the InternalName which always stays the same and you are also not allowed to set a new InternalName. When you writing code and use the original DisplayName to get the created InternalName then you are much more flexible, because you don’t need to take care if a user changes the display name once again.

      One thing that is in my mind for a long time: What is the safest way to get the fields? Somehow I’d like to change my coding to work only with the internal field names. Doing so gives the ability to use the initial display name in code to get the internal name. Especially if you build something on top of user customizations. This will also avoid the situation a little bit more: “Customer changed the display name again”.
      /Stefan

      Reply

      • That’s why I ended up building SPGetStaticFromDisplay and SPGetDisplayFromStatic into SPServices. Each requires a read from GetList, but it allows for looser code which is more user-focused.

        You’re right that if you’re working in server-side code, you have more options. But that will be the case less and less with the move to 2013’s app model.

        M.

        Reply

        • Oh wow! I think I have overseen those functions you mentioned completely. Thank you for that update I’ll need to check.
          /Stefan

          Reply

  2. As you guys mention above, after users rename the display name, there is no deducible mapping from display name to static name. There is never a completely deducible mapping from static name to internal name, I believe, due to the collision avoidance algorithms depending on which name was there first.

    Therefore, I think that it is not computationally possible to map display names to static names 100% reliably, nor static names to internal names 100% reliably, so I think the robust solution requires looking up the SPField.

    Reply

  3. easier:

    string itemInternalName = item.Fields[“Field Display Name”].InternalName;

    Reply

    • Yes it’s easier but in case someone renamed the display name this will fail. The internal name cannot be renamed.

      Reply

  4. Even without, the renaming issue, try this: create a new column called “Author”. You’ll find out that SharePoint will give it “Author0” as internal name.

    The foolproof way is to use the Fields collection.

    Reply

    • Author is the internal name of the “created by” field. This has nothing to do with the encoding. If you define author column or any ootb column again this will always lead to a fieldname like fieldname0 or fieldname1 in worst case. Walk the fields collection will might impact the performance and should be avoided.

      Reply

  5. hej,

    this is a powershell rewrite of the original c# version:

    Function ConvertTo-InternalField { Param ( [String] $toEncode ) if ($toEncode -ne $null) { [String] $encodedString foreach ($chr in $toEncode.GetEnumerator()) { [String] $encodedChar = [System.Web.HttpUtility]::UrlEncodeUnicode($chr.ToString()); if ($encodedChar.StartsWith("%")) { $encodedChar = $encodedChar.Replace("u", "x"); $encodedChar = $encodedChar.Substring(1, $encodedChar.Length - 1); $encodedChar = ("_{0}_" -f $encodedChar); $encodedString += $encodedChar; } elseif ($encodedChar -eq "+" -or $encodedChar -eq " ") { $encodedString += "_x0020_"; } elseif ($encodedChar -eq ".") { $encodedString += ("_x002e_"); } else { $encodedString += ($chr); } } $encodedString.ToString(); } $null; } ConvertTo-InternalField "Field: Display Name"

    Reply

  6. Thanks for the help, but I have a potential bug: in the C# version, it encodes ‘ as _27_. But in the JavaScript version, it encodes ‘ as _x0027_ which I think is correct.

    By the way, arrow keys navigate to the next/previous blog entry, which is a little frustrating when editing a comment :-S

    Reply

    • Thank you for your feedback about the possible bug. I will check this out.
      I think i will disable the navigation when the focus is on the comment or name field. Never checked this issue but again than for the feedback!!!

      Reply

      • I’ve just re-discovered this blog and my comment 3 years later! This line:

        encodedChar = String.Format(“_{0}_”, encodedChar);
        should be:
        encodedChar = String.Format(“_x00{0}_”, encodedChar);

        which will fix the bug. Obviously I didn’t learn my lesson in 2014 😀

        Cheers!!

        Reply

  7. oops. Poor choice for the group name…

    DecodeInternalField looks incorrect as it mistakenly removed valid underscores.

    E.g., car_colour –> carcolour

    I replaced

    string decodedString = toDecode.Replace(“_x”, “%u”).Replace(“_”, “”);

    with

    Regex regEx = new Regex(“_x(?[a-z0-9]{4})_”);
    string decodedString = regEx.Replace(toDecode, match =>
    {
    return “%u” + match.Groups[“fourletters”];
    });

    which seems to work better.

    Reply

    • Hi Matthew,
      Thank you for your reply. I mostly tested this with special german characters.
      /Stefan

      Reply

  8. In SharePoint 2013, it seems like the internal names are truncated if they are >32 chars.

    For example, instead of this…

    FieldName=”Meeting_x0020_Scheduled_x0020_Date”

    …SP is truncating it to this…

    FieldName=”Meeting_x0020_Scheduled_x0020_Da”

    …at least that what it looks like when I view EditForm.

    Anyway, I do not see that 32-max-len in your script, but maybe I missed it?

    Thanks.

    Mark Kamoski

    Reply

    • Thank you for your feedback. I think I should take a closer look at this issue.

      Reply

    • That 32 character limit has always been there. If you have multiple columns with the same first 32 characters, the first one keeps the 32nd character, the second one gets “0” as its 32nd character, the third gets “1”, etc.

      M.

      Reply

  9. Here’s another wacky one. Fields names containing a dash (-) and added to a *site* will not have – encoded. When you then add that field to a list you get the internal name on the list with a – encoded to _x002d_. The HttpUtility.Encode doesn’t encode the – character.

    Reply

Leave a Reply

Required fields are marked *.


This site uses Akismet to reduce spam. Learn how your comment data is processed.