The FIDO2 PIN

The FIDO2 standards contain some special requirements on the PIN. One constraint is that the PIN must be supplied as "... the UTF-8 representation of" the "Unicode characters in Normalization Form C". Another constraint is that the PIN must be a minimum length measured in "code points" (the standard declares, "This specification attempts to count code points as an approximation of Unicode characters"), and a maximum length measured in bytes (described further below).

What does that mean? How does one build such a PIN?

Unicode characters

First, let's look at "Unicode characters". The Unicode standard specifies a number for each character supported. For example, the number for cap-A is U+0041 or 0x000041. The number for the lower case greek letter pi (π) is U+03C0. There is no logical limit to numbers, but currently the maximum Unicode number is 0x10FFFF (21-bits, or 3 bytes).

Unfortunately, it is also possible to create "combinations". There is a "block" of unicode numbers that are "combining diacritical marks", meaning that when they appear in an array of characters, software that can render Unicode will know to combine them with the previous character. For example, the Unicode for lower case e is U+0065, and the Unicode for "acute accent" is U+0301 (the acute accent is a small, diagonal line above a letter, sort of like a single quote or forward slash). Combine the two

    char[] eWithAcute = new char[] { '\u0065', '\u0301' };

and the result is a lower case e with an acute accent: é.

There is also a Unicode number for an e with an acute accent: U+00E9. In other words, there are two ways to represent this letter in Unicode.

    char[] eWithAcute = new char[] { '\u0065', '\u0301' };
    char[] sameCharacter = new char[] { '\u00E9' };

Normalization

In order to use a PIN, there has to be one and only one way to encode the characters. Otherwise, someone could enter the correct PIN and if the underlying platform encodes it differently than the original one, then it would not authenticate. So the second element of the PIN is normalization. There is a standard that specifies how to "convert" most of the combinations into single numbers. For example, normalization can convert 0065 0301 into 00E9.

Hence if your PIN is normalized, then there is only one set of numbers to represent it. The standard specifies a number of ways to normalize, and FIDO2 has chosen the technique described as "Form C".

UTF-8

Once the PIN has been normalized, it is in essence an array of Unicode numbers. It would be possible to specify that each character in the PIN be a 3-byte (big endian) number. It would also be possible to specify that only 16-bit characters be allowed in a PIN and encode it as an array of 2-byte values. However, the standard specifies encoding it as UTF-8. In this encoding scheme, many characters can be expressed as a single byte, rather than two or three. In addition, there are no 00 bytes in UTF-8. For example, cap-C is U+0043 and in UTF-8, it is 0x43. The letter pi is U+03C0, and is encoded in UTF-8 as 0xCB80. In this way, it is possible to save space by "eliminating" many of the 00 bytes.

Actually, the encoding scheme is efficient only in that it treats ASCII characters as single bytes. There are non-ASCII Unicode characters that are only one byte (U+00xx), and are UTF-8 encoded as two bytes, and some two-byte Unicode characters that are encoded using three bytes, and three-byte Unicode encoded in four bytes. However, because ASCII characters are the most-used characters, the efficienices usually outweigh the inefficiencies.

C# and Unicode

Your PIN collection code will likely include some code that does something like this.

    while (someCheck)
    {
        ConsoleKeyInfo currentKeyInfo = Console.ReadKey();
        if (currentKeyInfo.Key == ConsoleKey.Enter)
        {
            break;
        }

        inputData = AppendChar(currentKeyInfo.KeyChar, inputData, ref dataLength);
    }

You read each character in the PIN as a char and append it to a char[]. You could use the string class, but Microsoft recommends not using the string class to hold sensitive data. This is because:

System.String instances are immutable, operations that appear to modify an existing instance actually create a copy of it to manipulate. Consequently, if a String object contains sensitive information such as a password, credit card number, or personal data, there is a risk the information could be revealed after it is used because your application cannot delete the data from computer memory.

By reading each PIN as a char, you are limiting the characters you support to those that can be represented as a 16-bit number in the Unicode space. You would not support U-10000 to U+10FFFF. This will almost certainly be no problem, because these numbers almost exclusively represent emojis and other figures (e.g. U+1F994 is a hedgehog: 🦔), along with rare alphabets (e.g. U+14400 to U+14646 are for Anatolian hieroglyphs).

You now have a char array to represent the PIN.

C# and Normalization

At this point, you need to normalize. For example, suppose that someone has a German keyboard and originally set a FIDO2 PIN that included a lower case u with an umlaut (ü). That keyboard represented the character as U+00FC. But now this person is using a keyboard that has no umlaut so uses the keystrokes Option-U followed by u. Maybe the platform reads it as U+00FC, but maybe it reads it as U+0075, U+0308.

If the char array is normalized, U+00FC will stay U+00FC, but U+0075, U+0308 will be converted to U+00FC.

How does one normalize in C#? Unfortunately there are no good solutions. Here are three possibilities: ignore the problem and assume no one will use a PIN that really needs normalization, write your own normalization code (or obtain something from a vendor), or use the String.Normalize method which would store the PIN in a new immutable string instance.

Assume PINs will not need normalization

This might not be unsafe. While it is possible to have a PIN that when entered is not the same as the normalized version, it is not likely.

First of all, a PIN that consists of only ASCII characters is normalized. Second, most people will choose a PIN that does not contain unusual characters. And third, there is a good chance that the keyboard or PIN-reading software will return the normalized version of a character even if some other form is possible.

Write your own normalization code

To do so, you will likely reference the Unicode standard along with the Normalization Annex to develop some class that can read a char array and convert those values to the normalized form C. For example, your program might read all the characters and determine if there are any characters from the "combining diacritical marks" block. If so, combine them with the appropriate prior character and map to the normalized value.

Alternatively, you might want to use some Open Source normalization code or find some other vendor with some module that can perform the appropriate operations.

    char[] pinChars = CollectPin();
    char[] normalizedPinChars = PerformNormalization(pinChars);

Normalization using the `string` class

As we saw above, holding sensitive data in a string carries some risk. Whether or not this is an acceptable risk for your application is something that you will need to determine. If your application's risk profile would allow the use of the string class, here's what you can do.

    char[] pinChars = CollectPin();
    char[] normalizedPinChars = PerformNormalization(pinChars);
      . . .

public char[] PerformNormalization(char[] pinChars)
{
    string pinAsString = new string(pinChars);
    string normalizedPin = pinAsString.Normalize();
    return normalizedPin.ToCharArray();
}

C# and UTF-8

Once you have an array of characters, you can convert that into UTF-8 using the C# Encoding class.

    byte[] utf8Pin = Encoding.UTF8.GetBytes(normalizedPinChars);

This byte array is what you pass to the SetPinCommand.

If you are using the string class to normalize, your code could look something like this.

    char[] pinChars = CollectPin();
    string pinAsString = new string(pinChars);
    string normalizedPin = pinAsString.Normalize();
    byte[] utf8Pin = Encoding.UTF8.GetBytes(normalizedPin);

Length restrictions

The standard specifies that a PIN must be at least four code points. Remember, the standard declares, "This specification attempts to count code points as an approximation of Unicode characters".

The standard also specifies that a PIN can be no more than 63 bytes. That means after the PIN has been converted to "... the UTF-8 representation of" the "Unicode characters in Normalization Form C", it is a byte array. That byte array's length must be less than or equal to 63.

It is possible a YubiKey can be manufactured with a longer minimum length (that is allowed by the standard), and it is possible on some YubiKeys to programmatically increase the minimum length. You can find the minimum PIN length on any YubiKey in the AuthenticatorInfo's MinimumPinLength property.

The standard does not allow increasing or decreasing the maximum PIN length.

Table of Contents