Quelle est la regex pour extraire tous les emojis d'une chaîne?
J'ai une chaîne codée en UTF-8. Par exemple:
Thats a nice joke
13 réponses
Le pdf que vous venez de mentionner indique Range: 1F300-1F5FF pour Divers symboles et Pictogrammes. Disons donc que je veux capturer n'importe quel personnage se trouvant dans cette plage. Maintenant, que faire?
Ok, mais je vais juste noter que les emoji dans votre question sont en dehors de cette plage! :-)
Le fait que ceux-ci soient au-dessus de 0xFFFF
complique les choses, car les chaînes Java stockent UTF-16. Nous ne pouvons donc pas simplement utiliser une classe de caractères simple pour cela. Nous allons ont paires de substitution. (Plus: http://www.unicode.org/faq/utf_bom.html)
U + 1F300 en UTF-16 finit par être la paire \uD83C\uDF00
; U + 1F5FF finit par être \uD83D\uDDFF
. Notez que le premier personnage est monté, nous traversons au moins une limite. Nous devons donc savoir quelles plages de paires de Substitution nous recherchons.
N'étant pas imprégné de connaissances sur le fonctionnement interne de UTF-16, j'ai écrit un programme pour le découvrir (source à la fin-je le vérifierais si j'étais vous, plutôt que faire confiance et de me). Il me dit que nous recherchons \uD83C
suivi de tout ce qui se trouve dans la plage \uDF00-\uDFFF
(inclus), ou \uD83D
suivi de tout ce qui se trouve dans la plage \uDC00-\uDDFF
(inclus).
Donc armés de cette connaissance, en théorie nous pourrions maintenant écrire un modèle:
// This is wrong, keep reading
Pattern p = Pattern.compile("(?:\uD83C[\uDF00-\uDFFF])|(?:\uD83D[\uDC00-\uDDFF])");
C'est une alternance de deux groupes non capturants, le premier groupe pour les paires commençant par \uD83C
, et le second groupe pour les paires commençant par \uD83D
.
Mais qui échoue (ne trouve rien). Je suis assez sûr que c'est parce que nous essayons de spécifier moitié d'une paire de substitution à divers endroits:
Pattern p = Pattern.compile("(?:\uD83C[\uDF00-\uDFFF])|(?:\uD83D[\uDC00-\uDDFF])");
// Half of a pair --------------^------^------^-----------^------^------^
On ne peut pas simplement diviser des paires de mères porteuses comme ça, on les appelle des paires de mères porteuses pour une raison. :-)
Par conséquent, je ne pense pas que nous puissions utiliser des expressions régulières (ou même une approche basée sur des chaînes) pour cela. Je pense que nous devons chercher dans les tableaux char
.
char
les tableaux contiennent des valeurs UTF-16, donc nous pouvons trouvez ces demi-paires dans les données si nous les cherchons à la dure:
String s = new StringBuilder()
.append("Thats a nice joke ")
.appendCodePoint(0x1F606)
.appendCodePoint(0x1F606)
.appendCodePoint(0x1F606)
.append(" ")
.appendCodePoint(0x1F61B)
.toString();
char[] chars = s.toCharArray();
int index;
char ch1;
char ch2;
index = 0;
while (index < chars.length - 1) { // -1 because we're looking for two-char-long things
ch1 = chars[index];
if ((int)ch1 == 0xD83C) {
ch2 = chars[index+1];
if ((int)ch2 >= 0xDF00 && (int)ch2 <= 0xDFFF) {
System.out.println("Found emoji at index " + index);
index += 2;
continue;
}
}
else if ((int)ch1 == 0xD83D) {
ch2 = chars[index+1];
if ((int)ch2 >= 0xDC00 && (int)ch2 <= 0xDDFF) {
System.out.println("Found emoji at index " + index);
index += 2;
continue;
}
}
++index;
}
Évidemment, c'est juste du code au niveau du débogage, mais ça fait le travail. (Dans votre chaîne donnée, avec ses emoji, bien sûr, il ne trouvera rien car ils sont en dehors de la plage. Mais si vous changez la limite supérieure sur la deuxième paire à 0xDEFF
au lieu de 0xDDFF
, ce sera le cas. Aucune idée si cela peut également inclure des émoticônes, cependant.)
Source de mon programme pour savoir ce que les plages de substitution étaient:
public class FindRanges {
public static void main(String[] args) {
char last0 = '\0';
char last1 = '\0';
for (int x = 0x1F300; x <= 0x1F5FF; ++x) {
char[] chars = new StringBuilder().appendCodePoint(x).toString().toCharArray();
if (chars[0] != last0) {
if (last0 != '\0') {
System.out.println("-\\u" + Integer.toHexString((int)last1).toUpperCase());
}
System.out.print("\\u" + Integer.toHexString((int)chars[0]).toUpperCase() + " \\u" + Integer.toHexString((int)chars[1]).toUpperCase());
last0 = chars[0];
}
last1 = chars[1];
}
if (last0 != '\0') {
System.out.println("-\\u" + Integer.toHexString((int)last1).toUpperCase());
}
}
}
Sortie:
\uD83C \uDF00-\uDFFF \uD83D \uDC00-\uDDFF
En utilisant Emoji-java j'ai écrit une méthode simple qui supprime tous les emojis, y compris les modificateursfitzpatrick . Nécessite une bibliothèque externe mais plus facile à maintenir que ces expressions rationnelles de monstre.
Utilisation:
String input = "A string with a \uD83D\uDC66\uD83C\uDFFFfew emojis!";
String result = EmojiParser.removeAllEmojis(input);
Installation Emoji-java Maven:
<dependency>
<groupId>com.vdurmont</groupId>
<artifactId>emoji-java</artifactId>
<version>3.1.3</version>
</dependency>
Gradle:
compile 'com.vdurmont:emoji-java:3.1.3'
EDIT: la réponse précédemment soumise a été tirée dans le code source emoji-java.
Avait un problème similaire. Ce qui suit m'a bien servi et correspond à des paires de substitution
public class SplitByUnicode {
public static void main(String[] argv) throws Exception {
String string = "Thats a nice joke ";
System.out.println("Original String:"+string);
String regexPattern = "[\uD83C-\uDBFF\uDC00-\uDFFF]+";
byte[] utf8 = string.getBytes("UTF-8");
String string1 = new String(utf8, "UTF-8");
Pattern pattern = Pattern.compile(regexPattern);
Matcher matcher = pattern.matcher(string1);
List<String> matchList = new ArrayList<String>();
while (matcher.find()) {
matchList.add(matcher.group());
}
for(int i=0;i<matchList.size();i++){
System.out.println(i+":"+matchList.get(i));
}
}
}
La sortie est:
Original String:Thats a nice joke
0:
1:
Trouvé l'expression régulière à partir de https://stackoverflow.com/a/24071599/915972
Cela a fonctionné pour moi en java 8:
public static String mysqlSafe(String input) {
if (input == null) return null;
StringBuilder sb = new StringBuilder();
for (int i = 0; i < input.length(); i++) {
if (i < (input.length() - 1)) { // Emojis are two characters long in java, e.g. a rocket emoji is "\uD83D\uDE80";
if (Character.isSurrogatePair(input.charAt(i), input.charAt(i + 1))) {
i += 1; //also skip the second character of the emoji
continue;
}
}
sb.append(input.charAt(i));
}
return sb.toString();
}
, Vous pouvez le faire comme ceci
String s="Thats a nice joke ";
Pattern pattern = Pattern.compile("[\ud83c\udc00-\ud83c\udfff]|[\ud83d\udc00-\ud83d\udfff]|[\u2600-\u27ff]",
Pattern.UNICODE_CASE | Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(s);
List<String> matchList = new ArrayList<String>();
while (matcher.find()) {
matchList.add(matcher.group());
}
for(int i=0;i<matchList.size();i++){
System.out.println(matchList.get(i));
}
En supposant que vous demandez des plages Emoji Unicode standard (il existe différents blocs par fournisseur), vous pouvez considérer ces trois plages:
- 0x20a0-0x32ff
- 0x1f000-0x1ffff
- 0xfe4e5-0xfe4ee
Outre toute l'explication réfléchie que T. J. Crowder a partagée avec nous, il faut dire qu'à partir de Java 7, Il est possible de faire correspondre facilement les paires de substitution codées UTF-16.
Jetez un oeil à la docs:
Http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html
Un caractère Unicode peut également être représenté dans une expression régulière en utilisant sa notation hexadécimale(valeur de point de code hexadécimal) directement comme décrit dans construct \ x{...}, par exemple, un caractère supplémentaire U + 2011F peut être spécifié comme \x{2011F}, au lieu de deux séquences d'échappement Unicode consécutives de la paire de substitution \ uD840\uDD1F.
Néanmoins, si vous ne pouvez pas passez à Java 7, vous pouvez étendre le précieux UnicodeEscaper fourni par la goyave.
Voici une implémentation pour l'exemple:
public class SimpleEscaper extends UnicodeEscaper
{
@Override
protected char[] escape(int codePoint)
{
if (0x1f000 >= codePoint && codePoint <= 0x1ffff)
{
return Integer.toHexString(codePoint).toCharArray();
}
return Character.toChars(codePoint);
}
}
La meilleure expression rationnelle pour extraire tous les emoji est la suivante:
(?:[\u2700-\u27bf]|(?:\ud83c[\udde6-\uddff]){2}|[\ud800-\udbff][\udc00-\udfff]|[\u0023-\u0039]\ufe0f?\u20e3|\u3299|\u3297|\u303d|\u3030|\u24c2|\ud83c[\udd70-\udd71]|\ud83c[\udd7e-\udd7f]|\ud83c\udd8e|\ud83c[\udd91-\udd9a]|\ud83c[\udde6-\uddff]|[\ud83c[\ude01-\ude02]|\ud83c\ude1a|\ud83c\ude2f|[\ud83c[\ude32-\ude3a]|[\ud83c[\ude50-\ude51]|\u203c|\u2049|[\u25aa-\u25ab]|\u25b6|\u25c0|[\u25fb-\u25fe]|\u00a9|\u00ae|\u2122|\u2139|\ud83c\udc04|[\u2600-\u26FF]|\u2b05|\u2b06|\u2b07|\u2b1b|\u2b1c|\u2b50|\u2b55|\u231a|\u231b|\u2328|\u23cf|[\u23e9-\u23f3]|[\u23f8-\u23fa]|\ud83c\udccf|\u2934|\u2935|[\u2190-\u21ff])
Il identifie de nombreux emoji à caractère unique que les autres réponses ne tiennent pas compte. Pour plus d'informations sur le fonctionnement de cette expression rationnelle, jetez un oeil à ce post. https://medium.com/@thekevinscott/emojis-in-javascript-f693d0eb79fb#.enomgcu63
Vous pouvez aussi utiliser emoji4j bibliothèque.
String emojiText = "A , and a became friends. For 's birthday party, they all had s, s, s and .";
EmojiUtils.removeAllEmojis(emojiText);//returns "A , and a became friends. For 's birthday party, they all had s, s, s and .
Emoji regex
public static final String sEmojiRegex = "(?:[\\u2700-\\u27bf]|" +
"(?:[\\ud83c\\udde6-\\ud83c\\uddff]){2}|" +
"[\\ud800\\udc00-\\uDBFF\\uDFFF]|[\\u2600-\\u26FF])[\\ufe0e\\ufe0f]?(?:[\\u0300-\\u036f\\ufe20-\\ufe23\\u20d0-\\u20f0]|[\\ud83c\\udffb-\\ud83c\\udfff])?" +
"(?:\\u200d(?:[^\\ud800-\\udfff]|" +
"(?:[\\ud83c\\udde6-\\ud83c\\uddff]){2}|" +
"[\\ud800\\udc00-\\uDBFF\\uDFFF]|[\\u2600-\\u26FF])[\\ufe0e\\ufe0f]?(?:[\\u0300-\\u036f\\ufe20-\\ufe23\\u20d0-\\u20f0]|[\\ud83c\\udffb-\\ud83c\\udfff])?)*|" +
"[\\u0023-\\u0039]\\ufe0f?\\u20e3|\\u3299|\\u3297|\\u303d|\\u3030|\\u24c2|[\\ud83c\\udd70-\\ud83c\\udd71]|[\\ud83c\\udd7e-\\ud83c\\udd7f]|\\ud83c\\udd8e|[\\ud83c\\udd91-\\ud83c\\udd9a]|[\\ud83c\\udde6-\\ud83c\\uddff]|[\\ud83c\\ude01-\\ud83c\\ude02]|\\ud83c\\ude1a|\\ud83c\\ude2f|[\\ud83c\\ude32-\\ud83c\\ude3a]|[\\ud83c\\ude50-\\ud83c\\ude51]|\\u203c|\\u2049|[\\u25aa-\\u25ab]|\\u25b6|\\u25c0|[\\u25fb-\\u25fe]|\\u00a9|\\u00ae|\\u2122|\\u2139|\\ud83c\\udc04|[\\u2600-\\u26FF]|\\u2b05|\\u2b06|\\u2b07|\\u2b1b|\\u2b1c|\\u2b50|\\u2b55|\\u231a|\\u231b|\\u2328|\\u23cf|[\\u23e9-\\u23f3]|[\\u23f8-\\u23fa]|\\ud83c\\udccf|\\u2934|\\u2935|[\\u2190-\\u21ff]";
Certaines émoticônes (1627)
// count = 1627
public static final String sEmojiTest = "☺️☹️☠️✌️☝️✍️♀♀♀♀♀️♀️⚕⚕✈✈⚖⚖♀♂♂♂♂♀♂♀♂♂♂♂♂♂♀♀❤️❤️❤️❤️☂️☘️️⚡️☄☀️️☁️☃️️❄️☔️☕️️️️️♀️♀♂♀♂️♀️♀♂️♀️♀♀♀♂♀♀♀♀♂✈️️⚓️️️️️⌚️⌨️☎️⌛️⚖️⚒⚙️⚔️⚰️⚱️⚗️✉️✂️✒️✏️❤️❣️☮️✝️☪️☸️✡️☯️☦️♈️♉️♊️♋️♌️♍️♎️♏️♐️♑️♒️♓️⚛️☢️☣️️️✴️㊙️㊗️️️️️️♨️️‼️⁉️〽️⚠️⚜️♻️️❇️✳️Ⓜ️♿️️️ℹ️0️⃣1️⃣2️⃣3️⃣4️⃣5️⃣6️⃣7️⃣8️⃣9️⃣#️⃣*️⃣▶️◀️➡️⬅️⬆️⬇️↗️↘️↙️↖️↕️↔️↪️↩️⤴️⤵️✖️™️©️®️〰️✔️☑️⚪️⚫️▪️▫️◾️◽️◼️◻️️️♠️♣️♥️♦️️️️️️️️♀️♀️♀️♀️♀️♀️️♀️♂️♀️♀️♀️♀️♀️♀️♂️♂️♂️♂️♂️♂️️♀️♀️♀️♀️♀️♀️️♀️♀️♀️♀️♀️♀️♂️♂️♂️♂️♂️♂️️♀️♀️♀️♀️♀️♀️️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♂️♂️♂️♂️♂️♂️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♀️♂️";
La Fonction permet de tester les émoticônes
public void checkMatchingEmojis() {
final Pattern pattern = Pattern.compile(sEmojiRegex);
final Matcher matcher = pattern.matcher(sEmojiTest);
int foundEmojiCount = 0;
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
foundEmojiCount++;
}
System.out.println("*******************************************");
System.out.println("Input Emoji count = 1627");
System.out.println("Captured Emoji count = " + foundEmojiCount);
System.out.println("*******************************************");
}
Voici l'essentiel, testé sur tous les emojis unicode 10
Merci à Kevin Scott pour avoir écrit greate example
C'est ce que j'utilise pour enlever les émoticônes et jusqu'à présent, il a montré que tous les autres alphabets.
private static String remove_Emojis(String name)
{
//we will store all the letters in this array
ArrayList<Character> nonEmoji = new ArrayList<>();
// and when we rebuild the name we will put it in here
String newName = "";
// we are going to loop through checking each character to see if its an emoji or not
for (int i = 0; i < name.length(); i++)
{
if (Character.isLetterOrDigit(name.charAt(i)))
{
nonEmoji.add(name.charAt(i));
}
else
{
// this is just a 2nd check in case the other method didn't allow some letter
if (Build.VERSION.SDK_INT > 18)
{
if (Character.isAlphabetic(name.charAt(i)))
{
nonEmoji.add(name.charAt(i));
}
}
}
if (name.charAt(i) == ' ')// may want to consider adding or '-' or '\''
{
nonEmoji.add(i);// just add it
}
if (name.charAt(i) == '@' && !name.contains(" "))// I put this in for email addresses
{
nonEmoji.add('@');
}
}
// finally just loop through building it back out
for (int i = 0; i < nonEmoji.size(); i++) {
newName += nonEmoji.get(i);
}
return newName;
}
Il y a deux façons de résoudre ce problème collant.
Le premier utilise des bibliothèques tierces comme Emoji-java{[7] } et emoji4j.ceux-ci sont mentionnés ci-dessus. Vous pouvez facilement utiliser la méthode containsEmoji
ou removesEmoji
, etc. Et dans vos propres applications, vous devez garder à jour avec ces libs.
Quant à moi, je veux trouver une solution simple pour résoudre ce problème.
Après une journée entière de recherche, j'ai trouvé une regex magique:
"(?:[\uD83C\uDF00-\uD83D\uDDFF]|[\uD83E\uDD00-\uD83E\uDDFF]|[\uD83D\uDE00-\uD83D\uDE4F]|[\uD83D\uDE80-\uD83D\uDEFF]|[\u2600-\u26FF]\uFE0F?|[\u2700-\u27BF]\uFE0F?|\u24C2\uFE0F?|[\uD83C\uDDE6-\uD83C\uDDFF]{1,2}|[\uD83C\uDD70\uD83C\uDD71\uD83C\uDD7E\uD83C\uDD7F\uD83C\uDD8E\uD83C\uDD91-\uD83C\uDD9A]\uFE0F?|[\u0023\u002A\u0030-\u0039]\uFE0F?\u20E3|[\u2194-\u2199\u21A9-\u21AA]\uFE0F?|[\u2B05-\u2B07\u2B1B\u2B1C\u2B50\u2B55]\uFE0F?|[\u2934\u2935]\uFE0F?|[\u3030\u303D]\uFE0F?|[\u3297\u3299]\uFE0F?|[\uD83C\uDE01\uD83C\uDE02\uD83C\uDE1A\uD83C\uDE2F\uD83C\uDE32-\uD83C\uDE3A\uD83C\uDE50\uD83C\uDE51]\uFE0F?|[\u203C\u2049]\uFE0F?|[\u25AA\u25AB\u25B6\u25C0\u25FB-\u25FE]\uFE0F?|[\u00A9\u00AE]\uFE0F?|[\u2122\u2139]\uFE0F?|\uD83C\uDC04\uFE0F?|\uD83C\uDCCF\uFE0F?|[\u231A\u231B\u2328\u23CF\u23E9-\u23F3\u23F8-\u23FA]\uFE0F?)"
Que j'ai testé OK en Java. Il a parfaitement résolu mon problème.
Vous pouvez voir ceci sur la page Github:
Https://github.com/zly394/EmojiRegex
Notes:
La réponse fournie par @ Eric Nakagawa contient quelques erreurs, qui ne peuvent pas être utilisées correctement.
Juste pour utiliser regex pour le résoudre:
s = s.replaceAll("\\p{So}+", "");
Vous pouvez le trouver dans
Http://www.regular-expressions.info/unicode.html
Https://docs.oracle.com/javase/7/docs/api/java/lang/Character.html#OTHER_SYMBOL
Vous pouvez générer votre propre regex chaque fois que la spécification change.
Cet outil (capture d'écran ici).
Pour le mode utf-8/32 (chaîne), mode étendu:
" # Use the 'Mega-Conversion' tool to change into other syntaxes"
" # -------------------------------------------------------------"
" "
" [#*0-9] \\x{FE0F} \\x{20E3}"
" | [\\x{A9}\\x{AE}\\x{203C}\\x{2049}\\x{2122}\\x{2139}\\x{2194}-\\x{2199}\\x{21A9}\\x{21AA}\\x{231A}\\x{231B}\\x{2328}\\x{23CF}\\x{23E9}-\\x{23F3}\\x{23F8}-\\x{23FA}\\x{24C2}\\x{25AA}\\x{25AB}\\x{25B6}\\x{25C0}\\x{25FB}-\\x{25FE}\\x{2600}-\\x{2604}\\x{260E}\\x{2611}\\x{2614}\\x{2615}\\x{2618}]"
" | \\x{261D} [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{2620}\\x{2622}\\x{2623}\\x{2626}\\x{262A}\\x{262E}\\x{262F}\\x{2638}-\\x{263A}\\x{2640}\\x{2642}\\x{2648}-\\x{2653}\\x{265F}\\x{2660}\\x{2663}\\x{2665}\\x{2666}\\x{2668}\\x{267B}\\x{267E}\\x{267F}\\x{2692}-\\x{2697}\\x{2699}\\x{269B}\\x{269C}\\x{26A0}\\x{26A1}\\x{26AA}\\x{26AB}\\x{26B0}\\x{26B1}\\x{26BD}\\x{26BE}\\x{26C4}\\x{26C5}\\x{26C8}\\x{26CE}\\x{26CF}\\x{26D1}\\x{26D3}\\x{26D4}\\x{26E9}\\x{26EA}\\x{26F0}-\\x{26F5}\\x{26F7}\\x{26F8}]"
" | \\x{26F9}"
" (?:"
" \\x{FE0F} \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | [\\x{26FA}\\x{26FD}\\x{2702}\\x{2705}\\x{2708}\\x{2709}]"
" | [\\x{270A}-\\x{270D}] [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{270F}\\x{2712}\\x{2714}\\x{2716}\\x{271D}\\x{2721}\\x{2728}\\x{2733}\\x{2734}\\x{2744}\\x{2747}\\x{274C}\\x{274E}\\x{2753}-\\x{2755}\\x{2757}\\x{2763}\\x{2764}\\x{2795}-\\x{2797}\\x{27A1}\\x{27B0}\\x{27BF}\\x{2934}\\x{2935}\\x{2B05}-\\x{2B07}\\x{2B1B}\\x{2B1C}\\x{2B50}\\x{2B55}\\x{3030}\\x{303D}\\x{3297}\\x{3299}\\x{1F004}\\x{1F0CF}\\x{1F170}\\x{1F171}\\x{1F17E}\\x{1F17F}\\x{1F18E}\\x{1F191}-\\x{1F19A}]"
" | \\x{1F1E6} [\\x{1F1E8}-\\x{1F1EC}\\x{1F1EE}\\x{1F1F1}\\x{1F1F2}\\x{1F1F4}\\x{1F1F6}-\\x{1F1FA}\\x{1F1FC}\\x{1F1FD}\\x{1F1FF}]"
" | \\x{1F1E7} [\\x{1F1E6}\\x{1F1E7}\\x{1F1E9}-\\x{1F1EF}\\x{1F1F1}-\\x{1F1F4}\\x{1F1F6}-\\x{1F1F9}\\x{1F1FB}\\x{1F1FC}\\x{1F1FE}\\x{1F1FF}]"
" | \\x{1F1E8} [\\x{1F1E6}\\x{1F1E8}\\x{1F1E9}\\x{1F1EB}-\\x{1F1EE}\\x{1F1F0}-\\x{1F1F5}\\x{1F1F7}\\x{1F1FA}-\\x{1F1FF}]"
" | \\x{1F1E9} [\\x{1F1EA}\\x{1F1EC}\\x{1F1EF}\\x{1F1F0}\\x{1F1F2}\\x{1F1F4}\\x{1F1FF}]"
" | \\x{1F1EA} [\\x{1F1E6}\\x{1F1E8}\\x{1F1EA}\\x{1F1EC}\\x{1F1ED}\\x{1F1F7}-\\x{1F1FA}]"
" | \\x{1F1EB} [\\x{1F1EE}-\\x{1F1F0}\\x{1F1F2}\\x{1F1F4}\\x{1F1F7}]"
" | \\x{1F1EC} [\\x{1F1E6}\\x{1F1E7}\\x{1F1E9}-\\x{1F1EE}\\x{1F1F1}-\\x{1F1F3}\\x{1F1F5}-\\x{1F1FA}\\x{1F1FC}\\x{1F1FE}]"
" | \\x{1F1ED} [\\x{1F1F0}\\x{1F1F2}\\x{1F1F3}\\x{1F1F7}\\x{1F1F9}\\x{1F1FA}]"
" | \\x{1F1EE} [\\x{1F1E8}-\\x{1F1EA}\\x{1F1F1}-\\x{1F1F4}\\x{1F1F6}-\\x{1F1F9}]"
" | \\x{1F1EF} [\\x{1F1EA}\\x{1F1F2}\\x{1F1F4}\\x{1F1F5}]"
" | \\x{1F1F0} [\\x{1F1EA}\\x{1F1EC}-\\x{1F1EE}\\x{1F1F2}\\x{1F1F3}\\x{1F1F5}\\x{1F1F7}\\x{1F1FC}\\x{1F1FE}\\x{1F1FF}]"
" | \\x{1F1F1} [\\x{1F1E6}-\\x{1F1E8}\\x{1F1EE}\\x{1F1F0}\\x{1F1F7}-\\x{1F1FB}\\x{1F1FE}]"
" | \\x{1F1F2} [\\x{1F1E6}\\x{1F1E8}-\\x{1F1ED}\\x{1F1F0}-\\x{1F1FF}]"
" | \\x{1F1F3} [\\x{1F1E6}\\x{1F1E8}\\x{1F1EA}-\\x{1F1EC}\\x{1F1EE}\\x{1F1F1}\\x{1F1F4}\\x{1F1F5}\\x{1F1F7}\\x{1F1FA}\\x{1F1FF}]"
" | \\x{1F1F4} \\x{1F1F2}"
" | \\x{1F1F5} [\\x{1F1E6}\\x{1F1EA}-\\x{1F1ED}\\x{1F1F0}-\\x{1F1F3}\\x{1F1F7}-\\x{1F1F9}\\x{1F1FC}\\x{1F1FE}]"
" | \\x{1F1F6} \\x{1F1E6}"
" | \\x{1F1F7} [\\x{1F1EA}\\x{1F1F4}\\x{1F1F8}\\x{1F1FA}\\x{1F1FC}]"
" | \\x{1F1F8} [\\x{1F1E6}-\\x{1F1EA}\\x{1F1EC}-\\x{1F1F4}\\x{1F1F7}-\\x{1F1F9}\\x{1F1FB}\\x{1F1FD}-\\x{1F1FF}]"
" | \\x{1F1F9} [\\x{1F1E6}\\x{1F1E8}\\x{1F1E9}\\x{1F1EB}-\\x{1F1ED}\\x{1F1EF}-\\x{1F1F4}\\x{1F1F7}\\x{1F1F9}\\x{1F1FB}\\x{1F1FC}\\x{1F1FF}]"
" | \\x{1F1FA} [\\x{1F1E6}\\x{1F1EC}\\x{1F1F2}\\x{1F1F3}\\x{1F1F8}\\x{1F1FE}\\x{1F1FF}]"
" | \\x{1F1FB} [\\x{1F1E6}\\x{1F1E8}\\x{1F1EA}\\x{1F1EC}\\x{1F1EE}\\x{1F1F3}\\x{1F1FA}]"
" | \\x{1F1FC} [\\x{1F1EB}\\x{1F1F8}]"
" | \\x{1F1FD} \\x{1F1F0}"
" | \\x{1F1FE} [\\x{1F1EA}\\x{1F1F9}]"
" | \\x{1F1FF} [\\x{1F1E6}\\x{1F1F2}\\x{1F1FC}]"
" | [\\x{1F201}\\x{1F202}\\x{1F21A}\\x{1F22F}\\x{1F232}-\\x{1F23A}\\x{1F250}\\x{1F251}\\x{1F300}-\\x{1F321}\\x{1F324}-\\x{1F384}]"
" | \\x{1F385} [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{1F386}-\\x{1F393}\\x{1F396}\\x{1F397}\\x{1F399}-\\x{1F39B}\\x{1F39E}-\\x{1F3C1}]"
" | \\x{1F3C2} [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{1F3C3}\\x{1F3C4}]"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | [\\x{1F3C5}\\x{1F3C6}]"
" | \\x{1F3C7} [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{1F3C8}\\x{1F3C9}]"
" | \\x{1F3CA}"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | [\\x{1F3CB}\\x{1F3CC}]"
" (?:"
" \\x{FE0F} \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | [\\x{1F3CD}-\\x{1F3F0}]"
" | \\x{1F3F3}"
" (?: \\x{FE0F} \\x{200D} \\x{1F308} )?"
" | \\x{1F3F4}"
" (?:"
" \\x{200D} \\x{2620} \\x{FE0F}"
" | \\x{E0067} \\x{E0062}"
" (?:"
" \\x{E0065} \\x{E006E} \\x{E0067}"
" | \\x{E0073} \\x{E0063} \\x{E0074}"
" | \\x{E0077} \\x{E006C} \\x{E0073}"
" )"
" \\x{E007F}"
" )?"
" | [\\x{1F3F5}\\x{1F3F7}-\\x{1F440}]"
" | \\x{1F441}"
" (?: \\x{FE0F} \\x{200D} \\x{1F5E8} \\x{FE0F} )?"
" | [\\x{1F442}\\x{1F443}] [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{1F444}\\x{1F445}]"
" | [\\x{1F446}-\\x{1F450}] [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{1F451}-\\x{1F465}]"
" | [\\x{1F466}\\x{1F467}] [\\x{1F3FB}-\\x{1F3FF}]?"
" | \\x{1F468}"
" (?:"
" \\x{200D}"
" (?:"
" [\\x{2695}\\x{2696}\\x{2708}] \\x{FE0F}"
" | \\x{2764} \\x{FE0F} \\x{200D}"
" (?: \\x{1F48B} \\x{200D} )?"
" \\x{1F468}"
" | [\\x{1F33E}\\x{1F373}\\x{1F393}\\x{1F3A4}\\x{1F3A8}\\x{1F3EB}\\x{1F3ED}]"
" | \\x{1F466}"
" (?: \\x{200D} \\x{1F466} )?"
" | \\x{1F467}"
" (?: \\x{200D} [\\x{1F466}\\x{1F467}] )?"
" | [\\x{1F468}\\x{1F469}] \\x{200D}"
" (?:"
" \\x{1F466}"
" (?: \\x{200D} \\x{1F466} )?"
" | \\x{1F467}"
" (?: \\x{200D} [\\x{1F466}\\x{1F467}] )?"
" )"
" | [\\x{1F4BB}\\x{1F4BC}\\x{1F527}\\x{1F52C}\\x{1F680}\\x{1F692}\\x{1F9B0}-\\x{1F9B3}]"
" )"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?:"
" \\x{200D}"
" (?:"
" [\\x{2695}\\x{2696}\\x{2708}] \\x{FE0F}"
" | [\\x{1F33E}\\x{1F373}\\x{1F393}\\x{1F3A4}\\x{1F3A8}\\x{1F3EB}\\x{1F3ED}\\x{1F4BB}\\x{1F4BC}\\x{1F527}\\x{1F52C}\\x{1F680}\\x{1F692}\\x{1F9B0}-\\x{1F9B3}]"
" )"
" )?"
" )?"
" | \\x{1F469}"
" (?:"
" \\x{200D}"
" (?:"
" [\\x{2695}\\x{2696}\\x{2708}] \\x{FE0F}"
" | \\x{2764} \\x{FE0F} \\x{200D}"
" (?: \\x{1F48B} \\x{200D} )?"
" [\\x{1F468}\\x{1F469}]"
" | [\\x{1F33E}\\x{1F373}\\x{1F393}\\x{1F3A4}\\x{1F3A8}\\x{1F3EB}\\x{1F3ED}]"
" | \\x{1F466}"
" (?: \\x{200D} \\x{1F466} )?"
" | \\x{1F467}"
" (?: \\x{200D} [\\x{1F466}\\x{1F467}] )?"
" | \\x{1F469} \\x{200D}"
" (?:"
" \\x{1F466}"
" (?: \\x{200D} \\x{1F466} )?"
" | \\x{1F467}"
" (?: \\x{200D} [\\x{1F466}\\x{1F467}] )?"
" )"
" | [\\x{1F4BB}\\x{1F4BC}\\x{1F527}\\x{1F52C}\\x{1F680}\\x{1F692}\\x{1F9B0}-\\x{1F9B3}]"
" )"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?:"
" \\x{200D}"
" (?:"
" [\\x{2695}\\x{2696}\\x{2708}] \\x{FE0F}"
" | [\\x{1F33E}\\x{1F373}\\x{1F393}\\x{1F3A4}\\x{1F3A8}\\x{1F3EB}\\x{1F3ED}\\x{1F4BB}\\x{1F4BC}\\x{1F527}\\x{1F52C}\\x{1F680}\\x{1F692}\\x{1F9B0}-\\x{1F9B3}]"
" )"
" )?"
" )?"
" | [\\x{1F46A}-\\x{1F46D}]"
" | \\x{1F46E}"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | \\x{1F46F}"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" | \\x{1F470} [\\x{1F3FB}-\\x{1F3FF}]?"
" | \\x{1F471}"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | \\x{1F472} [\\x{1F3FB}-\\x{1F3FF}]?"
" | \\x{1F473}"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | [\\x{1F474}-\\x{1F476}] [\\x{1F3FB}-\\x{1F3FF}]?"
" | \\x{1F477}"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | \\x{1F478} [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{1F479}-\\x{1F47B}]"
" | \\x{1F47C} [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{1F47D}-\\x{1F480}]"
" | [\\x{1F481}\\x{1F482}]"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | \\x{1F483} [\\x{1F3FB}-\\x{1F3FF}]?"
" | \\x{1F484}"
" | \\x{1F485} [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{1F486}\\x{1F487}]"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | [\\x{1F488}-\\x{1F4A9}]"
" | \\x{1F4AA} [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{1F4AB}-\\x{1F4FD}\\x{1F4FF}-\\x{1F53D}\\x{1F549}-\\x{1F54E}\\x{1F550}-\\x{1F567}\\x{1F56F}\\x{1F570}\\x{1F573}]"
" | \\x{1F574} [\\x{1F3FB}-\\x{1F3FF}]?"
" | \\x{1F575}"
" (?:"
" \\x{FE0F} \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | [\\x{1F576}-\\x{1F579}]"
" | \\x{1F57A} [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{1F587}\\x{1F58A}-\\x{1F58D}]"
" | [\\x{1F590}\\x{1F595}\\x{1F596}] [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{1F5A4}\\x{1F5A5}\\x{1F5A8}\\x{1F5B1}\\x{1F5B2}\\x{1F5BC}\\x{1F5C2}-\\x{1F5C4}\\x{1F5D1}-\\x{1F5D3}\\x{1F5DC}-\\x{1F5DE}\\x{1F5E1}\\x{1F5E3}\\x{1F5E8}\\x{1F5EF}\\x{1F5F3}\\x{1F5FA}-\\x{1F644}]"
" | [\\x{1F645}-\\x{1F647}]"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | [\\x{1F648}-\\x{1F64A}]"
" | \\x{1F64B}"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | \\x{1F64C} [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{1F64D}\\x{1F64E}]"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | \\x{1F64F} [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{1F680}-\\x{1F6A2}]"
" | \\x{1F6A3}"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | [\\x{1F6A4}-\\x{1F6B3}]"
" | [\\x{1F6B4}-\\x{1F6B6}]"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | [\\x{1F6B7}-\\x{1F6BF}]"
" | \\x{1F6C0} [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{1F6C1}-\\x{1F6C5}\\x{1F6CB}]"
" | \\x{1F6CC} [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{1F6CD}-\\x{1F6D2}\\x{1F6E0}-\\x{1F6E5}\\x{1F6E9}\\x{1F6EB}\\x{1F6EC}\\x{1F6F0}\\x{1F6F3}-\\x{1F6F9}\\x{1F910}-\\x{1F917}]"
" | [\\x{1F918}-\\x{1F91C}] [\\x{1F3FB}-\\x{1F3FF}]?"
" | \\x{1F91D}"
" | [\\x{1F91E}\\x{1F91F}] [\\x{1F3FB}-\\x{1F3FF}]?"
" | [\\x{1F920}-\\x{1F925}]"
" | \\x{1F926}"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | [\\x{1F927}-\\x{1F92F}]"
" | [\\x{1F930}-\\x{1F936}] [\\x{1F3FB}-\\x{1F3FF}]?"
" | \\x{1F937}"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | [\\x{1F938}\\x{1F939}]"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | \\x{1F93A}"
" | \\x{1F93C}"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" | [\\x{1F93D}\\x{1F93E}]"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | [\\x{1F940}-\\x{1F945}\\x{1F947}-\\x{1F970}\\x{1F973}-\\x{1F976}\\x{1F97A}\\x{1F97C}-\\x{1F9A2}\\x{1F9B0}-\\x{1F9B4}]"
" | [\\x{1F9B5}\\x{1F9B6}] [\\x{1F3FB}-\\x{1F3FF}]?"
" | \\x{1F9B7}"
" | [\\x{1F9B8}\\x{1F9B9}]"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | [\\x{1F9C0}-\\x{1F9C2}\\x{1F9D0}]"
" | [\\x{1F9D1}-\\x{1F9D5}] [\\x{1F3FB}-\\x{1F3FF}]?"
" | \\x{1F9D6}"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | [\\x{1F9D7}-\\x{1F9DD}]"
" (?:"
" \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F}"
" | [\\x{1F3FB}-\\x{1F3FF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" )?"
" | [\\x{1F9DE}\\x{1F9DF}]"
" (?: \\x{200D} [\\x{2640}\\x{2642}] \\x{FE0F} )?"
" | [\\x{1F9E0}-\\x{1F9FF}]"
Pour le mode utf-16 (chaîne), mode compressé:
"[#*0-9]\\uFE0F\\u20E3|[\\u00A9\\u00AE\\u203C\\u2049\\u2122\\u2139\\u2"
"194-\\u2199\\u21A9\\u21AA\\u231A\\u231B\\u2328\\u23CF\\u23E9-\\u23F3\\"
"u23F8-\\u23FA\\u24C2\\u25AA\\u25AB\\u25B6\\u25C0\\u25FB-\\u25FE\\u260"
"0-\\u2604\\u260E\\u2611\\u2614\\u2615\\u2618]|\\u261D(?:\\uD83C[\\uDF"
"FB-\\uDFFF])?|[\\u2620\\u2622\\u2623\\u2626\\u262A\\u262E\\u262F\\u26"
"38-\\u263A\\u2640\\u2642\\u2648-\\u2653\\u265F\\u2660\\u2663\\u2665\\u"
"2666\\u2668\\u267B\\u267E\\u267F\\u2692-\\u2697\\u2699\\u269B\\u269C\\"
"u26A0\\u26A1\\u26AA\\u26AB\\u26B0\\u26B1\\u26BD\\u26BE\\u26C4\\u26C5\\"
"u26C8\\u26CE\\u26CF\\u26D1\\u26D3\\u26D4\\u26E9\\u26EA\\u26F0-\\u26F5"
"\\u26F7\\u26F8]|\\u26F9(?:\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D[\\u2640"
"\\u2642]\\uFE0F)?|\\uFE0F\\u200D[\\u2640\\u2642]\\uFE0F)?|[\\u26FA\\u"
"26FD\\u2702\\u2705\\u2708\\u2709]|[\\u270A-\\u270D](?:\\uD83C[\\uDFF"
"B-\\uDFFF])?|[\\u270F\\u2712\\u2714\\u2716\\u271D\\u2721\\u2728\\u273"
"3\\u2734\\u2744\\u2747\\u274C\\u274E\\u2753-\\u2755\\u2757\\u2763\\u27"
"64\\u2795-\\u2797\\u27A1\\u27B0\\u27BF\\u2934\\u2935\\u2B05-\\u2B07\\u"
"2B1B\\u2B1C\\u2B50\\u2B55\\u3030\\u303D\\u3297\\u3299]|\\uD83C(?:[\\u"
"DC04\\uDCCF\\uDD70\\uDD71\\uDD7E\\uDD7F\\uDD8E\\uDD91-\\uDD9A]|\\uDDE"
"6\\uD83C[\\uDDE8-\\uDDEC\\uDDEE\\uDDF1\\uDDF2\\uDDF4\\uDDF6-\\uDDFA\\u"
"DDFC\\uDDFD\\uDDFF]|\\uDDE7\\uD83C[\\uDDE6\\uDDE7\\uDDE9-\\uDDEF\\uDD"
"F1-\\uDDF4\\uDDF6-\\uDDF9\\uDDFB\\uDDFC\\uDDFE\\uDDFF]|\\uDDE8\\uD83C"
"[\\uDDE6\\uDDE8\\uDDE9\\uDDEB-\\uDDEE\\uDDF0-\\uDDF5\\uDDF7\\uDDFA-\\u"
"DDFF]|\\uDDE9\\uD83C[\\uDDEA\\uDDEC\\uDDEF\\uDDF0\\uDDF2\\uDDF4\\uDDF"
"F]|\\uDDEA\\uD83C[\\uDDE6\\uDDE8\\uDDEA\\uDDEC\\uDDED\\uDDF7-\\uDDFA]"
"|\\uDDEB\\uD83C[\\uDDEE-\\uDDF0\\uDDF2\\uDDF4\\uDDF7]|\\uDDEC\\uD83C["
"\\uDDE6\\uDDE7\\uDDE9-\\uDDEE\\uDDF1-\\uDDF3\\uDDF5-\\uDDFA\\uDDFC\\uD"
"DFE]|\\uDDED\\uD83C[\\uDDF0\\uDDF2\\uDDF3\\uDDF7\\uDDF9\\uDDFA]|\\uDD"
"EE\\uD83C[\\uDDE8-\\uDDEA\\uDDF1-\\uDDF4\\uDDF6-\\uDDF9]|\\uDDEF\\uD8"
"3C[\\uDDEA\\uDDF2\\uDDF4\\uDDF5]|\\uDDF0\\uD83C[\\uDDEA\\uDDEC-\\uDDE"
"E\\uDDF2\\uDDF3\\uDDF5\\uDDF7\\uDDFC\\uDDFE\\uDDFF]|\\uDDF1\\uD83C[\\u"
"DDE6-\\uDDE8\\uDDEE\\uDDF0\\uDDF7-\\uDDFB\\uDDFE]|\\uDDF2\\uD83C[\\uD"
"DE6\\uDDE8-\\uDDED\\uDDF0-\\uDDFF]|\\uDDF3\\uD83C[\\uDDE6\\uDDE8\\uDD"
"EA-\\uDDEC\\uDDEE\\uDDF1\\uDDF4\\uDDF5\\uDDF7\\uDDFA\\uDDFF]|\\uDDF4\\"
"uD83C\\uDDF2|\\uDDF5\\uD83C[\\uDDE6\\uDDEA-\\uDDED\\uDDF0-\\uDDF3\\uD"
"DF7-\\uDDF9\\uDDFC\\uDDFE]|\\uDDF6\\uD83C\\uDDE6|\\uDDF7\\uD83C[\\uDD"
"EA\\uDDF4\\uDDF8\\uDDFA\\uDDFC]|\\uDDF8\\uD83C[\\uDDE6-\\uDDEA\\uDDEC"
"-\\uDDF4\\uDDF7-\\uDDF9\\uDDFB\\uDDFD-\\uDDFF]|\\uDDF9\\uD83C[\\uDDE6"
"\\uDDE8\\uDDE9\\uDDEB-\\uDDED\\uDDEF-\\uDDF4\\uDDF7\\uDDF9\\uDDFB\\uDD"
"FC\\uDDFF]|\\uDDFA\\uD83C[\\uDDE6\\uDDEC\\uDDF2\\uDDF3\\uDDF8\\uDDFE\\"
"uDDFF]|\\uDDFB\\uD83C[\\uDDE6\\uDDE8\\uDDEA\\uDDEC\\uDDEE\\uDDF3\\uDD"
"FA]|\\uDDFC\\uD83C[\\uDDEB\\uDDF8]|\\uDDFD\\uD83C\\uDDF0|\\uDDFE\\uD8"
"3C[\\uDDEA\\uDDF9]|\\uDDFF\\uD83C[\\uDDE6\\uDDF2\\uDDFC]|[\\uDE01\\uD"
"E02\\uDE1A\\uDE2F\\uDE32-\\uDE3A\\uDE50\\uDE51\\uDF00-\\uDF21\\uDF24-"
"\\uDF84]|\\uDF85(?:\\uD83C[\\uDFFB-\\uDFFF])?|[\\uDF86-\\uDF93\\uDF9"
"6\\uDF97\\uDF99-\\uDF9B\\uDF9E-\\uDFC1]|\\uDFC2(?:\\uD83C[\\uDFFB-\\u"
"DFFF])?|[\\uDFC3\\uDFC4](?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\"
"uDFFB-\\uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|[\\uDFC5\\uDFC6"
"]|\\uDFC7(?:\\uD83C[\\uDFFB-\\uDFFF])?|[\\uDFC8\\uDFC9]|\\uDFCA(?:\\"
"u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D[\\u2"
"640\\u2642]\\uFE0F)?)?|[\\uDFCB\\uDFCC](?:\\uD83C[\\uDFFB-\\uDFFF]("
"?:\\u200D[\\u2640\\u2642]\\uFE0F)?|\\uFE0F\\u200D[\\u2640\\u2642]\\uF"
"E0F)?|[\\uDFCD-\\uDFF0]|\\uDFF3(?:\\uFE0F\\u200D\\uD83C\\uDF08)?|\\u"
"DFF4(?:\\u200D\\u2620\\uFE0F|\\uDB40\\uDC67\\uDB40\\uDC62\\uDB40(?:\\"
"uDC65\\uDB40\\uDC6E\\uDB40\\uDC67|\\uDC73\\uDB40\\uDC63\\uDB40\\uDC74"
"|\\uDC77\\uDB40\\uDC6C\\uDB40\\uDC73)\\uDB40\\uDC7F)?|[\\uDFF5\\uDFF7"
"-\\uDFFF])|\\uD83D(?:[\\uDC00-\\uDC40]|\\uDC41(?:\\uFE0F\\u200D\\uD8"
"3D\\uDDE8\\uFE0F)?|[\\uDC42\\uDC43](?:\\uD83C[\\uDFFB-\\uDFFF])?|[\\"
"uDC44\\uDC45]|[\\uDC46-\\uDC50](?:\\uD83C[\\uDFFB-\\uDFFF])?|[\\uDC"
"51-\\uDC65]|[\\uDC66\\uDC67](?:\\uD83C[\\uDFFB-\\uDFFF])?|\\uDC68(?"
":\\u200D(?:[\\u2695\\u2696\\u2708]\\uFE0F|\\u2764\\uFE0F\\u200D\\uD83"
"D(?:\\uDC8B\\u200D\\uD83D)?\\uDC68|\\uD83C[\\uDF3E\\uDF73\\uDF93\\uDF"
"A4\\uDFA8\\uDFEB\\uDFED]|\\uD83D(?:\\uDC66(?:\\u200D\\uD83D\\uDC66)?"
"|\\uDC67(?:\\u200D\\uD83D[\\uDC66\\uDC67])?|[\\uDC68\\uDC69]\\u200D\\"
"uD83D(?:\\uDC66(?:\\u200D\\uD83D\\uDC66)?|\\uDC67(?:\\u200D\\uD83D["
"\\uDC66\\uDC67])?)|[\\uDCBB\\uDCBC\\uDD27\\uDD2C\\uDE80\\uDE92])|\\uD"
"83E[\\uDDB0-\\uDDB3])|\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D(?:[\\u2695"
"\\u2696\\u2708]\\uFE0F|\\uD83C[\\uDF3E\\uDF73\\uDF93\\uDFA4\\uDFA8\\uD"
"FEB\\uDFED]|\\uD83D[\\uDCBB\\uDCBC\\uDD27\\uDD2C\\uDE80\\uDE92]|\\uD8"
"3E[\\uDDB0-\\uDDB3]))?)?|\\uDC69(?:\\u200D(?:[\\u2695\\u2696\\u2708"
"]\\uFE0F|\\u2764\\uFE0F\\u200D\\uD83D(?:\\uDC8B\\u200D\\uD83D)?[\\uDC"
"68\\uDC69]|\\uD83C[\\uDF3E\\uDF73\\uDF93\\uDFA4\\uDFA8\\uDFEB\\uDFED]"
"|\\uD83D(?:\\uDC66(?:\\u200D\\uD83D\\uDC66)?|\\uDC67(?:\\u200D\\uD83"
"D[\\uDC66\\uDC67])?|\\uDC69\\u200D\\uD83D(?:\\uDC66(?:\\u200D\\uD83D"
"\\uDC66)?|\\uDC67(?:\\u200D\\uD83D[\\uDC66\\uDC67])?)|[\\uDCBB\\uDCB"
"C\\uDD27\\uDD2C\\uDE80\\uDE92])|\\uD83E[\\uDDB0-\\uDDB3])|\\uD83C[\\u"
"DFFB-\\uDFFF](?:\\u200D(?:[\\u2695\\u2696\\u2708]\\uFE0F|\\uD83C[\\u"
"DF3E\\uDF73\\uDF93\\uDFA4\\uDFA8\\uDFEB\\uDFED]|\\uD83D[\\uDCBB\\uDCB"
"C\\uDD27\\uDD2C\\uDE80\\uDE92]|\\uD83E[\\uDDB0-\\uDDB3]))?)?|[\\uDC6"
"A-\\uDC6D]|\\uDC6E(?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-"
"\\uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|\\uDC6F(?:\\u200D[\\u2"
"640\\u2642]\\uFE0F)?|\\uDC70(?:\\uD83C[\\uDFFB-\\uDFFF])?|\\uDC71(?"
":\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D[\\"
"u2640\\u2642]\\uFE0F)?)?|\\uDC72(?:\\uD83C[\\uDFFB-\\uDFFF])?|\\uDC"
"73(?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-\\uDFFF](?:\\u20"
"0D[\\u2640\\u2642]\\uFE0F)?)?|[\\uDC74-\\uDC76](?:\\uD83C[\\uDFFB-\\"
"uDFFF])?|\\uDC77(?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-\\"
"uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|\\uDC78(?:\\uD83C[\\uDF"
"FB-\\uDFFF])?|[\\uDC79-\\uDC7B]|\\uDC7C(?:\\uD83C[\\uDFFB-\\uDFFF])"
"?|[\\uDC7D-\\uDC80]|[\\uDC81\\uDC82](?:\\u200D[\\u2640\\u2642]\\uFE0"
"F|\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|\\uD"
"C83(?:\\uD83C[\\uDFFB-\\uDFFF])?|\\uDC84|\\uDC85(?:\\uD83C[\\uDFFB-"
"\\uDFFF])?|[\\uDC86\\uDC87](?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C"
"[\\uDFFB-\\uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|[\\uDC88-\\uD"
"CA9]|\\uDCAA(?:\\uD83C[\\uDFFB-\\uDFFF])?|[\\uDCAB-\\uDCFD\\uDCFF-\\"
"uDD3D\\uDD49-\\uDD4E\\uDD50-\\uDD67\\uDD6F\\uDD70\\uDD73]|\\uDD74(?:"
"\\uD83C[\\uDFFB-\\uDFFF])?|\\uDD75(?:\\uD83C[\\uDFFB-\\uDFFF](?:\\u2"
"00D[\\u2640\\u2642]\\uFE0F)?|\\uFE0F\\u200D[\\u2640\\u2642]\\uFE0F)?"
"|[\\uDD76-\\uDD79]|\\uDD7A(?:\\uD83C[\\uDFFB-\\uDFFF])?|[\\uDD87\\uD"
"D8A-\\uDD8D]|[\\uDD90\\uDD95\\uDD96](?:\\uD83C[\\uDFFB-\\uDFFF])?|["
"\\uDDA4\\uDDA5\\uDDA8\\uDDB1\\uDDB2\\uDDBC\\uDDC2-\\uDDC4\\uDDD1-\\uDD"
"D3\\uDDDC-\\uDDDE\\uDDE1\\uDDE3\\uDDE8\\uDDEF\\uDDF3\\uDDFA-\\uDE44]|"
"[\\uDE45-\\uDE47](?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-\\"
"uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|[\\uDE48-\\uDE4A]|\\uDE"
"4B(?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-\\uDFFF](?:\\u20"
"0D[\\u2640\\u2642]\\uFE0F)?)?|\\uDE4C(?:\\uD83C[\\uDFFB-\\uDFFF])?|"
"[\\uDE4D\\uDE4E](?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-\\u"
"DFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|\\uDE4F(?:\\uD83C[\\uDFF"
"B-\\uDFFF])?|[\\uDE80-\\uDEA2]|\\uDEA3(?:\\u200D[\\u2640\\u2642]\\uF"
"E0F|\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|["
"\\uDEA4-\\uDEB3]|[\\uDEB4-\\uDEB6](?:\\u200D[\\u2640\\u2642]\\uFE0F|"
"\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|[\\uDE"
"B7-\\uDEBF]|\\uDEC0(?:\\uD83C[\\uDFFB-\\uDFFF])?|[\\uDEC1-\\uDEC5\\u"
"DECB]|\\uDECC(?:\\uD83C[\\uDFFB-\\uDFFF])?|[\\uDECD-\\uDED2\\uDEE0-"
"\\uDEE5\\uDEE9\\uDEEB\\uDEEC\\uDEF0\\uDEF3-\\uDEF9])|\\uD83E(?:[\\uDD"
"10-\\uDD17]|[\\uDD18-\\uDD1C](?:\\uD83C[\\uDFFB-\\uDFFF])?|\\uDD1D|"
"[\\uDD1E\\uDD1F](?:\\uD83C[\\uDFFB-\\uDFFF])?|[\\uDD20-\\uDD25]|\\uD"
"D26(?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-\\uDFFF](?:\\u2"
"00D[\\u2640\\u2642]\\uFE0F)?)?|[\\uDD27-\\uDD2F]|[\\uDD30-\\uDD36]("
"?:\\uD83C[\\uDFFB-\\uDFFF])?|\\uDD37(?:\\u200D[\\u2640\\u2642]\\uFE0"
"F|\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|[\\u"
"DD38\\uDD39](?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-\\uDFF"
"F](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|\\uDD3A|\\uDD3C(?:\\u200D[\\"
"u2640\\u2642]\\uFE0F)?|[\\uDD3D\\uDD3E](?:\\u200D[\\u2640\\u2642]\\u"
"FE0F|\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|"
"[\\uDD40-\\uDD45\\uDD47-\\uDD70\\uDD73-\\uDD76\\uDD7A\\uDD7C-\\uDDA2\\"
"uDDB0-\\uDDB4]|[\\uDDB5\\uDDB6](?:\\uD83C[\\uDFFB-\\uDFFF])?|\\uDDB"
"7|[\\uDDB8\\uDDB9](?:\\u200D[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-"
"\\uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|[\\uDDC0-\\uDDC2\\uDDD"
"0]|[\\uDDD1-\\uDDD5](?:\\uD83C[\\uDFFB-\\uDFFF])?|\\uDDD6(?:\\u200D"
"[\\u2640\\u2642]\\uFE0F|\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D[\\u2640\\u"
"2642]\\uFE0F)?)?|[\\uDDD7-\\uDDDD](?:\\u200D[\\u2640\\u2642]\\uFE0F"
"|\\uD83C[\\uDFFB-\\uDFFF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?)?|[\\uD"
"DDE\\uDDDF](?:\\u200D[\\u2640\\u2642]\\uFE0F)?|[\\uDDE0-\\uDDFF])"