From WikiChip
Editing mirc/identifiers/$isutf

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.

This page supports semantic in-text annotations (e.g. "[[Is specified as::World Heritage Site]]") to build structured and queryable content provided by Semantic MediaWiki. For a comprehensive description on how to use annotations or the #ask parser function, please have a look at the getting started, in-text annotation, or inline queries help pages.

Latest revision Your text
Line 1: Line 1:
{{mirc title|$isutf Identifier}}'''$isutf''' returns the status of the text where 0 = not utf8 (contains invalid utf8 sequence), 1 = seems to be plain text, 2 = seems to contain valid utf8. In older 6.x versions, the result can change depending on which UTF8 setting is active in the /font dialog.
+
{{mirc title|$isutf Identifier}}'''$isutf''' returns the status of the text where 0 = not utf8 (contains invalid utf8 sequence), 1 = seems to be plain text, 2 = seems to contain valid utf8
 +
 
  
 
== Synopsis ==
 
== Synopsis ==
Line 5: Line 6:
  
 
== Parameters ==
 
== Parameters ==
 +
 
* '''text''' - The text you want the status of
 
* '''text''' - The text you want the status of
  
Line 13: Line 15:
 
<source lang="mIRC">//echo -a $isutf(é) $isutf($utfencode(é)) $isutf(plain)</source>
 
<source lang="mIRC">//echo -a $isutf(é) $isutf($utfencode(é)) $isutf(plain)</source>
  
Note how this indicates whether the text contains the UTF8 codepoints of a UTF8 sequence, not whether the input is a UTF8 string, which all %strings in a unicode-aware client should be, which is why the next command returns "0 2".
 
 
<source lang="mIRC">//echo -a $isutf($chr(233)) vs $isutf($chr(195) $+ $chr(169))</source>
 
 
If you need to test if a &binvar contains a UTF8 string, you can take advantage of the $regsubex feature where it can output a string into a binvar. If the input is $bvar(&var1,1-).text, you can test whether &var2 is created as an exact replica. Note how $isutf returns 0 for both binvars. On the other hand, the isbinvarutf alias returns 2 for &v1 which contains a UTF8 byte sequence, but returns 0 for &v2 because the cloned UTF8 output from $regsubex was not the same bytes as the original. Note that there's a limit to how long of a binvar can be tested using this method, because $regsubex only permits the $2 string to contain more than approximately $maxlenl *bytes* even when that string has fewer than 4000 UTF8 *characters*.
 
 
<source lang="mIRC">
 
//bset &v1 1 195 169 | bset &v2 1 233 | var -s %a1 $bvar(&v1,1-).text , %a2 $bvar(&v2,1-).text | echo -a $isutf(%a1) $isutf(%a2) vs $isbinvarutf(&v1) $isbinvarutf(&v2)
 
 
alias isbinvarutf {
 
  if ($bvar($1,0) == 0) return 0 | var %len1 $v1
 
  noop $regsubex(foo,$bvar($1,1-).text,,,&tempvar2)
 
  var %len2 $bvar(&tempvar2,0) | if (%len1 != %len2) return 0
 
  if ($calc(%len1 + %len2) < 2000) { if ($bvar($1,1-) == $bvar(&tempvar2,1-)) return 2 | else return 0 }
 
  else { if ($sha256($1,1) == $sha256(&tempvar2,1)) return 2 | else return 0 }
 
}
 
</source>
 
 
== Compatibility ==
 
== Compatibility ==
 
{{mIRC compatibility|6.17}}
 
{{mIRC compatibility|6.17}}
Line 36: Line 21:
 
{{mIRC|$utfencode}}
 
{{mIRC|$utfencode}}
 
{{mIRC|$utfdecode}}
 
{{mIRC|$utfdecode}}
 +
 +
[[Category:mIRC identifiers]]

Please note that all contributions to WikiChip may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see WikiChip:Copyrights for details). Do not submit copyrighted work without permission!

Cancel | Editing help (opens in new window)