유니 코드 문자 이름의 표준화 된 번역본이 있습니까?

유니 코드 표준의 모든 코드 포인트에는 고유 한 영어 이름이 붙어 있습니다. 독일어, 프랑스어, 일본어 등의 언어에이 이름 (코드 포인트의 작은 하위 집합)을 번역해야합니다. 전문 번역사에 대한 액세스 권한이 있으므로 해당 이름을 하나씩 번역 할 수도 있지만 결과는 반드시 유니 코드 표준의 의도를 잘 나타내는 것은 아닙니다. 유니 코드위원회가 이미 영어가 아닌 다른 언어의 코드 포인트 이름을 표준화하기 위해 이미 노력해 왔기 때문에 번역본을 참조하기 만하면됩니까? unicode.org에서 영어 이외의 것을 찾을 수는 없었지만 아직 뭔가를 놓쳤 으면합니다. 미리 감사드립니다!유니 코드 문자 이름의 표준화 된 번역본이 있습니까?

출처

2017-12-05 vschoech

.NET은/PowerShell을 예 : [Microsofts.CharMap.UName]::Get('č')

윈도우 OS : 라이브러리 getuname.dll 지역화 에 저장이 지역화 된 유니 코드 속성 (적어도 name). 바로 다음 스크립트를 사용하거나 거기에 영감을 얻을 :

<# Origin by: http://poshcode.org/5234 Improved by: https://stackoverflow.com/users/3439404/josefz Use this like this: "ábč",([char]'x'),0xBF | Get-CharInfo Activate dot-sourced like this (apply a real path instead of .\): . .\_get-CharInfo_1.1.ps1 #> Set-StrictMode -Version latest Add-Type -Name UName -Namespace Microsofts.CharMap -MemberDefinition $( switch ("$([System.Environment]::SystemDirectory -replace '\\', '\\')\\getuname.dll") { {Test-Path -LiteralPath $_ -PathType Leaf} {@" [DllImport("${_}", ExactSpelling=true, SetLastError=true)] private static extern int GetUName(ushort wCharCode, [MarshalAs(UnmanagedType.LPWStr)] System.Text.StringBuilder buf); public static string Get(char ch) { var sb = new System.Text.StringBuilder(300); UName.GetUName(ch, sb); return sb.ToString(); } "@ } default {'public static string Get(char ch) { return "???"; }'} }) function Get-CharInfo { [CmdletBinding()] [OutputType([System.Management.Automation.PSCustomObject],[System.Array])] param( [Parameter(Position=0, Mandatory=$true, ValueFromPipeline=$true)] $InputObject ) begin { function out { param( [Parameter(Position=0, Mandatory=$true)] $ch, [Parameter(Position=1, Mandatory=$false)]$nil='' ) if (0 -le $ch -and 0xFFFF -ge $ch) { [pscustomobject]@{ Char = [char]$ch CodePoint = 'U+{0:X4}' -f $ch Category = [System.Globalization.CharUnicodeInfo]::GetUnicodeCategory($ch) Description = [Microsofts.CharMap.UName]::Get($ch) } } elseif (0 -le $ch -and 0x10FFFF -ge $ch) { $s = [char]::ConvertFromUtf32($ch) [pscustomobject]@{ Char = $s CodePoint = 'U+{0:X}' -f $ch Category = [System.Globalization.CharUnicodeInfo]::GetUnicodeCategory($s, 0) Description = '???' + $nil } } else { Write-Warning ('Character U+{0:X} is out of range' -f $ch) } } } process { if ($PSBoundParameters['Verbose']) { Write-Warning "InputObject type = $($InputObject.GetType().Name)"} if ($null -cne ($InputObject -as [char])) { #Write-Verbose "A $([char]$InputObject) InputObject character" out $([int][char]$InputObject) '' } elseif ($InputObject -isnot [string] -and $null -cne ($InputObject -as [int])) { #Write-Verbose "B $InputObject InputObject" out $([int]$InputObject) '' } else { $InputObject = [string]$InputObject #Write-Verbose "C $InputObject InputObject.Length $($InputObject.Length)" for ($i = 0; $i -lt $InputObject.Length; ++$i) { if ( [char]::IsHighSurrogate($InputObject[$i]) -and (1+$i) -lt $InputObject.Length -and [char]::IsLowSurrogate($InputObject[$i+1])) { $aux = ' 0x{0:x4},0x{1:x4}' -f [int]$InputObject[$i], [int]$InputObject[$i+1] Write-Verbose "surrogate pair $aux at position $i" out $([char]::ConvertToUtf32($InputObject[$i], $InputObject[1+$i])) $aux $i++ } else { out $([int][char]$InputObject[$i]) '' } } } } }

예 : 후자 (반 지역화) 출력은 다음 코드에서 오는

PS D:\PShell> "ábč",([char]'x'),0xBF | Get-CharInfo Char CodePoint Category Description ---- --------- -------- ----------- á U+00E1 LowercaseLetter Latin Small Letter A With Acute b U+0062 LowercaseLetter Latin Small Letter B č U+010D LowercaseLetter Latin Small Letter C With Caron x U+0078 LowercaseLetter Latin Small Letter X ¿ U+00BF OtherPunctuation Inverted Question Mark PS D:\PShell> Get-Content .\DataFiles\getcharinfoczech.txt Char CodePoint Category Description ---- --------- -------- ----------- á U+00E1 LowercaseLetter Malé písmeno latinky a s čárkou nad vpravo b U+0062 LowercaseLetter Malé písmeno latinky b č U+010D LowercaseLetter Malé písmeno latinky c s háčkem x U+0078 LowercaseLetter Malé písmeno latinky x ¿ U+00BF OtherPunctuation Znak obráceného otazníku PS D:\PShell>

주 (같은 컴퓨터에서 실행 현지화 된 사용자) :

"ábč",([char]'x'),0xBF | Get-CharInfo | Out-File .\DataFiles\getcharinfoczech.txt

출처

2017-12-27 09:37:12 JosefZ

유니 코드 문자 이름의 표준화 된 번역본이 있습니까?

답변

관련 문제