gSOAP DOM 파서 문제

gSOAP 2.8.10 DOM 파서를 사용하여 UTF8로 인코딩 된 키릴 문자가 포함 된 간단한 XML을 구문 분석하려고합니다. soapC.cpp 및 soapns.cpp 프로젝트에 추가 된 VC++ 콘솔 응용 프로그램을 만들었습니다.gSOAP DOM 파서 문제

soapns.cpp :

#include <soap.nsmap>

soap.nsmap :

#include "soapH.h" 
SOAP_NMAC struct Namespace namespaces[] = 
{ 
    {"SOAP-ENV", "http://schemas.xmlsoap.org/soap/envelope/", "http://www.w3.org /*/soap-envelope", NULL}, 
    {"SOAP-ENC", "http://schemas.xmlsoap.org/soap/encoding/", "http://www.w3.org/*/soap-encoding", NULL}, 
    {"xsi", "http://www.w3.org/2001/XMLSchema-instance", "http://www.w3.org/*/XMLSchema-instance", NULL}, 
    {"xsd", "http://www.w3.org/2001/XMLSchema", "http://www.w3.org/*/XMLSchema", NULL}, 
    {"ns2", "http://schemas.microsoft.com/2003/10/Serialization/", NULL, NULL}, 
    {"ns1", "http://asp.net/ApplicationServices/v200", NULL, NULL}, 
    {"ns3", "http://tempuri.org/", NULL, NULL}, 
    {NULL, NULL, NULL, NULL} 
};

soapC.cpp, soap.H, soap.nsmap하는 soapcpp2.exe 유틸리티를 사용하여 생성됩니다.

MAIN.CPP :

#include <stdsoap2.h> 
#include <string> 
#include <sstream> 
#include <iomanip> 
#include <iostream> 
#include <tchar.h> 

void print_in_hex(const std::string& str) 
{ 
    std::string::const_iterator ch; 
    for(ch = str.begin(); ch != str.end(); ++ch) 
    { 
     std::cout << std::hex << 
     std::setw(2) << std::setfill('0') << std::uppercase << 
      static_cast<unsigned int>(static_cast<unsigned char>(*ch)) << " "; 

    } 
    std::cout << std::endl; 
} 

// Sample XML content 

const std::string Xml = 
"<?xml version=\"1.0\" encoding=\"utf-8\"?>\ 
<entry>\ 
<properties>\ 
<Id>a8a4cf87-9497-4078-9166-0737a55ca7fc</Id>\ 
<Name>\xD0\x9D\xD0\xBE\xD0\xB2\xD0\xB0\xD1\x8F\x20\xD0\xBA\ 
\xD0\xBE\xD0\xBB\xD0\xBB\xD0\xB5\xD0\xBA\xD1\x86\xD0\xB8\xD1\x8F</Name>\ 
</properties>\ 
</entry>"; 

const std::string correctName = "\xD0\x9D\xD0\xBE\xD0\xB2\xD0\xB0\xD1\x8F\x20\xD0\xBA\ 
\xD0\xBE\xD0\xBB\xD0\xBB\xD0\xB5\xD0\xBA\xD1\x86\xD0\xB8\xD1\x8F"; 

int _tmain(int argc, _TCHAR* argv[]) 
{ 
    std::stringstream inputStream; 
    inputStream.str(Xml); 
    struct soap_dom_element entry(soap_new()); 
    soap_set_mode(entry.soap, SOAP_DOM_TREE | SOAP_C_UTFSTRING); 
    inputStream >> entry; 
    soap_dom_element_iterator it = entry.find(NULL, "Name"); 
    if(it != entry.end()) 
    { 
     std::cout << "Original content:" << std::endl; 
     print_in_hex(correctName); 
     std::string name = (*it).data; 
     std::cout << "Parsed content:" << std::endl; 
     print_in_hex(name); 
    } 
    return 0; 
}

출력 :

XML가 스트림으로부터 판독되는

Original content: 
D0 9D D0 BE D0 B2 D0 B0 D1 8F 20 D0 BA D0 BE D0 BB D0 BB D0 B5 D0 BA D1 86 D0 B8 D1 8F 
Parsed content: 
C3 90 9D D0 BE D0 B2 D0 B0 D1 8F 20 D0 BA D0 BE D0 BB D0 BB D0 B5 D0 BA D1 86 D0 B8 D1 8F

가 gSOAP 2 바이트 대신 <Name> 태그 원본 콘텐츠의 최초의 바이트 0xD0의 0xC3 0x90을 둔다. 결과적으로 텍스트가 UTF8에서 Windows-1251로 디코딩 될 때 'Новая коллекция' 대신 '??овая коллекция'이 표시됩니다. 아무도이 문제를 해결하는 방법을 알고 있습니까? 감사!

출처

2013-08-29 Yury Rudakou

이 문제는 gSOAP 2.8.16

에서 수정되었습니다.

출처

2013-08-30 11:53:29

gSOAP DOM 파서 문제

답변

관련 문제