关于XMLHTTP客户端与ASP交互传输XML时所产生的中文编码问题的解

80酷酷网    80kuku.com

  xml|编码|交互|解决|客户端|问题|中文 

测试通过系统:WinXP 中文Pro, XML4.0 SP2,C#

  尝试过XMLHTTP作客户端,然后尝试与服务器端ASP交互的程序员,我认为都很有思路,当然这也是在自夸:)。但最头疼的问题恐怕就是中文乱码的问题,查了很多资料,MSDN,互联网上的,尝试了很多方法都不太奏效,还好没有气馁,现在,最新的最简单的解决办法闪亮登场:

把客户端要传输的XML的头由:

<?xml version="1.0" encoding="gb2312" ?>

改为:

<?xml version="1.0" encoding="utf-8" ?>

服务器端的ASP程序发送给客户端XML结果时需要加:

Response.ContentType = "text/xml"
Response.CharSet = "gb2312"

客户端的程序取返回结果用XmlDom.loadXml(xmlhttp.responseText)就可以了。

 ============================================================================

以下分析可能的原因:

可能是我们的操作系统本身使用UTF-8编码的原因。

把所有Request.ServerVariables写到一个文本文件中你会发现类似这些:

ALL_HTTP:HTTP_ACCEPT:*/*
HTTP_ACCEPT_LANGUAGE:zh-cn
HTTP_CONNECTION:Keep-Alive
HTTP_HOST:localhost
HTTP_USER_AGENT:Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; InfoPath.1)
HTTP_COOKIE:ASPSESSIONIDAQBCSQRA=FNEHNOCCMHECCOPIOKKECEFM
HTTP_CONTENT_LENGTH:94
HTTP_CONTENT_TYPE:text/xml;charset=gb2312
HTTP_ACCEPT_ENCODING:gzip, deflate
HTTP_CACHE_CONTROL:no-cache

ALL_RAW:Accept: */*
Accept-Language: zh-cn
Connection: Keep-Alive
Host: localhost
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; InfoPath.1)
Cookie: ASPSESSIONIDAQBCSQRA=FNEHNOCCMHECCOPIOKKECEFM
Content-Length: 94
Content-Type: text/xml;charset=gb2312
Accept-Encoding: gzip, deflate
Cache-Control: no-cache

APPL_MD_PATH:/LM/W3SVC/1/Root/zdqs
APPL_PHYSICAL_PATH:C:\Inetpub\systems\ZDS\qry\
AUTH_PASSWORD:
AUTH_TYPE:
AUTH_USER:
CERT_COOKIE:
CERT_FLAGS:
CERT_ISSUER:
CERT_KEYSIZE:
CERT_SECRETKEYSIZE:
CERT_SERIALNUMBER:
CERT_SERVER_ISSUER:
CERT_SERVER_SUBJECT:
CERT_SUBJECT:
CONTENT_LENGTH:94
CONTENT_TYPE:text/xml;charset=gb2312
GATEWAY_INTERFACE:CGI/1.1
HTTPS:off
HTTPS_KEYSIZE:
HTTPS_SECRETKEYSIZE:
HTTPS_SERVER_ISSUER:
HTTPS_SERVER_SUBJECT:
INSTANCE_ID:1
INSTANCE_META_PATH:/LM/W3SVC/1
LOCAL_ADDR:127.0.0.1
LOGON_USER:
PATH_INFO:/zdqs/QURY.asp
PATH_TRANSLATED:C:\Inetpub\systems\ZDS\qry\QURY.asp
QUERY_STRING:
REMOTE_ADDR:127.0.0.1
REMOTE_HOST:127.0.0.1
REMOTE_USER:
REQUEST_METHOD:POST
SCRIPT_NAME:/zdqs/QURY.asp
SERVER_NAME:localhost
SERVER_PORT:80
SERVER_PORT_SECURE:0
SERVER_PROTOCOL:HTTP/1.1
SERVER_SOFTWARE:Microsoft-IIS/5.1

HTTP_ACCEPT:*/*
HTTP_ACCEPT_LANGUAGE:zh-cn
HTTP_CONNECTION:Keep-Alive
HTTP_HOST:localhost
HTTP_USER_AGENT:Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; InfoPath.1)
HTTP_COOKIE:ASPSESSIONIDAQBCSQRA=FNEHNOCCMHECCOPIOKKECEFM
HTTP_CONTENT_LENGTH:94
HTTP_CONTENT_TYPE:text/xml;charset=gb2312
HTTP_ACCEPT_ENCODING:gzip, deflate
HTTP_CACHE_CONTROL:no-cache

猜测一:网络传输过程中所用的编码方式是gb2312

然后,请看另外MSXML4 SDK中一个帮助:

 

Enforcing Character Encoding with DOM

In some cases, an XML document is passed to and processed by an application—for example, an ASP page—that cannot properly decode rare or new characters. When this happens, you might be able to work around the problem by relying on DOM to handle the character encoding. This bypasses the incapable application.

For example, the following XML document contains the character entity ("€") that corresponds to the Euro currency symbol (?/STRONG>). The ASP page, incapable.asp, cannot process currency.xml.

XML Data (currency.xml)

<?xml version="1.0" encoding="utf-8"?><currency>   <name>Euro</name>   <symbol>€</symbol>   <exchange>      <base>US___FCKpd___0lt;/base>      <rate>1.106</rate>   </exchange></currency>

ASP Page (incapable.asp)

<%language = "javascript"%><%   var doc = new ActiveXObject("Msxml2.DOMDocument.4.0");   doc.async = false;   if (doc.load(Server.MapPath("currency.xml"))==true) {      Response.ContentType = "text/xml";      Response.Write(doc.xml);   }%>

When incapable.asp is opened from a Web browser, an error such as the following results:

An invalid character was found in text content. Error processing resource 'http://MyWebServer/MyVirtualDirectory/incapable.asp'. Line 4, Position 10

This error is caused by the use of the Response.Write(doc.xml) instruction in the incapable.asp code. Because it calls upon ASP to encode/decode the Euro currency symbol character found in currency.xml, it fails.

However, you can fix this error. To do so, replace this Response.Write(doc.xml) instruction in incapable.asp with the following line:

doc.save(Response);

With this line, the error does not occur. The ASP code does produce the correct output in a Web browser, as follows:

  <?xml version="1.0" encoding="utf-8" ?>   <currency>    <name>Euro</name>     <symbol>?/STRONG></symbol>     <exchange>      <base>US$</base>       <rate>1.106</rate>     </exchange>  </currency>

The effect of the change in the ASP page is to let the DOM object (doc)—instead of the Response object on the ASP page—handle the character encoding.

请看最后一句:上例中ASP的改变在于让DOM对象(doc)——而不是ASP中的Response对象——处理字符编码。

所以得出:

猜想二:你可以视Request或Response对象为一个文件句柄,如果是用DOM对象的load与save方法时。

由猜想一、猜想二得出

猜想三:客户端编译的系统使用的字符串本身就是采用GB2312编码的,而使用XMLHTTP传输数据时自动转换为GB2312,服务器端用DOM对象load时由于相当于载入一个字节流,然后一看xml头中的encoding就是GB2312,所以就没做转换,直接把字节流视为字符串!!!不好意思是它的确忘记了一件事就是,这个字符串在我的系统显示时却认为是UTF-8编码的,所以只有强制xml转换以下就行了,好像见别人的解决方案时也有写gb2312到utf-8转换函数的……

最后实践,证实可行!!!

用一句话概括就是,客户端发送给服务器的XML,encoding全部为utf-8编码的;服务器发送给客户端,全部指定编码为:gb2312,一切OK。

分享到
  • 微信分享
  • 新浪微博
  • QQ好友
  • QQ空间
点击: