找回密碼
 註冊

QQ登錄

只需一步,快速開始

搜索
熱搜: 活動 交友 discuz
查看: 4112|回復: 5

依然顯示唔到超大字庫集嘅生僻字

[複製鏈接]
發表於 2007-10-7 21:41:34 | 顯示全部樓層 |閱讀模式
  寫帖時若果有生僻字,發帖後嗰隻字後邊所有內容都會顯示唔到,睇吓有冇辦法解決?
發表於 2007-10-8 00:02:01 | 顯示全部樓層

下面咪有解决辦法囉,不過紛繁復雜又危險啫,免費嘅嘢就係噉,冇辦法。

 

http://bbs.cantonese.asia/viewthread.php?tid=7978&extra=page%3D1

發表於 2007-10-8 22:00:20 | 顯示全部樓層
我對論壇呢個軟件唔熟,唔知將個字體改為宋體再加海峰嘅大字集唔知得唔得??
發表於 2007-10-10 00:30:12 | 顯示全部樓層
希望微軟早日出粵文版操作系統啦。
發表於 2007-12-9 18:06:56 | 顯示全部樓層

原帖由 Ultra 於 2007-10-8 00:02 發表 下面咪有解决辦法囉,不過紛繁復雜又危險啫,免費嘅嘢就係噉,冇辦法。   http://bbs.cantonese.asia/viewthread.php?tid=7978&extra=page%3D1

 

 

唔支漢典論壇係唔係用呢種方法??漢典同樣係 Discuz! 6.0.0

如果佢哋有其它辦法的話可以去話詢問一下!!

發表於 2007-12-14 12:32:14 | 顯示全部樓層

呢個係喺 http://bugs.mysql.com/bug.php?id=14052  mysql官方網上睇到嘅由於我雞腸唔多掂,所以各位睇下係唔係有辦法解決論壇顯示唔到 unicode漢字??有人話 顯示唔到唔係discuz嘅bug而係mysql!

 

Description:
At Wikipedia we have some data which contains high Unicode characters beyond the Basic
Multilingual Plane (>65536). In UTF-8 encoding these take up 4 bytes; in UTF-16 they would
be stored as a "surrogate pair" of two 16-bit pseudocharacters.

Currently MySQL's Unicode charset support doesn't seem to allow storing these characters
in text-encoded fields when using UTF-8 to communicate to the server:
* ucs2  stores four question marks "????" in place of the char
* utf8 truncates the string at the point the char appears

Some quick testing indicates that I can store surrogate pseudocharacters if I explicitly
code for them in pseudo-UTF-8, but this complicates communicating with the server.

How to repeat:
Tested with PHP 5.1.0RC1:
<?php
mysql_connect("localhost", "unitest");
mysql_select_db("unitest");
mysql_query("SET NAMES utf8");
mysql_query("CREATE TABLE demo(
    wide VARCHAR(50) CHARACTER SET ucs2,
    utf VARCHAR(50) CHARACTER SET utf8,
    raw VARBINARY(50)
)");
$char = "\xf0\xa8\xa7\x80";
mysql_query("INSERT INTO demo(wide,utf,raw) VALUES ('$char','$char','$char')");
$result = mysql_query("SELECT * FROM demo");
$row = mysql_fetch_array($result, MYSQL_ASSOC);
foreach($row as $field => $val) {
    $match = ($val == $char) ? "OK" : "FAILED";
    print "$field: $match ($val)\n";
}
?>

Outputs:
wide: FAILED (????)
utf: FAILED ()
raw: OK (

Suggested fix:
A sufficient compromise for our purposes would be for the ucs2 charset to be enhanced (or
a second utf16 charset made available) to do encoding and decoding of UTF-16 surrogate
pairs when communicating with the server in UTF-8.

So if we send the UTF-8 string: "\xf0\xa8\xa7\x80"
It should interpret it as the UTF-16 string: "\ud862\uddc0"
And on select we should get back UTF-8: "\xf0\xa8\xa7\x80"

Continuing to use UCS-2 collation semantics would be good enough for what we need; this
would allow us to use collatable Unicode text fields for page titles and usernames without
rare but existing data getting corrupted.

您需要登錄後才可以回帖 登錄 | 註冊

本版積分規則

Archiver|手機版|粵語協會

GMT+8, 2024-4-24 04:55 , Processed in 0.058453 second(s), 20 queries .

Powered by Discuz! X3.5 Licensed

© 2001-2024 Discuz! Team.

快速回復 返回頂部 返回列表