Ticket #4 (assigned defect)

Opened 5 years ago

Last modified 3 years ago

unicode (utf8 actually) characters not handled correctly

Reported by: Arkadiusz Miskiewicz Owned by: thommey
Priority: blocker Milestone: 1.8.0
Component: Core Version: 1.8.0 CVS
Keywords: utf unicode Cc:

Description

My setup:

eggdrop 1.6.17
tcl 8.5a6
Linux 2.6 with latest version of libraries and all stuff
freenode network
irssi client

Testing is simple, tcl script that echoes data back to channel:
bind pub - "utf" pub_proc
proc pub_proc { nick idx handle channel szoveg } {

putmsg $channel "$szoveg"

}

Now on original 1.6.17 when entering utf8 characters this happens:
21:39 < arekm8> utf ó&#261;&#322;&#324;&#347;
21:39 < utftest> &#65533;EBD[
(some crap is echoed)

After patching src/tcl.c utf_convert() with:

  • byteptr = (char *) Tcl_GetByteArrayFromObj(objv[i], &len);

+ byteptr = (char *) Tcl_GetStringFromObj(objv[i], &len);

I get:
21:57 < arekm8> utf ó&#261;&#322;&#324;&#347;
21:57 < utftest> ó&#261;&#322;&#324;&#347;

It works - proper characters are echoed back.

No idea why ByteArray? is used in utf_convert() so I'm not sure if the fix is
correct. Is it?

Change History

comment:1 Changed 5 years ago by simple

  • Milestone set to 1.6.20

comment:2 Changed 5 years ago by simple

  • Owner set to thommey
  • Status changed from new to assigned
  • Version set to 1.6.20
  • Milestone changed from 1.6.20 to UTF-8

comment:3 Changed 5 years ago by simple

  • Milestone changed from 1.6.20 to 1.6.21

comment:4 Changed 5 years ago by simple

  • Version 1.6.20 CVS deleted

comment:5 Changed 5 years ago by simple

  • Version set to 1.6.20 CVS

comment:6 Changed 4 years ago by pseudo

  • Priority changed from major to blocker
  • Version changed from 1.6.20 to 1.8.0 CVS

comment:7 Changed 3 years ago by simple

putlog may handle utf-8 separately and need additional changes

Note: See TracTickets for help on using tickets.