A question that comes up frequently is whether epic supports utf8 or not,
and if it does not, when will it be supported? The simple answer is that it
does not support utf8 because of a lack of expertise at converting programs
to use utf8 within the epic community. Therefore, interested contributers
are having to learn all about the unicode way of doing things as they go along
which is much slower than if someone who had done this before would step in
and help us write the code to implement the many design changes
Converting from ascii to unicode is very invasive to a program, and there
are important questions to consider when you ask what it really means to
support utf8. This is not an exhaustive list but gives you an idea of the
size of the effort.
- UTF8 breaks from the longstanding tradition that one byte equals one
glyph equals one column on the screen. This affects things like column
counting, which is important for the input line, and for line wrapping.
Much code has to be rewritten for this.
- The historical way of handling national character sets is to use code
pages, which map 128 glyphs into code points 128-255. Normally this is handled
by the user's terminal emulator so epic has never had to worry about the
details. There will always be irc users who aren't using utf8 clients, so it
will always be required for the client to support a remote target (channel or
user) who can't do utf8. If you exchange messages and you're using utf8 and
the other person isn't, then everything will be garbled. It is necessary for
the client to be able to convert FROM utf8 TO any other encoding, and vice
versa, to really support utf8.
- Additionally, there will always be epic users who aren't using utf8
terminal emulators. But these users would like to be able to join utf8
channels and have everything Just Work. It is necessary for epic to be able
to convert FROM any input encoding TO utf8 and back again for these users.
- Finally, once you open the door to unicode, you're talking about being
able to support any encoding. How will this impact things like scripts?
How will the /echo's in your script output if you encode it in utf8 but
the person who uses your script doesn't use a utf8 emulator? We see this
problem today when people use the default vga code page for linux console, but
their scripts look all weird when you use them in a latin-1 font. So there
needs to be some way for scripts to convert between encodings.
I'm not trying to discourage you from thinking that epic will never have
proper unicode support, but to help you understand this is not a simple matter
and the lack of any outside assistance means the work will be slow and steady,
because there is a large amount of code to be written. Eventually it will
happen, but the only way to make it happen sooner is to help us write the code
or recruit someone who will help us write the code.
Thanks for understanding!
Jeremy