Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 129761 - mysql-4.1.14-r1 screwes up character encodings
Summary: mysql-4.1.14-r1 screwes up character encodings
Status: VERIFIED TEST-REQUEST
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: New packages (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: Gentoo Linux MySQL bugs team
URL:
Whiteboard:
Keywords:
: 129762 130661 (view as bug list)
Depends on:
Blocks:
 
Reported: 2006-04-12 17:23 UTC by Attila Tóth
Modified: 2006-05-05 15:58 UTC (History)
5 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Attila Tóth 2006-04-12 17:23:26 UTC
Using stable apache2 and php5.
After emerging recent stable mysql ebuild makes all special latin2 characters to appear strange (utf-8 misconverted) screwing up LAMP pages on box. No matter what I specify in my.cnf for default-character-set, character-set-server, character-sets-dir, default-collation or any arbitrary combinations of these with values of utf8, latin1 or latin2.
Downgrading to 4.1.14 solves the problem.
I tried to investigate the ebuild itself, but I have to admit, that although I could understand the 4.1.14's ebuild, but I wasn't capable interpret 4.1.14-r1's ebuild lacking basic components. It seems to be similar to later masked versions.
I don't know how to tune compilation options regarding this new ebuild style.
Has anybody experienced the same problem regarding character translations?
Comment 1 Walter Wandra 2006-04-12 19:17:59 UTC
(In reply to comment #0)
> Using stable apache2 and php5.
> After emerging recent stable mysql ebuild makes all special latin2 characters
> to appear strange (utf-8 misconverted) screwing up LAMP pages on box. No matter

yes same with me ... 
file do downgrade ... yours, harry
Comment 2 Walter Wandra 2006-04-12 19:31:06 UTC
would like t downgrade, but how do i do that -
mysql-4.1.14 seems not to be available to emerge
any more ...
Comment 3 Jakub Moc (RETIRED) gentoo-dev 2006-04-12 23:59:52 UTC
*** Bug 129762 has been marked as a duplicate of this bug. ***
Comment 4 Luca Longinotti (RETIRED) gentoo-dev 2006-04-13 03:14:54 UTC
All MySQL ebuilds from 4.1 up should now correctly use UTF8 as default charset (the compiled in one). Now, you can overwrite that setting to whatever you want in /etc/mysql/my.cnf, and the tools that correctly call and check my.cnf for options will respect that, such as the mysqld itself, the mysql* tools. PHP isn't one of those, PHP always only uses the compiled in character set, which is UTF8 now (and this is correct as it's upstream's default). We are atm working on a patch to PHP so that it will also check my.cnf and get the default-character-set settings from there. I'll update the bug once those fixed PHP ebuilds are ready, with directions to find them.
Best regards, CHTEKK.
Comment 5 Jan Bruvoll 2006-04-13 03:51:27 UTC
There are so many other applications that will break because of this - frankly, I'm quite disappointed that such a major functionality change has been let slip into a minor upgrade like this, and on top -without- a huge, fat warning on top.

This is the -stable- branch - people expect it to be stable.
Comment 6 Walter Wandra 2006-04-13 04:18:51 UTC
any intermediate quick solution ?
such as going back to mysql-4.1.14 in the meantime

but it has vanished from the gentoo servers 

urgend call ... have severe trouble with my production sites !!

yours, ww

> PHP ebuilds are ready, with directions to find them.
> Best regards, CHTEKK.
> 

Comment 7 Luca Longinotti (RETIRED) gentoo-dev 2006-04-13 04:55:44 UTC
(In reply to comment #6)
> any intermediate quick solution ?

4.1.14 is out of the servers as it has other problems and security issues. As a temporary solution, you can do the following:

1) Open /usr/portage/eclass/mysql.eclass

2) Search for the following lines:

myconf="${myconf} --with-charset=utf8"
myconf="${myconf} --with-collation=utf8_general_ci"

and sobstitute them with:

myconf="${myconf} --with-charset=latin1"
myconf="${myconf} --with-collation=latin1_swedish_ci"

3) Save the file.

4) Emerge mysql-4.1.14-r1 again.

This should bring the situation back to the old one for now if you need to, remember that the changes to mysql.eclass will be removed when you do your next emerge --sync.
Best regards, CHTEKK.
Comment 8 Walter Wandra 2006-04-13 06:50:49 UTC
great, this worked for me with the old, saved my.cnf

> > any intermediate quick solution ?
> 
> 4.1.14 is out of the servers as it has other problems and security issues. As 
Comment 9 Attila Tóth 2006-04-13 08:58:03 UTC
(In reply to comment #4)
> options will respect that, such as the mysqld itself, the mysql* tools. PHP
> isn't one of those, PHP always only uses the compiled in character set, which
> is UTF8 now (and this is correct as it's upstream's default). We are atm
> working on a patch to PHP so that it will also check my.cnf and get the
> default-character-set settings from there. I'll update the bug once those fixed
> PHP ebuilds are ready, with directions to find them.
Damned PHP! PHP5's html entity decode function was promised to handle iso-8859-2 correctly instead of the failing PHP4. It came out, and proved to be equally uncapable... Why is it so hard to make some minor effort (regarding PHP) in order to support some character encodings different from uft8 or latin1?
The summary of the bug is somewhat misleading...

Regards,
Dwokfur
Comment 10 Attila Tóth 2006-04-13 09:10:28 UTC
(In reply to comment #7)
> (In reply to comment #6)
> > any intermediate quick solution ?
> 
> 4.1.14 is out of the servers as it has other problems and security issues. As a
> temporary solution, you can do the following:
> 
> 1) Open /usr/portage/eclass/mysql.eclass
> 
> 2) Search for the following lines:
> 
> myconf="${myconf} --with-charset=utf8"
> myconf="${myconf} --with-collation=utf8_general_ci"
> 
> and sobstitute them with:
> 
> myconf="${myconf} --with-charset=latin1"
> myconf="${myconf} --with-collation=latin1_swedish_ci"
I've searched for a responsbile eclass, but failed to find it... (it was too late - or too early?)

I have an excellent idea, which is Gentoo-ish in advance:
What if this encoding issue would be handled by a USE flag? What I mean: if utf8/unicode is set, it would be compiled to include utf8. If utf8 is not set, it would fall back to latin2 (I'm too selfish to say latin1). May be latin1 and latin2 could be introduced also for different fallback encodings, which will be also able to trigger the putative PHP patch to include in the future.

Regards,
Dwokfur
Comment 11 Luca Longinotti (RETIRED) gentoo-dev 2006-04-13 14:33:29 UTC
Ok, fixed PHP versions are in Portage now, please emerge --sync in an hour or so. The fixed versions are dev-lang/php-4.4.2-r1 and dev-lang/php-5.1.2-r1, they are both in ~arch (unstable), so you'll have to do "echo dev-lang/php >> /etc/portage/package.keywords".
Now those new versions do respect the charset you set under the [client] part in /etc/mysql/my.cnf, you can even create a [php-$SAPI] line and add then settings that will only be valid for PHP (where $SAPI can be cli, cgi, or apache2handler).

The upgrade path that should fix all your problems is quite simple:

1) emerge --sync

2) emerge the new PHP's and the lastest MySQL (-r1's), this time without changing anything in the eclass or the ebuild

3) set the correct charset in /etc/mysql/my.cnf for your uses, if your server uses latin1 in the [mysqld] section, simply add "default-character-set=latin1" to the [client] section, and PHP should be back to displaying your characters as it always was.

Have fun, and _please_ report back if this worked ok for you or not, thanks!
Best regards, CHTEKK.
Comment 12 kahler 2006-04-14 17:18:53 UTC
Seems to work as long as your tables charset is the same as set in the my.cnf.
Comment 13 Alexander 'E-Razor' Krause 2006-04-16 12:21:06 UTC
it's not working here :-(

i synced, emerged =mysql-4.1.14-r1 and restarted mysql... no success.

My config contains something like:
[client]
#password                                       = your_password
port                                            = 3306
socket                                          = /var/run/mysqld/mysqld.sock
character-sets-dir=latin1
default-character-set=latin1

[mysql]
character-sets-dir=latin1
default-character-set=latin1

[php-cgi]
character-sets-dir=latin1
default-character-set=latin1

I'm using the mysql-4.1.14 now, which works.
Comment 14 Luca Longinotti (RETIRED) gentoo-dev 2006-04-16 12:59:12 UTC
(In reply to comment #13)
> it's not working here :-(
> 
> i synced, emerged =mysql-4.1.14-r1 and restarted mysql... no success.

And you emerged the new PHP packages too? 4.4.2-r1 and/or 5.1.2-r1 ???
Note that it's not [php-cgi] but [php-cgi-fcgi] in the my.cnf, my fault, sorry. ;)
Best regards, CHTEKK.
Comment 15 Alexander 'E-Razor' Krause 2006-04-17 06:26:34 UTC
no, still the same...

and yes, I'm running php-5.1.2-r1. (synced and compiled it again though)

I'm using those lines in my config now:
[php-cgi-fcgi]
character-sets-dir=latin1
default-character-set=latin1

It looks like php fully ignores these settings cos setting them to utf8 should display the characters like the -r1 does, right?
Comment 16 Luca Longinotti (RETIRED) gentoo-dev 2006-04-17 07:12:58 UTC
PHP should _not_ ignore those settings in PHP 5.1.2-r1, that's what was fixed...
Please take a look at https://forums.gentoo.org/viewtopic-p-3256496.html#3256496 and run that script so we can see how exactly your PHP sees MySQL's charsets.
And you're sure you're using the CGI version of PHP (php-cgi binary)?
Best regards, CHTEKK.
Comment 17 Alexander 'E-Razor' Krause 2006-04-17 08:49:19 UTC
It is the cgi-fcgi version:
http://web0.erazor-zone.de/phpinfo.php

Thats the output of that php-script:
http://web0.erazor-zone.de/sql-test.php

No matter what i'm setting up in [client] or [php-cgi-fcgi] the output is still the same (i'm currently using the mysql-4.1.14, not the -r).

That's my config:
http://web0.erazor-zone.de/cat-my.php

However, it's not really a problem for me, but i thought it might be helpfull for others to solve that thing.
Comment 18 Luca Longinotti (RETIRED) gentoo-dev 2006-04-17 09:43:29 UTC
(In reply to comment #17)
> It is the cgi-fcgi version:
> http://web0.erazor-zone.de/phpinfo.php

"PHP Version 5.1.2-gentoo" that is not a 5.1.2-r1 installation, that would show up as "PHP Version 5.1.2-pl1-gentoo".

> Thats the output of that php-script:
> http://web0.erazor-zone.de/sql-test.php

This looks correct??? I don't get what you're trying to accomplish, are your databases originally latin1 or UTF8? I thought they are latin1, and according to that, it all looks good, but you'd have to set "character-sets-dir=utf8" to "character-sets-dir=latin1". Also, as already mentioned, for all this to work correctly, you *must* use mysql-4.1.14-r1 and php-5.1.2-r1.
Best regards, CHTEKK.
Comment 19 Alexander 'E-Razor' Krause 2006-04-17 11:14:56 UTC
>"PHP Version 5.1.2-gentoo" that is not a 5.1.2-r1 installation, that would show up as "PHP Version 5.1.2-pl1-gentoo".
damn, your right -silly me :-(
somehow my ~php was masked... i upgraded and now it reads the [php-cgi-fcgi] section.

>This looks correct??? I don't get what you're trying to accomplish, are your
databases originally latin1 or UTF8?
They are latin1, but i went the way backward ;-)
I was using the mysql-4.1.14 (not the -r1 as i wrote) and tried to force that character problem again by setting the encoding to utf8 just to test if php uses the settings of my.cnf . (changing 2 lines is faster than emerging the mysql package again)

Thanks again... and i would call this one SOLVED :-)
Comment 20 Jakub Moc (RETIRED) gentoo-dev 2006-04-20 14:22:06 UTC
*** Bug 130661 has been marked as a duplicate of this bug. ***
Comment 21 Brad Fish 2006-05-05 15:33:57 UTC
I've noticed that php-5.1.2-r1 was just removed from portage. Does php-5.1.4 include this fix?
Comment 22 Luca Longinotti (RETIRED) gentoo-dev 2006-05-05 15:58:16 UTC
Yes, PHP 5.1.4, as well as PHP 4.4.2-r2, do now include this fix and are both targeted to become the stable PHP versions on Gentoo. Future PHP versions will too include this patch, if we ever change that, it will be noted with all the due warnings et all, but for now it's not foreseen to change this, we'll maintain the patch.
Best regards, CHTEKK.