Hi, i recently updated my minimal mysql installation on our productive webservers from 5.0.26-r1 5.0.26-r2. After finished the update the following problem occured: Data from Database (which is inserted UTF-8 encoded) is displayed on our website just like in the DB (Ä,Ö,Ü,...) After downgrading to r1 again (from binpkg because it disappeared from portage) everything works as expected. Reproducible: Always Steps to Reproduce: 1. Install mysql-5.0.26-r1 2. Visit website 3. Actual Results: UTF-8 characters are messed up Expected Results: Well encoded characters dev-lang/php-5.1.6-r6 USE="apache2 berkdb crypt gdbm iconv ipv6 mysql ncurses nls pcre readline reflection session spl ssl unicode xml zlib (-adabas) -apache -bcmath (-birdstep) -bzip2 -calendar -cdb -cgi -cjk -cli -concurrentmodphp -ctype -curl -curlwrappers -db2 -dbase (-dbmaker) -debug -discard-path -doc (-empress) (-empress-bcs) (-esoob) -exif -fastbuild (-fdftk) (-filepro) -firebird -flatfile -force-cgi-redirect (-frontbase) -ftp -gd -gd-external -gmp -hardenedphp -hash -hyperwave-api -imap (-informix) -inifile -interbase -iodbc -java-external -kerberos -ldap -libedit -mcve -memlimit -mhash -ming -msql -mssql -mysqli -oci8 (-oci8-instant-client) -odbc -pcntl -pdo -pdo-external -pic -posix -postgres -qdbm -recode -sapdb -sasl -sharedext -sharedmem -simplexml -snmp -soap -sockets (-solid) -spell -sqlite (-sybase) (-sybase-ct) -sysvipc -threads -tidy -tokenizer -truetype -vm-goto -vm-switch -wddx -xmlreader -xmlrpc -xmlwriter -xpm -xsl -yaz -zip" --------------------------------------------------- equery u =dev-db/mysql-5.0.26-r1 [ Searching for packages matching =dev-db/mysql-5.0.26-r1... ] [ Colour Code : set unset ] [ Legend : Left column (U) - USE flags from make.conf ] [ : Right column (I) - USE flags packages was installed with ] [ Found these USE variables for dev-db/mysql-5.0.26-r1 ] U I + + berkdb : Adds support for sys-libs/db (Berkeley DB for MySQL) - - big-tables : Make tables contain up to 1.844E+19 rows - - cluster : Add support for NDB clustering. - - debug : Enable extra debug codepaths, like asserts and extra output. If you want to get meaningful backtraces see http://www.gentoo.org/proj/en/qa/backtraces.xml . - - embedded : Build embedded server (libmysqld) - - extraengine : Add support for alternative storage engines. - + latin1 : Use LATIN1 encoding instead of UTF8. - - max-idx-128 : Raise the max index per table limit from 64 to 128 + - minimal : Install a very minimal build (disables, for example, plugins, fonts, most drivers, non-critical features) - - perl : Adds support/bindings for the Perl language. - - selinux : !!internal use only!! Security Enhanced Linux support, this must be set by the selinux profile or breakage will occur - - srvdir : Add support for GLEP 20 + + ssl : Adds support for Secure Socket Layer connections - - static : !!do not set this during bootstrap!! Causes binaries to be statically linked instead of dynamically
it sounds like your tables don't have the correct character sets marked on them (eg they contain UTF8, but are marked as LATIN1). please check your 'show create table ...' output.
Tables are configured correctly. It works with Mysql version 5.0.26-r1. Only r2 is unusable with php. I think it's a problem with libmysqlclient.
Ok, that's really weird. I checked 5.0.26-r1, and found that there are no changes in our patches between -r1 as of the revision immediately before it was deleted compared to -r2. # for i in mysql-5.0.26-r1.ebuild mysql-5.0.26-r2.ebuild ; do ebuild $i unpack 1>/dev/null ; done ; diff -Nuar /var/tmp/portage/dev-db/mysql-5.0.26-r1/work/mysql /var/tmp/portage/dev-db/mysql-5.0.26-r2/work/mysql (no output, so they are identical) # vivo: what did you do?
That's really really weird. Can i give you more information on this?
please find exactly which revision of dev-db/mysql-5.0.26-r1 you were using. look for the ebuild under /var/db/pkg, and check it's header.
I suspect the problem can be in either one of this two places: - 105_all_mysql_config_cleanup.patch this patch modify the behaviour of "mysql_config" executable, used by php and much others to gather information on how to link against libmysql, has been introduced 2007-01-04/05 - php must use /etc/mysql/my.cnf, check with an strace (cli version is easier) that it read that config file also check the obvious, my.cnf
> - 105_all_mysql_config_cleanup.patch > this patch modify the behaviour of "mysql_config" executable, used by php reference bug #156301 "mysql_config wrongly retains too much info from CFLAGS"
I'm nor sure if it has something to do with /etc/mysql/my.cnf. The staging server which runs -r1 contains a my.cnf where latin1 is specified. This is because there's a database running using latin1. The stage server uses for the webapp the productive database which is UTF-8. Here's the my.cnf from the stage: [client] port = 3306 socket = /var/run/mysqld/mysqld.sock [mysql] character-sets-dir=/usr/share/mysql/charsets default-character-set=latin1 [mysqladmin] character-sets-dir=/usr/share/mysql/charsets default-character-set=latin1 [mysqlcheck] character-sets-dir=/usr/share/mysql/charsets default-character-set=latin1 [mysqldump] character-sets-dir=/usr/share/mysql/charsets default-character-set=latin1 [mysqlimport] character-sets-dir=/usr/share/mysql/charsets default-character-set=latin1 [mysqlshow] character-sets-dir=/usr/share/mysql/charsets default-character-set=latin1 [myisamchk] character-sets-dir=/usr/share/mysql/charsets [myisampack] character-sets-dir=/usr/share/mysql/charsets [mysqld_safe] err-log = /var/log/mysql/mysql.err [mysqld] character-set-server = latin1 default-character-set = latin1 skip-character-set-client-handshake innodb_buffer_pool_size = 16M innodb_additional_mem_pool_size = 2M innodb_log_file_size = 5M innodb_log_buffer_size = 8M set-variable = innodb_log_files_in_group=2 innodb_flush_log_at_trx_commit = 1 innodb_lock_wait_timeout = 50 innodb_data_file_path = ibdata1:1024M:autoextend innodb_file_per_table innodb_flush_log_at_trx_commit = 1 [mysqldump] quick max_allowed_packet = 16M [mysql] [isamchk] key_buffer = 20M sort_buffer_size = 20M read_buffer = 2M write_buffer = 2M [myisamchk] key_buffer = 20M sort_buffer_size = 20M read_buffer = 2M write_buffer = 2M [mysqlhotcopy] interactive-timeout user = mysql port = 3306 socket = /var/run/mysqld/mysqld.sock pid-file = /var/run/mysqld/mysqld.pid log-error = /var/log/mysql/mysqld.err basedir = /usr datadir = /var/lib/mysql skip-locking key_buffer = 16M max_allowed_packet = 1M table_cache = 64 sort_buffer_size = 512K net_buffer_length = 8K read_buffer_size = 256K read_rnd_buffer_size = 512K myisam_sort_buffer_size = 8M language = /usr/share/mysql/english log-bin server-id = 1 tmpdir = /tmp/
Hello? Is there anything i can do to get this bug solved??
You're not the only one that was stuck with this, this had me running crazy the last three months or so and I finally stumbled onto a solution this morning. Don't ask how, it was a nightmare involving strace a lot of screaming and some luck. In /etc/mysql/my.cnf you have this: [client] port = 3306 socket = /var/run/mysqld/mysqld.sock [mysql] character-sets-dir=/usr/share/mysql/charsets default-character-set=latin1 This implies that all clients get the right port and socket, however, only the mysqld client gets the right charset stuff, so copying those charset lines from the [mysql] section into the [client] section fixes it. Could we please get this done by default on shipped my.cnf files?
This doesn't work for me. I tried all the versions in portage (with USE=minimal) but the problem stays the same. I won't be able to update MySQL anymore :-(
Did you restart the client?
Check https://forums.gentoo.org/viewtopic-p-3946384.html#3946384 the configuration was moved from my.cnf to php.ini in the newest PHP releases, and MySQL itself has nothing to do with this at all really, it only has its own "latin1" USE flag if you want latin1 instead of the default utf8 charset. Best regards, CHTEKK.
I configured php and mysql according to the thread but the problem still exists :-( here's how the content on the website looks: Sie benötigen einen SCART-Anschluss an Ihrem TV-Gerät. Außerdem /etc/php/apache2-php5/php.ini =============================== ; Local Variables: ; tab-width: 4 ; End: ; MySQL extensions default connection charset settings mysql.connect_charset = utf8 mysqli.connect_charset = utf8 pdo_mysql.connect_charset = utf8 [ebuild R ] dev-db/mysql-5.0.26-r2 USE="berkdb minimal ssl -big-tables -cluster -debug -embedded -extraengine -latin1 -max-idx-128 -perl (-selinux) -static" 0 kB [ebuild R ] dev-lang/php-5.2.1-r3 USE="apache2 berkdb cli crypt curl gd gdbm iconv ipv6 mysql ncurses nls pcre readline reflection session soap spl ssl unicode xml zlib (-adabas) -apache -bcmath (-birdstep) -bzip2 -calendar -cdb -cgi -cjk -concurrentmodphp -ctype -curlwrappers -db2 -dbase (-dbmaker) -debug -discard-path -doc (-empress) (-empress-bcs) (-esoob) -exif -fastbuild (-fdftk) -filter (-firebird) -flatfile -force-cgi-redirect (-frontbase) -ftp -gd-external -gmp -hash -imap -inifile -interbase -iodbc -java-external -json -kerberos -ldap -ldap-sasl -libedit -mcve -mhash -msql -mssql -mysqli -oci8 (-oci8-instant-client) -odbc -pcntl -pdo -pdo-external -pic -posix -postgres -qdbm -recode -sapdb -sharedext -sharedmem -simplexml -snmp -sockets (-solid) -spell -sqlite -suhosin (-sybase) (-sybase-ct) -sysvipc -threads -tidy -tokenizer -truetype -wddx -xmlreader -xmlrpc -xmlwriter -xpm -xsl -yaz -zip -zip-external" 0 kB Any more ideas??
Ok that looks about right if you want everything to be UTF8... Still, are you sure the tables themselves and the data are correctly in UTF8 in the database? Also, what kind of charset header do you send out in your pages' HTML code? Try forcing to view the page with UTF8 as charset (in FireFox fex. go to View -> Character Encoding and select "Unicode (UTF-8)". Generally charset viewing issues in a webpage can be tracked like this: -the page itself (the charset header must be defined, and naturally to the charset you want) -the PHP<->MySQL connection (that one you can set with the php.ini's mysql.connection_charset, remember to restart Apache!) -the MySQL database itself (be sure it uses UTF8 itself, for it's databases, tables and data) Best regards, CHTEKK.
(In reply to comment #15) > Ok that looks about right if you want everything to be UTF8... > Still, are you sure the tables themselves and the data are correctly in UTF8 in > the database? I'm sure. If i go back to -r1 version the website uses all the right encoding > Also, what kind of charset header do you send out in your pages' > HTML code? Try forcing to view the page with UTF8 as charset (in FireFox fex. > go to View -> Character Encoding and select "Unicode (UTF-8)". The pages are sent as UTF-8 and again it works with the earlier version of mysql. > Generally charset viewing issues in a webpage can be tracked like this: > -the page itself (the charset header must be defined, and naturally to the > charset you want) Should be ok > -the PHP<->MySQL connection (that one you can set with the php.ini's > mysql.connection_charset, remember to restart Apache!) I updated the config and restarted apache. Doesn't change anything > -the MySQL database itself (be sure it uses UTF8 itself, for it's databases, > tables and data) Is ok. I can see the 3 byte UTF-8 representaion for the encoded chars Regards Daniel
(In reply to comment #16) > (In reply to comment #15) > > Ok that looks about right if you want everything to be UTF8... > > Still, are you sure the tables themselves and the data are correctly in UTF8 in > > the database? > > I'm sure. If i go back to -r1 version the website uses all the right encoding > To prove this please provide the output of the following statement: show table status from DB_NAME like 'TABLE_NAME' Replace DB_NAME and TABLE_NAME approriately > > Also, what kind of charset header do you send out in your pages' > > HTML code? Try forcing to view the page with UTF8 as charset (in FireFox fex. > > go to View -> Character Encoding and select "Unicode (UTF-8)". > > The pages are sent as UTF-8 and again it works with the earlier version of > mysql. > > > Generally charset viewing issues in a webpage can be tracked like this: > > -the page itself (the charset header must be defined, and naturally to the > > charset you want) > > Should be ok > The suggested way to change firefox settings does not work if the displayed website has mixed-up character-encodings. Please check, if this tag in the html-header is exactly like this one :<meta http-equiv='content-type' content='text/html; charset=UTF-8'> > > -the PHP<->MySQL connection (that one you can set with the php.ini's > > mysql.connection_charset, remember to restart Apache!) > > I updated the config and restarted apache. Doesn't change anything > > > -the MySQL database itself (be sure it uses UTF8 itself, for it's databases, > > tables and data) > > Is ok. I can see the 3 byte UTF-8 representaion for the encoded chars > > Regards > > Daniel > What is the output of this script? <?php $mysql_connection_id = mysql_connect('localhost', 'mysql_user', 'mysql_password'); print = mysql_client_encoding($mysql_connection_id); ?> If in all three cases you have utf-8 then you probably have double mixed-up the charset. Try to open the output of your website using a hex-editor; if the umlaut ü does not show up as #c3bc but 4 byte code like #c383c2bc then you have (taken #c383c2bc as example) previously saved utf-8 multibyte characters converted to iso-8859-1(two one byte characters) in your database converted back to utf-8 resulting in this errorneous #c383c2bc representation. This happened for example in the german documentation for the mysql-client-encoding() function (Look at the description, open output of the website in hex editor) http://de.php.net/manual/de/function.mysql-client-encoding.php
Please, respond to Comment #17