Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 447976 - app-dicts/myspell-de does not recognize spellings containing ß
Summary: app-dicts/myspell-de does not recognize spellings containing ß
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Library (show other bugs)
Hardware: AMD64 Linux
: Normal normal (vote)
Assignee: Spell checking utilities and dictionaries -- related bugs (OBSOLETE)
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-12-20 17:48 UTC by Stephen Bosch
Modified: 2013-06-04 13:10 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
libreoffice spell-checking as you type does not recognize correct spellings containing ß. (bug_libreoffice_German_dictionary_de_DE.png,15.84 KB, image/png)
2012-12-20 17:51 UTC, Stephen Bosch
Details
Status bar showing language settings of text. (bug_libreoffice_German_dictionary_de_DE_2.png,2.49 KB, image/png)
2012-12-20 17:52 UTC, Stephen Bosch
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Stephen Bosch 2012-12-20 17:48:53 UTC
The German word for big is spelled "groß" in Germany and Austria, and "gross" in Switzerland. When libreoffice-l10n is installed with LINGUAS="de", spellings of any words containing the ß character are marked as incorrect.

A general German dictionary should contain ALL the spellings. Perhaps this is an encoding problem?

Reproducible: Always

Steps to Reproduce:
1. set LANG=de_DE.UTF-8
2. set LINGUAS="de"
3. set the keyboard encoding to "de"
3. emerge libreoffice-l10n, libreoffice-bin
4. Start libreoffice. Make sure spell-checking as you type is enabled.
5. Type the word "groß". If you are using a US keyboard, type "gro-".
Actual Results:  
The word will be underlined red. The word "gross", by contrast, is recognized.

Expected Results:  
The word "groß" is correct in standard German and the spell-checker should recognize it as such.

# emerge --info app-office/libreoffice-l10n
Portage 2.1.11.31 (default/linux/amd64/10.0, gcc-4.5.4, glibc-2.15-r3, 3.4.9-gentoo x86_64)
=================================================================
                         System Settings
=================================================================
System uname: Linux-3.4.9-gentoo-x86_64-Intel-R-_Core-TM-2_Duo_CPU_T9400_@_2.53GHz-with-gentoo-2.1
Timestamp of tree: Thu, 20 Dec 2012 13:30:01 +0000
ld GNU ld (GNU Binutils) 2.22
app-shells/bash:          4.2_p37
dev-java/java-config:     2.1.11-r3
dev-lang/python:          2.7.3-r2, 3.2.3
dev-util/cmake:           2.8.9
dev-util/pkgconfig:       0.27.1
sys-apps/baselayout:      2.1-r1
sys-apps/openrc:          0.11.5
sys-apps/sandbox:         2.5
sys-devel/autoconf:       2.13, 2.68
sys-devel/automake:       1.9.6-r3, 1.11.6
sys-devel/binutils:       2.22-r1
sys-devel/gcc:            4.5.4
sys-devel/gcc-config:     1.7.3
sys-devel/libtool:        2.4-r1
sys-devel/make:           3.82-r3
sys-kernel/linux-headers: 3.6 (virtual/os-headers)
sys-libs/glibc:           2.15-r3
Repositories: gentoo science kde
ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="* -@EULA"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-O2 -march=native -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/config /usr/share/gnupg/qualified.txt /var/lib/hsqldb"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-O2 -march=native -pipe"
DISTDIR="/usr/portage/distfiles"
FCFLAGS="-O2 -pipe"
FEATURES="assume-digests binpkg-logs config-protect-if-modified distlocks ebuild-locks fixlafiles merge-sync news parallel-fetch protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch"
FFLAGS="-O2 -pipe"
GENTOO_MIRRORS="ftp://ftp.wh2.tu-dresden.de/pub/mirrors/gentoo ftp://ftp.uni-erlangen.de/pub/mirrors/gentoo http://ftp.uni-erlangen.de/pub/mirrors/gentoo ftp://ftp-stud.hs-esslingen.de/pub/Mirrors/gentoo/ rsync://ftp-stud.hs-esslingen.de/gentoo/ http://ftp-stud.hs-esslingen.de/pub/Mirrors/gentoo/"
LANG="de_DE.UTF-8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
MAKEOPTS="-j3"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/var/lib/layman/science /var/lib/layman/kde"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="X aac acl acpi alsa amd64 amr bash-completion berkdb bzip2 cairo cli consolekit cracklib crypt cups cxx dbus device-mapper dri encode exif extras faac flac fortran gdbm gimp gpm hwdb iconv icu ipv6 jpeg kde laptop lcms lm_sensors mmx modules mp3 mudflap multilib ncurses nls nptl nsplugin opengl openmp pam parport pcre pdf perl png policykit pppd python qt3support readline rtmp semantic-desktop session sqlite sse sse2 sse3 ssl ssse3 startup-notification taglib tcpd theora threads tiff truetype udev unicode usb v4l video vorbis x264 xml zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump" CAMERAS="ptp2" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ubx" INPUT_DEVICES="evdev keyboard mouse synaptics" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LINGUAS="de en fr nb nb_NO en_GB en_CA" PHP_TARGETS="php5-3" PYTHON_SINGLE_TARGET="python2_7" PYTHON_TARGETS="python2_7 python3_2" RUBY_TARGETS="ruby18 ruby19" USERLAND="GNU" VIDEO_CARDS="intel vesa radeon fglrx" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, USE_PYTHON

=================================================================
                        Package Settings
=================================================================

app-office/libreoffice-l10n-3.6.4.3 was built with the following:
USE="(multilib) -offlinehelp" LINGUAS="de en en_GB fr nb -af -am -ar -as -ast -be -bg -bn -bn_IN -bo -br -brx -bs -ca -ca_XV -cs -cy -da -dgo -dz -el -en_ZA -eo -es -et -eu -fa -fi -ga -gd -gl -gu -he -hi -hr -hu -id -is -it -ja -ka -kk -km -kn -ko -kok -ks -ku -lb -lo -lt -lv -mai -mk -ml -mn -mni -mr -my -ne -nl -nn -nr -nso -oc -om -or -pa_IN -pl -pt -pt_BR -ro -ru -rw -sa_IN -sat -sd (-sh) -si -sk -sl -sq -sr -ss -st -sv -sw_TZ -ta -te -tg -th -tn -tr -ts -tt -ug -uk -uz -ve -vi -xh -zh_CN -zh_TW -zu"
Comment 1 Stephen Bosch 2012-12-20 17:51:29 UTC
Created attachment 332808 [details]
libreoffice spell-checking as you type does not recognize correct spellings containing ß.

Correctly spelled words are marked as incorrect if those words contain the ß character.
Comment 2 Stephen Bosch 2012-12-20 17:52:44 UTC
Created attachment 332810 [details]
Status bar showing language settings of text.
Comment 3 Stephen Bosch 2012-12-20 17:54:17 UTC
There's a grammatical error in the first image. Please ignore that :)
Comment 4 Tomáš Chvátal (RETIRED) gentoo-dev 2013-01-01 10:47:09 UTC
The spell check is system package app-dicts/myspell-de It has technically nothing to do with the libreoffice itself, so maybe our dictionary is broken there.

I see there is one testing version in the tree, could you try that one and report if that one works?
Comment 5 Stephen Bosch 2013-01-03 21:10:27 UTC
(In reply to comment #4)
> The spell check is system package app-dicts/myspell-de It has technically
> nothing to do with the libreoffice itself, so maybe our dictionary is broken
> there.
> 
> I see there is one testing version in the tree, could you try that one and
> report if that one works?

I merged myspell-de-2012.06.17, but the problem persists.

I can confirm that this issue is present with one other program that depends on myspell-de (specifically, texmaker) so this does indeed look like a problem with the dictionary. Does the bug need to be changed or moved?

(For comparison: the problem is *not* present with the dictionary extensions for firefox-17.)
Comment 6 Uwe Breidenbach 2013-04-30 14:35:59 UTC
I can confirm the issue on ~amd64. And the same happens with gedit's spell checking.
But neither hunspell nor myspell-de packages are the issue. If I manuelly call hunspell it checks the words just fine:

hunspell -i utf-8 -d de_DE_frami
Hunspell 1.3.2
groß
*

gross
& gross 5 0: groß, goss, kross, Tross, -ross

With my stable x86 gentoo I don't have this issue.
Comment 7 Uwe Breidenbach 2013-04-30 15:15:23 UTC
Issue is not present:
www-client/firefox-20.0.1
net-im/pidgin-2.10.7-r1
app-text/hunspell-1.3.2-r3
app-dicts/myspell-de-2012.06.17

Issue is present:
app-office/libreoffice-bin-3.6.4.3 (with app-office/libreoffice-l10n-3.6.6.2)
mail-client/evolution-3.6.4
app-editors/gedit-3.6.2-r1
Comment 8 Stephen Bosch 2013-04-30 22:05:54 UTC
The problem persists with libreoffice here also.

(I should note that texmaker now approves of 'groß' and marks 'gross' as incorrect. This may be intended if the dictionary adheres to the strict German (as in DE) orthography.)

Then perhaps it is an encoding problem after all.
Comment 9 Michael Hofmann 2013-06-02 18:25:19 UTC
I remember I also had this problem. Today, after finding this bug thread, I tried to reproduce the error. But spell checking works fine with LibreOffice 3.6.6.2 as
well as with LibreOffice 4.0.3.3.

I've been thinking about this for a while and I'm pretty sure that this bug has something to do with the strange filenames of German dictionary files.
Can someone who still has the problem ("groß" is marked as error and "gross" is OK) try the steps below, please:

1) Log in as root
2) cd /usr/share/myspell
3) rm *frami*.dic *frami*.aff
4) ln -s /usr/share/hunspell/de_DE_frami.aff de_DE.aff
5) ln -s /usr/share/hunspell/de_DE_frami.dic de_DE.dic
6) Restart LibreOffice an retry spell checking.

After that you should do the steps below:
1) cd /usr/share/myspell
2) rm de_DE.aff de_DE.dic
3) emerge myspell-de

I suspect that it's not an encoding error - LibreOffice just uses the wrong dictionary - although it displays "Deutsch/Deutschland", it uses Swiss German dictionary. 

See also: https://bugs.gentoo.org/show_bug.cgi?id=430468 and https://bugs.gentoo.org/show_bug.cgi?id=430468. This all seems to be the same bug: LibreOffice is confused by '_frami' in the filenames of the dictionaries.
Comment 10 Michael Hofmann 2013-06-02 18:29:20 UTC
Sorry: the second link should have been: https://bugs.gentoo.org/show_bug.cgi?id=458772
Comment 11 Uwe Breidenbach 2013-06-03 21:28:05 UTC
You thought right. After following (the first part of) your steps, the spell checking in LibreOffice and gedit worked.
Comment 12 Stephen Bosch 2013-06-03 22:19:21 UTC
(In reply to Michael from comment #9)
> I remember I also had this problem. Today, after finding this bug thread, I
> tried to reproduce the error. But spell checking works fine with LibreOffice
> 3.6.6.2 as
> well as with LibreOffice 4.0.3.3.
> 
> I've been thinking about this for a while and I'm pretty sure that this bug
> has something to do with the strange filenames of German dictionary files.
> Can someone who still has the problem ("groß" is marked as error and "gross"
> is OK) try the steps below, please:
> 
> 1) Log in as root
> 2) cd /usr/share/myspell
> 3) rm *frami*.dic *frami*.aff
> 4) ln -s /usr/share/hunspell/de_DE_frami.aff de_DE.aff
> 5) ln -s /usr/share/hunspell/de_DE_frami.dic de_DE.dic
> 6) Restart LibreOffice an retry spell checking.

This worked for me as well.

> After that you should do the steps below:
> 1) cd /usr/share/myspell
> 2) rm de_DE.aff de_DE.dic
> 3) emerge myspell-de

If the first set of instructions are an effective workaround, why is the remerge necessary? It will just end up replacing the old symlinks.
 
> I suspect that it's not an encoding error - LibreOffice just uses the wrong
> dictionary - although it displays "Deutsch/Deutschland", it uses Swiss
> German dictionary. 
> 
> See also: https://bugs.gentoo.org/show_bug.cgi?id=430468 and
> https://bugs.gentoo.org/show_bug.cgi?id=430468. This all seems to be the
> same bug: LibreOffice is confused by '_frami' in the filenames of the
> dictionaries.

So where does this need to be fixed, then?
Comment 13 Michael Hofmann 2013-06-03 22:32:05 UTC
The second part with the "emerge" is only to restore current configuration - I didn't know if the proposed steps in part 1 really solve your problem - and I didn't want to leave with you a garbled system...

I'll write an email to the package maintainer of myspell-de and propose
to omit "_frami" from the filenames of German dictionaries in /usr/share/myspell. 
I'm pretty sure now that this is the right way to go... Thanks for reporting your results!
Comment 14 Tomáš Chvátal (RETIRED) gentoo-dev 2013-06-04 13:10:45 UTC
Removed the frami part in -r1 bump of the dicts as suggested.

Lets see how many regressions that cause.