Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 176908

Summary: Missing 'gofast' patch for sys-apps/grep
Product: Gentoo Linux Reporter: Michal Morávek <michal.moravek>
Component: [OLD] Core systemAssignee: Gentoo's Team for Core System packages <base-system>
Status: RESOLVED WONTFIX    
Severity: normal    
Priority: High    
Version: unspecified   
Hardware: x86   
OS: Linux   
Whiteboard:
Package list:
Runtime testing required: ---

Description Michal Morávek 2007-05-03 10:46:02 UTC
Hello, 
 I'm not much familiar with portage development, so sorry for inaccuracies. I've noticed that grep is running very slow on my system and I've noticed that it is dependent on locales. When I use multibyte (cs_CZ.utf-8) locales. It is 100x slower than with iso-8859-2 or POSIX for example. I've found that this is well-known 'bug' of grep and in various distributions it is solved by patches for grep (Fedora - http://download.fedora.redhat.com/pub/fedora/linux/core/updates/1/SRPMS/, debian - http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=181378). I think that Gentoo also had a patch for this behaviour - I've found it in ChangeLog: 
 
 *grep-2.5.1-r5 (19 Aug 2004) 
 
 19 Aug 2004; Heinrich Wendel <lanius@gentoo.org> grep-2.5.1-r5.ebuild, 
 files/grep-2.5.1-fgrep.patch.bz2, files/grep-2.5.1-gofast.patch.bz2, 
 files/grep-2.5.1-i18n.patch.bz2, files/grep-2.5.1-oi.patch.bz2: 
 better performance on utf8 systems 
 
 but now there is no such patch (gofast) in portage for grep and it is wery slow on multibyte locales. I use version 2.5.1a-r1, but IMHO (by watching grep ebuild) it uses the same patches as 2.5.1. And yes, I've already installed earlier version (2.5.1-r8) and it has the same problem.
 
 Other linux distributions (Fedora, Centos, Debian) has no such problems with grep performance at multibyte locales. 
 
 Thanks in advance 
 Michal Moravek

Reproducible: Always

Steps to Reproduce:
1.LC_ALL=cs_CZ.utf-8 time grep 'something' long_file
2.LC_ALL=C time grep 'something' long_file
3.Compare measured times

Actual Results:  
grep launched with utf-8 locales is about 100x slower.

Expected Results:  
On other systems (e.g. Fedora) is the difference about 20%.
Comment 1 SpanKY gentoo-dev 2007-05-03 15:03:20 UTC
it caused problems and was abandoned ... even Fedora has punted it:
* Fri Jun 27 2003 Tim Waugh <twaugh@redhat.com> 2.5.1-16
- Finally give up on making grep go fast. :-(