Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 126733 - beagle crawl changes
Summary: beagle crawl changes
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: New packages (show other bugs)
Hardware: All Linux
: High enhancement (vote)
Assignee: Luis Medinas (RETIRED)
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-03-18 15:26 UTC by Pat Double
Modified: 2006-04-10 12:09 UTC (History)
4 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
beagle 2.3.0 ebuild (beagle-0.2.3.ebuild,3.26 KB, text/plain)
2006-03-18 15:26 UTC, Pat Double
Details
diff to original ebuild (beagle-0.2.3-ebuild.patch,645 bytes, patch)
2006-03-18 18:28 UTC, Arif Lukito
Details | Diff
beagle-0.2.3-crawl-path.patch (beagle-0.2.3-crawl-path.patch,1.14 KB, patch)
2006-03-18 18:30 UTC, Arif Lukito
Details | Diff
update for patch to include KDE dirs (beagle-0.2.3-crawl-path.patch,1.19 KB, patch)
2006-03-18 19:58 UTC, Pat Double
Details | Diff
beagle-0.2.3.ebuild.patch (beagle-0.2.3.ebuild.patch,1.08 KB, patch)
2006-03-19 06:25 UTC, Pat Double
Details | Diff
beagle-0.2.3.ebuild.patch (beagle-0.2.3.ebuild.patch,713 bytes, patch)
2006-03-20 02:38 UTC, Pat Double
Details | Diff
crawl-portage - index /usr/portage (crawl-portage,208 bytes, text/plain)
2006-03-22 13:50 UTC, Pat Double
Details
beagle-0.2.3.ebuild.patch (beagle-0.2.3.ebuild.patch,771 bytes, patch)
2006-03-22 13:51 UTC, Pat Double
Details | Diff
beagle-0.2.3.ebuild.patch (beagle-0.2.3.ebuild.patch,983 bytes, patch)
2006-03-25 16:26 UTC, Pat Double
Details | Diff
FilterEbuild.cs (FilterEbuild.cs,3.53 KB, text/plain)
2006-03-25 16:26 UTC, Pat Double
Details
FilterEbuild.cs (FilterEbuild.cs,3.52 KB, text/plain)
2006-03-29 03:41 UTC, Pat Double
Details
beagle-0.2.4.ebuild (beagle-0.2.4.ebuild,3.73 KB, text/plain)
2006-04-07 12:30 UTC, Pat Double
Details
/etc/beagle/crawl-portage (crawl-portage,219 bytes, text/plain)
2006-04-07 12:31 UTC, Pat Double
Details
beagle-0.2.4.ebuild (beagle-0.2.4.ebuild,3.69 KB, text/plain)
2006-04-07 14:15 UTC, Pat Double
Details
beagle-0.2.4-CVE-2006-1296.patch (beagle-0.2.4-CVE-2006-1296.patch,556 bytes, patch)
2006-04-07 14:16 UTC, Pat Double
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Pat Double 2006-03-18 15:26:07 UTC
Updated ebuild for beagle 2.3.0. Note the changes from the 2.2.1 ebuild:

1. Added user and group for 'beagleindex' for the global static index program.
2. Patches /etc/beagle/crawl-* files for correct KDE location.
3. Keep directory for global static indexes.
Comment 1 Pat Double 2006-03-18 15:26:51 UTC
Created attachment 82499 [details]
beagle 2.3.0 ebuild

This is the ebuild for beagle 2.3.0. Note this is for app-misc/beagle, the desktop search engine.
Comment 2 Pat Double 2006-03-18 16:42:00 UTC
Sorry to be confusing, this is really app-misc/beagle, the desktop search engine. I keep forgetting there are two beagle packages.
Comment 3 Arif Lukito 2006-03-18 18:28:43 UTC
Created attachment 82510 [details, diff]
diff to original ebuild

Oh you're faster than me. I was making the same thing.
Here is my version, the main difference is this
-enewuser -1 -1 /var/lib/cache/beagle beagleindex
+enewuser -1 /bin/bash /var/lib/cache/beagle beagleindex
I think beagleindex needs shell otherwise it will not run.
Comment 4 Arif Lukito 2006-03-18 18:30:27 UTC
Created attachment 82511 [details, diff]
beagle-0.2.3-crawl-path.patch
Comment 5 Pat Double 2006-03-18 18:34:01 UTC
I thought the default shell for the user would be sufficient. Perhaps I didn't read enough, what is the default? You are correct though, it does need a shell to run.

Regarding the crawl path patch, why remove the KDE and GNOME application and doc search paths? I just reviewed by system (which has only KDE) and there is something in /usr/kde/3.5/share/applications and /usr/kde/3.5/share/doc. I sure would like to have those included in the search.
Comment 6 Arif Lukito 2006-03-18 19:39:39 UTC
enewuser put /usr/sbin/nologin if you don't specify shell.

My system is GNOME only so feel free to add KDE specific paths.
Comment 7 Pat Double 2006-03-18 19:44:42 UTC
Are you suggesting the user adding them after install? The user could do this but beagle will ignore directories in the crawl files that do not exist, therefore I think it is OK to leave them in there. They do not cause a dependency, error or warning. The ebuild modifies the files to be the correct KDE path for Gentoo. What is the GNOME path? Surely not /opt as beagle has configured.

Still wondering why you are patching the crawl files to remove the GNOME and KDE paths? The user will have more of an install and its done experience if the paths are left in.
Comment 8 Arif Lukito 2006-03-18 19:53:28 UTC
No I was suggesting that you can modify my patch to include KDE paths. GNOME puts all application shortcuts under /usr/share/applications. I didn't know KDE has different paths when I made the patch.
Comment 9 Pat Double 2006-03-18 19:58:54 UTC
Created attachment 82513 [details, diff]
update for patch to include KDE dirs
Comment 10 Olivier Fisette (RETIRED) gentoo-dev 2006-03-18 21:39:26 UTC
app-misc/beagle is not taken care of by the sci herd. (We maintain the other beagle :)
Comment 11 Luis Medinas (RETIRED) gentoo-dev 2006-03-19 06:08:40 UTC
this is in portage since yesterday...
Comment 12 Pat Double 2006-03-19 06:24:35 UTC
The version in portage lacks the changes necessary to make the global static indexes work. These include indexing of package documentation from /usr/share/doc, KDE share/doc and GNOME share/doc, and the application files. The cron job is installed by beagle but it will not run unless the user manually adds the beagleindex group and user, and creates the directory. The ebuild I attached here has the changes necessary to make this work. Please consider incorporating those changes and if not, please let us know why they should not be. I have attached a patch to the ebuild.
Comment 13 Pat Double 2006-03-19 06:25:26 UTC
Created attachment 82549 [details, diff]
beagle-0.2.3.ebuild.patch

Patch for beagle-0.2.3.ebuild to make global static indices work.
Comment 14 Arif Lukito 2006-03-19 08:10:08 UTC
(In reply to comment #13)
> Created an attachment (id=82549) [edit]
> beagle-0.2.3.ebuild.patch
> 
> Patch for beagle-0.2.3.ebuild to make global static indices work.
> 

Pat you forgot the user shell. Redhat has a patch for this but doesn't seem to work http://bugzilla.gnome.org/show_bug.cgi?id=332955.
Comment 15 Pat Double 2006-03-20 02:38:54 UTC
Created attachment 82643 [details, diff]
beagle-0.2.3.ebuild.patch

Fixed patch to set beagleindex shell.
Comment 16 Pat Double 2006-03-22 13:50:59 UTC
Created attachment 82879 [details]
crawl-portage - index /usr/portage

This file in /etc/beagle will cause beagle to index /usr/portage ebuilds, ChangeLog and metadata.xml. Patch to ebuild to include this will follow.

Opinions: should this file set CRAWL_CACHE_TEXT="yes" to cache snippet text from the ebuilds and related files? This way in the search you'll see context of where the search terms match. I'm not sure since the stock crawl-applications and crawl-documentation do not, however the crawl-windows file does. ???
Comment 17 Pat Double 2006-03-22 13:51:37 UTC
Created attachment 82880 [details, diff]
beagle-0.2.3.ebuild.patch

Patch to install crawl-portage. Put crawl-portage into files/.
Comment 18 Pat Double 2006-03-25 16:26:11 UTC
Created attachment 83118 [details, diff]
beagle-0.2.3.ebuild.patch

Patch to ebuild that makes the following changes:
1. Keep the /var/lib/beagle/Backends directory, beagle expects it an emits a warning.
2. Include a filter for ebuilds that extras the package name, version, description, home page and license from the ebuild and adds as properties in the index. /etc/beagle/crawl-portage (attached) crawls /usr/portage and this filter will extract the properties. I have a patch for kerry that will then display the title, version, description and home page in the search results.
Comment 19 Pat Double 2006-03-25 16:26:58 UTC
Created attachment 83119 [details]
FilterEbuild.cs

Beagle filter to extract info from ebuilds.
Comment 20 Luis Medinas (RETIRED) gentoo-dev 2006-03-28 10:12:02 UTC
Do we really need this filter on the ebuild ? I don't think is a good idea add this filter on our ebuild. Of course you can provide this filter for users who want's to use it we can help you improve the filter and host it on our devspaces if you need.
Comment 21 Daniel Drake (RETIRED) gentoo-dev 2006-03-28 10:38:25 UTC
The ebuild filter is a great idea, I've had it on my todo-list for quite some time. Pat, many thanks for producing that. Rather than patch it in, I'd normally suggest that you get it applied upstream so that it will appear in the next release, but I see you have already done that - perfect!

I'll revisit the idea of getting the beagle-crawl-system stuff working by default sometime soon (bit low on time at the moment). Thanks again.
Comment 22 Luis Medinas (RETIRED) gentoo-dev 2006-03-28 10:46:28 UTC
Daniel are you suggesting we should apply the filter on our ebuild ? 
About the crawl i'll see what i can do with it.
Comment 23 Daniel Drake (RETIRED) gentoo-dev 2006-03-28 11:08:47 UTC
Pat's code is a generic beagle filter. Comparable to other generic beagle filters, such as the text filter, the PDF filter, ...

We will unconditionally be getting the ebuild filter in the next release since Pat got it reviewed and committed to the main beagle tree.

I'm not suggesting we patch it in, unless the next release gets heavily delayed.

BTW I think this makes ours the first distro package format which can be indexed by beagle :)
Comment 24 Luis Medinas (RETIRED) gentoo-dev 2006-03-28 11:48:37 UTC
awesome!! i asked you because it could be bad maintaining a patch like this if it wasn't on the upstream. But since it would it's awesome.

Good work.
Comment 25 Pat Double 2006-03-28 11:49:32 UTC
(In reply to comment #21)
> The ebuild filter is a great idea, I've had it on my todo-list for quite some
> time. Pat, many thanks for producing that. Rather than patch it in, I'd
> normally suggest that you get it applied upstream so that it will appear in the
> next release, but I see you have already done that - perfect!
> 
> I'll revisit the idea of getting the beagle-crawl-system stuff working by
> default sometime soon (bit low on time at the moment). Thanks again.
> 

You're welcome. I get a large benefit of open source, it's nice to give something back. What exactly are the issues with the crawl? The patch to the ebuild that is here also fixes that, at least on my system. There is a previous patch that does not have the filter that only fixes the crawl.
Comment 26 Daniel Drake (RETIRED) gentoo-dev 2006-03-28 12:20:36 UTC
I just haven't had time to look over or think about it yet. The main issues are: should it be enabled by default? (probably not), if not, then what should be the mechanism for enabling it?
You may have addressed these already, I just haven't spent time looking at it yet. Stay tuned :)
Comment 27 Pat Double 2006-03-28 12:31:00 UTC
The difficulty I ran into (and others as well) was that beagle-build-index cannot be run as root and was being run as the user 'beagleindex' which did not exist, and the necessary path for the indicies was not there. I addressed those. Funny thing is the cron job was installed, so it failed every time. I would agree that it should probably not be enabled by default. The crawl files have this line in them:

CRAWL_ENABLED="yes"

If the value is "no", it is ignored. Seems that this would be the best way to enable/disable. Change the default from "yes" to "no" using a patch or sed and einfo the user to change that if desired. This way no cron job or other stuff needs to be done. Also the individual indices can be controlled by the user.

My 2 cents ;)
Comment 28 Pat Double 2006-03-29 03:41:37 UTC
Created attachment 83362 [details]
FilterEbuild.cs

FilterEbuild.cs that was commited to beagle repository. This has style changes and a few other fixes suggested by the beagle devs. Here in case upstream takes a while to release and devs decide to include as a patch.
Comment 29 Pat Double 2006-04-07 12:29:04 UTC
New version of beagle available, 0.2.4. I'm not sure if I should update this bug or submit a new one, so I'll update this one.
Comment 30 Pat Double 2006-04-07 12:30:57 UTC
Created attachment 84153 [details]
beagle-0.2.4.ebuild

New beagle 0.2.4 ebuild. Removes compile for FilterEbuild.cs because it is now in beagle 0.2.4. Makes the /etc/beagle/crawl-* files disabled by default. Please consider this ebuild as it makes the crawling work again, and the latest release of beagle makes the crawling much faster.
Comment 31 Pat Double 2006-04-07 12:31:45 UTC
Created attachment 84154 [details]
/etc/beagle/crawl-portage

Crawls portage, disabled by default. This file add "/var/db/pkg".
Comment 32 Pat Double 2006-04-07 14:15:45 UTC
Created attachment 84165 [details]
beagle-0.2.4.ebuild

With the security patch.
Comment 33 Pat Double 2006-04-07 14:16:05 UTC
Created attachment 84166 [details, diff]
beagle-0.2.4-CVE-2006-1296.patch

The security patch.
Comment 34 Arif Lukito 2006-04-07 19:50:55 UTC
what about disabling bludgeon by default ?
Comment 35 Arif Lukito 2006-04-07 20:14:36 UTC
oh beagle now use cron.daily instead of cron.d
I think it's a good idea to delete /etc/cron.d/beagle-crawl-system.crontab automatically
Comment 36 Luis Medinas (RETIRED) gentoo-dev 2006-04-08 06:44:22 UTC
i bumped beagle 0.2.4 to portage. it's time to change the summary.
Thanks
Comment 37 Luis Medinas (RETIRED) gentoo-dev 2006-04-09 13:18:01 UTC
a added a patch to fix crawl application and docs paths. Thanks Pat for your work it's on portage.
Comment 38 Pat Double 2006-04-10 02:23:27 UTC
There are a couple of problems with the crawl patch.

1. /etc/beagle/crawl-applications: this will crawl the entire /usr/kde/* directories, and this crawl is meant to crawl .desktop files. Wouldn't it be better to crawl /usr/kde/*/share/applications ?

2. /etc/beagle/crawl-documentation: KDE documentation is left out entirely. Please add /usr/kde/*/share/doc

I can file another bug if you wish.

Thanks.
Comment 39 Arif Lukito 2006-04-10 05:53:44 UTC
metalgod if you're going to bump another revision please include this patch http://cvs.gnome.org/viewcvs/beagle/search/Search.cs?r1=1.19&r2=1.20&makepatch=1&diff_format=h
it fixes crash on beagle-search
Comment 40 Luis Medinas (RETIRED) gentoo-dev 2006-04-10 12:09:14 UTC
i did that changes in cvs please next time open a new bug.
yesterday i didn't add those paths because kde users will probably want to use kerry instead of beagle and i was not sure about those paths.
Thanks both.