Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 314799 - >=app-dicts/wordnet-3.0-r2: morphstr() unable to handle lemmatization of irregular words
Summary: >=app-dicts/wordnet-3.0-r2: morphstr() unable to handle lemmatization of irre...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: High major (vote)
Assignee: No maintainer - Look at https://wiki.gentoo.org/wiki/Project:Proxy_Maintainers if you want to take care of it
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-04-12 12:15 UTC by Edgar Gonzàlez i Pellicer
Modified: 2016-07-26 22:38 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Patch for file lib/morph.c, fixing the indexing of structure exc_fps in function exc_lookup (wordnet-3.0-exc.patch,757 bytes, patch)
2010-04-12 12:17 UTC, Edgar Gonzàlez i Pellicer
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Edgar Gonzàlez i Pellicer 2010-04-12 12:15:28 UTC
The morphstr() function is unable to handle the lemmatization of irregular words

Reproducible: Always

Steps to Reproduce:
1. Call the morphstr() function with a derived form of any irregular noun or verb (e.g. 'knives')

Actual Results:  
The return value of the function is NULL, meaning the function is unable to lemmatize

Expected Results:  
The return value should be the base form (in this case, 'knife')

The error does not happen in the unpatched version of WordNet-3.0 available from Princeton. Checking the patches that the ebuild applies, I have traced the error to patch

wordnet-3.0-CVE-2008-3908.patch

The patch changes the indexing of structure

(lib/morph.c:71)
static FILE *exc_fps[NUMPARTS];

from 1-based to 0-based. The indices in function do_init() are accordingly changed.

(lib/morph.c:156-158)
for (i = 0; i < NUMPARTS; i++) {
    snprintf(fname, sizeof(fname), EXCFILE, searchdir, partnames[i+1]);
    if ((exc_fps[i] = fopen(fname, "r")) == NULL) {
...

However, the indexing in exc_lookup() is still 1-based (because parameter pos is 1-based). In addition to producing bad results, this could provoke a segfault.

(lib/morph.c:373)
static char *exc_lookup(char *word, int pos)

(lib/morph.c:378-379)
    if (exc_fps[pos] == NULL)
	return(NULL);

(lib/morph.c:385)
	if ((excline = bin_search(word, exc_fps[pos])) != NULL) {

The bug can be fixed by changing these two exc_fps[pos] by exc_fps[pos-1]. I have tried it and it seems to work. I attach a patch containing the change.
Comment 1 Edgar Gonzàlez i Pellicer 2010-04-12 12:17:02 UTC
Created attachment 227489 [details, diff]
Patch for file lib/morph.c, fixing the indexing of structure exc_fps in function exc_lookup
Comment 2 Michael Orlitzky gentoo-dev 2016-07-26 22:38:20 UTC
Thank you for the patch! I have applied it to a new revision -- I hope this solves the problem.

commit a31726101fbd16c0520916c2e075012f36f13b93
Author: Michael Orlitzky <mjo@gentoo.org>
Date:   Tue Jul 26 18:35:15 2016 -0400

    app-dicts/wordnet: new revision fixing two bugs.

    This package is unmaintained and has two open bugs. The first has a
    patch, thanks to Edgar Gonzàlez i Pellicer, which fixes a problem
    introduced by an earlier patch. It is now applied. The second bug
    reports that the package's SRC_URI is no longer valid, so I have
    updated it from the homepage.

    The ebuild was updated to EAPI=6 in the process. This allowed the
    removal of multilib.eclass in exchange for a call to eapply_user.

    Gentoo-Bug: 314799
    Gentoo-Bug: 543946

    Package-Manager: portage-2.2.28