In Scikit-learn version 0.23.2 calling the predict() method maliciously crafted model SVM can result in a segmentation fault. Such models can be introduced via pickle, json, or any other model permanence standard. The behaviour is triggered when one of the members of the _n_support array has a very large value, example 1000000 when calling libsvm.predict()
Upstream appears not to care:
This is where it's out of scope here: we can't guard against everything. We have a responsibility to provide safe code when that code is used under the limits of what's a normal use-case, but that's pretty much it. Private attributes shouldn't be modified, and it's up to users to make sure that the estimator isn't maliciously altered.
I might go on a limb and use a poor analogy but when I buy a car, I can't complain that it breaks if I replace the steering wheel by a potato.
The bug has been referenced in the following commit(s):
Author: Andrew Ammerlaan <firstname.lastname@example.org>
AuthorDate: 2021-05-29 17:41:14 +0000
Commit: Andrew Ammerlaan <email@example.com>
CommitDate: 2021-05-29 17:41:48 +0000
sci-libs/scikit-learn: drop 0.23.2
Package-Manager: Portage-3.0.19, Repoman-3.0.3
Signed-off-by: Andrew Ammerlaan <firstname.lastname@example.org>
sci-libs/scikit-learn/Manifest | 1 -
sci-libs/scikit-learn/scikit-learn-0.23.2.ebuild | 66 ------------------------
2 files changed, 67 deletions(-)
Andrew, is this vulnerability fixed by the versions now in tree?
(In reply to John Helmert III from comment #3)
> Andrew, is this vulnerability fixed by the versions now in tree?
It does according to repology: https://repology.org/project/python:scikit-learn/cves
Repology uses CVE data to handle that, and the CVE data isn't necessarily always trustable. Upstream didn't seem to have any interest in patching it, so let's assume the vulnerability is still present unless there's patches upstream.
Package list is empty or all packages have requested keywords.