Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 528960 - dev-lang/php - segmentation fault in preg_replace()
Summary: dev-lang/php - segmentation fault in preg_replace()
Status: RESOLVED OBSOLETE
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Development (show other bugs)
Hardware: AMD64 Linux
: Normal normal (vote)
Assignee: Gentoo's Team for Core System packages
URL:
Whiteboard:
Keywords:
: 481216 (view as bug list)
Depends on:
Blocks:
 
Reported: 2014-11-11 18:03 UTC by Zoltán Halassy
Modified: 2023-05-16 19:48 UTC (History)
3 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Adds disable-stack-for-recursion USE flag (disable_stack.patch,1.17 KB, patch)
2014-11-13 10:15 UTC, Zoltán Halassy
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Zoltán Halassy 2014-11-11 18:03:32 UTC
Box A is a Desktop linux, with php 5.6.1, with libpcre 8.33 ,
Box B is a Hardened linux, with php 5.5.18 with libpcre 8.35 .

This little test script SegFaults on both:

<?php
$str = '/*'.str_repeat(' ',17000);
echo strlen($str)."\n";
preg_replace('@/\\*([^*]|\\*[^/])*\\*+/@','',$str);

If I replace preg_replace with mb_ereg_replace (and removing the @ signs) the script does not SegFault.

Reproducible: Always

Steps to Reproduce:
Run the script above.
Actual Results:  
Segmentation Fault

Expected Results:  
Shouldn't SegFault.

I don't think a ~17k string should cause havoc in preg_replace().

I will provide stack trace later.
Comment 1 Zoltán Halassy 2014-11-11 18:19:42 UTC
Oh, on the stack trace it seems libpcre recurses too deeply:

#0  0x00007ffff4f01c56 in ?? () from /lib64/libpcre.so.1
#1  0x00007ffff4f0f2c2 in ?? () from /lib64/libpcre.so.1
#2  0x00007ffff4f0f993 in ?? () from /lib64/libpcre.so.1
...
#10893 0x00007ffff4f0f2c2 in ?? () from /lib64/libpcre.so.1
#10894 0x00007ffff4f0f993 in ?? () from /lib64/libpcre.so.1
#10895 0x00007ffff4f03d5e in ?? () from /lib64/libpcre.so.1
#10896 0x00007ffff4f135b7 in pcre_exec () from /lib64/libpcre.so.1
#10897 0x00000000004cc30d in php_pcre_replace_impl ()
#10898 0x00000000004cd233 in ?? ()
#10899 0x00000000004cd7f9 in ?? ()
#10900 0x00000000007f6a22 in ?? ()
#10901 0x00000000007bd098 in execute_ex ()
#10902 0x0000000000755c09 in zend_execute_scripts ()
#10903 0x00000000006f32ff in php_execute_script ()
#10904 0x00000000007f9c59 in ?? ()
#10905 0x000000000048376f in main ()

Is this normal? Shouldn't an application - like this one - prefer iteration over recursion?
Comment 2 Zoltán Halassy 2014-11-11 18:42:45 UTC
Oh, okay, to sum it up, this is the default behaviour because it's faster. Cool. So, we have a fast implementation that crashes, and a slower one, which wouldn't but it's not supported by the build system.

There seem to be a configure option:

--disable-stack-for-recursion

I think it would be a nice option to able to pass that to libpcre. But it's not that essential since mb_ereg_replace seem to work that way, so the developer has the option to use a fast or safe solution.
Comment 3 Michael Orlitzky gentoo-dev 2014-11-12 16:49:36 UTC
Upstream bug, for the curious:

  https://bugs.php.net/bug.php?id=64046

I tried emerging libpcre with USE=jit (per bug #514454) but it didn't seem to help.
Comment 4 Zoltán Halassy 2014-11-13 09:19:13 UTC
Well, for me it seems to be a libpcre "bug", instead of a php bug. A disable-stack-for-recursion USE flag would be nice to have (and make it default in the hardened profile perhaps). There are a lot of cases where preg_match/preg_replace are used to filter *user* input. I don't think *any* user input should affect the stack in any way (at least not in a linear way, logarithmic should be fine).
Comment 5 Zoltán Halassy 2014-11-13 10:09:10 UTC
Compiled libpcre with disable-stack-for-recursion. The engine is 2.8 times slower now with the regular expression above. This is sad, because mb_ereg_replace resolves that 4 times faster than the stack-using-libpcre, and does not even crash with bigger inputs.
Comment 6 Zoltán Halassy 2014-11-13 10:15:20 UTC
Created attachment 389224 [details, diff]
Adds disable-stack-for-recursion USE flag

Added opportunity to build libpcre without relying on stack size. Slows down pcre severely.
Comment 7 Michael Orlitzky gentoo-dev 2016-07-12 15:39:21 UTC
Here's a simple test case that should print "OK":

  <?php
  $input='<span>'.str_repeat('X', 10500).'</span>';
  $output = preg_replace("/<span>(((?!(<\/span>)).)*)<\/span>/",
  		       "BEGIN \\1 END"  ,$input);

  echo "OK\n";
  ?>

(hit it with "php" on the command-line").

Compiling libpcre with --disable-stack-for-recursion fixes the segfault. Would something like that be appropriate behind USE=hardened? It does slow things down, but a lot of the time USE=hardened means "make this safe and slow."
Comment 8 Michael Orlitzky gentoo-dev 2016-07-14 15:57:51 UTC
*** Bug 481216 has been marked as a duplicate of this bug. ***
Comment 9 Michael Orlitzky gentoo-dev 2023-05-16 19:48:43 UTC
PHP is now using libpcre2, and I can no longer reproduce the bug. Without a reproducible test case, there's not much we could do at this point, so let's be optimistic and agree to assume that the bug was fixed.