Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 951737 - >=dev-ruby/commonmarker-2.1.1: uses bundled dev-libs/oniguruma (crashes with system copy)
Summary: >=dev-ruby/commonmarker-2.1.1: uses bundled dev-libs/oniguruma (crashes with ...
Status: CONFIRMED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal
Assignee: Gentoo Ruby Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: bundled-libs
  Show dependency tree
 
Reported: 2025-03-21 22:48 UTC by Sam James
Modified: 2025-03-21 22:49 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Sam James archtester Gentoo Infrastructure gentoo-dev Security 2025-03-21 22:48:02 UTC
This comes from https://github.com/gentoo/gentoo/pull/41130. Filing this for completeness and for future reference.

commonmarker-2.1.1 has been RIIR'd and when using RUSTONIG_SYSTEM_LIBONIG=1, tests segfault.

I'm going to recreate all my comments on GH here.
Comment 1 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2025-03-21 22:48:06 UTC
First go:
```
/var/tmp/portage/dev-ruby/commonmarker-2.1.1/work/ruby32/commonmarker-2.1.1 $ gdb --args ruby -Ilib:test:. -e 'Dir["test/*_test.rb"].each {|f| require f}'
...............
Thread 1 "ruby" received signal SIGSEGV, Segmentation fault.
0x00007ffff760ab01 in main_arena () from /usr/lib64/libc.so.6
(gdb) bt
#0  0x00007ffff760ab01 in main_arena () from /usr/lib64/libc.so.6
#1  0x0000000000000001 in ?? ()
#2  0x0000000000000001 in ?? ()
#3  0x0000000000000004 in ?? ()
#4  0x0000000000000001 in ?? ()
#5  0x0000000000000003 in ?? ()
#6  0x00007fffdbd8bbc0 in OnigEncodingGB18030 () from /usr/lib64/libonig.so.5
#7  0x0000000000000001 in ?? ()
#8  0x00007fffdba431b6 in core::core_arch::x86::m128iExt::as_i8x16 () from /var/tmp/portage/dev-ruby/commonmarker-2.1.1/work/ruby32/commonmarker-2.1.1/lib/commonmarker/commonmarker.so
#9  0x00005555558fcb10 in ?? ()
#10 0x0000000000000004 in ?? ()
#11 0x00007fffdb9acd91 in <Q as hashbrown::Equivalent<K>>::equivalent ()
   from /var/tmp/portage/dev-ruby/commonmarker-2.1.1/work/ruby32/commonmarker-2.1.1/lib/commonmarker/commonmarker.so
#12 0x00007fffdb9b5a44 in hashbrown::map::equivalent_key::{{closure}} ()
   from /var/tmp/portage/dev-ruby/commonmarker-2.1.1/work/ruby32/commonmarker-2.1.1/lib/commonmarker/commonmarker.so
#13 0x00007fffdb979647 in hashbrown::raw::RawTable<T,A>::find::{{closure}} ()
   from /var/tmp/portage/dev-ruby/commonmarker-2.1.1/work/ruby32/commonmarker-2.1.1/lib/commonmarker/commonmarker.so
#14 0x00007fffdb979381 in hashbrown::raw::RawTable<T,A>::find () from /var/tmp/portage/dev-ruby/commonmarker-2.1.1/work/ruby32/commonmarker-2.1.1/lib/commonmarker/commonmarker.so
#15 0x00007fffdb9b64d6 in hashbrown::map::HashMap<K,V,S,A>::get_inner ()
   from /var/tmp/portage/dev-ruby/commonmarker-2.1.1/work/ruby32/commonmarker-2.1.1/lib/commonmarker/commonmarker.so
#16 0x00007fffdb979eca in std::collections::hash::map::HashMap<K,V,S>::get ()
   from /var/tmp/portage/dev-ruby/commonmarker-2.1.1/work/ruby32/commonmarker-2.1.1/lib/commonmarker/commonmarker.so
#17 0x00007fffdb98cd5d in syntect::parsing::scope::ScopeRepository::atom_to_index ()
   from /var/tmp/portage/dev-ruby/commonmarker-2.1.1/work/ruby32/commonmarker-2.1.1/lib/commonmarker/commonmarker.so
#18 0x0000000000000000 in ?? ()
````

Rebuilding with more debug symbols.
Comment 2 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2025-03-21 22:48:14 UTC
OK, reproduced it manually:
```
~/bugs/commonmarker $ valgrind --vgdb-error=1 --suppressions=/tmp/supp ruby34 --disable-jit -Ilib:test:. -e 'Dir["test/*_test.rb"].each {|f| require f}'
==2680191== Memcheck, a memory error detector
==2680191== Copyright (C) 2002-2024, and GNU GPL'd, by Julian Seward et al.
==2680191== Using Valgrind-3.25.0.GIT and LibVEX; rerun with -h for copyright info
==2680191== Command: ruby34 --disable-jit -Ilib:test:. -e Dir["test/*_test.rb"].each\ {|f|\ require\ f}
==2680191==
==2680191==
==2680191== TO DEBUG THIS PROCESS USING GDB: start GDB like this
==2680191==   /path/to/gdb ruby34
==2680191== and then give GDB the following command
==2680191==   target remote | /usr/libexec/valgrind/../../bin/vgdb --pid=2680191
==2680191== --pid is optional if only one valgrind process is running
==2680191==
==2680191== Warning: set address range perms: large range [0x6e7e000, 0x1ee7e000) (defined)
Run options: --seed 48819

# Running:

.................................................==2680191== Use of uninitialised value of size 8
==2680191==    at 0x22753C7D: match_at (regexec.c:3069)
==2680191==    by 0x227590D6: search_in_range (regexec.c:5713)
==2680191==    by 0x2275A58E: onig_search_with_param (regexec.c:5839)
==2680191==    by 0x224422F6: onig::Regex::search_with_param (lib.rs:723)
==2680191==    by 0x223F1804: syntect::parsing::regex::regex_impl::Regex::search (regex.rs:174)
==2680191==    by 0x223D7750: syntect::parsing::regex::Regex::search (regex.rs:64)
==2680191==    by 0x22419015: syntect::parsing::parser::ParseState::search (parser.rs:449)
==2680191==    by 0x22418850: syntect::parsing::parser::ParseState::find_best_match (parser.rs:374)
==2680191==    by 0x224172FA: syntect::parsing::parser::ParseState::parse_next_token (parser.rs:274)
==2680191==    by 0x22416F5B: syntect::parsing::parser::ParseState::parse_line (parser.rs:240)
==2680191==    by 0x223C2101: syntect::easy::HighlightLines::highlight_line (easy.rs:68)
==2680191==    by 0x2231115C: comrak::plugins::syntect::SyntectAdapter::highlight_html (syntect.rs:46)
==2680191==
==2680191== (action on error) vgdb me ...
```

Then w/ (v)gdb:
```
#0  0x0000000022753c7d in match_at (reg=reg@entry=0x2199f300, str=str@entry=0x240f5580 "puts 'wow'\nd", end=end@entry=0x240f558b "d", in_right_range=in_right_range@entry=0x240f558b "d", sstart=sstart@entry=0x240f5580 "puts 'wow'\nd", msa=msa@entry=0x1ffeff7540) at /usr/src/debug/dev-libs/oniguruma-6.9.10/onig-6.9.10/src/regexec.c:3069
#1  0x00000000227590d7 in search_in_range (reg=0x2199f300, str=0x240f5580 "puts 'wow'\nd", end=0x240f558b "d", start=<optimized out>, range=0x240f558b "d", data_range=0x240f558b "d", region=0x1ffeff90e0, option=0, mp=0x21461e40) at /usr/src/debug/dev-libs/oniguruma-6.9.10/onig-6.9.10/src/regexec.c:5713
#2  0x000000002275a58f in onig_search_with_param (reg=<optimized out>, str=<optimized out>, end=<optimized out>, start=<optimized out>, range=<optimized out>, region=<optimized out>, option=0, mp=0x21461e40) at /usr/src/debug/dev-libs/oniguruma-6.9.10/onig-6.9.10/src/regexec.c:5839
#3  0x00000000224422f7 in onig::Regex::search_with_param<&str> (self=0x231fd840, chars="puts 'wow'\n", from=0, to=11, options=..., region=..., match_param=...) at src/lib.rs:723
#4  0x00000000223f1805 in syntect::parsing::regex::regex_impl::Regex::search (self=0x231fd840, text="puts 'wow'\n", begin=0, end=11, region=...) at src/parsing/regex.rs:174
#5  0x00000000223d7751 in syntect::parsing::regex::Regex::search (self=0x231fd820, text="puts 'wow'\n", begin=0, end=11, region=...) at src/parsing/regex.rs:64
#6  0x0000000022419016 in syntect::parsing::parser::ParseState::search (self=0x1ffeff9660, line="puts 'wow'\n", start=0, match_pat=0x231fd800, captures=..., search_cache=0x1ffeff9100, regions=0x1ffeff90e0) at src/parsing/parser.rs:449
#7  0x0000000022418851 in syntect::parsing::parser::ParseState::find_best_match (self=0x1ffeff9660, line="puts 'wow'\n", start=0, syntax_set=0x1ffeffd8b0, search_cache=0x1ffeff9100, regions=0x1ffeff90e0, check_pop_loop=false) at src/parsing/parser.rs:374
#8  0x00000000224172fb in syntect::parsing::parser::ParseState::parse_next_token (self=0x1ffeff9660, line="puts 'wow'\n", syntax_set=0x1ffeffd8b0, start=0x1ffeff8fe0, search_cache=0x1ffeff9100, regions=0x1ffeff90e0, non_consuming_push_at=0x1ffeff9120, ops=0x1ffeff8fe8) at src/parsing/parser.rs:274
#9  0x0000000022416f5c in syntect::parsing::parser::ParseState::parse_line (self=0x1ffeff9660, line="puts 'wow'\n", syntax_set=0x1ffeffd8b0) at src/parsing/parser.rs:240
#10 0x00000000223c2102 in syntect::easy::HighlightLines::highlight_line (self=0x1ffeff9628, line="puts 'wow'\n", syntax_set=0x1ffeffd8b0) at src/easy.rs:68
#11 0x000000002231115d in comrak::plugins::syntect::SyntectAdapter::highlight_html (self=0x1ffeffd8b0, code="puts 'wow'\n", syntax=0x235fff10) at src/plugins/syntect.rs:46
#12 0x0000000022311563 in comrak::plugins::syntect::{impl#1}::write_highlighted (self=0x1ffeffd8b0, output=..., lang=..., code="puts 'wow'\n") at src/plugins/syntect.rs:94
#13 0x00000000222de262 in comrak::html::HtmlFormatter::format_node (self=0x1ffeffd418, node=0x220e88b0, entering=true) at src/html.rs:635
#14 0x00000000222da6f9 in comrak::html::HtmlFormatter::format (self=0x1ffeffd418, node=0x1ffeffdaa8, plain=false) at src/html.rs:403
#15 0x00000000222d6a53 in comrak::html::format_document_with_plugins (root=0x1ffeffdaa8, options=0x1ffeffd648, output=..., plugins=0x1ffeffd890) at src/html.rs:40
#16 0x0000000022236288 in commonmarker::node::CommonmarkerNode::to_html (self=0x218c3cf0, args=&[magnus::value::Value](size=1) = {...}) at ext/commonmarker/src/node.rs:1028
#17 0x0000000022290a33 in core::ops::function::Fn::call<fn(&commonmarker::node::CommonmarkerNode, &[magnus::value::Value]) -> core::result::Result<alloc::string::String, magnus::error::Error>, (&commonmarker::node::CommonmarkerNode, &[magnus::value::Value])> () at /usr/lib/rust/1.85.1/lib/rustlib/src/rust/library/core/src/ops/function.rs:79
#18 0x0000000022226ee2 in magnus::method::MethodCAry::call_convert_value<fn(&commonmarker::node::CommonmarkerNode, &[magnus::value::Value]) -> core::result::Result<alloc::string::String, magnus::error::Error>, &commonmarker::node::CommonmarkerNode, core::result::Result<alloc::string::String, magnus::error::Error>> (self=0x53983d000, argc=1, argv=0x5398ac8, rb_self=...) at /home/sam/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/magnus-0.7.1/src/method.rs:601
#19 0x000000002228ec33 in magnus::method::MethodCAry::call_handle_error::{closure#0}<fn(&commonmarker::node::CommonmarkerNode, &[magnus::value::Value]) -> core::result::Result<alloc::string::String, magnus::error::Error>, &commonmarker::node::CommonmarkerNode, core::result::Result<alloc::string::String, magnus::error::Error>> () at /home/sam/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/magnus-0.7.1/src/method.rs:607
#20 0x00000000222a6f30 in core::panic::unwind_safe::{impl#23}::call_once<core::result::Result<magnus::value::Value, magnus::error::Error>, magnus::method::MethodCAry::call_handle_error::{closure_env#0}<fn(&commonmarker::node::CommonmarkerNode, &[magnus::value::Value]) -> core::result::Result<alloc::string::String, magnus::error::Error>, &commonmarker::node::CommonmarkerNode, core::result::Result<alloc::string::String, magnus::error::Error>>> (self=...) at /usr/lib/rust/1.85.1/lib/rustlib/src/rust/library/core/src/panic/unwind_safe.rs:272
#21 0x000000002225ffd0 in std::panicking::try::do_call<core::panic::unwind_safe::AssertUnwindSafe<magnus::method::MethodCAry::call_handle_error::{closure_env#0}<fn(&commonmarker::node::CommonmarkerNode, &[magnus::value::Value]) -> core::result::Result<alloc::string::String, magnus::error::Error>, &commonmarker::node::CommonmarkerNode, core::result::Result<alloc::string::String, magnus::error::Error>>>, core::result::Result<magnus::value::Value, magnus::error::Error>> (data=0x1ffeffe110) at /usr/lib/rust/1.85.1/lib/rustlib/src/rust/library/std/src/panicking.rs:584
#22 0x00000000222a6a0b in __rust_try () from /home/sam/bugs/commonmarker/lib/commonmarker/commonmarker.so
#23 0x00000000222a321d in std::panicking::try<core::result::Result<magnus::value::Value, magnus::error::Error>, core::panic::unwind_safe::AssertUnwindSafe<magnus::method::MethodCAry::call_handle_error::{closure_env#0}<fn(&commonmarker::node::CommonmarkerNode, &[magnus::value::Value]) -> core::result::Result<alloc::string::String, magnus::error::Error>, &commonmarker::node::CommonmarkerNode, core::result::Result<alloc::string::String, magnus::error::Error>>>> (f=...) at /usr/lib/rust/1.85.1/lib/rustlib/src/rust/library/std/src/panicking.rs:547
#24 std::panic::catch_unwind<core::panic::unwind_safe::AssertUnwindSafe<magnus::method::MethodCAry::call_handle_error::{closure_env#0}<fn(&commonmarker::node::CommonmarkerNode, &[magnus::value::Value]) -> core::result::Result<alloc::string::String, magnus::error::Error>, &commonmarker::node::CommonmarkerNode, core::result::Result<alloc::string::String, magnus::error::Error>>>, core::result::Result<magnus::value::Value, magnus::error::Error>> (f=...) at /usr/lib/rust/1.85.1/lib/rustlib/src/rust/library/std/src/panic.rs:358
#25 0x0000000022226bd4 in magnus::method::MethodCAry::call_handle_error<fn(&commonmarker::node::CommonmarkerNode, &[magnus::value::Value]) -> core::result::Result<alloc::string::String, magnus::error::Error>, &commonmarker::node::CommonmarkerNode, core::result::Result<alloc::string::String, magnus::error::Error>> (self=0x1ffeffe30000, argc=1, argv=0x5398ac8, rb_self=...) at /home/sam/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/magnus-0.7.1/src/method.rs:606
#26 0x0000000022239297 in commonmarker::node::init::anon (argc=1, argv=0x5398ac8, rb_self=...) at /home/sam/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/magnus-0.7.1/src/method.rs:850
#27 0x0000000004bfe637 in vm_call_cfunc_with_frame_ (ec=0x53983d0, reg_cfp=0x5498278, calling=<optimized out>, argc=<optimized out>, argv=0x5398ac8, stack_bottom=<optimized out>) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/vm_insnhelper.c:3801
#28 0x0000000004bfd51f in vm_sendish.constprop.0 (ec=<optimized out>, reg_cfp=<optimized out>, cd=<optimized out>, block_handler=<optimized out>, method_explorer=mexp_search_method) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/vm_insnhelper.c:5968
#29 0x0000000004c09daf in vm_exec_core (ec=0x227ce200 <FinishCode.1>) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/insns.def:898
#30 0x0000000004c1f23c in vm_exec_loop (ec=<optimized out>, state=<optimized out>, tag=<optimized out>, result=<optimized out>) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/vm.c:2622
#31 rb_vm_exec (ec=0x53983d0) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/vm.c:2598
#32 0x0000000004c0ffc7 in vm_yield_with_cref (ec=<optimized out>, argc=1, argv=0x1ffeffe758, kw_splat=0, cref=0x0, is_lambda=0) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/vm.c:1676
#33 vm_yield (ec=<optimized out>, argc=1, argv=0x1ffeffe758, kw_splat=0) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/vm.c:1684
#34 rb_yield_0 (argc=1, argv=0x1ffeffe758) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/vm_eval.c:1344
#35 rb_yield (val=<optimized out>) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/vm_eval.c:1360
#36 0x00000000048ce8cd in rb_ary_each (ary=679977560) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/array.c:2641
#37 0x0000000004bfe637 in vm_call_cfunc_with_frame_ (ec=0x53983d0, reg_cfp=0x54984e0, calling=<optimized out>, argc=<optimized out>, argv=0x5398900, stack_bottom=<optimized out>) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/vm_insnhelper.c:3801
#38 0x0000000004bfd51f in vm_sendish.constprop.0 (ec=<optimized out>, reg_cfp=<optimized out>, cd=<optimized out>, block_handler=<optimized out>, method_explorer=mexp_search_method) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/vm_insnhelper.c:5968
#39 0x0000000004c0a79b in vm_exec_core (ec=0x227ce200 <FinishCode.1>) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/insns.def:851
#40 0x0000000004c1f23c in vm_exec_loop (ec=<optimized out>, state=<optimized out>, tag=<optimized out>, result=<optimized out>) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/vm.c:2622
#41 rb_vm_exec (ec=0x53983d0) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/vm.c:2598
#42 0x0000000004c0ffc7 in vm_yield_with_cref (ec=<optimized out>, argc=1, argv=0x1ffeffebd8, kw_splat=0, cref=0x0, is_lambda=0) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/vm.c:1676
#43 vm_yield (ec=<optimized out>, argc=1, argv=0x1ffeffebd8, kw_splat=0) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/vm.c:1684
#44 rb_yield_0 (argc=1, argv=0x1ffeffebd8) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/vm_eval.c:1344
#45 rb_yield (val=<optimized out>) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/vm_eval.c:1360
#46 0x00000000048cec6c in rb_ary_collect (ary=599776640) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/array.c:3645
#47 0x0000000004bfe637 in vm_call_cfunc_with_frame_ (ec=0x53983d0, reg_cfp=0x5498630, calling=<optimized out>, argc=<optimized out>, argv=0x53987e0, stack_bottom=<optimized out>) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/vm_insnhelper.c:3801
#48 0x0000000004bfd51f in vm_sendish.constprop.0 (ec=<optimized out>, reg_cfp=<optimized out>, cd=<optimized out>, block_handler=<optimized out>, method_explorer=mexp_search_method) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/vm_insnhelper.c:5968
#49 0x0000000004c0a79b in vm_exec_core (ec=0x227ce200 <FinishCode.1>) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/insns.def:851
#50 0x0000000004c1f23c in vm_exec_loop (ec=<optimized out>, state=<optimized out>, tag=<optimized out>, result=<optimized out>) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/vm.c:2622
#51 rb_vm_exec (ec=0x53983d0) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/vm.c:2598
#52 0x00000000049afa4e in rb_vm_invoke_proc (ec=<optimized out>, proc=<optimized out>, argc=<optimized out>, argv=<optimized out>, kw_splat=0, passed_block_handler=0) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/vm.c:1770
#53 rb_proc_call_kw (self=<optimized out>, args=<optimized out>, kw_splat=0) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/proc.c:988
#54 rb_proc_call (self=568030360, args=<optimized out>) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/proc.c:998
#55 rb_call_end_proc (data=568030360) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/eval_jump.c:13
#56 0x00000000049b51ce in exec_end_procs_chain (procs=0x4f77d88 <end_procs.lto_priv>, errp=0x5398440) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/eval_jump.c:105
#57 rb_ec_exec_end_proc (ec=ec@entry=0x53983d0) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/eval_jump.c:121
#58 0x00000000049b72b4 in rb_ec_teardown (ec=0x53983d0) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/eval.c:155
#59 0x00000000049b8882 in rb_ec_cleanup (ec=0x53983d0, ex=RUBY_TAG_NONE) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/eval.c:207
#60 0x00000000049b94f0 in ruby_run_node (n=<optimized out>) at /usr/src/debug/dev-lang/ruby-3.4.2/ruby-3.4.2/eval.c:319
#61 0x00000000001083f1 in rb_main (argc=5, argv=0x1ffefff6d8) at ./main.c:43
#62 main (argc=<optimized out>, argv=<optimized out>) at ./main.c:68
```
Comment 3 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2025-03-21 22:48:24 UTC
I don't get where the uninitialised use is supposed to be.

Poking at each of the variables in `frame `0, they all look fine. The only funky thing is `end` where Valgrind's `monitor` can't tell me anything useful:
```
(gdb) monitor xb 0x240f558b 8
                  __      __      __      __      __      __      __      __
0x240F558B:     0x??    0x??    0x??    0x??    0x??    0x??    0x??    0x??
Address 0x240F558B len 8 has 8 bytes unaddressable
(gdb) p 0x240f558b
$30 = 604984715
(gdb) p *0x240f558b
$31 = 100
(gdb) p (char*)0x240f558b
$32 = 0x240f558b "d"
```
Comment 4 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2025-03-21 22:48:32 UTC
Maybe it's `msa`?

```
(gdb) p msa
$33 = (MatchArg *) 0x1ffeff7540
(gdb) p *msa
$34 = {
  stack_p = 0x0,
  stack_n = 11,
  options = 0,
  region = 0x1ffeff90e0,
  ptr_num = 2,
  start = 0x240f5580 "puts 'wow'\nd",
  match_stack_limit = 0,
  retry_limit_in_match = 10000000,
  retry_limit_in_search = 0,
  retry_limit_in_search_counter = 0,
  mp = 0x21461e40,
  best_len = -1,
  best_s = 0xb <error: Cannot access memory at address 0xb>,
  subexp_call_in_search_counter = 0,
  skip_search = 0x240f5580 "puts 'wow'\nd"
}
(gdb) p sizeof(*msa)
$36 = 112
(gdb) monitor xb 0x1ffeff7540 112
                  00      00      00      00      00      00      00      00
0x1FFEFF7540:   0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
                  ff      ff      ff      ff      00      00      00      00
0x1FFEFF7548:   0x0b    0x00    0x00    0x00    0x00    0x00    0x00    0x00
                  00      00      00      00      00      00      00      00
0x1FFEFF7550:   0xe0    0x90    0xff    0xfe    0x1f    0x00    0x00    0x00
                  00      00      00      00      ff      ff      ff      ff
0x1FFEFF7558:   0x02    0x00    0x00    0x00    0x00    0x00    0x00    0x00
                  00      00      00      00      00      00      00      00
0x1FFEFF7560:   0x80    0x55    0x0f    0x24    0x00    0x00    0x00    0x00
                  00      00      00      00      ff      ff      ff      ff
0x1FFEFF7568:   0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
                  00      00      00      00      00      00      00      00
0x1FFEFF7570:   0x80    0x96    0x98    0x00    0x00    0x00    0x00    0x00
                  00      00      00      00      00      00      00      00
0x1FFEFF7578:   0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
                  00      00      00      00      00      00      00      00
0x1FFEFF7580:   0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
                  00      00      00      00      00      00      00      00
0x1FFEFF7588:   0x40    0x1e    0x46    0x21    0x00    0x00    0x00    0x00
                  00      00      00      00      ff      ff      ff      ff
0x1FFEFF7590:   0xff    0xff    0xff    0xff    0x00    0x00    0x00    0x00
                  ff      ff      ff      ff      ff      ff      ff      ff
0x1FFEFF7598:   0x0b    0x00    0x00    0x00    0x00    0x00    0x00    0x00
                  00      00      00      00      00      00      00      00
0x1FFEFF75A0:   0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
                  00      00      00      00      00      00      00      00
0x1FFEFF75A8:   0x80    0x55    0x0f    0x24    0x00    0x00    0x00    0x00
(gdb)
```
Comment 5 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2025-03-21 22:48:39 UTC
Carrying on from there:
```
==2680191== Continuing ...
==2680191== Jump to the invalid address stated on the next line
==2680191==    at 0x10000E003B0001: ???
==2680191==    by 0x227590D6: search_in_range (regexec.c:5713)
==2680191==    by 0x2275A58E: onig_search_with_param (regexec.c:5839)
==2680191==    by 0x224422F6: onig::Regex::search_with_param (lib.rs:723)
==2680191==    by 0x223F1804: syntect::parsing::regex::regex_impl::Regex::search (regex.rs:174)
==2680191==    by 0x223D7750: syntect::parsing::regex::Regex::search (regex.rs:64)
==2680191==    by 0x22419015: syntect::parsing::parser::ParseState::search (parser.rs:449)
==2680191==    by 0x22418850: syntect::parsing::parser::ParseState::find_best_match (parser.rs:374)
==2680191==    by 0x224172FA: syntect::parsing::parser::ParseState::parse_next_token (parser.rs:274)
==2680191==    by 0x22416F5B: syntect::parsing::parser::ParseState::parse_line (parser.rs:240)
==2680191==    by 0x223C2101: syntect::easy::HighlightLines::highlight_line (easy.rs:68)
==2680191==    by 0x2231115C: comrak::plugins::syntect::SyntectAdapter::highlight_html (syntect.rs:46)
==2680191==  Address 0x10000e003b0001 is not stack'd, malloc'd or (recently) free'd
==2680191==
==2680191== (action on error) vgdb me ...
```

```

Thread 1 received signal SIGTRAP, Trace/breakpoint trap.
0x0010000e003b0001 in ?? ()
(gdb) bt
#0  0x0010000e003b0001 in ?? ()
#1  0x0000001ffeff6564 in ?? ()
#2  0x0000000000000004 in ?? ()
#3  0x0000000000000003 in ?? ()
#4  0x00000000227ce200 in RetryLimitInMatch () from /usr/lib64/libonig.so.5
#5  0x0000001ffeff6564 in ?? ()
#6  0x0000000000000004 in ?? ()
#7  0x00000000247caad3 in ?? ()
#8  0x0000001ffeff65f7 in ?? ()
#9  0x0000000000000001 in ?? ()
#10 0x0000000000000001 in ?? ()
#11 0x0000000000000001 in ?? ()
#12 0x0000000000000001 in ?? ()
#13 0x0000000000000001 in ?? ()
#14 0x0000000000000001 in ?? ()
#15 0x0000000000000001 in ?? ()
#16 0x00000000224b6476 in core::slice::index::{impl#4}::get_unchecked_mut<u8> (slice=...) at /usr/lib/rust/1.85.1/lib/rustlib/src/rust/library/core/src/ub_checks.rs:75
#17 core::slice::index::{impl#7}::get_unchecked_mut<u8> (slice=...) at /usr/lib/rust/1.85.1/lib/rustlib/src/rust/library/core/src/slice/index.rs:555
#18 core::slice::index::{impl#7}::index_mut<u8> (self=..., slice=&mut [u8](size=1) = {...}) at /usr/lib/rust/1.85.1/lib/rustlib/src/rust/library/core/src/slice/index.rs:573
#19 0x00000000224b9831 in core::slice::index::{impl#1}::index_mut<u8, core::ops::range::RangeFrom<usize>> (self=&mut [u8](size=576437034) = {...}, index=...)
    at /usr/lib/rust/1.85.1/lib/rustlib/src/rust/library/core/src/slice/index.rs:27
#20 miniz_oxide::inflate::stream::push_dict_out (state=0x247c7e90, next_out=0x1ffeff6150) at src/inflate/stream.rs:371
#21 0x00000000224b9131 in miniz_oxide::inflate::stream::inflate (state=0x247c7e90, input=&[u8](size=0), output=&mut [u8](size=1) = {...}, flush=miniz_oxide::MZFlush::Finish)
    at src/inflate/stream.rs:272
#22 0x00000000224a9ad4 in flate2::ffi::rust::{impl#2}::decompress (self=0x1ffeff8cf8, input=&[u8](size=0), output=&mut [u8](size=1) = {...}, flush=flate2::mem::FlushDecompress::Finish)
    at src/ffi/rust.rs:72
#23 0x00000000224c32d2 in core::slice::index::{impl#4}::get_unchecked<u8> (slice=&[u8](size=0)) at /usr/lib/rust/1.85.1/lib/rustlib/src/rust/library/core/src/ub_checks.rs:75
#24 core::slice::index::{impl#7}::get_unchecked<u8> (slice=&[u8](size=0)) at /usr/lib/rust/1.85.1/lib/rustlib/src/rust/library/core/src/slice/index.rs:549
#25 core::slice::index::{impl#7}::index<u8> (self=..., slice=&[u8](size=0)) at /usr/lib/rust/1.85.1/lib/rustlib/src/rust/library/core/src/slice/index.rs:564
#26 0x000000002241c241 in core::slice::index::{impl#0}::index<u8, core::ops::range::RangeFrom<usize>> (self=&[u8](size=0), index=...)
    at /usr/lib/rust/1.85.1/lib/rustlib/src/rust/library/core/src/slice/index.rs:16
#27 std::io::impls::{impl#9}::consume (self=0x1ffeff8ce8, amt=0) at /usr/lib/rust/1.85.1/lib/rustlib/src/rust/library/std/src/io/impls.rs:352
#28 0x00000000223fa6e9 in flate2::zio::read<&[u8], flate2::mem::Decompress> (obj=0x1, data=0x1, dst=&mut [u8](size=575366262) = {...})
    at /home/sam/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/flate2-1.0.35/src/zio.rs:139
#29 0x0000000000000001 in ?? ()
#30 0x0000001ffeff65f7 in ?? ()
#31 0x0000000000000001 in ?? ()
#32 0x0000000000000000 in ?? ()
```

It jumps from:
```
    while (1 == 1) {
      MATCH_AND_RETURN_CHECK(data_range); # <-- here
      if (s >= range) break;
      s += enclen(reg->enc, s);
```
Comment 6 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2025-03-21 22:48:47 UTC
```
(gdb) frame 0
#0  0x0000000022762ed0 in match_at (reg=0x20e1e730, str=0x5726340 "puts \"hello\"\nllo\"\n", end=0x572634d "llo\"\n", in_right_range=0x572634d "llo\"\n",
    sstart=0x5726340 "puts \"hello\"\nllo\"\n", msa=0x1ffeff7720) at /usr/src/debug/dev-libs/oniguruma-6.9.10/onig-6.9.10/src/regexec.c:3078
3078      BYTECODE_INTERPRETER_START {
(gdb) p p
$29 = (Operation *) 0x20e5f3c0
(gdb) p p.opaddr
$30 = (const void *) 0x300000001
(gdb) p opcode_to_label[95]
$42 = (const void *) 0x500000001
```

```
#ifdef USE_DIRECT_THREADED_CODE
  if (IS_NULL(msa)) {
    for (i = 0; i < reg->ops_used; i++) {
       const void* addr;
       addr = opcode_to_label[reg->ocs[i]];
       p->opaddr = addr;
       p++;
    }
    return ONIG_NORMAL;
  }
#endif
```

On another run:
```
(gdb) frame 0
#0  0x0000000022812ed0 in match_at (reg=0x222825a0, str=0x2223b470 "puts \"hello\"\nllo\"\n#\"", end=0x2223b47d "llo\"\n#\"", in_right_range=0x2223b47d "llo\"\n#\"",
    sstart=0x2223b470 "puts \"hello\"\nllo\"\n#\"", msa=0x1ffeff4be0) at /usr/src/debug/dev-libs/oniguruma-6.9.10/onig-6.9.10/src/regexec.c:3078
3078      BYTECODE_INTERPRETER_START {
(gdb) macro exp BYTECODE_INTERPRETER_START
expands to: goto *(p->opaddr);
$58 = (Operation *) 0x2368da20
(gdb) p p.opaddr
$59 = (const void *) 0x1
```
Comment 7 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2025-03-21 22:48:53 UTC
Operation is at https://github.com/kkos/oniguruma/blob/master/src/regint.h#L739:
```
(gdb) p &p
$66 = (Operation **) 0x1ffeff4928
(gdb) monitor xb 0x1ffeff4928 21
                  00      00      00      00      00      00      00      00
0x1FFEFF4928:   0x20    0xda    0x68    0x23    0x00    0x00    0x00    0x00
                  00      00      00      00      00      00      00      00
0x1FFEFF4930:   0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
                  ff      ff      ff      ff      ff
0x1FFEFF4938:   0x02    0x00    0x00    0x00    0x00
```
but it's a union so it's not immediately obvious how much of that is a problem. I still don't really see where the uninitialised read is.
Comment 8 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2025-03-21 22:49:00 UTC
Using the ebuild to build 08d36110c5670c815ad6d6f969e578049d209080 which should match the crate _also_ fails.
Comment 9 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2025-03-21 22:49:06 UTC
I started to look at how the crate builds it. They define a bunch of `ONIG_DEBUG_*` at https://github.com/rust-onig/rust-onig/blob/main/onig_sys/build.rs#L70.

If I pass those in `CFLAGS` with the ebuild:
```
 * CMP: =dev-libs/oniguruma-6.9.10 with dev-libs/oniguruma-6.9.10/image
 *    ABI: libonig.so.5(32) func(+2,~1)
 * Functions changes summary: 0 Removed, 1 Changed (1 filtered out), 2 Added functions
 * Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
 *
 * 2 Added functions:
 *
 *   [A] 'function void onig_print_compiled_byte_code_list(FILE*, regex_t*)'    {onig_print_compiled_byte_code_list}
 *   [A] 'function int onig_print_names(FILE*, regex_t*)'    {onig_print_names}
 *
 * 1 function with some indirect sub-type change:
 *
 *   [C] 'function int onig_parse_tree(Node**, const OnigUChar*, const OnigUChar*, regex_t*, ParseEnv*)' at regparse.c:9390:1 has some indirect sub-type changes:
 *     parameter 5 of type 'ParseEnv*' has sub-type changes:
 *       in pointed to type 'typedef ParseEnv' at regparse.h:455:1:
 *         underlying type 'struct ParseEnv' at regparse.h:423:1 changed:
 *           type size changed from 1312 to 1344 (in bits)
 *           1 data member insertion:
 *             'unsigned int max_parse_depth', at offset 1280 (in bits) at regparse.h:452:1
 *           1 data member change:
 *             'unsigned int flags' offset changed from 1280 to 1312 (in bits) (by +32 bits)
 *    ABI: libonig.so.5(64) func(+2,~1)
 * Functions changes summary: 0 Removed, 1 Changed (1 filtered out), 2 Added functions
 * Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
 *
 * 2 Added functions:
 *
 *   [A] 'function void onig_print_compiled_byte_code_list(FILE*, regex_t*)'    {onig_print_compiled_byte_code_list}
 *   [A] 'function int onig_print_names(FILE*, regex_t*)'    {onig_print_names}
 *
 * 1 function with some indirect sub-type change:
 *
 *   [C] 'function int onig_parse_tree(Node**, const OnigUChar*, const OnigUChar*, regex_t*, ParseEnv*)' at regparse.c:9390:1 has some indirect sub-type changes:
 *     parameter 5 of type 'ParseEnv*' has sub-type changes:
 *       in pointed to type 'typedef ParseEnv' at regparse.h:455:1:
 *         underlying type 'struct ParseEnv' at regparse.h:423:1 changed:
 *           type size changed from 2176 to 2240 (in bits)
 *           1 data member insertion:
 *             'unsigned int max_parse_depth', at offset 2144 (in bits) at regparse.h:452:1
 *           1 data member change:
 *             'unsigned int flags' offset changed from 2144 to 2176 (in bits) (by +32 bits)
 *   SIZE: 3.07MiB -> 3.38MiB, 28 -> 28 files
 * ------> ABI(+4,~2) SIZE(+10.15%)
```

They affect a bunch of structures: https://github.com/kkos/oniguruma/blob/05bb130c9ad54877e73d1caf08dd95e6ff199d99/src/regparse.h#L451.
Comment 10 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2025-03-21 22:49:14 UTC
But I think that's a red herring unless it had bundled headers and always passed those flags but the library on the system wasn't built with it. 

Also, with the debug flags added (gdb is attached separately):
```
.match_at: str: 0x216170d0, end: 0x216170db, start: 0x216170d0
size: 11, start offset: 0
==3343695== Use of uninitialised value of size 8
==3343695==    at 0x2281661F: match_at (regexec.c:3078)
==3343695==    by 0x22838A59: search_in_range (regexec.c:5713)
==3343695==    by 0x228390BB: onig_search_with_param (regexec.c:5839)
==3343695==    by 0x224F22F6: onig::Regex::search_with_param (lib.rs:723)
==3343695==    by 0x224A1804: syntect::parsing::regex::regex_impl::Regex::search (regex.rs:174)
==3343695==    by 0x22487750: syntect::parsing::regex::Regex::search (regex.rs:64)
==3343695==    by 0x224C9015: syntect::parsing::parser::ParseState::search (parser.rs:449)
==3343695==    by 0x224C8850: syntect::parsing::parser::ParseState::find_best_match (parser.rs:374)
==3343695==    by 0x224C72FA: syntect::parsing::parser::ParseState::parse_next_token (parser.rs:274)
==3343695==    by 0x224C6F5B: syntect::parsing::parser::ParseState::parse_line (parser.rs:240)
==3343695==    by 0x22472101: syntect::easy::HighlightLines::highlight_line (easy.rs:68)
==3343695==    by 0x223C115C: comrak::plugins::syntect::SyntectAdapter::highlight_html (syntect.rs:46)
==3343695==
==3343695== (action on error) vgdb me ...
==3343695== Continuing ...
==3343695== Jump to the invalid address stated on the next line
==3343695==    at 0x1: ???
==3343695==    by 0x22838A59: search_in_range (regexec.c:5713)
==3343695==    by 0x228390BB: onig_search_with_param (regexec.c:5839)
==3343695==    by 0x224F22F6: onig::Regex::search_with_param (lib.rs:723)
==3343695==    by 0x224A1804: syntect::parsing::regex::regex_impl::Regex::search (regex.rs:174)
==3343695==    by 0x22487750: syntect::parsing::regex::Regex::search (regex.rs:64)
==3343695==    by 0x224C9015: syntect::parsing::parser::ParseState::search (parser.rs:449)
==3343695==    by 0x224C8850: syntect::parsing::parser::ParseState::find_best_match (parser.rs:374)
==3343695==    by 0x224C72FA: syntect::parsing::parser::ParseState::parse_next_token (parser.rs:274)
==3343695==    by 0x224C6F5B: syntect::parsing::parser::ParseState::parse_line (parser.rs:240)
==3343695==    by 0x22472101: syntect::easy::HighlightLines::highlight_line (easy.rs:68)
==3343695==    by 0x223C115C: comrak::plugins::syntect::SyntectAdapter::highlight_html (syntect.rs:46)
==3343695==  Address 0x1 is not stack'd, malloc'd or (recently) free'd
```

That's from stepping immediately after the Valgrind trap on the uninitialised read at the `jump ...`, so if it uses 0x1, it must be that `p.opaddr` is corrupted.

Using `rr` (where the invalid value is a bit different but still bogus):

At the start of `match_at`:
```
(rr) p p.opaddr
$38 = (const void *) 0x7cac8120ab01 <main_arena+65>
(rr) x/i $pc
=> 0x7cac66c68618 <match_at+1404>:      mov    rax,QWORD PTR [rbp-0xca8]
(rr) n
0x00007cac8120ab01 in main_arena () from /usr/lib64/libc.so.6
(rr) x/i $pc
=> 0x7cac8120ab01 <main_arena+65>:      add    BYTE PTR [rax],al
(rr) n
Single stepping until exit from function main_arena,
which has no line number information.

Thread 1 received signal SIGSEGV, Segmentation fault.
0x00007cac8120ab01 in main_arena () from /usr/lib64/libc.so.6
```

Then:
```
(rr) watch p.opaddr
Hardware watchpoint 6: p.opaddr
(rr) reverse-continue
Continuing.

Thread 1 hit Hardware watchpoint 6: p.opaddr

Old value = (const void *) 0x7cac8120ab01 <main_arena+65>
New value = (const void *) 0x7ffc5c2d0340
0x00007cac66c680f9 in match_at (reg=0x60d0f7510b90, str=0x60d0f74e87c0 "puts 'wow'\n\201\254|", end=0x60d0f74e87cb "\201\254|", in_right_range=0x60d0f74e87cb "\201\254|",
    sstart=0x60d0f74e87c0 "puts 'wow'\n\201\254|", msa=0x7ffc5c2cecc0) at /usr/src/debug/dev-libs/oniguruma-6.9.10/onig-6.9.10/src/regexec.c:3009
3009      Operation* p = reg->ops;
(rr) p p
$76 = (Operation *) 0x7ffc5c2ce058
(rr) p *p
$77 = {
  opaddr = 0x7ffc5c2d0340,
[...]
(rr) p p.opaddr
$78 = (const void *) 0x7ffc5c2d0340
(rr) watch p
Hardware watchpoint 7: p
(rr) watch p.opaddr
Hardware watchpoint 8: p.opaddr
(rr) c
Continuing.

Thread 1 hit Hardware watchpoint 6: p.opaddr

Old value = (const void *) 0x7ffc5c2d0340
New value = (const void *) 0x7cac8120ab01 <main_arena+65>

Thread 1 hit Hardware watchpoint 7: p

Old value = (Operation *) 0x7ffc5c2ce058
New value = (Operation *) 0x60d0f7510d60

Thread 1 hit Hardware watchpoint 8: p.opaddr

Old value = (const void *) 0x7ffc5c2d0340
New value = (const void *) 0x7cac8120ab01 <main_arena+65>
```
where we have
```
(rr) list .
3005      unsigned long subexp_call_counters[MAX_SUBEXP_CALL_COUNTERS];
3006    #endif
3007
3008      OnigOptionType options;
3009      Operation* p = reg->ops;
3010      OnigEncoding encode = reg->enc;
3011      OnigCaseFoldType case_fold_flag = reg->case_fold_flag;
3012
3013    #ifdef USE_CALL
3014      unsigned long subexp_call_nest_counter = 0;

(rr) p reg->ops
$87 = (Operation *) 0x60d0f7510d60
```

That comes ultimately from:
```
#3  0x00007cac652222f7 in onig::Regex::search_with_param<&str> (self=0x60d0f7522160, chars=..., from=0, to=11, options=..., region=..., match_param=...) at src/lib.rs:723
[...]
```
which is in the crate:
```
        let r = unsafe {
            let start = beg.add(from);
            let range = beg.add(to);
            if start > end {
                return Err(Error::custom("Start of match should be before end"));
            }
            if range > end {
                return Err(Error::custom("Limit of match should be before end"));
            }
            onig_sys::onig_search_with_param(
                self.raw,
                beg,
                end,
                start,
                range,
                match region {
                    Some(region) => region as *mut Region as *mut onig_sys::OnigRegion,
                    None => std::ptr::null_mut(),
                },
                options.bits(),
                match_param.as_raw(),
            )
```
so `self.raw` is garbage (I think), though I can't print it at that point in gdb. I guess this means it's probably a onig crate bug. I know absolutely zero Rust and don't think I can go further.
Comment 11 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2025-03-21 22:49:22 UTC
I give up for now, but here's a reproducer for commonmarker from git:
```
#!/bin/bash
set -x

#(cd ext/commonmarker/ && ruby extconf.rb)

export RUSTONIG_SYSTEM_LIBONIG=1
#export CFLAGS="-Og -ggdb3"
export CFLAGS="-Og -ggdb3 -DONIG_DEBUG_PARSE -DONIG_DEBUG_COMPILE -DONIG_DEBUG_COMPILE -DONIG_DEBUG_MATCH"
export RUSTFLAGS="-C opt-level=0 -C strip=none -C debuginfo=full"

#rm -rf ./ext/commonmarker/target/release/deps
#rm -rf ./ext/commonmarker/target/release/build/onig_sys*

make -C ext/commonmarker -Onone CFLAGS="-Og -ggdb3 -std=gnu17" || exit 1
cp {ext,lib}/commonmarker/commonmarker.so || exit 1

#exec ruby34 --disable-jit -Ilib:test:. -e 'Dir["test/*_test.rb"].each {|f| require f}'
exec ruby34 --disable-jit -Ilib:test:. test/node_test.rb
```

with `test/node_test.rb` being modified to just:
```
# frozen_string_literal: true

require "test_helper"

class NodeTest < Minitest::Test
  def setup
    @document = Commonmarker.parse("Hi *there*. This has __many nodes__!")
  end

  class FenceInfoTest < Minitest::Test
    def setup
      @document = Commonmarker.parse("``` ruby\nputs 'wow'\n```")
      @fence_node = @document.first_child
    end

    def test_has_fence_info
      assert_equal("ruby", @fence_node.fence_info)
    end

    def test_can_set_fence_info
      assert_match(/<pre lang=\"ruby\"/, @document.to_html)

      @fence_node.fence_info = "perl"

      assert_equal("perl", @fence_node.fence_info)
      assert_match(/<pre lang=\"perl\"/, @document.to_html)
    end
  end
end
```

In summary:
* It only crashes with the system oniguruma
* Using the same version as the crate has bundled via onigurma-9999 + override still crashes
* Valgrind reports an uninitialised memory read but it's not clear to me where
* I _think_ `p.opaddr` is corrupted, but it gets used in a table so I'm not completely sure (EDIT: see https://github.com/gentoo/gentoo/pull/41130#issuecomment-2742137819)

I think the next steps are (not necessarily for me):
* Try the onigurma crate testsuite
* Try asan+ubsan on ruby/oniguruma/the crate (will need some special flags for the crate)
* Reduce the Ruby testcase further
* Try to transform the Ruby testcase into just using Ruby's regex engine (which should be the same as oniguruma)
* Try to replicate it in a Rust testcase (bleh) using the crate
* Try to replicate it in a pure C oniguruma testcase
* Report it to commonmarker upstream and see what they say (it's very possibly a bug in that still if it passes something invalid down, maybe?)