It's written in the summary, but simdjson has many primitives in it returning AVX512 types, while being compiled on an *AVX2 system*. Reproducible: Always Steps to Reproduce: 1. Attempt to build 2. Watch it crash 3. Wallow in your misery Actual Results: It failed to build Expected Results: It should've built Major rating bc NodeJS is a major part of many people's systems, including mine.
Created attachment 892318 [details] emerge --info
Created attachment 892319 [details] emerge -pqv
Created attachment 892320 [details] build.log
Can you report this upstream (https://github.com/simdutf/simdutf but I imagine https://github.com/simdjson/simdjson suffers from the same) please?
I have confirmed that simdjson and simdutf actually build outside of the Portage build staging area
(In reply to zachariah.cabelly from comment #5) > I have confirmed that simdjson and simdutf actually build outside of the > Portage build staging area That being /var/tmp/portage/package-category/${P}
I don't think we're doing anything specific or magic here, just that it's likely when building manually, you don't get CFLAGS="... -march=native ..." applied. (i.e. The ebuild isn't applying any sort of patches in this area.) It's also possible that the nodejs build system mangles things in a way where building standalone simd{utf,json} doesn't work, as it's kind of convoluted.
(In reply to Sam James from comment #7) > I don't think we're doing anything specific or magic here, just that it's > likely when building manually, you don't get CFLAGS="... -march=native ..." > applied. > Well, that's a clue. I'll try a workaround. In other words, "fuck it, I'll try something" > (i.e. The ebuild isn't applying any sort of patches in this area.) > > It's also possible that the nodejs build system mangles things in a way > where building standalone simd{utf,json} doesn't work, as it's kind of > convoluted.
God speed. I spent an hour or so earlier on a related problem with node...
lmao and godspeed
(In reply to zachariah.cabelly from comment #8) > (In reply to Sam James from comment #7) > > I don't think we're doing anything specific or magic here, just that it's > > likely when building manually, you don't get CFLAGS="... -march=native ..." > > applied. > > > Well, that's a clue. I'll try a workaround. In other words, "fuck it, I'll > try something" > My workaround worked! It was deleting -march=native. > > (i.e. The ebuild isn't applying any sort of patches in this area.) > > > > It's also possible that the nodejs build system mangles things in a way > > where building standalone simd{utf,json} doesn't work, as it's kind of > > convoluted.
I've reported this to nodejs https://github.com/nodejs/node/issues/52876
I think this is a clang issue: 1705848001: ::: completed emerge (11 of 19) net-libs/nodejs-20.11.0 to / 1708299962: ::: completed emerge (23 of 27) net-libs/nodejs-20.11.1 to / 1709762580: ::: completed emerge (4 of 18) sys-devel/clang-18.1.0 to / 1709797272: ::: completed emerge (701 of 1264) net-libs/nodejs-20.11.1 to / 1709808101: ::: completed emerge (976 of 1264) sys-devel/clang-18.1.0 to / 1709842715: ::: completed emerge (2 of 3) sys-devel/clang-18.1.0 to / 1710814187: ::: completed emerge (7 of 33) sys-devel/clang-17.0.6 to / 1710945456: ::: completed emerge (8 of 15) sys-devel/clang-18.1.2 to / 1712284321: ::: completed emerge (6 of 12) sys-devel/clang-18.1.3 to / 1713263296: ::: completed emerge (2 of 5) net-libs/nodejs-20.12.1 to / 1713485952: ::: completed emerge (6 of 15) sys-devel/clang-18.1.4 to / 1715029866: ::: completed emerge (21 of 39) sys-devel/clang-18.1.5 to / Above are the successful builds of clang and nodejs on my system Nodejs 20.12.1 built fine with clang 18.1.3 but that version no longer compiles with 18.1.5, along with the newer nodejs 22.1.0 also failing
I've now successfully built it with GCC 13.2.1
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=032d6c63320058503aaba7cc0945965a667b1e79 commit 032d6c63320058503aaba7cc0945965a667b1e79 Author: Sam James <sam@gentoo.org> AuthorDate: 2024-05-07 17:44:51 +0000 Commit: Sam James <sam@gentoo.org> CommitDate: 2024-05-07 17:46:20 +0000 net-libs/nodejs: fix build with GCC 14 (update simdjson to 3.9.1) Bug: https://bugs.gentoo.org/931267 Closes: https://bugs.gentoo.org/931150 Signed-off-by: Sam James <sam@gentoo.org> net-libs/nodejs/Manifest | 1 + net-libs/nodejs/nodejs-22.1.0.ebuild | 6 ++++++ 2 files changed, 7 insertions(+)
> It's written in the summary, but simdjson has many primitives in it > returning AVX512 types, while being compiled on an *AVX2 system*. The way simdjson and simdutf work is that they rely on runtime dispatching. When the libraries are first called, they check the CPU and use the best possible kernel by default. Yes, that remains true even with `-march=native`: it may still check whether the binary could use more advanced instructions. You can manually disable runtime dispatching but Node.js won't do it (by default). So, to be clear, it will definitively compile an AVX-512 kernel *even* if the current system you are on does not support AVX-512. That is very much by design. That is, if you target, say, AMD Zen2, then the library will know that it does not need to bother with anything less than what AMD Zen2 supports. But it will still allow for the fact that the binary could be running on AMD Zen4, in which case, it will use more advanced instructions. Thus, to repeat, it is very much not a bug that you see AVX-512 instructions being generated. What happened is that GCC 14 (and maybe LLVM 18) changed the way macros related to instruction sets work, changing the value of a macro like __AVX2__ in the middle of the same execution unit, something that did not happen previously. This broke some libraries. The issue was easily patched, but Node.js takes a few weeks to update its dependencies. This current issue is likely a duplicate of 931150 I recommend fixing by updating the dependencies through a patch *or* waiting for Node.js to update its dependencies.
(In reply to Daniel Lemire from comment #16) > > It's written in the summary, but simdjson has many primitives in it > > returning AVX512 types, while being compiled on an *AVX2 system*. > > > [...] > This current issue is likely a duplicate of 931150 > > I recommend fixing by updating the dependencies through a patch *or* waiting > for Node.js to update its dependencies. Thank you for your help Daniel. I've gone ahead and updated through a patch for now. I can reopen this bug if people continue to hit it but I suspect it's a dupe too. (I'm not sure what change in Clang exposed this - I know what the GCC change was, though.) *** This bug has been marked as a duplicate of bug 931150 ***
(In reply to Daniel Lemire from comment #16) > So, to be clear, it will definitively compile an AVX-512 kernel *even* if > the current system you are on does not support AVX-512. That is very much by > design. As someone who is shipping a product, runtime CPU dispatching is a really great feature -- a single binary can work and properly utilize hardware without relying on more drastic measures like plugin libraries. But for those loving the idea of -march=native and building from source for a specific hardware, runtime CPU dispatching is kinda sad -- more unused code is generated, some inlining opportunities are missed. And sometimes there are more interesting stuff like the need for vzeroupper to avoid AVX-to-SSE switch penalties (I suppose no longer an issue since processors without AVX are rare these days) -- not necessary otherwise. I've seen other troubles with runtime dispatching like some versions of GCC with LTO being confused by different translation units compiled with different microarchitecture options and upgrading SSE to AVX everywhere; and if not careful, C++ templates for vocabulary types being instantiated in translation units with advanced instructions enabled. Overall, the ability to disable CPU runtime dispatching is nice, but I understand that the use case for that is not particularly large.
> As someone who is shipping a product, runtime CPU dispatching is a really great > feature -- a single binary can work and properly utilize hardware without relying on > more drastic measures like plugin libraries. Though you know this, let me state it for the benefit of people reading this: If one is going to release a binary package, runtime dispatching is necessary on some processor families (e.g., x64) to use advanced CPU features. Indeed, when Debian makes available to binary libsimdjson-dev, they cannot assume to know whether the machine will be an AMD Zen4 or an AMD Zen2. Thankfully, it is not a concern with 64-bit ARM, at least as far as simdjson and simdutf are concerned (no runtime dispatching needed). It is also not a concern if you build for one of the most advanced CPUs (AMD Zen4, Ice Lake...) as then runtime dispatching is automagically disabled. > But for those loving the idea of -march=native and building from source for a > specific hardware, runtime CPU dispatching is kinda sad (...) > Overall, the ability to disable CPU runtime dispatching is nice, but I understand > that the use case for that is not particularly large. There are two layers in simdjson, an indexing layers, and the front-end. The front-end, by default, does not use "much" runtime dispatching. I think one or two functions use runtime dispatching, but that is all. So if you build Node.js with -march=native, you will get some benefits right there that a Ubuntu user won't get. That is, the front-end will be optimized and some JSON processing in Node.js will get (slightly) faster. I doubt that the benefit is measurable, but maybe... As for the indexing layer, it is one call per JSON document, so there is not much concern. Regarding simdutf, you are correct that the price to pay is that there is no inlining possible. But in Node.js, we take this into account by providing an inlined function for some small inputs. As I stated elsewhere, it would be possible for an advanced users to just disable runtime dispatching entirely. But I don't think you would see much benefit. This being said, if someone is interested, it would not be too hard to make it easy and documented to disable runtime dispatching in these libraries. Contributions invited !!! It is genuinely not difficult. Merely some macro logic.