Bug 433563

Summary:	Provide config.site with cached values for common system tests
Product:	Portage Development	Reporter:	Michał Górny <mgorny>
Component:	Enhancement/Feature Requests	Assignee:	Portage team <dev-portage>
Status:	CONFIRMED ---
Severity:	enhancement	CC:	ferringb, sam
Priority:	Normal
Version:	unspecified
Hardware:	All
OS:	Linux
Whiteboard:
Package list:		Runtime testing required:	---

Description Michał Górny archtester

2012-08-31 19:19:35 UTC

There's a bunch of common (standard) autotools test which always return the same results in Gentoo. In order to make configure checks faster a little, we could record the cached results of those tests and store them in config.site file which portage would provide to the builds.

I'm mostly thinking of the C standards compliance tests and other deprecated tests where we can safely assume that any toolchain and system in Gentoo (at least the non-Prefix variant) will pass (or give the same results for).

Comment 1 Fabian Groffen gentoo-dev

2012-08-31 19:23:54 UTC

I recall 'confcache', which was axed by diego (flameeyes) and brian (ferringb).  They'll know the exact details, but I guess this won't work very well, because not all upstreams use the same tests for the same macros.

Comment 2 Michał Górny archtester

2012-08-31 20:46:11 UTC

(In reply to comment #1)
> I recall 'confcache', which was axed by diego (flameeyes) and brian
> (ferringb).  They'll know the exact details, but I guess this won't work
> very well, because not all upstreams use the same tests for the same macros.

AFAIU the main problem with 'confcache' is that there's no good of 'refreshing' the information, i.e. removing 'no' results after relevant libraries were installed etc.

I was more thinking about grabbing just those few 'safe' values which wouldn't change from 'yes' to 'no' for a long time.

Comment 3 Fabian Groffen gentoo-dev

2012-08-31 21:06:03 UTC

I'm affraid due to the nature of autoconf files, this set would be so small, that it's pointless to bother with it for most systems.

Think of the scenario where the configure file does some flag trickery (can be home-brewn, or e.g. AC_EXTENSIONS) and then tries to detect some functions.  If you'd cache that, you'd generate false negatives/positives for configure files that don't do the same (or similar) flag trickery.  It's also getting quickly very tricky, because the environment might change: _XOPEN_SOURCE={500,600} (defined for g++) changing in GCC's spec-files (since 4.6) for instance.

Comment 4 Michał Górny archtester

2012-08-31 21:18:05 UTC

(In reply to comment #3)
> I'm affraid due to the nature of autoconf files, this set would be so small,
> that it's pointless to bother with it for most systems.

But it would contain the most common tests which are done for almost every package using autotools. (small * very many) = large.

> Think of the scenario where the configure file does some flag trickery (can
> be home-brewn, or e.g. AC_EXTENSIONS) and then tries to detect some
> functions.  If you'd cache that, you'd generate false negatives/positives
> for configure files that don't do the same (or similar) flag trickery.  It's
> also getting quickly very tricky, because the environment might change:
> _XOPEN_SOURCE={500,600} (defined for g++) changing in GCC's spec-files
> (since 4.6) for instance.

I don't think that should actually cause the environment to become non-compliant with the C standard which I was considering mostly. And if people shoot themselves in the feet, I believe that they can also work-around our config.site.

Comment 5 Fabian Groffen gentoo-dev

2012-09-01 07:20:22 UTC

(In reply to comment #4)
> (In reply to comment #3)
> > I'm affraid due to the nature of autoconf files, this set would be so small,
> > that it's pointless to bother with it for most systems.
> 
> But it would contain the most common tests which are done for almost every
> package using autotools. (small * very many) = large.

which ones do you think of?  I really doubt you're right here  (and path lookups are as cheap as variable caches)

> > Think of the scenario where the configure file does some flag trickery (can
> > be home-brewn, or e.g. AC_EXTENSIONS) and then tries to detect some
> > functions.  If you'd cache that, you'd generate false negatives/positives
> > for configure files that don't do the same (or similar) flag trickery.  It's
> > also getting quickly very tricky, because the environment might change:
> > _XOPEN_SOURCE={500,600} (defined for g++) changing in GCC's spec-files
> > (since 4.6) for instance.
> 
> I don't think that should actually cause the environment to become
> non-compliant with the C standard which I was considering mostly. And if
> people shoot themselves in the feet, I believe that they can also
> work-around our config.site.

I'm sorry, but I think you have a wrong idea about how systems look like.  Also, with this attitude, I think you demonstrate you don't understand the idea behind autoconf so well.

Just to give you some more to think of:
- how about versions of autoconf macros (updates, bugfixes, different output)
- how about systems headers changing (or gcc's fix-included headers differing per release)

Comment 6 Michał Górny archtester

2012-09-01 07:47:57 UTC

Well, right now I'm testing the following:

checking for ANSI C header files... (cached) yes
checking for sys/types.h... (cached) yes
checking for sys/stat.h... (cached) yes
checking for stdlib.h... (cached) yes
checking for string.h... (cached) yes
checking for memory.h... (cached) yes
checking for strings.h... (cached) yes
checking for inttypes.h... (cached) yes
checking for stdint.h... (cached) yes
checking for unistd.h... (cached) yes

I don't think they can really diverge anytime in Gentoo. They are pulled in by libtool for some reason and the results are almost always unused.

There are also many 'obsolescent' macros if you look through autoconf docs; like:

     This macro is obsolescent, as `closedir' returns a meaningful value
     on current systems.  New programs need not use this macro.

We could just cache those results to avoid running those tests.

Comment 7 Fabian Groffen gentoo-dev

2012-09-01 08:19:34 UTC

Ok, so you want to have each package that installs headers in /usr/include to basically set av_have_x_y_h=yes.  Portage could do that, I guess, and maintain it, without any package intervention, that sounds cool on first sight, but...

I don't really like going as deep as testing if functions exist, or work, or accept parameters, etc, since they highly depend on flags, versions, etc.

One problem with this approach is, that you don't know what's the if-logic in a configure script.  Due to pre-setting availability, you might actually break something, e.g. defining both HAVE_SYS_TIMES_H and HAVE_TIMES_H with this code:

#ifdef HAVE_SYS_TIMES_H
..
#elif HAVE_TIMES_H
...
#endif

while the configure script would first check times.h and fall back to sys/times.h if not available.  There are cases like this.

What *does* usually work fine, is when you re-use the cache for the same configure run, which greatly speeds up if you're testing an ebuild and configuring it over and over again.

Comment 8 Michał Górny archtester

2012-09-01 12:16:21 UTC

(In reply to comment #7)
> Ok, so you want to have each package that installs headers in /usr/include
> to basically set av_have_x_y_h=yes.  Portage could do that, I guess, and
> maintain it, without any package intervention, that sounds cool on first
> sight, but...
> 
> I don't really like going as deep as testing if functions exist, or work, or
> accept parameters, etc, since they highly depend on flags, versions, etc.

That may be a good idea too but I haven't gone that far.

I just said *standard C headers* and *obsolete checks*.

Comment 9 Fabian Groffen gentoo-dev

2012-09-01 12:22:19 UTC

(In reply to comment #8)
> I just said *standard C headers* and *obsolete checks*.

And I just said *it won't work* and *cause bugs*.

Comment 10 Michał Górny archtester

2012-09-01 16:19:09 UTC

(In reply to comment #9)
> (In reply to comment #8)
> > I just said *standard C headers* and *obsolete checks*.
> 
> And I just said *it won't work* and *cause bugs*.

You haven't backed that opinion with anything relevant to what I said.

Comment 11 Fabian Groffen gentoo-dev

2012-09-01 16:27:40 UTC

you removed that bit from your reply, so you likely didn't read it

Comment 12 Brian Harring 2024-01-27 06:57:52 UTC

This should be closed.  I never wrote a post-mortem of confcache, but it lived only from ~02 to 06.  Take a look at 7707d956e87a7edce06cbca10f419d5449b2b4b9 in pkgcore if you want to trace backwards.

Confcache implementation wired sandbox access patterns to the cache, and then used checksum'd those paths to understand if that 'delta' of the cache was valid.  Invalidation was built in; I could build it better now, but the failure rate would still be unacceptable.  To make this work you basically need to wire the tested expression- instrumenting state before and state after- and wire that as the cache validation.

Mgorny's notion of pre-seeding "known good system values" (compiler/glibc/like that) can work, but that's reliant on upstreams not doing frankly stupid ass shit.  It's been a long while, but while you'd expect dev-lang/php to have done things that would poison the cache for common values like what we're discussing, even python did it back then.  Presumably it's been fixed, but you get the idea.  I rejected the idea of falling back to a pre-seeded content since the build time gain didn't justify it in my testing.  I strongly doubt modern hardware (nvme/ram) supports the value vs risk.

Either way, my 2 cents, close it.  There's higher value things to target.