Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 913938

Summary: python: current .pyc check scheme causing .pyc files not reproducible
Product: Gentoo Linux Reporter: thssld
Component: Current packagesAssignee: Python Gentoo Team <python>
Status: UNCONFIRMED ---    
Severity: normal CC: mgorny
Priority: Normal    
Version: unspecified   
Hardware: All   
OS: Linux   
Whiteboard:
Package list:
Runtime testing required: ---
Bug Depends on:    
Bug Blocks: 913920    

Description thssld 2023-09-10 12:56:28 UTC
Python defaults to embed build-time timestamp into .pyc. During loading, python interpreter checks timestamp to decide if .pyc matches the .py file. Since we ship with .pyc files in binpkgs, current we cannot have reproducible python related packages.

Python can have .pyc contain hash instead of timestamp. In this mode, python interpreter computes the hash of .py file and check if hash matches the field in .pyc file during loading. So this mode has some performance hit.

Debian seems doesn't provide .pyc files, so their binpkgs can be reproducible.
Archlinux seems to use hash-based method.

See:
https://peps.python.org/pep-0552/
https://fedoraproject.org/wiki/Changes/ReproducibleBuildsClampMtimes
Comment 1 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2023-09-12 06:04:25 UTC
(In reply to thssld from comment #0)
> Python defaults to embed build-time timestamp into .pyc. During loading,
> python interpreter checks timestamp to decide if .pyc matches the .py file.
> Since we ship with .pyc files in binpkgs, current we cannot have
> reproducible python related packages.

I don't get this.  We ship both .py and .pyc, and we preserve timestamps, so th files always match.  What's the problem?
Comment 2 thssld 2023-09-12 15:02:36 UTC
(In reply to Michał Górny from comment #1)
> (In reply to thssld from comment #0)
> > Python defaults to embed build-time timestamp into .pyc. During loading,
> > python interpreter checks timestamp to decide if .pyc matches the .py file.
> > Since we ship with .pyc files in binpkgs, current we cannot have
> > reproducible python related packages.
> 
> I don't get this.  We ship both .py and .pyc, and we preserve timestamps, so
> th files always match.  What's the problem?

The timestamp embedded in .pyc is the time of build rather than the mtime of .py file. So it differs across builds.
Comment 3 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2023-09-14 10:17:46 UTC
That's not a problem.  Python still correctly recognizes the .pyc file as being up-to-date.