Version bump to psycopg-2.4.2 would seem to be in order due to this.comment #45 indicating fixed in that version quoting comment #8 I believe the crash is happening due to a bug in python-psycopg2. I am about to attach a proposed patch to that package which I believe will fix it. I don't have a minimal reproducer yet. Technical analysis follows: Based on reviewing the coredump, the crash is happening in thread #1, during one of python's periodic invocations of its garbage collector. The assertion failure is that an object has a lower reference count than expected (based on the reference-owning objects pointing to it). This object is a StringIO object, containing a series of lines containing dates and numbers (I'm deliberately not including it in a public bz comment, in case it contains private information). Meanwhile thread #4 is within python-psycopg2's pq_execute function executing a query. Specifically, it is within the function "psyco_curs_copy_from" in psycopg2-2.0.13/psycopg/cursor_type.c I notice that within thread #4, pcurs->copyfile contains the textual data seen in the crash in thread #1. What's happened is that the StringIO instance containing the query has too low a reference count. Further, pyscopg2's cursorObject participates in the Python garbage collector: psycopg2-2.0.13/psycopg/cursor_type.c: cursor_traverse has: Py_VISIT(self->copyfile); It acquires PyObject *file via: if (!PyArg_ParseTupleAndKeywords(args, kwargs, "O&s|ss" CONV_CODE_PY_SSIZE_T "O", kwlist, _psyco_curs_has_read_check, &file, &table_name, &sep, &null, &bufsize, &columns)) i.e. from "O& (object) [converter, anything]" with converter "_psyco_curs_has_read_check": Looking at _psyco_curs_has_read_check, it has this code and comment: /* It's OK to store a borrowed reference, because it is only held for * the duration of psyco_curs_copy_from. */ *((PyObject**)var) = o; return 1; This is thus storing a borrowed reference to the StringIO object into the psyco's cursorObject without INCREFing it. This reference is is visible to the garbage collector. The comment in the code above is incorrect: within the call to pq_execute, there is an invocation of the Py_BEGIN_ALLOW_THREADS macro. This allows the GIL to transition to another thread, and it seems that this is what happened in this case (within the coredump, that thread is waiting at the subsequent Py_END_ALLOW_THREADS macro, waiting to reacquire the GIL). When the GC runs in the other thread, it detects that the refcount is too low, and bails out with an assertion failure. (I believe that if assertions were disabled, it could try to free up the memory, so we'd be seeing a different kind of crash) Reproducible: Always
Since we now have 2.4.2 in the tree (about to be stabilized in bug 392749).