After a kernel upgrade, and uptime of 6 days, our asterisk phones stopped making calls to the asterisk server. Initially believing this to be an asterisk problem, upon further investigation we noticed the following in the kernel log: Oct 10 15:30:21 jared kernel: nf_conntrack: expectation table full The error message first appeared at the same time the phone system went down. On further investigation, we noticed that the Expectation Table had 0 length, using the following commands: #wc /proc/net/ip_conntrack_expect 0 0 0 /proc/net/ip_conntrack_expect and #conntrack -C expect 0 I couldn't work out why this was the case, so rebooted the machine and the Expect table is once again operating normally. I would like to be able to diagnose the problem incase the same thing should happen in 6 days time. Reproducible: Couldn't Reproduce Steps to Reproduce: I'm not sure the problem is reproducable without waiting 6 days to see if it re-occurs on our system. Running OpenVZ and several guests. Once the Expectation Table reports itself as full, all connections that require it are broken. If I knew how to re-initialise the Expect table without rebooting, the problem would be less critical - however, if it requires a reboot each time to fix it, then it would have a more severe impact. Please let me know what further information you require.
So, did it happen again?
Not been able to reproduce the problem as yet - how long do you think that I should leave it before putting it down to goblins?
It's fine for now. Feel free to reopen if any new data or commenter arise.
The problem has recurred. I'm not sure what you need me to do to report it - but while it is corrupt we can no longer register our phones with our SIP machine. As it's the end of the working day I can leave the machine up (with the corrupt nf_conntrack) to work out what is broken, but it will have to be restarted before the working day tomorrow - thus wiping potential troubleshooting opportunities. There doesn't appear to have been anything unique that happened when the table became corrupted. Evidently the only way to reload the table is to remove the module - but due to the dependencies I'm not sure this is possible without taking out the network connection.