If I run "file" against a sample with many empty lines (>=100.000), it seems to hang. Depending on system, it runs for minutes to hours with 100% load. The regular expression for detecting AWK scripts in file-5.07/magic/Magdir/commands seems to be the cause. Without this regex, "file" works like expected: --- file-5.07/magic/Magdir/commands 2011-05-02 14:36:41.000000000 +0200 +++ file-5.07/magic/Magdir/commands.patched 2011-08-16 17:37:12.327729653 +0200 @@ -48,8 +48,8 @@ 0 string/wt #!\ /bin/awk awk script text executable !:mime text/x-awk 0 string/wt #!\ /usr/bin/awk awk script text executable -!:mime text/x-awk -0 regex =^\\s*BEGIN\\s*[{] awk script text +# !:mime text/x-awk +# 0 regex =^\\s*BEGIN\\s*[{] awk script text # AT&T Bell Labs' Plan 9 shell 0 string/wt #!\ /bin/rc Plan 9 rc shell script text executable Test with 10.000 empty lines: 1. Create test file with 10.000 empty lines for quick test for i in `seq 1 10000`; do echo >>/tmp/file_with_10000_empty_lines.txt ; done 2. Run file with original magic.mgc: time file -m /usr/share/misc/magic.mgc /tmp/file_with_10000_empty_lines.txt /tmp/file_with_10000_empty_lines.txt: ASCII text real 0m2.243s user 0m2.230s sys 0m0.000s 3. Run with patched magic.mgc: time file -m /usr/share/misc/magic.patched.mgc /tmp/file_with_10000_empty_lines.txt /tmp/file_with_10000_empty_lines.txt: ASCII text real 0m0.005s user 0m0.000s sys 0m0.000s
presumably you're running this on a server. you can mitigate the issue by running file in a LC_ALL=C locale rather than something like en_US.UTF8.