I have sed 4.0.9 compiled with nls support enabled. However, in an UTF-8 terminal I can't process latin1 texts anymore: Reproducible: Always Steps to Reproduce: 1. echo -e "foo\337bar" | LC_CTYPE=de_DE.UTF-8 sed s/foo.*bar/bar/ 2. echo -e "foo\337bar" | LC_CTYPE=C sed s/foo.*bar/bar/ 3. Actual Results: foo
I have sed 4.0.9 compiled with nls support enabled. However, in an UTF-8 terminal I can't process latin1 texts anymore: Reproducible: Always Steps to Reproduce: 1. echo -e "foo\337bar" | LC_CTYPE=de_DE.UTF-8 sed s/foo.*bar/bar/ 2. echo -e "foo\337bar" | LC_CTYPE=C sed s/foo.*bar/bar/ 3. Actual Results: fooßbar bar Expected Results: bar bar
I'm pretty sure this is INVALID. If I'm remembering my Unicode right, \377b is one 'character' in UTF-8. What's the output of the following? echo -e "foo\337bar" | LC_CTYPE=de_DE.UTF-8 sed s/foo.*ar/bar/
sorry for the long delay. hum, yes. you're right, \337b is one character. However: $ echo -e "foo\337bar" | LC_CTYPE=de_DE.UTF-8 sed s/foo.*ar/bar/ foo
sorry for the long delay. hum, yes. you're right, \337b is one character. However: $ echo -e "foo\337bar" | LC_CTYPE=de_DE.UTF-8 sed s/foo.*ar/bar/ fooßbar and even: $ echo -e "foo\337bar" | LC_CTYPE=de_DE.UTF-8 sed s/foo.*/bar/ barßbar
can you try this out with sed-4.1.2 ?
get back to us on whether 4.1.2 does the right thing