Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 584040 - app-misc/mc: using mcedit/mcview on a file chokes with unicode
Summary: app-misc/mc: using mcedit/mcview on a file chokes with unicode
Status: RESOLVED INVALID
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Sergei Trofimovich (RETIRED)
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-05-24 21:48 UTC by Raymond Jennings
Modified: 2017-01-22 16:38 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
emerge --info (einfo.txt,5.93 KB, text/plain)
2016-05-25 21:06 UTC, Raymond Jennings
Details
locale output (foo,200 bytes, text/plain)
2016-05-25 21:15 UTC, Raymond Jennings
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Raymond Jennings 2016-05-24 21:48:01 UTC
When I'm looking/editing a file in mc, it messes up when trying to display unicode characters.

The changelog for eclean-kernel provokes this issue with Michael Gorny's name for example.

I get garbage at the end of the line that looks like the line wasn't completely painted/drawn, and often the actual cursor position doesn't match the one indicated at the top of the screen.

Suggested solution:  Make sure that unicode characters are properly counted.  I think you should make sure that wcwidth and wcswidth are being properly used, and that mc isn't naively assuming that one byte is equal to one character, particularly if it's processing UTF-8
Comment 1 Lars Wendler (Polynomial-C) (RETIRED) gentoo-dev 2016-05-25 04:33:36 UTC
Please attach the output of 

  emerge --info app-misc/mc

to this bug and post the the output of the command

 locale

into the comments.
Comment 2 Sergei Trofimovich (RETIRED) gentoo-dev 2016-05-25 20:46:13 UTC
Please also provide:
- minimal "broken" text file
- a screenshot how it's broken
- current encoding in mcedit (seen by pressing Ctrl+e)
Comment 3 Raymond Jennings 2016-05-25 21:06:27 UTC
Created attachment 435390 [details]
emerge --info
Comment 4 Raymond Jennings 2016-05-25 21:15:36 UTC
Created attachment 435392 [details]
locale output
Comment 5 Rafał Mużyło 2016-05-25 23:38:02 UTC
...:sigh:...

That's actually a well known *upstream* bug (at least IIRC).

If you want to see something funny, try, while editing a file with some double-width chars on two subsequent lines, move up/down between them.

Sometimes you'll be able to move *into* a double-width char, that is if you try to backspace a char, instead of simply removing the char, you'd end up removing only a part of its utf8 sequence.
Comment 6 Lars Wendler (Polynomial-C) (RETIRED) gentoo-dev 2016-05-26 10:07:37 UTC
(In reply to Rafał Mużyło from comment #5)
> ...:sigh:...
> 
> That's actually a well known *upstream* bug (at least IIRC).
> 

If it's a well known bug can you please add the upstream bug URL to this bug?
Comment 7 Rafał Mużyło 2016-05-30 14:56:51 UTC
(In reply to Lars Wendler (Polynomial-C) from comment #6)
> (In reply to Rafał Mużyło from comment #5)
> > ...:sigh:...
> > 
> > That's actually a well known *upstream* bug (at least IIRC).
> > 
> 
> If it's a well known bug can you please add the upstream bug URL to this bug?

...OK, it seems I stand corrected on the *'reported'* part (at least I'm unable to find it in their bugtrack), yet that's still a long-standing *upstream* problem.

I still recall the time, when utf8nsupport was provide by a mostly-working external patch, and while mc has made a significant progress since, issues like these are still not that hard to find, if you start looking. If you're switching from a single-byte to variable byte length and add variable char width on top of it, such problems are pretty much expected and sprinkled all over the code.

If you want more, there's an unrelated one: in copy/move dialogs, if you Esc-Tab a path containing spaces, mc will backslash-escape the completion, but won't go any deeper with the completion unless you remove those backslashes.

I'm too used to the interface (and a few functions) to drop it, but major warts....
Comment 8 Yury V. Zaytsev 2016-10-07 21:03:28 UTC
(In reply to Rafał Mużyło from comment #7)
> ...OK, it seems I stand corrected on the *'reported'* part (at least I'm
> unable to find it in their bugtrack), yet that's still a long-standing
> *upstream* problem.

I cannot guarantee you that this bug will be fixed if you report it upstream, but I can guarantee you, that it will *never* be addressed if you don't do so.

A proper report upstream with a minimal reproducer + screenshot as requested by @slyfox will be appreciated; Egmont has fixed a number of similar / related problems in the past, and I'm sure that he can look into it if asked kindly.

> If you want more, there's an unrelated one: in copy/move dialogs, if you
> Esc-Tab a path containing spaces, mc will backslash-escape the completion,
> but won't go any deeper with the completion unless you remove those
> backslashes.

Same here.
Comment 9 Raymond Jennings 2016-10-07 23:15:56 UTC
Sorry ^^ I thought it was the gentoo level maintainer for this package that had the responsibility of making upstream reports.

I've been waylaid irl with drama.
Comment 10 Sergei Trofimovich (RETIRED) gentoo-dev 2017-01-22 16:38:04 UTC
So far I didn't see any of this in this bug (comment #2):

- minimal "broken" text file
- a screenshot how it's broken
- current encoding in mcedit (seen by pressing Ctrl+e)

Closing as INVALID.