Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 365263 - sys-apps/less - auto convert UTF16/UTF32 inputs to active encoding
Summary: sys-apps/less - auto convert UTF16/UTF32 inputs to active encoding
Status: RESOLVED NEEDINFO
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All All
: Normal enhancement (vote)
Assignee: Gentoo's Team for Core System packages
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-04-28 23:18 UTC by Scott Bertilson
Modified: 2014-11-21 09:28 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments
cleaned up patch (lesspipe.patch,780 bytes, patch)
2013-10-21 15:35 UTC, Scott Bertilson
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Scott Bertilson 2011-04-28 23:18:12 UTC
Fedora recently added this feature.  I was able to hack it into the lesspipe.sh that is part of the ebuild, but had to prevent the early exit.  No time to figure out how to integrate it with that at the moment.
--- /usr/portage/sys-apps/less/files/lesspipe.sh        2011-01-19 21:31:13.000000000 -0600
+++ /home/ssb/bin/./lesspipe.sh 2011-04-28 18:07:18.029585226 -0500
@@ -91,7 +91,7 @@
        *.ps|*.pdf) ps2ascii "$1" || pstotext "$1" || pdftotext "$1" ;;
        *.doc)      antiword "$1" || catdoc "$1" ;;
        *.rtf)      unrtf --nopict --text "$1" ;;
-       *.conf|*.txt|*.log) ;; # force less to work on these directly #150256
+#      *.conf|*.txt|*.log) ;; # force less to work on these directly #150256
 
        ### URLs ###
        ftp://*|http://*|*.htm|*.html)
@@ -195,6 +195,17 @@
                        # Maybe we didn't match because the file is named weird ...
                        1) lesspipe_file "$1" ;;
                esac
+               if [ -x /usr/bin/file -a -x /usr/bin/iconv -a -x /usr/bin/cut ]; then
+                       case `file -b "$1"` in
+                       *UTF-16*) conv='UTF-16' ;;
+                       *UTF-32*) conv='UTF-32' ;;
+                       esac
+                       env=`echo $LANG | cut -d. -f2`
+                       if [ -n  "$conv" -a -n "$env" -a "$conv" != "$env" ]; then
+                               iconv -f $conv -t $env "$1"
+                               exit $?
+                       fi
+               fi
 
                # So no matches from above ... finally fall back to an external
                # coloring package.  No matching here so we don't have to worry

Reproducible: Always

Steps to Reproduce:
1.any arbitrary file with UTF-16 content is displayed without decoding
2.
3.
Comment 1 SpanKY gentoo-dev 2011-05-02 01:45:04 UTC
post patches as attachments.  bugzilla comments corrupts them.

hardcodes paths are never acceptable.  the invocation is also wrong as it will fail on files that begin with a dash.

the logic also breaks as the comment indicates.

the LANG parsing is wrong.  pretty sure it should be LC_CTYPE.  further, splitting on "." doesnt work for all possible names.  it also will wrongly exit if the locale is invalid, or there is a conversion error.
Comment 2 Tony Vroon gentoo-dev 2013-10-18 17:12:39 UTC
This bug has been open, awaiting a response for over two years now. If you are still interested in having this feature added, please attach a patch that is modified in response to vapier's feedback.
(I am closing out old base-system bugs to get the list to a manageable size.)
Comment 3 Scott Bertilson 2013-10-21 15:35:08 UTC
Created attachment 361536 [details, diff]
cleaned up patch

Cleaned up patch based on your comments.  Sorry I didn't ever get back to you.

I think LANG needs to be included since that's the last resort, but from what I've read it should try to use in order of preference: LC_ALL LC_CTYPE LANG.
I'm not sure what to do about the parsing problem with using ".", but it seems reasonable to me for it to fall through if it can't find a second field after a ".".
I also changed it to fall through if iconv returns anything other than success since there are some conversions that it doesn't like such as ca_ES.UTF-8@valencia.  I wasn't sure what to do about the error message in view of falling through - maybe it should go to /dev/null.
Comment 4 SpanKY gentoo-dev 2013-12-22 23:39:15 UTC
can you post some sample files and the locales you're using to view them ?