New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perldoc -f open example for utf encoding may be incorrect #5779
Comments
From dcd@tc.fluke.comCreated by dcd@tc.fluke.comperldoc -f open shows an example For example will open the UTF-8 encoded file containing but to read a Unicode file generated on Windows it appears that Perl Info
|
From @jhiI do not quite understand that "for a Unicode file generated in Windows" the sample should be changed to be "utf-16". Windows can generate utf-8 just fine. |
From david.dyck@fluke.comOn 27 Nov 2002 at 04:35 -0000, Jarkko Hietaniemi <perlbug@perl.org> wrote:
Thanks for your question, and as I'm still learning a bit about unicode, I created a text file consisting of the word "test" in windows NT using A hex dump of the file would tend to indicate that the file was utf-16, $ hd -x /usr0/dcd/test-unicode.txt http://www.unicode.org/unicode/faq/utf_bom.html#25 A: No, a BOM can be used as a signature no matter how the Unicode text Bytes Encoding Form The above leads me to think that the default unicode text format on Upon re-reading the "perldoc -f open" section that I questioned ... For example I would have to agree that the statement was not literally perl -e 'open(FH, "<:utf16", "/usr0/dcd/test-unicode.txt") || die "open failed:$!"' Perhaps I should submit a bug that reports that utf16 files are not |
From @jhiOkay, now I understand a little bit better. Still, you really should get rid of the "Windows Unicode" meme :-) There's only one Unicode, which has various different encodings. Perl prefers UTF-8, Windows prefers (little-endian) UTF-16. In the three argument open the ":utf8" is currently a special case. The general case is ":encoding(foobar)", so it would be ":encoding(utf16)"-- but I have to admit that this seems to have a bug currently: one gets strange "UTF-16:Partial character" warnings that I think shouldn't happen. I'll ask the guy working on the encoding bits. I guess we could make also ":utf16" (and ":utf32", I guess) another special case since UTF-16 is prevalent enough. |
From david.dyck@fluke.comOn 3 Dec 2002 at 02:17 -0000, Jarkko Hietaniemi <perlbug@perl.org> wrote:
I didn't get any warnings from the following code perl -we 'open(FH, "<:encoding(utf16)", "/usr0/dcd/test-unicode.txt")
That would be nice, but next time I should just read the |
From @jhiOkay... I think you got me convinced that things work as they should :-) |
@jhi - Status changed from 'new' to 'resolved' |
Migrated from rt.perl.org#15533 (status was 'resolved')
Searchable as RT15533$
The text was updated successfully, but these errors were encountered: