Discussion:
Console encoding on Windows, output to file with ">"
(too old to reply)
Ondrej Pokorny
2017-05-03 06:34:14 UTC
Permalink
Raw Message
Hello,

I have a simple FPC program:

---
program GermanTest;

{$codepage utf8}
begin
Writeln('ÄäÖöÜüß');
end.
---

If I run it in the console, I see correct characters:

BUT when I redirect the output to a file with ">" :

GermanTest.exe > GermanTest.txt I get corrupted characters "Ž„™”šá"
(see attachment).

I tried various possibilities but either the console output is correct
or the file output is correct. Never both of them.

E.g.
---
program GermanTest;

{$codepage utf8}

uses
Windows, SysUtils, Classes;

begin
SetConsoleOutputCP(CP_UTF8);

Writeln('ÄäÖöÜüß');
end.
---

Works fine for file output but the console shows wrong characters.

What should I do so that both outputs are correct? Or is it an FPC bug?
Delphi works fine - both outputs are correct.

FPC version: current trunk 3.1.1 Windows, 32bit.
OS: Windows 10, 64bit

Ondrej
Michael Thompson
2017-05-03 06:56:20 UTC
Permalink
Raw Message
Post by Ondrej Pokorny
GermanTest.exe > GermanTest.txt I get corrupted characters
"Ŝ„™”š á" (see attachment).
When I open the attachment on my PC, I see the correct characters:
[image: Inline images 1]

No idea what this means, just sharing info.

Good luck

Mike
Petr Kristan
2017-05-03 07:07:51 UTC
Permalink
Raw Message
Post by Ondrej Pokorny
Hello,
---
program GermanTest;
{$codepage utf8}
begin
Writeln('ÄäÖöÜüß');
end.
---
GermanTest.exe > GermanTest.txt I get corrupted characters "Ž„™”š?á" (see
attachment).
I tried various possibilities but either the console output is correct or
the file output is correct. Never both of them.
E.g.
---
program GermanTest;
{$codepage utf8}
uses
Windows, SysUtils, Classes;
begin
SetConsoleOutputCP(CP_UTF8);
Writeln('ÄäÖöÜüß');
Try:
Writeln(UTF8ToSys('ÄäÖöÜüß'));
Post by Ondrej Pokorny
end.
---
Works fine for file output but the console shows wrong characters.
What should I do so that both outputs are correct? Or is it an FPC bug?
Delphi works fine - both outputs are correct.
FPC version: current trunk 3.1.1 Windows, 32bit.
OS: Windows 10, 64bit
Ondrej
У„УЄУ–УЖУœУМУŸ
_______________________________________________
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
--
Petr Kristan
.
EPOS PRO s.r.o., Smilova 333, 530 02 Pardubice
tel: +420 461101401 Czech Republic (Eastern Europe)
fax: +420 461101481
_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/f
Ondrej Pokorny
2017-05-03 09:43:59 UTC
Permalink
Raw Message
Post by Petr Kristan
Writeln(UTF8ToSys('ÄäÖöÜüß'));
No, it doesn't help.

I debugged into the issue and found out that the exported file was in
DOS-850 encoding (the same as the console lived in). Delphi changes the
output encoding to "DefaultSystemCodePage" if Output is different from
console. This code shows the same behavior as in Delphi:

---
program GermanTest;

{$codepage utf8}

uses
Windows, SysUtils;

begin
if GetFileType(TTextRec(Output).Handle) <> FILE_TYPE_CHAR then
SetTextCodePage(Output, DefaultSystemCodePage);
Writeln('ÄäÖöÜüß');
end.
---

I reported it in bug tracker: https://bugs.freepascal.org/view.php?id=31746
Post by Petr Kristan
When I open the attachment on my PC, I see the correct characters
Upps. My fault - I sent the wrong (= with correct characters :D) file.

Ondrej
_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/cg
Ondrej Pokorny
2017-05-03 16:18:53 UTC
Permalink
Raw Message
Michael, because I cannot comment on a closed issue report I post my
I don't think what you write is correct. As far as I can see in the
code of Delphi Berlin: Delphi uses always the defaultsystemcodepage for
text files, regardless of what the file is used for.

What do you mean? I checked both Berlin and Tokyo and in "function
TextOpen(var t: TTextRec): Integer;" in unit System you see that if the
text file is used for console input/output the CodePage is
GetConsoleOutputCP/GetConsoleCP whereas if the text file is used for
other purposes it uses DefaultSystemCodePage. (Search for "if
GetFileType(t.Handle) = 2 then"; 2=FILE_TYPE_CHAR).

It is correct that the patch doesn't exactly reflect Delphi behavior -
it sets the DefaultSystemCodePage only for standard output and not all
text files. It mises the Delphi fallback to DefaultSystemCodePage.
FPC uses a different mechanism which allows you to write files in a
different code page.

Exactly. The issue report is about the fact that the mechanism is
different - which puzzled me a lot. I just didn't understand why I see
nonsense in Notepad when I open the output from an FPC application
whereas output from Delphi shows fine.

Some background: I convinced one of my customers that FPC is better than
Delphi (I have known it for a long time already) and migrated one of my
console programs for him from Delphi to FPC and I didn't understand why
FPC "corrupts" the output to file whereas the output to console is OK. I
wasn't aware that the FPC output was in console codepage even if you
redirect the output to a file (you don't really use console codepage for
files, don't you).

I accept that this behavior is by design and wanted - I hadn't found any
information on this Delphi incompatibility before I started checking the
issue. So at least the issue report can be some kind of information
source even if you won't fix it - hopefully Google files it correctly :)
You can do your own detection and set the code page accordingly
This is exactly what I found out and what I posted as workaround in
"Steps To Reproduce" :)

Ondrej
_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/m
Michael Van Canneyt
2017-05-03 16:36:33 UTC
Permalink
Raw Message
Post by Ondrej Pokorny
Michael, because I cannot comment on a closed issue report I post my
I don't think what you write is correct. As far as I can see in the
code of Delphi Berlin: Delphi uses always the defaultsystemcodepage for
text files, regardless of what the file is used for.
What do you mean? I checked both Berlin and Tokyo and in "function
TextOpen(var t: TTextRec): Integer;" in unit System you see that if the
text file is used for console input/output the CodePage is
GetConsoleOutputCP/GetConsoleCP whereas if the text file is used for
other purposes it uses DefaultSystemCodePage. (Search for "if
GetFileType(t.Handle) = 2 then"; 2=FILE_TYPE_CHAR).
Hmh.
I looked in a different function and missed this. You are right.

I don't think this is very good behaviour.

IMHO the output should be the same regardless of whether the output is piped or not.

I can understand a check for stdinputhandle/stdoutputhandle.
(specially on windows with the weird handling of a console)

But whether or not the output is piped: the program should handle both cases equally.

I have reopened the bugreport. Some windows user should check this.

Michael.
_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/ma

Loading...