Discussion:
Managed Types, Undefined Bhaviour
(too old to reply)
Willibald Krenn
2018-06-28 16:45:33 UTC
Permalink
Raw Message
_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Jonas Maebe
2018-06-28 18:06:23 UTC
Permalink
Raw Message
type  Vector = array of integer;
function DoSomething (len: longint): Vector; begin
  SetLength (result,len); // whatever
end;
var  A,B: Vector;
begin
  A := DoSomething(3);
  // changing A here will influence B below,
  // which seems ok to some as
  // "managed types are not initialized when
  // calling DoSomething and this is documented
  // - so anything goes and stop whining".
  B := DoSomething(3);
end.
I strongly believe that the behaviour as currently implemented is wrong
and here is why.
(1) When assigning "result" to A, refcount of result needs to go down
and, hence, the local var needs to be freed/invalidated. (Result goes
out of scope.)
For performance reasons, managed function results are implemented as
call-by-reference parameters. This is also the case in Delphi:
http://docwiki.embarcadero.com/RADStudio/Berlin/en/Program_Control#Handling_Function_Results

Therefore the function result's scope is the caller of the function, not
the function itself.
(3) The behaviour is incompatible to Delphi (tested 10.2.3 vs. Lazarus
1.8.0).
It is compatible with Delphi. See e.g.
Delphi shows the expected behaviour of treating A and B as distinct.
That's because your example gets optimized better by Delphi than by FPC:
Delphi directly passes A resp. B as the function result "var" parameter
to the function calls in your example, while FPC passes a/the same
temporary variable to them and then assigns this temprary variable to A/B.

If you have more complex code where the Delphi compiler cannot be
certain that the variable to which you assign the function result isn't
used inside the function as well, it will behave in a similar way (as
discussed in the StackOverflow posts linked from the message above).


Jonas
_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Stefan Glienke
2018-06-29 15:57:58 UTC
Permalink
Raw Message
Let me add some information to this issue - as I think this is one - before this drifted into the interface chaining thing:

When you execute this code in Delphi it will print 0 while on FPC it prints 42:

type
Vector = array of integer;

function DoSomething (len: Integer): Vector;
begin
SetLength(Result, len);
end;

var
A, B: Vector;
begin
A := DoSomething(3);
A[0] := 42;
B := DoSomething(4);
Writeln(B[0]);
end.


This is the case as FPC reuses the same temp variable for both calls and thus still points to an allocated array from the first call which it just enlarges by one. Delphi in this case does not directly pass A and B but does temp variables for both (it does that because A and B are global variables).

When you move this code to a routine like this:

procedure Main;
var
A, B: Vector;
begin
A := DoSomething(3);
A[0] := 42;
B := DoSomething(4);
Writeln(B[0]);
end;

begin
Main;
end.

both Delphi and FPC behave the same - both directly pass A and B do DoSomething because they are local variables that can not be affected by any other code running at the same time.

While this might look as something that does not have any serious implication it actually might have.

Change the code to following:

type
TFoo = class
A, B: Vector;
end;

procedure Main;
var
foo: TFoo;
begin
foo := TFoo.Create;
foo.A := DoSomething(3);
foo.A[0] := 42;
foo.B := DoSomething(4);
Writeln(foo.B[0]);
end;


Now we are back to using temp variables (both Delphi and FPC do) but FPC again reuses its temp variable for A and B while Delphi uses different ones. Now for some integer this might not be a big issue but imagine you have something else in these arrays (for example any managed type like an interface).
Not having properly cleared B because it still uses the temporary content from A might cause some issues.
Post by Jonas Maebe
type Vector = array of integer;
function DoSomething (len: longint): Vector; begin
SetLength (result,len); // whatever
end;
var A,B: Vector;
begin
A := DoSomething(3);
// changing A here will influence B below,
// which seems ok to some as
// "managed types are not initialized when
// calling DoSomething and this is documented
// - so anything goes and stop whining".
B := DoSomething(3);
end.
I strongly believe that the behaviour as currently implemented is wrong
and here is why.
(1) When assigning "result" to A, refcount of result needs to go down
and, hence, the local var needs to be freed/invalidated. (Result goes
out of scope.)
For performance reasons, managed function results are implemented as
http://docwiki.embarcadero.com/RADStudio/Berlin/en/Program_Control#Handling_Function_Results
Therefore the function result's scope is the caller of the function, not
the function itself.
(3) The behaviour is incompatible to Delphi (tested 10.2.3 vs. Lazarus
1.8.0).
It is compatible with Delphi. See e.g.
Delphi shows the expected behaviour of treating A and B as distinct.
Delphi directly passes A resp. B as the function result "var" parameter
to the function calls in your example, while FPC passes a/the same
temporary variable to them and then assigns this temprary variable to A/B.
If you have more complex code where the Delphi compiler cannot be
certain that the variable to which you assign the function result isn't
used inside the function as well, it will behave in a similar way (as
discussed in the StackOverflow posts linked from the message above).
Jonas
_______________________________________________
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.or
Jonas Maebe
2018-06-29 16:27:26 UTC
Permalink
Raw Message
Post by Stefan Glienke
Now we are back to using temp variables (both Delphi and FPC do) but FPC again reuses its temp variable for A and B while Delphi uses different ones. Now for some integer this might not be a big issue but imagine you have something else in these arrays (for example any managed type like an interface).
Not having properly cleared B because it still uses the temporary content from A might cause some issues.
My point was that Delphi sometimes also reuses temp variables. See the
StackOverflow posts linked from the previous message. It does not do it
in the same cases as FPC, but it does do it. So while you may be lucky
more often in Delphi, relying on this behaviour is unsafe even there afaik.


Jonas
_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/f
Stefan Glienke
2018-06-29 17:03:58 UTC
Permalink
Raw Message
Delphi does not reuse them, every call to a function generates a temp
variable. Sure, if you call it in a loop it of course uses the same one.
But if you have 2 calls after each other the compiler generates two
variables. Even if they are in seperate code branches. I have often
enough optimized some code that caused huge prologue/epilogue just for
temp variables of different calls where only one could have happened
(like in a case statement). You can see exactly that in the answer by
David Heffernan you linked to. The loop keeps adding X while the other
two calls get an empty string passed.
Post by Jonas Maebe
Post by Stefan Glienke
Now we are back to using temp variables (both Delphi and FPC do) but
FPC again reuses its temp variable for A and B while Delphi uses
different ones. Now for some integer this might not be a big issue
but imagine you have something else in these arrays (for example any
managed type like an interface).
Not having properly cleared B because it still uses the temporary
content from A might cause some issues.
My point was that Delphi sometimes also reuses temp variables. See the
StackOverflow posts linked from the previous message. It does not do
it in the same cases as FPC, but it does do it. So while you may be
lucky more often in Delphi, relying on this behaviour is unsafe even
there afaik.
Jonas
_______________________________________________
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
---
Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft.
https://www.avast.com/antivirus

_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinf
Jonas Maebe
2018-06-29 18:10:05 UTC
Permalink
Raw Message
Post by Stefan Glienke
Delphi does not reuse them, every call to a function generates a temp
variable. Sure, if you call it in a loop it of course uses the same one.
That does not make any sense to me from a language design point of view.
Either the language guarantees that managed function results are
initialised to empty, or it does not. The fact that these are the same
or different temps, or if there are no temps at all, should not matter
in the least. Otherwise you are defining the behaviour of the language
in terms of the quality of the compiler's alias analysis (since if it
can prove that it does not need a temp, the function result may not be
empty on entry).
Post by Stefan Glienke
But if you have 2 calls after each other the compiler generates two
variables. Even if they are in seperate code branches. I have often
enough optimized some code that caused huge prologue/epilogue just for
temp variables of different calls where only one could have happened
(like in a case statement).
I'm sorry, but supporting the exploitation of properties of the Delphi
code generator is not in the scope of the FPC project.


Jonas
_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/lis
Stefan Glienke
2018-06-29 20:02:45 UTC
Permalink
Raw Message
Post by Jonas Maebe
That does not make any sense to me from a language design point of
view. Either the language guarantees that managed function results are
initialised to empty, or it does not. The fact that these are the same
or different temps, or if there are no temps at all, should not matter
in the least. Otherwise you are defining the behaviour of the language
in terms of the quality of the compiler's alias analysis (since if it
can prove that it does not need a temp, the function result may not be
empty on entry).
The issue is that reusing a temp variable for a second call while it
still has some already allocated array in it causes the code to behave
differently even if you declare {$mode delphi}. In Delphi it is clear
that if a temp variable is being used it was initialized to be empty in
the prologue. The only case (I know of) where a call with a non
initialized temp variable happens is when the statement is called in a
loop and it does in no way happen that one statement affects the
behavior of a different statement in the same routine.

It is true that function results of managed types are not initialized
but passed as var parameter. However it should be ensured that they are
in a defined state. If the LHS is being passed you know its state. If a
temp variable is being passed it should be initialized (like Delphi
does). Reusing temp variables holding intermediate values introduces
hidden side effects.

If I am not mistaken (this is from my observarion and might not be 100%
accurate) usually the rule in Delphi (possibly similar in FPC) when it
allows to directly pass the LHS as hidden result parameter is when it is
a local variable. Now with the dynamic array we have the case that since
the content of the temp variable was being copied it then when being
reused has a ref count of 1 causing SetLength to just realloc the memory
and possibly keep any existing content of the array copying it then on
to the LHS of a second statement as shown in the code example earlier.
Post by Jonas Maebe
I'm sorry, but supporting the exploitation of properties of the Delphi
code generator is not in the scope of the FPC project.
My point was that Delphi sometimes also reuses temp variables.
I just pointed out that this claim is not correct.

IMO there is a potential cause for a code defect that is not obvious -
you might go the way of the principle of least surprises and try to
address it or argue it away for whatever reasons.

---
Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft.
https://www.avast.com/antivirus

_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinf
Jonas Maebe
2018-06-29 20:33:44 UTC
Permalink
Raw Message
Post by Stefan Glienke
If I am not mistaken (this is from my observarion and might not be 100%
accurate) usually the rule in Delphi (possibly similar in FPC) when it
allows to directly pass the LHS as hidden result parameter is when it is
a local variable.
FPC does it when it knows for certain the target cannot be read or
modified by the called function. Even for local variables, this is not
always the case, e.g. if the local variable's address may have been
taken at some point: by the address parameter, or simply because it was
passed as var- or out-parameter (the called function could then take the
address of that parameter and store it elsewhere). I.e., it is based on
alias analysis.

With whole-program analysis, it could also be done with global
variables, or in cases the compiler could prove that the function that
received a (local) variable as var/out-parameter did not take its
address and store it in a location that survived the lifetime of the
called function.

That's what I meant when I said this proposal defines language behaviour
based on the limitations of a compiler's implementation of alias
analysis. Nobody can be 100% accurate about this because this can change
from one compiler version to another, unless you limit yourself to the
very first implementation your original compiler had. Any improvement
could break working code.
Post by Stefan Glienke
Now with the dynamic array we have the case that since
the content of the temp variable was being copied it then when being
reused has a ref count of 1 causing SetLength to just realloc the memory
and possibly keep any existing content of the array copying it then on
to the LHS of a second statement as shown in the code example earlier.
I fully agree it is counter-intuitive. But I disagree that only solving
the issue in specific cases (which aren't even documented anywhere) is a
solution. Either you go all the way, or you don't do it.

Having the programmer keep manual track of when a managed type might
need extra initialisation does not make sense. After all, when you
change your code, behaviour could change because suddenly a local
variable might have a value at a point where previously it didn't.
Having to follow all code paths to see whether somewhere you call a
function that does not initialise its function result so you can add an
extra nil-initialisation of that managed local variable is the opposite
of robust code.
Post by Stefan Glienke
IMO there is a potential cause for a code defect that is not obvious -
you might go the way of the principle of least surprises and try to
address it
The principle of least surprises would be to always use temps and to
always initialise them before passing them as function result. I think
many more people would expect this rather than "it is usually
initialised to nil". It would of course result in complaints that FPC
performs extra initialisations of managed variables compared to Delphi.
Post by Stefan Glienke
or argue it away for whatever reasons.
That is no way to discuss.


Jonas
_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/cgi-b
Ondrej Pokorny
2018-07-07 08:37:57 UTC
Permalink
Raw Message
Post by Thorsten Engler
type
TFoo = class
A, B: Vector;
end;
procedure Main;
var
foo: TFoo;
begin
foo := TFoo.Create;
foo.A := DoSomething(3);
foo.A[0] := 42;
foo.B := DoSomething(4);
Writeln(foo.B[0]);
end;
Now we are back to using temp variables (both Delphi and FPC do) but FPC again reuses its temp variable for A and B while Delphi uses different ones. Now for some integer this might not be a big issue but imagine you have something else in these arrays (for example any managed type like an interface).
Not having properly cleared B because it still uses the temporary content from A might cause some issues.
Change the code to the following:

program Project1;
{$APPTYPE CONSOLE}
type
Vector = array of integer;
TFoo = class
A, B: Vector;
end;
function DoSomething (len: Integer): Vector;
begin
SetLength(Result, len);
end;
var
foo: TFoo;
I: Integer;
begin
foo := TFoo.Create;
for I := 0 to 1 do
begin
foo.A := foo.A + DoSomething(3);
foo.A[0] := 42;
end;
WriteLn(foo.A[3]); // << writes 42 even in Delphi !!!
Readln;
end.

Delphi 10 outputs "42"! Yes, and this is not a bug!

Result is not guaranteed to be cleared. Not in Delphi, nor in FPC.

Yes, you showed us one code example where Delphi clears the result and where FPC does not but that's all. Delphi behavior can change any time in the future as it has already been in other cases of undocumented behavior. So your example can stop working in future Delphi versions, if e.g. some optimizations will be added.

Ondrej

_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://l
Willibald Krenn
2018-06-30 06:01:23 UTC
Permalink
Raw Message
_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Sven Barth via fpc-devel
2018-06-30 06:47:49 UTC
Permalink
Raw Message
TBH, I didn't know this issue existed in Delphi and I've done my share of
Delphi over time. I still maintain that for managed types the compiler is
responsible for some minimal initialization (like it's done for records
etc, no?).
The variables we're talking about here *are* initialized. They contain
valid values and none of the internal RTL routines will crash when used
with them. Everyone however expects result variables of those to be
initialized to Nil and that is simply *not* a guaranteed given.
Also records only initialize their managed fields. All others are left as
garbage.

Regards,
Sven
Florian Klämpfl
2018-06-30 07:25:56 UTC
Permalink
Raw Message
 TBH, I didn't know this issue existed in Delphi and I've done my share of Delphi over time. I still maintain that
for managed types the compiler is responsible for some minimal initialization (like it's done for records etc, no?).
The variables we're talking about here *are* initialized.
Maybe the term initialized is wrong and confusing. They are not initialized in the sense of having defined values as
global variables have. Non-global managed types should not be considered as being initialized (never!, like any other
type), this is also why the compiler warns (!) about this. They can be considered as being "setup" by the compiler.
They contain valid values and none of the internal RTL
routines will crash when used with them. Everyone however expects result variables of those to be initialized to Nil and
that is simply *not* a guaranteed given.
Also records only initialize their managed fields. All others are left as garbage.
Managed fields of records are "setup" ;)
_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists
Michael Van Canneyt
2018-06-30 08:54:17 UTC
Permalink
Raw Message
 TBH, I didn't know this issue existed in Delphi and I've done my
share of Delphi over time. I still maintain that
for managed types the compiler is responsible for some
minimal initialization (like it's done for records etc, no?).
The variables we're talking about here *are* initialized.
Maybe the term initialized is wrong and confusing. They are not initialized
in the sense of having defined values as
global variables have. Non-global managed types should not be considered as
being initialized (never!, like any other
type), this is also why the compiler warns (!) about this. They can be
considered as being "setup" by the compiler.
They contain valid values and none of the internal RTL
routines will crash when used with them. Everyone however expects result
variables of those to be initialized to Nil and
that is simply *not* a guaranteed given.
Also records only initialize their managed fields. All others are left as
garbage.
Managed fields of records are "setup" ;)
I will add a section about this in the documentation, seeing that people
often confuse the 2 concepts.

Michael.
Willibald Krenn
2018-06-30 10:02:03 UTC
Permalink
Raw Message
Gesendet: Samstag, 30. Juni 2018 um 10:54 Uhr
Post by Florian Klämpfl
Post by Sven Barth via fpc-devel
The variables we're talking about here *are* initialized.
Bit lost here. Maybe A and B are setup, but not result. And the apparent re-use of tmp vars also seems wrong. In the first call you get a "setup" var, in the second call an "initialized" one with previous values. But the latter is not commonly referred to as initialization, as initialization means usually setting to a sensible default value, which always is the same.
Post by Florian Klämpfl
Maybe the term initialized is wrong and confusing. They are not initialized
in the sense of having defined values as
global variables have. Non-global managed types should not be considered as
being initialized (never!, like any other
type), this is also why the compiler warns (!) about this. They can be
considered as being "setup" by the compiler.
So what does setup do - allocate memory and set refcount accordingly? If result was being setup properly that might also help.
Post by Florian Klämpfl
Post by Sven Barth via fpc-devel
They contain valid values and none of the internal RTL
routines will crash when used with them. Everyone however expects result
variables of those to be initialized to Nil and
Post by Sven Barth via fpc-devel
that is simply *not* a guaranteed given.
I'd say if everyone expects that then there is a point in that the current behaviour is surprising and not intuitive. I mean, the compiler is for people, right? ;)
Post by Florian Klämpfl
Post by Sven Barth via fpc-devel
Also records only initialize their managed fields. All others are left as
garbage.
I'm only talking about managed types.
Post by Florian Klämpfl
Managed fields of records are "setup" ;)
I will add a section about this in the documentation, seeing that people
often confuse the 2 concepts.
In an ideal world, either the language would not let you write code that has random behavior or the compiler would enforce this.

Out of curiosity I did a couple of more tests and it seems FPC is pretty inconsistent in handling all this. See below.

//... some form in lazarus with a tmemo and a button on it

TCArray = array of integer;
TMRecord = record
arr: TCArray;
end;

// global vars
var
A, B: TCArray;
C, D: TMRecord;

function X(l:integer): TCArray;
begin
setlength(result,l);
end;

function Y(l: integer): TMRecord;
begin
setlength(result.arr, l);
end;

procedure global(memo1: TMemo);
begin
A:=X(3);
A[0] := 5;
A[1] := 4;
B:=X(3);
// the following will print 5 twice
memo1.lines.add('GlobalA0: %d',[A[0]]);
memo1.lines.add('GlobalB0: %d',[B[0]]);

C := Y(3);
C.arr[0] := 5;
C.arr[1] := 4;
D := Y(3);
// the following will print 5 and 0
memo1.lines.add('GlobalC0: %d',[C.arr[0]]);
memo1.lines.add('GlobalD0: %d',[D.arr[0]]);
end;

{ TForm1 }

// will add the following lines to memo1:
//GlobalA0: 5
//GlobalB0: 5 (this is the strange case...)
//GlobalC0: 5
//GlobalD0: 0
//LocalA0: 5
//LocalB0: 0
//LocalC0: 5
//LocalD0: 0
procedure TForm1.ToggleBox1Change(Sender: TObject);
var
A, B: TCArray;
C, D: TMRecord;
begin
global(memo1);
A:=X(3);
A[0] := 5;
A[1] := 4;
B:=X(3);
// the following will print 5 and 0
memo1.lines.add('LocalA0: %d',[A[0]]);
memo1.lines.add('LocalB0: %d',[B[0]]);


C := Y(3);
C.arr[0] := 5;
C.arr[1] := 4;
D := Y(3);
// the following will print 5 and 0
memo1.lines.add('LocalC0: %d',[C.arr[0]]);
memo1.lines.add('LocalD0: %d',[D.arr[0]]);
end;


... And at least with my current settings in Lazarus (standard ones) I don't see any warning about me venturing into the fields of undefined - or - random.


Best,
Willibald
_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailma
Florian Klämpfl
2018-06-30 11:34:50 UTC
Permalink
Raw Message
Post by Willibald Krenn
Gesendet: Samstag, 30. Juni 2018 um 10:54 Uhr
Post by Florian Klämpfl
Post by Sven Barth via fpc-devel
The variables we're talking about here *are* initialized.
Bit lost here. Maybe A and B are setup, but not result. And the apparent re-use of tmp vars also seems wrong. In the first call you get a "setup" var, in the second call an "initialized" one with previous values. But the latter is not commonly referred to as initialization, as initialization means usually setting to a sensible default value, which always is the same.
Post by Florian Klämpfl
Maybe the term initialized is wrong and confusing. They are not initialized
in the sense of having defined values as
global variables have. Non-global managed types should not be considered as
being initialized (never!, like any other
type), this is also why the compiler warns (!) about this. They can be
considered as being "setup" by the compiler.
So what does setup do - allocate memory and set refcount accordingly?
No. Consistent in the sense that ref. counting works and causes no leaks.
Post by Willibald Krenn
If result was being setup properly that might also help.
It is setup, but not initialized.
Post by Willibald Krenn
Post by Florian Klämpfl
Post by Sven Barth via fpc-devel
They contain valid values and none of the internal RTL
routines will crash when used with them. Everyone however expects result
variables of those to be initialized to Nil and
Post by Sven Barth via fpc-devel
that is simply *not* a guaranteed given.
I'd say if everyone expects that then there is a point in that the current behaviour is surprising and not intuitive. I mean, the compiler is for people, right? ;)
Post by Florian Klämpfl
Post by Sven Barth via fpc-devel
Also records only initialize their managed fields. All others are left as
garbage.
I'm only talking about managed types.
Post by Florian Klämpfl
Managed fields of records are "setup" ;)
I will add a section about this in the documentation, seeing that people
often confuse the 2 concepts.
In an ideal world, either the language would not let you write code that has random behavior or the compiler would enforce this.
Compile with trunk and -Sew and you get this behavior.
Post by Willibald Krenn
Out of curiosity I did a couple of more tests and it seems FPC is pretty inconsistent in handling all this. See below.
Pretty useless code fragment.
1) This is fpc-devel and not lazarus-same-random-code, it does not compile without lazarus and even not with it.
2) If I guess the missing parts right, the example simply points out a bug of the setlength handling which is handled
internally but this is fixed now.
_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.or
Willibald Krenn
2018-07-07 17:47:30 UTC
Permalink
Raw Message
Gesendet: Samstag, 30. Juni 2018 um 13:34 Uhr
Sorry for the late reply.
Post by Willibald Krenn
In an ideal world, either the language would not let you write code that has random behavior or the compiler would enforce this.
Compile with trunk and -Sew and you get this behavior.
Thanks, interesting to know. A very useful feature - should be on by default.
That said, I'm using fpc to develop an application I need for resource planning, hence I cannot use non-relase versions of fpc that might not even be working with Lazarus.
Post by Willibald Krenn
Out of curiosity I did a couple of more tests and it seems FPC is pretty inconsistent in handling all this. See below.
Pretty useless code fragment.
1) This is fpc-devel and not lazarus-same-random-code, it does not compile without lazarus and even not with it.
2) If I guess the missing parts right, the example simply points out a bug of the setlength handling which is handled
internally but this is fixed now.
The only thing it had demonstrated is that the funky behaviour only occurs when global vars are being used. In all other cases (also if the array is part of a record) a more sane behaviour is there. Which seems interesting to me. (From a language design point of view.) If it helps, I can easily supply you with a console-type version of this - it's a super-trivial example.

In any case, I won't push this any further, as there clearly is no support from anybody else for letting the compiler properly initialize managed types (was thinking about a mode switch) and since it's documented undefined behaviour, Delphi compatibility cannot really be achieved either. And with the new -Sew switch it seems that FPC will warn about or fail to compile code that will result in random, undefined behaviour anyways.

Best,
Willibald

_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.

Michael Van Canneyt
2018-06-30 12:50:52 UTC
Permalink
Raw Message
Post by Willibald Krenn
Post by Michael Van Canneyt
Post by Florian Klämpfl
Managed fields of records are "setup" ;)
I will add a section about this in the documentation, seeing that people
often confuse the 2 concepts.
In an ideal world, either the language would not let you write code that has random behavior or the compiler would enforce this.
There is no random behaviour.
Post by Willibald Krenn
procedure global(memo1: TMemo);
begin
A:=X(3);
A[0] := 5;
A[1] := 4;
B:=X(3);
// the following will print 5 twice
memo1.lines.add('GlobalA0: %d',[A[0]]);
memo1.lines.add('GlobalB0: %d',[B[0]]);
B[0] is not initialized. Printing the contents can print anything.

A and B are different arrays, with length 3, The compiler ensures this.
But their content is not initialized. Printing the content may therefor
result in any possible vallue.

Your code in essence does the same as

Procedure something;

var
B : Array[0..2] of integer;

begin
Writeln(b[0]);
end;

And - Lo - it behaves the same...

Michael.
_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/
Martok
2018-06-29 10:04:03 UTC
Permalink
Raw Message
I hope this issue gets addressed, as I deem the current behaviour completely
broken and also going totally against the spirit of Pascal, feeling much more
like some very obscure behaviour I'd expect from some C compiler.
Discovering the handling of this issue, however, makes me wonder
whether fpc aims to be a great Pascal compiler that does without bad surprises
and very very hard to debug "documented" behaviour or not.
There is less undefined behaviour than in C, but the one we have will bite you
in the most awful ways, sometimes after years of working just fine. And we don't
even have a nice formal standards document that one could grep for "undefined".

But yeah, as Jonas wrote, this _isn't_ one of these occasions. FPC uses (and
reuses!) tempvars a lot more than Delphi, which causes all sorts of funny
behaviours with managed types. Try returning a string or use the
JavaScript-style "Foo().Bar().Baz()" method chaining pattern and you'll see what
I mean.

Change your function to the following, and it will do what one would expect:
function DoSomething (len: longint): Vector;
begin
Result:= nil;
SetLength (result,len); // whatever
end;

For managed types, as far as I can tell:
- locals are initialized (even if there is a warning telling you they are not)
- tempvars are initialized *once*
- Result is never initialized (there is no warning telling you it is not).
--
Regards,
Martok

Ceterum censeo b32079 esse sanandam.

_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http:/
Michael Van Canneyt
2018-06-29 10:43:48 UTC
Permalink
Raw Message
Post by Martok
I hope this issue gets addressed, as I deem the current behaviour completely
broken and also going totally against the spirit of Pascal, feeling much more
like some very obscure behaviour I'd expect from some C compiler.
Discovering the handling of this issue, however, makes me wonder
whether fpc aims to be a great Pascal compiler that does without bad surprises
and very very hard to debug "documented" behaviour or not.
There is less undefined behaviour than in C, but the one we have will bite you
in the most awful ways, sometimes after years of working just fine. And we don't
even have a nice formal standards document that one could grep for "undefined".
But yeah, as Jonas wrote, this _isn't_ one of these occasions. FPC uses (and
reuses!) tempvars a lot more than Delphi, which causes all sorts of funny
behaviours with managed types. Try returning a string or use the
JavaScript-style "Foo().Bar().Baz()" method chaining pattern and you'll see what
I mean.
Out of curiosity, can you give a simple example of such a funny behaviour
in such a chaining pattern ?

Michael.
Martok
2018-06-29 12:25:19 UTC
Permalink
Raw Message
Post by Michael Van Canneyt
Out of curiosity, can you give a simple example of such a funny behaviour
in such a chaining pattern ?
We've had this topic about 2 years ago with regard to automatic file close on
interface release. Interestingly, something must have changed in the mean time,
because the trivial testcase is now *different* , which is somewhat the point of
being weird-undefined ;-)

Take this example: https://pastebin.com/gsdVXWAi

The tempvar used to get reused, causing lifetime issues with the "chain object".
This isn't the case anymore, now three independent tempvars are used, all of
which live until the end of the function, potentially keeping the object alive
for a long time.
There is also one fpc_intf_assign with associated addref/release per as
operator, which isn't technically necessary.

One could probably avoid the interfaces here with ARC records, but either I'm
missing something or the scope lifetime of tempvars there is even worse.
--
Regards,
Martok


_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.free
Michael Van Canneyt
2018-06-29 12:41:41 UTC
Permalink
Raw Message
Post by Martok
Post by Michael Van Canneyt
Out of curiosity, can you give a simple example of such a funny behaviour
in such a chaining pattern ?
We've had this topic about 2 years ago with regard to automatic file close on
interface release. Interestingly, something must have changed in the mean time,
because the trivial testcase is now *different* , which is somewhat the point of
being weird-undefined ;-)
Take this example: https://pastebin.com/gsdVXWAi
The tempvar used to get reused, causing lifetime issues with the "chain object".
This isn't the case anymore, now three independent tempvars are used, all of
which live until the end of the function, potentially keeping the object alive
for a long time.
There is also one fpc_intf_assign with associated addref/release per as
operator, which isn't technically necessary.
One could probably avoid the interfaces here with ARC records, but either I'm
missing something or the scope lifetime of tempvars there is even worse.
What is the expected output of this program ?

As far as I can see, you get 2 chain and 1 done call. Which is what I'd expect.
The overrides of the _* calls are useless, since they are not virtual in
TInterfacedObject and hence never called. So that's OK too.

There is no memory leak, output is the same with 2.6.4 and 3.0.4 and trunk,
so what does this demo actually demonstrate other than that the code just
works ?

Michael.
_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo
Martok
2018-06-29 13:55:02 UTC
Permalink
Raw Message
Post by Michael Van Canneyt
As far as I can see, you get 2 chain and 1 done call. Which is what I'd expect.
The overrides of the _* calls are useless, since they are not virtual in
TInterfacedObject and hence never called. So that's OK too.
Interface functions are always virtual and implemented by the actually
instantiated class. The "override" keyword is neither allowed nor needed, this
code does what it looks like.

The expected output would be 3 Addrefs and 3 Releases.
A bit better would be doing the last release after "Done" and before "fin" (but
this is difficult because of the implied exception frame - there are cases where
Delphi does it anyway, depending on method length).
The "ideal" output would get away with even less (but I don't think this is
possible without translating to SSA form first and doing some strict counting).

The observed output is 6 Addrefs and 6 Releases. At some point in the past (this
may have to do with variable and register allocation), a Release could happen
too soon. It doesn't now, so that's good.

There is nothing technically wrong with the generated code, but it is not nearly
as efficient as it could be. See also Ryan's comments about slow
Interlocked*-calls a few weeks ago. Delphi's output for the same example is
better, giving the expected output.
Because of the tempvars, it is also not exactly what one might expect at first,
which is why I mentioned it in this context.
--
Regards,
Martok


_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://li
Michael Van Canneyt
2018-06-29 14:05:25 UTC
Permalink
Raw Message
Post by Martok
Post by Michael Van Canneyt
As far as I can see, you get 2 chain and 1 done call. Which is what I'd expect.
The overrides of the _* calls are useless, since they are not virtual in
TInterfacedObject and hence never called. So that's OK too.
Interface functions are always virtual and implemented by the actually
instantiated class. The "override" keyword is neither allowed nor needed, this
code does what it looks like.
The expected output would be 3 Addrefs and 3 Releases.
I don't get that.

Michael.
_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinf
Michael Van Canneyt
2018-06-29 14:18:04 UTC
Permalink
Raw Message
Post by Michael Van Canneyt
Post by Martok
Post by Michael Van Canneyt
As far as I can see, you get 2 chain and 1 done call. Which is what I'd
expect.
Post by Martok
Post by Michael Van Canneyt
The overrides of the _* calls are useless, since they are not virtual in
TInterfacedObject and hence never called. So that's OK too.
Interface functions are always virtual and implemented by the actually
instantiated class.
The "override" keyword is neither allowed nor needed, this
Post by Martok
code does what it looks like.
The expected output would be 3 Addrefs and 3 Releases.
I don't get that.
Pressed send too quickly.

home:~> ./tirc
Chain: 00007FA5948CF040
Chain: 00007FA5948CF040
Done: 00007FA5948CF040
fin

is the complete output. So either your explanation is wrong, or the compiler
completely faulty.

Compiling with memleak detection:

home:~> fpc -glh tirc.pp
home:~> ./tirc
Chain: 00007F6FA90280C0
Chain: 00007F6FA90280C0
Done: 00007F6FA90280C0
fin

Heap dump by heaptrc unit of /home/michael/tirc
1 memory blocks allocated : 32/32
1 memory blocks freed : 32/32
0 unfreed memory blocks : 0
True heap size : 32768
True free heap : 32768


Michael.
_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/l
Michael Van Canneyt
2018-06-29 14:31:30 UTC
Permalink
Raw Message
Post by Michael Van Canneyt
Post by Michael Van Canneyt
Post by Martok
Post by Michael Van Canneyt
As far as I can see, you get 2 chain and 1 done call. Which is what I'd
expect.
Post by Martok
Post by Michael Van Canneyt
The overrides of the _* calls are useless, since they are not virtual in
TInterfacedObject and hence never called. So that's OK too.
Interface functions are always virtual and implemented by the actually
instantiated class.
The "override" keyword is neither allowed nor needed, this
Post by Martok
code does what it looks like.
The expected output would be 3 Addrefs and 3 Releases.
I don't get that.
Pressed send too quickly.
home:~> ./tirc
Chain: 00007FA5948CF040
Chain: 00007FA5948CF040
Done: 00007FA5948CF040
fin
is the complete output. So either your explanation is wrong, or the compiler
completely faulty.
home:~> fpc -glh tirc.pp
home:~> ./tirc
Chain: 00007F6FA90280C0
Chain: 00007F6FA90280C0
Done: 00007F6FA90280C0
fin
Heap dump by heaptrc unit of /home/michael/tirc
1 memory blocks allocated : 32/32
1 memory blocks freed : 32/32
0 unfreed memory blocks : 0
True heap size : 32768
True free heap : 32768
Tested on Delphi.

Seems FPC is completely faulty. Delphi calls the versions in TChainer, FPC does not.

Well, You can guess how often this feature is used then... :)

Michael.
_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailm
Mattias Gaertner
2018-06-29 14:53:29 UTC
Permalink
Raw Message
On Fri, 29 Jun 2018 16:18:04 +0200 (CEST)
Post by Michael Van Canneyt
Post by Michael Van Canneyt
Post by Martok
Post by Michael Van Canneyt
As far as I can see, you get 2 chain and 1 done call. Which is what I'd
expect.
Post by Martok
Post by Michael Van Canneyt
The overrides of the _* calls are useless, since they are not virtual in
TInterfacedObject and hence never called. So that's OK too.
Interface functions are always virtual and implemented by the actually
instantiated class.
The "override" keyword is neither allowed nor needed, this
Post by Martok
code does what it looks like.
The expected output would be 3 Addrefs and 3 Releases.
I don't get that.
Pressed send too quickly.
home:~> ./tirc
Chain: 00007FA5948CF040
Chain: 00007FA5948CF040
Done: 00007FA5948CF040
"stdcall" is wrong for Linux. It must be
{$IFNDEF WINDOWS}cdecl{$ELSE}stdcall{$ENDIF};

Then you get under Linux:
Addref: 00007F0B935BF040 Refcount: 1 at 000000000041331A
Addref: 00007F0B935BF040 Refcount: 2 at 00000000004121F2
Release: 00007F0B935BF040 Refcount: 2 at 0000000000412196
Chain: 00007F0B935BF040
Addref: 00007F0B935BF040 Refcount: 2 at 000000000041331A
Addref: 00007F0B935BF040 Refcount: 3 at 00000000004121F2
Release: 00007F0B935BF040 Refcount: 3 at 0000000000412196
Chain: 00007F0B935BF040
Addref: 00007F0B935BF040 Refcount: 3 at 000000000041331A
Addref: 00007F0B935BF040 Refcount: 4 at 00000000004121F2
Release: 00007F0B935BF040 Refcount: 4 at 0000000000412196
Done: 00007F0B935BF040
fin
Release: 00007F0B935BF040 Refcount: 3 at 0000000000412196
Release: 00007F0B935BF040 Refcount: 2 at 0000000000412196
Release: 00007F0B935BF040 Refcount: 1 at 0000000000412196

Mattias
_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.
Michael Van Canneyt
2018-06-29 15:07:07 UTC
Permalink
Raw Message
Post by Mattias Gaertner
Post by Michael Van Canneyt
Pressed send too quickly.
home:~> ./tirc
Chain: 00007FA5948CF040
Chain: 00007FA5948CF040
Done: 00007FA5948CF040
"stdcall" is wrong for Linux. It must be
{$IFNDEF WINDOWS}cdecl{$ELSE}stdcall{$ENDIF};
Addref: 00007F0B935BF040 Refcount: 1 at 000000000041331A
Addref: 00007F0B935BF040 Refcount: 2 at 00000000004121F2
Release: 00007F0B935BF040 Refcount: 2 at 0000000000412196
Chain: 00007F0B935BF040
Addref: 00007F0B935BF040 Refcount: 2 at 000000000041331A
Addref: 00007F0B935BF040 Refcount: 3 at 00000000004121F2
Release: 00007F0B935BF040 Refcount: 3 at 0000000000412196
Chain: 00007F0B935BF040
Addref: 00007F0B935BF040 Refcount: 3 at 000000000041331A
Addref: 00007F0B935BF040 Refcount: 4 at 00000000004121F2
Release: 00007F0B935BF040 Refcount: 4 at 0000000000412196
Done: 00007F0B935BF040
fin
Release: 00007F0B935BF040 Refcount: 3 at 0000000000412196
Release: 00007F0B935BF040 Refcount: 2 at 0000000000412196
Release: 00007F0B935BF040 Refcount: 1 at 0000000000412196
ahahahahahaha... Good point. Changed it and now I get this too.

Explains a lot. Strange the compiler does not warn about this.
("hides method of parent class" or somesuch)

With that out of the way, the original question still remains:

what does this demo actually demonstrate other than that the compiler functions ?

Michael.
_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Thorsten Engler
2018-06-29 15:17:05 UTC
Permalink
Raw Message
-----Original Message-----
Michael Van Canneyt
Sent: Saturday, 30 June 2018 01:07
Subject: Re: [fpc-devel] Managed Types, Undefined Bhaviour
what does this demo actually demonstrate other than that the compiler functions ?
That there is a difference between "functions correctly" and "performs well"?

_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freep
Michael Van Canneyt
2018-06-29 17:11:14 UTC
Permalink
Raw Message
Post by Thorsten Engler
-----Original Message-----
Michael Van Canneyt
Sent: Saturday, 30 June 2018 01:07
Subject: Re: [fpc-devel] Managed Types, Undefined Bhaviour
what does this demo actually demonstrate other than that the compiler functions ?
That there is a difference between "functions correctly" and "performs well"?
Please explain. Exactly how does it demonstrate this ?

What is the expected output ?
And how does current output differ from expected output ?

Michael.
_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/cgi-
Mattias Gaertner
2018-06-29 17:24:09 UTC
Permalink
Raw Message
On Fri, 29 Jun 2018 19:11:14 +0200 (CEST)
Post by Michael Van Canneyt
Post by Thorsten Engler
-----Original Message-----
Michael Van Canneyt
Sent: Saturday, 30 June 2018 01:07
Subject: Re: [fpc-devel] Managed Types, Undefined Bhaviour
what does this demo actually demonstrate other than that the compiler functions ?
That there is a difference between "functions correctly" and "performs well"?
Please explain. Exactly how does it demonstrate this ?
Functions correctly: _AddRef is called for each interface reference and
_AddRef and _Release are balanced.

Performs well: need only 3 _AddRef instead of 6

Mattias
_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lis
Thorsten Engler
2018-06-29 17:28:20 UTC
Permalink
Raw Message
-----Original Message-----
Michael Van Canneyt
Sent: Saturday, 30 June 2018 03:11
Subject: Re: [fpc-devel] Managed Types, Undefined Bhaviour
Please explain. Exactly how does it demonstrate this ?
What is the expected output ?
And how does current output differ from expected output ?
The same code results in more calls to AddRef/Release under FPC than it does under Delphi.

The executed code in FPC is still "correct", in that the reference count reaches 0 (and the object is freed) late enough. So there is no issue with "correctness".

But the additional redundant calls to AddRef/Release will execute lock inc/dec or add/sub instructions. Which are very expensive.

_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mai
Martok
2018-06-29 14:31:17 UTC
Permalink
Raw Message
Post by Michael Van Canneyt
Post by Martok
The expected output would be 3 Addrefs and 3 Releases.
I don't get that.
Somewhat current FPC trunk output, annotations added manually:
==================================================
Addref: 0022FAA8 Refcount: 1 at 00404961
(by fpc_class_as_intf in GetChainer)
Addref: 0022FAA8 Refcount: 2 at 00404223
(by fpc_intf_assign of GetChainer Result)
Release: 0022FAA8 Refcount: 2 at 004041F4
(by fpc_intf_decr_ref of GetChainer Result)
Chain: 0022FAA8
Addref: 0022FAA8 Refcount: 2 at 00404961
(by fpc_class_as_intf in Chain)
Addref: 0022FAA8 Refcount: 3 at 00404223
(by fpc_intf_assign of Chain Result)
Release: 0022FAA8 Refcount: 3 at 004041F4
(by fpc_intf_decr_ref of Chain Result)
Chain: 0022FAA8
Addref: 0022FAA8 Refcount: 3 at 00404961
Addref: 0022FAA8 Refcount: 4 at 00404223
Release: 0022FAA8 Refcount: 4 at 004041F4
Done: 0022FAA8
fin
Release: 0022FAA8 Refcount: 3 at 004041F4
(by fpc_intf_decr_ref at scope end of Test)
Release: 0022FAA8 Refcount: 2 at 004041F4
(dito)
Release: 0022FAA8 Refcount: 1 at 004041F4
(dito)
==================================================


Delphi output (without the stack trace part, because they don't have it):
==================================================
Addref: 0205DBE8 Refcount: 1
Chain: 0205DBE8
Addref: 0205DBE8 Refcount: 2
Chain: 0205DBE8
Addref: 0205DBE8 Refcount: 3
Done: 0205DBE8
fin
Release: 0205DBE8 Refcount: 3
Release: 0205DBE8 Refcount: 2
Release: 0205DBE8 Refcount: 1
==================================================


Delphi uses a shortcut for "as", and because of their different handling of
putting Result in a tempvar, they get away with less internal assignments.

6 LOCK instructions (and associated calls) less. Not a lot by itself, but since
we're counting single-digit cycles in other places...
--
Regards,
Martok


_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://l
Michael Van Canneyt
2018-06-29 14:33:34 UTC
Permalink
Raw Message
Post by Martok
Post by Michael Van Canneyt
Post by Martok
The expected output would be 3 Addrefs and 3 Releases.
I don't get that.
==================================================
Addref: 0022FAA8 Refcount: 1 at 00404961
(by fpc_class_as_intf in GetChainer)
Addref: 0022FAA8 Refcount: 2 at 00404223
(by fpc_intf_assign of GetChainer Result)
Release: 0022FAA8 Refcount: 2 at 004041F4
(by fpc_intf_decr_ref of GetChainer Result)
Chain: 0022FAA8
Addref: 0022FAA8 Refcount: 2 at 00404961
(by fpc_class_as_intf in Chain)
Addref: 0022FAA8 Refcount: 3 at 00404223
(by fpc_intf_assign of Chain Result)
Release: 0022FAA8 Refcount: 3 at 004041F4
(by fpc_intf_decr_ref of Chain Result)
Chain: 0022FAA8
Addref: 0022FAA8 Refcount: 3 at 00404961
Addref: 0022FAA8 Refcount: 4 at 00404223
Release: 0022FAA8 Refcount: 4 at 004041F4
Done: 0022FAA8
fin
Release: 0022FAA8 Refcount: 3 at 004041F4
(by fpc_intf_decr_ref at scope end of Test)
Release: 0022FAA8 Refcount: 2 at 004041F4
(dito)
Release: 0022FAA8 Refcount: 1 at 004041F4
(dito)
==================================================
What OS is this ?

Some help from the compiler people to explain why I get totally different output ?

Michael.
_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://li
Thorsten Engler
2018-06-29 14:37:10 UTC
Permalink
Raw Message
-----Original Message-----
Martok
Sent: Friday, 29 June 2018 23:55
Interface functions are always virtual and implemented by the
actually instantiated class. The "override" keyword is neither
allowed nor needed,
Without having looked the particular code this thread is about, that statement, at least how I interpret it, is wrong.

The specific functions that implement an interface get baked into the class at the moment when the interface is defined as part of the class. This results in important differences in behaviour, depending if methods (in the class) are defined as virtual or not, and if a derived class redeclares an interface already declared on an ancestor or not.

I've only tried the following code (which demonstrates this) in Delphi, but would assume FPC to produce the same result (otherwise there is bound to be a lot of Delphi code which produces subtly different outcomes when compiled with FPC).

program IntfImplDetails;

{$APPTYPE CONSOLE}

uses
System.SysUtils;

type
IFoo = interface(IInterface)
['{E9A12596-8F61-4CF1-A09A-266D56BD837D}']
procedure Foo;
end;

IBar = interface(IFoo)
['{6782527D-431E-49F4-89D0-DCF871BE63A3}']
procedure Bar;
end;

TFoo = class(TInterfacedObject, IFoo)
protected
procedure Foo;
end;

TFooBar = class(TFoo, IBar)
protected
procedure Bar;
procedure Foo;
end;

TFooBarToo = class(TFooBar, IFoo)
protected
procedure Bar;
procedure Foo;
end;

TVirtFoo = class(TInterfacedObject, IFoo)
protected
procedure Foo; virtual;
end;

TVirtFooBar = class(TVirtFoo, IBar)
protected
procedure Bar;
procedure Foo; override;
end;

{ TFoo }

procedure TFoo.Foo;
begin
WriteLn('TFoo.Foo');
end;

procedure TFooBar.Bar;
begin
WriteLn('TFooBar.Bar');
end;

procedure TFooBar.Foo;
begin
WriteLn('TFooBar.Foo');
end;

procedure TFooBarToo.Bar;
begin
WriteLn('TFooBarToo.Bar');
end;

procedure TFooBarToo.Foo;
begin
WriteLn('TFooBarToo.Foo');
end;

procedure TVirtFoo.Foo;
begin
WriteLn('TVirtFoo.Foo');
end;

procedure TVirtFooBar.Bar;
begin
WriteLn('TVirtFooBar.Bar');
end;

procedure TVirtFooBar.Foo;
begin
WriteLn('TVirtFooBar.Foo');
end;

var
Intf : IInterface;

IntfFoo : IFoo;
IntfBar : IBar;

begin
try
{=== TFooBar ===}
WriteLn('=== TFooBar ===');
Intf := TFooBar.Create;

Supports(Intf, IFoo, IntfFoo);
IntfFoo.Foo; // TFoo.Foo

Supports(Intf, IBar, IntfBar);

IntfBar.Foo; // TFooBar.Foo
IntfBar.Bar; // TFooBar.Bar

IntfFoo := IntfBar;
IntfFoo.Foo; // TFooBar.Foo

{=== TFooBarToo ===}

WriteLn('=== TFooBarToo ===');
Intf := TFooBarToo.Create;

Supports(Intf, IFoo, IntfFoo);
IntfFoo.Foo; // TFooBarToo.Foo

Supports(Intf, IBar, IntfBar);

IntfBar.Foo; // TFooBar.Foo
IntfBar.Bar; // TFooBar.Bar

IntfFoo := IntfBar;
IntfFoo.Foo; // TFooBar.Foo

{=== TVirtFooBar ===}

WriteLn('=== TVirtFooBar ===');
Intf := TVirtFooBar.Create;

Supports(Intf, IFoo, IntfFoo);
IntfFoo.Foo; // TVirtFooBar.Foo

Supports(Intf, IBar, IntfBar);

IntfBar.Foo; // TVirtFooBar.Foo
IntfBar.Bar; // TVirtFooBar.Bar

IntfFoo := IntfBar;
IntfFoo.Foo; // TVirtFooBar.Foo
except
on E: Exception do
Writeln(E.ClassName, ': ', E.Message);
end;
if DebugHook <> 0 then
ReadLn;
end.

_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.free
Martok
2018-06-29 17:15:00 UTC
Permalink
Raw Message
Post by Thorsten Engler
The specific functions that implement an interface get baked into the class at the moment when the interface is defined as part of the class. This results in important differences in behaviour, depending if methods (in the class) are defined as virtual or not, and if a derived class redeclares an interface already declared on an ancestor or not.
Okay, then why does the calling convention change matters so much?

Maybe a COM/CORBA thing? COM interface methods can't logically not be virtual,
and the kind of code from my example has always worked (for me!) in FPC.

I am confused. Which sorta ties in to the whole "surprises" thing from before we
hijacked this thread...
--
Regards,
Martok

_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://
Mattias Gaertner
2018-06-29 17:40:35 UTC
Permalink
Raw Message
On Fri, 29 Jun 2018 19:15:00 +0200
Post by Martok
Post by Thorsten Engler
The specific functions that implement an interface get baked into the class at the moment when the interface is defined as part of the class. This results in important differences in behaviour, depending if methods (in the class) are defined as virtual or not, and if a derived class redeclares an interface already declared on an ancestor or not.
Okay, then why does the calling convention change matters so much?
The compiler searches interface methods in the class via the method
signature, which includes the calling convention.
Same as Delphi.

And same as Delphi, FPC does not give a hint if there is an overload
with a different calling convention. I wish there would be.
Post by Martok
[...] COM interface methods can't logically not be virtual,
I think you are confusing things here. They can be virtual or not
virtual in COM and CORBA.

Mattias
_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc
Thorsten Engler
2018-06-29 19:27:54 UTC
Permalink
Raw Message
-----Original Message-----
Mattias Gaertner
Post by Martok
[...] COM interface methods can't logically not be virtual,
I think you are confusing things here. They can be virtual or not
virtual in COM and CORBA.
I think a lot of people simply don't understand how interfaces are implemented.

A decade ago, I wrote some articles about interfaces:

https://www.nexusdb.com/support/index.php?q=intf-fundamentals
https://www.nexusdb.com/support/index.php?q=intf-advanced
https://www.nexusdb.com/support/index.php?q=intf-aggregation

Looking through these now, I see that there are details missing about how Delphi (and I assume FPC) actually implements interfaces. (The following assumes that the read has some understanding about interfaces that can be gained from the articles above.)

An interface being "a pointer to a pointer to an array of method pointers" raises the question, "where is the memory that the 2nd pointer in that chain is stored in?"

When you define a class like:

type
ISomeInterfaceA = interface(IInterface)
['{7C6DC303-0D93-4212-971F-6EACA1B97015}']
procedure SomeMethod;
end;

TSomeObjectA = class(TInterfacedObject, ISomeInterfaceA)
protected
{--- ISomeInterfaceA ---}
procedure SomeMethod;
end;

The in-memory layout of that class is something like:

VMT : Pointer { to array of method pointer};
InheritedFields : Array[x] of Byte; // fields inherited from TInterfacedObject
TSomeObjectA_ISomeInterfaceA_VMT : Pointer { to array of method pointer};

So if you have a TSomeObjectA instance and get an ISomeInterfaceA from that, what you get is a pointer to that TSomeObjectA_ISomeInterfaceA_VMT field of that particular instance of TSomeObjectA.

Which raise the next question, what's the value of that TSomeObjectA_ISomeInterfaceA_VMT and where does it come from?

The compiler, at the time it compiles TSomeObjectA will create a static array of method pointers (for all the methods in the interface, including ancestors) and store a pointer to that in the RTTI of the class.

When the an instance of TSomeObjectA is created, as part of the work done before the constructor is run, the RTL will go through the RTTI of the class, find out about all the "hidden interface VMT pointer fields" and initialize them to point at the array of method pointers the compiler generated. So all instances of TSomeObjectA will always have exactly the same value in the hidden TSomeObjectA_ISomeInterfaceA_VMT field.

Which raise the next question, what code exactly are the method pointers in that compiler generated TSomeObjectA_ISomeInterfaceA_VMT array pointing to?

A valid ISomeInterfaceA VMT is expected to contain a pointer in the SomeMethod slot to something like:

procedure(Self: ISomeInterfaceA);

But the code for TSomeObjectA.SomeMethod is:

procedure(Self: TSomeObjectA);

So the compiler clearly can't just put a pointer to TSomeObjectA.SomeMethod into the SomeMethod slot of the TSomeObjectA_ISomeInterfaceA_VMT. Instead, the compiler needs to create code for a hidden trampoline that looks like this:

procedure TSomeObjectA_ISomeInterfaceA_SomeMethod(Self: ISomeInterfaceA);
begin
TSomeObjectA(PByte(Self)-Offset_of_TSomeObjectA_ISomeInterfaceA_VMT_in_instance_data).SomeMethod;
end;

So the code can reconstruct the TSomeObjectA pointer because the position of the TSomeObjectA_ISomeInterfaceA_VMT field (which is what Self: ISomeInterfaceA points to) is always at a fixed offset from the class VMT (which is what a TSomeObjectA variable points to).


And now we can get to the "COM interface methods can't logically not be virtual" statement.

A call against ISomeInterfaceA.SomeMethod is always "virtual". Yes. It always involved looking up the appropriate method pointer in the interface VMT and then calling that. So you can have different ISomeInterfaceA that will have different code pointers in the SomeMethod slot of their VMT.

But that virtual call only gets you to the compiler generated trampoline. Which then reconstructs the object pointer, and makes the actual call against the object method. And THAT call can be static or virtual, depending if the method in the class is virtual or not.

Let's continue with the code example from above:

type
ISomeInterfaceB = interface(ISomeInterfaceA)
['{1BB48CC2-A2AF-4C7E-A798-288B0F30F04F}']
procedure SomeOtherMethod;
end;

TSomeObjectB = class(TSomeObjectA, ISomeInterfaceB)
protected
{--- ISomeInterfaceA ---}
procedure SomeMethod; //not virtual!

{--- ISomeInterfaceB ---}
procedure SomeOtherMethod;
end;

The resulting memory layout of TSomeObjectB will be:

VMT : Pointer { to array of method pointer};
InheritedFields : Array[x] of Byte; // fields inherited from TInterfacedObject
TSomeObjectA_ISomeInterfaceA_VMT : Pointer { to array of method pointer};
TSomeObjectB_ISomeInterfaceB_VMT : Pointer { to array of method pointer};

TSomeObjectA_ISomeInterfaceA_VMT will be initialized to exactly the same value when creating a TSomeObjectB instance then when creating a TSomeObjectA instance.

The SomeMethod slot in that VMT will point to exactly the same TSomeObjectA_ISomeInterfaceA_SomeMethod code I showed above. And because TSomeObjectA(...).SomeMethod; is a static call, it will really call *TSomeObjectA*.SomeMethod, not TSomeObjectB.SomeMethod.

TSomeObjectB_ISomeInterfaceB_VMT will contain a pointer to a totally different VMT than TSomeObjectA_ISomeInterfaceA_VMT. Because the offset between TSomeObjectB_ISomeInterfaceB_VMT and the class VMT is different than between TSomeObjectA_ISomeInterfaceA_VMT and the class VMT the compiler has to create new trampolines for all methods in ISomeInterfaceB (and ancestors).

So the SomeMethod slot in TSomeObjectB_ISomeInterfaceB_VMT will point to this code:

procedure TSomeObjectB_ISomeInterfaceB_SomeMethod(Self: ISomeInterfaceB);
begin
TSomeObjectB(PByte(Self)-Offset_of_TSomeObjectB_ISomeInterfaceB_VMT_in_instance_data).SomeMethod;
end;

Again, this is a static call, so TSomeObjectB(...).SomeMethod; will call TSomeObjectB.SomeMethod and nothing else.

IF SomeMethod was defined as virtual in TSomeObjectA and override in TSomeObjectB, then the TSomeObjectA(...).SomeMethod call in TSomeObjectA_ISomeInterfaceA_SomeMethod would be a virtual call, and if the ... is actually a TSomeObjectB, then it would end up calling TSomeObjectB.SomeMethod.


And to have a look at the other possibility, redeclaring the interface:

TSomeObjectBToo = class(TSomeObjectA, ISomeInterfaceA, ISomeInterfaceB)
protected
{--- ISomeInterfaceA ---}
procedure SomeMethod; //not virtual!

{--- ISomeInterfaceB ---}
procedure SomeOtherMethod;
end;

The resulting memory layout of TSomeObjectBToo will be:

VMT : Pointer { to array of method pointer};
InheritedFields : Array[x] of Byte; // fields inherited from TInterfacedObject
TSomeObjectA_ISomeInterfaceA_VMT : Pointer { to array of method pointer};
TSomeObjectBToo_ISomeInterfaceA_VMT : Pointer { to array of method pointer};
TSomeObjectBToo_ISomeInterfaceB_VMT : Pointer { to array of method pointer};

Notice that there is an additional TSomeObjectB_ISomeInterfaceA_VMT, this does not affect TSomeObjectA_ISomeInterfaceA_VMT, which will continue to be initialized to exactly the same value when creating a TSomeObjectBToo instance then when creating a TSomeObjectA instance.

The compiler will prepare VMTs for both TSomeObjectB_ISomeInterfaceA_VMT and TSomeObjectB_ISomeInterfaceB_VMT, each with their own trampolines.

The SomeMethod slot in TSomeObjectA_ISomeInterfaceA_VMT will continue to point to the TSomeObjectA_ISomeInterfaceA_SomeMethod I showed above.

var
SomeObjectBToo : TSomeObjectBToo;
SomeObjectA : TSomeObjectA;

IntfB : ISomeInterfaceB;
IntfA1, IntfA2, IntfA3 : ISomeInterfaceA;
begin
SomeObjectBToo := TSomeObjectBToo.Create;
SomeObjectA := SomeObjectBToo;

IntfB := SomeObjectBToo; // TSomeObjectBToo_ISomeInterfaceB_VMT

IntfA1 := SomeObjectBToo; // TSomeObjectBToo_ISomeInterfaceA_VMT
IntfA2 := SomeObjectA; // TSomeObjectA_ISomeInterfaceA_VMT
IntfA3 := IntfB; // TSomeObjectBToo_ISomeInterfaceB_VMT

You have now managed to get 3 different ISomeInterfaceA, all for the same TSomeObjectBToo instance!


I hope this clears up some of the subtle issues with interfaces...

Cheers,
Thorsten

_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fp
Martok
2018-06-30 13:29:31 UTC
Permalink
Raw Message
Post by Thorsten Engler
Post by Mattias Gaertner
Post by Martok
[...] COM interface methods can't logically not be virtual,
I think you are confusing things here. They can be virtual or not
virtual in COM and CORBA.
I think a lot of people simply don't understand how interfaces are implemented.
Thank you for the explanation! Saved for future reference.

I was thinking too much in terms of C++ pure virtual classes and their VMT and
forgot about the self translation trampoline functions.
--
Regards,
Martok


_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://l
Thorsten Engler
2018-06-29 17:44:09 UTC
Permalink
Raw Message
-----Original Message-----
Martok
Sent: Saturday, 30 June 2018 03:15
Subject: Re: [fpc-devel] Managed Types, Undefined Bhaviour
Okay, then why does the calling convention change matters so much?
Maybe a COM/CORBA thing? COM interface methods can't logically not be
virtual, and the kind of code from my example has always worked (for
me!) in FPC.
I am confused. Which sorta ties in to the whole "surprises" thing
from before we hijacked this thread...
The compiler, when building the interface VMT for a specific interface as implemented by a specific class requires that the signature of the method in the class matches the signature of the method in the interface definition.

So if the _AddRef and _Release methods in IInterface/IUnknown are using cdecl and the methods in the class are stdcall, it will ignore them. (Technically the compiler shouldn't need to do that, because it has to create trampolins for all interface methods anyway to fix up the self pointer, at which point any difference in calling convention could be accounted for).

What I am surprised about is that the code with mismatched calling conventions compiles at all, and decides to use TInterfacedObject._AddRef (which has the correct calling convention) when building the Interface VMT, instead of producing a compiler error saying that TChainer._AddRef has a mismatching calling convention.

I would not have expected that and it feels like a bug to me.

_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/cgi-bin
Mattias Gaertner
2018-06-29 17:47:05 UTC
Permalink
Raw Message
On Fri, 29 Jun 2018 14:25:19 +0200
Post by Martok
Post by Michael Van Canneyt
Out of curiosity, can you give a simple example of such a funny behaviour
in such a chaining pattern ?
We've had this topic about 2 years ago with regard to automatic file close on
interface release. Interestingly, something must have changed in the mean time,
because the trivial testcase is now *different* , which is somewhat the point of
being weird-undefined ;-)
Take this example: https://pastebin.com/gsdVXWAi
Thanks for the example.

I tried it with pas2js and found a bug and fixed it.

Mattias
_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/m
Loading...