Discussion:
Proposal for 2 new compiler functions
(too old to reply)
J. Gareth Moreton
2018-06-12 12:42:36 UTC
Permalink
Hi everyone,
Sorry to flood the mailing list again with more ideas and experiments.

I would like to propose introducing a new pair of in-built functions for
the compiler.

function FLog2(X: Cardinal): Cardinal;
function CLog2(X: Cardinal): Cardinal;

FLog2 returns the base-2 logarithm of X, rounded down, while CLog2 returns
the base-2 logarithm of X, rounded up.

To give examples where these functions could be useful, FLog2(X) + 1
indicates the minimum number of bits required to store X as an unsigned
integer, while CLog2(X) is equal to the maximum number of iterations
required for a binary search of an X-sized list.

Why should they be in-built though? With the binary search example, the
size of the list is sometimes known at compile time, hence is a constant -
therefore, its logarithm is also a constant.  By pre-calculating the
logarithm using the in-built function, it can be used to aid optimization
such as loop unrolling.  It also makes them easier to inline, where
FLog2(X) on x86_64-win64 translates to a single line of assembly language:
BSR EAX, ECX (unless X is zero, in which case ZF is set and EAX is
undefined).

If there had to be a slight nuance though, it's what happens if X = 0,
since log(0) = -oo, which cannot be stored as an integer and may cause
problems with the compiler.  I would propose it return 0 as a special
case, as this will fix most problems when it comes to loops and storage,
and also ensure that FLog2 and CLog2 are "entire functions".  To give an
optimised example of FLog(2) in x86-64 assembly language:

XOR EDX, EDX
BSR EAX, ECX // ZF is set if ECX is zero
CMOVZ EAX, EDX // Zero (EDX) written to result if ZF is set.

Some kind of deep optimization could be used if the input is known to be
non-zero and remove the instructions either side of BSR.

(Alternative names: FILog2 and CILog2, to indicate they work on integers
and to distinguish them from versions that work with floating-point
numbers)

Gareth
J. Gareth Moreton
2018-06-12 18:33:07 UTC
Permalink
That would be nice, but I don't think Free
Pascal supports such a concept, at least
not yet. It would certainly be much more
flexible and allow such functions to be
distributed as part of a mathematical
package, say.

The concept of a pure function is useful
to determine for other purposes as well
(insomuch that pure functions are far
easier to online).

Gareth

On Tue 12/06/18 19:26 , "David Pethes"
Hi,
given the description, maybe compile
time function execution is what
you're looking for as a general solution
instead of builtins?
https://en.wikipedia.org/wiki/Compile_time
_function_execution
David
On 12. 6. 2018 14:42, J. Gareth Moreton
Post by J. Gareth Moreton
Why should they be in-built though?
With the
binary search example, the
Post by J. Gareth Moreton
size of the list is sometimes known at
compile
time, hence is a constant
Post by J. Gareth Moreton
- therefore, its logarithm is also a
constant.  By pre-calculating the
Post by J. Gareth Moreton
logarithm using the in-built function,
it can be
used to aid
Post by J. Gareth Moreton
optimization such as loop unrolling.
_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fp
Wolf
2018-06-12 23:45:47 UTC
Permalink
Hi

I object to one component of Gareth's proposal - to make log2(0)=0. The
problem lies not with what Gareth wants to do with it, but what other
people will use it for once it becomes available.  log2(0) is undefined
(and undefinable, as it is not a computable number), the appropriate
choice for log2(0) is to make it Not-A-Number (NaN).

FLog2(0) = NaN = CLog2.

Such a choice would avoid the mess Gustavson got himself into when he
mapped <www.johngustafson.net/pdfs/BeatingFloatingPoint.pdf> 0 and 1/0
onto the same number - a mapping that has many advantages for computing,
but eventually destroys computability
<https://arxiv.org/pdf/1701.00722>. To a lesser degree, this mess is
already present in the IEEE 754 standard for floating-point arithmetic,
and thus led to, to put it mildly, computing difficulties
<www.itu.dk/%7Esestoft/bachelor/IEEE754_article.pdf>, difficulties that
many programmers gloss over - or simply ignore.

I will have to say more about this when I am ready to file a bug report
on floating point exceptions, since Gareth's proposal has deep links to
how floating point numbers are defined - and why they were defined such.

Wolf
Post by J. Gareth Moreton
Hi everyone,
Sorry to flood the mailing list again with more ideas and experiments.
I would like to propose introducing a new pair of in-built functions
for the compiler.
function FLog2(X: Cardinal): Cardinal;
function CLog2(X: Cardinal): Cardinal;
FLog2 returns the base-2 logarithm of X, rounded down, while CLog2
returns the base-2 logarithm of X, rounded up.
To give examples where these functions could be useful, FLog2(X) + 1
indicates the minimum number of bits required to store X as an
unsigned integer, while CLog2(X) is equal to the maximum number of
iterations required for a binary search of an X-sized list.
Given the stated purpose, I could withdraw my objection if any reference
to logarithm was removed from the function and its name. Then Gareth
would be free to create his function any way he likes and assign to it
the properties he chooses. The only requirement left then would be to
state in the comments around the function what it is supposed to
achieve, as a deterrence to mis-use and guidance to future maintainers,
who may not think the same as Gareth does.
Post by J. Gareth Moreton
Why should they be in-built though? With the binary search example,
the size of the list is sometimes known at compile time, hence is a
constant - therefore, its logarithm is also a constant.  By
pre-calculating the logarithm using the in-built function, it can be
used to aid optimization such as loop unrolling.  It also makes them
easier to inline, where FLog2(X) on x86_64-win64 translates to a
single line of assembly language: BSR EAX, ECX (unless X is zero, in
which case ZF is set and EAX is undefined).
If there had to be a slight nuance though, it's what happens if X = 0,
since log(0) = -oo
This statement is false. log(0) is not infinity. To obtain a numerical
value for log(0) by e.g. Taylor series expansion, at one stage you have
to divide by zero since the differential
(d ln x )/ d x = 1/x.
 And since 1/0 is not an element of the set of computable numbers,
log(0) is not either. The only valid assignment can be log(0):=NaN, for
any base.
Post by J. Gareth Moreton
, which cannot be stored as an integer and may cause problems with the
compiler.  I would propose it return 0 as a special case, as this will
fix most problems when it comes to loops and storage, and also ensure
that FLog2 and CLog2 are "entire functions".  To give an optimised
XOR EDX, EDX
BSR EAX, ECX // ZF is set if ECX is zero
CMOVZ EAX, EDX // Zero (EDX) written to result if ZF is set.
Some kind of deep optimization could be used if the input is known to
be non-zero and remove the instructions either side of BSR.
(Alternative names: FILog2 and CILog2, to indicate they work on
integers and to distinguish them from versions that work with
floating-point numbers)
Gareth
_______________________________________________
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
J. Gareth Moreton
2018-06-12 23:07:25 UTC
Permalink
Well, I would argue that when computing log(x) of any base, as x tends to
0 from a positive number, log(x) becomes a progressively more negative
number to the point that, yes, it's undefined, but that's only going by the
definition of limits.

The issue here is that my proposed logarithm functions work not on
floating-point numbers, but integers, primarily for the sake of speed and
application.  There is no way to store a NaN or plus or minus infinity in
an integral type, which means the only other option is to raise an
exception.
For truly mathematical functions with a continuous domain, then yes, it
should return proper values.  I suppose what I'm proposing is not the true
base-2 logarithm, but two functions that do the following:

FLog2(x), x element of N (natural numbers), including zero

0, x = 0
floor(log2(x)); x ≠ 0

CLog2(x), x element of N, including zero

0, x = 0
ceil(log2(x)); x ≠ 0

(Not easy to describe when you can't use proper mathematical notation in
an e-mail!)
In this instance, it's less about mathematical correctness and more for
the sake of convenience, because only a single input (zero) would be
undefined, and for the sake of bit lengths and loop iterations, 0 is a more
convenient value than throwing an exception or returning something
undefined or potentially hazardous like $FFFFFFFF, which if passed blindly
into a for-loop, will cause 4.29 billion iterations..
Gareth

On Wed 13/06/18 00:45 , Wolf ***@gmail.com sent:

Hi

I object to one component of Gareth's proposal - to make log2(0)=0. The
problem lies not with what Gareth wants to do with it, but what other
people will use it for once it becomes available.  log2(0) is undefined
(and undefinable, as it is not a computable number), the appropriate choice
for log2(0) is to make it Not-A-Number (NaN).

FLog2(0) = NaN = CLog2.

Such a choice would avoid the mess Gustavson got himself into when he
mapped [1] 0 and 1/0 onto the same number - a mapping that has many
advantages for computing, but eventually destroys computability [2]. To a
lesser degree, this mess is already present in the IEEE 754 standard for
floating-point arithmetic, and thus led to, to put it mildly, computing
difficulties [3], difficulties that many programmers gloss over - or simply
ignore.
I will have to say more about this when I am ready to file a bug report
on floating point exceptions, since Gareth's proposal has deep links to how
floating point numbers are defined - and why they were defined such.

Wolf

On 13/06/2018 00:42, J. Gareth Moreton wrote:
Hi everyone,
Sorry to flood the mailing list again with more ideas and experiments.

I would like to propose introducing a new pair of in-built functions for
the compiler.

function FLog2(X: Cardinal): Cardinal;
function CLog2(X: Cardinal): Cardinal;

FLog2 returns the base-2 logarithm of X, rounded down, while CLog2 returns
the base-2 logarithm of X, rounded up.

To give examples where these functions could be useful, FLog2(X) + 1
indicates the minimum number of bits required to store X as an unsigned
integer, while CLog2(X) is equal to the maximum number of iterations
required for a binary search of an X-sized list.
Given the stated purpose, I could withdraw my objection if any reference
to logarithm was removed from the function and its name. Then Gareth would
be free to create his function any way he likes and assign to it the
properties he chooses. The only requirement left then would be to state in
the comments around the function what it is supposed to achieve, as a
deterrence to mis-use and guidance to future maintainers, who may not think
the same as Gareth does.
Why should they be in-built though? With the binary search example, the
size of the list is sometimes known at compile time, hence is a constant -
therefore, its logarithm is also a constant.  By pre-calculating the
logarithm using the in-built function, it can be used to aid optimization
such as loop unrolling.  It also makes them easier to inline, where
FLog2(X) on x86_64-win64 translates to a single line of assembly language:
BSR EAX, ECX (unless X is zero, in which case ZF is set and EAX is
undefined).

If there had to be a slight nuance though, it's what happens if X = 0,
since log(0) = -oo This statement is false. log(0) is not infinity. To
obtain a numerical value for log(0) by e.g. Taylor series expansion, at one
stage you have to divide by zero since the differential
(d ln x )/ d x = 1/x.
 And since 1/0 is not an element of the set of computable numbers,
log(0) is not either. The only valid assignment can be log(0):=NaN, for any
base.
, which cannot be stored as an integer and may cause problems with the
compiler.  I would propose it return 0 as a special case, as this will fix
most problems when it comes to loops and storage, and also ensure that
FLog2 and CLog2 are "entire functions".  To give an optimised example of
FLog(2) in x86-64 assembly language:

XOR EDX, EDX
BSR EAX, ECX // ZF is set if ECX is zero
CMOVZ EAX, EDX // Zero (EDX) written to result if ZF is set.

Some kind of deep optimization could be used if the input is known to be
non-zero and remove the instructions either side of BSR.

(Alternative names: FILog2 and CILog2, to indicate they work on integers
and to distinguish them from versions that work with floating-point
numbers)

Gareth


Links:
------
[1]
http://secureweb.fast.net.uk/www.johngustafson.net/pdfs/BeatingFloatingPoint.pdf
[2] https://arxiv.org/pdf/1701.00722
[3]
http://secureweb.fast.net.uk/www.itu.dk/%7Esestoft/bachelor/IEEE754_article.pdf
[4] mailto:fpc-***@lists.freepascal.org
[5] http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Wolf
2018-06-13 00:56:22 UTC
Permalink
Post by J. Gareth Moreton
Well, I would argue that when computing log(x) of any base, as x tends
to 0 from a positive number, log(x) becomes a progressively more
negative number to the point that, yes, it's undefined, but that's
only going by the definition of limits.
The issue here is that my proposed logarithm functions work not on
floating-point numbers, but integers, primarily for the sake of speed
and application.  There is no way to store a NaN or plus or minus
infinity in an integral type, which means the only other option is to
raise an exception.
That's why I am willing to withdraw my objections, as stated, the moment
you remove log from the function name. What you want to use is a
function of your own design, with value mappings different from what a
logarithm does. My objection is not about your optimization work, it is
about potential mis-use of your work, and mis-understandings by any
future maintainer of your work.
Wolf
Post by J. Gareth Moreton
For truly mathematical functions with a continuous domain, then yes,
it should return proper values.  I suppose what I'm proposing is not
FLog2(x), x element of N (natural numbers), including zero
0, x = 0
floor(log2(x)); x ≠ 0
CLog2(x), x element of N, including zero
0, x = 0
ceil(log2(x)); x ≠ 0
(Not easy to describe when you can't use proper mathematical notation
in an e-mail!)
In this instance, it's less about mathematical correctness and more
for the sake of convenience, because only a single input (zero) would
be undefined, and for the sake of bit lengths and loop iterations, 0
is a more convenient value than throwing an exception or returning
something undefined or potentially hazardous like $FFFFFFFF, which if
passed blindly into a for-loop, will cause 4.29 billion iterations..
I understand that, and if you want to use a Bit Scan Reverse
instruction, use it. But do not call it a logarithm, because that has
implications . . .  Take a look at the *[fpc-pascal] round(2.5)=2*
thread. Why is nobody there suggesting to look for Intel to sort out his
/her rounding issues? That thread displays the kind of blindness I am
concerned about. The answers are available, but hidden in massive
documentation, as you yourself noticed so recently.
Wolf
Post by J. Gareth Moreton
Gareth
Hi
I object to one component of Gareth's proposal - to make
log2(0)=0. The problem lies not with what Gareth wants to do with
it, but what other people will use it for once it becomes
available.  log2(0) is undefined (and undefinable, as it is not a
computable number), the appropriate choice for log2(0) is to make
it Not-A-Number (NaN).
FLog2(0) = NaN = CLog2.
Such a choice would avoid the mess Gustavson got himself into when
he mapped <www.johngustafson.net/pdfs/BeatingFloatingPoint.pdf> 0
and 1/0 onto the same number - a mapping that has many advantages
for computing, but eventually destroys computability
<https://arxiv.org/pdf/1701.00722>. To a lesser degree, this mess
is already present in the IEEE 754 standard for floating-point
arithmetic, and thus led to, to put it mildly, computing
difficulties <www.itu.dk/%7Esestoft/bachelor/IEEE754_article.pdf>,
difficulties that many programmers gloss over - or simply ignore.
I will have to say more about this when I am ready to file a bug
report on floating point exceptions, since Gareth's proposal has
deep links to how floating point numbers are defined - and why
they were defined such.
Wolf
Post by J. Gareth Moreton
Hi everyone,
Sorry to flood the mailing list again with more ideas and
experiments.
I would like to propose introducing a new pair of in-built
functions for the compiler.
function FLog2(X: Cardinal): Cardinal;
function CLog2(X: Cardinal): Cardinal;
FLog2 returns the base-2 logarithm of X, rounded down, while
CLog2 returns the base-2 logarithm of X, rounded up.
To give examples where these functions could be useful, FLog2(X)
+ 1 indicates the minimum number of bits required to store X as
an unsigned integer, while CLog2(X) is equal to the maximum
number of iterations required for a binary search of an X-sized list.
Given the stated purpose, I could withdraw my objection if any
reference to logarithm was removed from the function and its name.
Then Gareth would be free to create his function any way he likes
and assign to it the properties he chooses. The only requirement
left then would be to state in the comments around the function
what it is supposed to achieve, as a deterrence to mis-use and
guidance to future maintainers, who may not think the same as
Gareth does.
Post by J. Gareth Moreton
Why should they be in-built though? With the binary search
example, the size of the list is sometimes known at compile time,
hence is a constant - therefore, its logarithm is also a
constant.  By pre-calculating the logarithm using the in-built
function, it can be used to aid optimization such as loop
unrolling.  It also makes them easier to inline, where FLog2(X)
BSR EAX, ECX (unless X is zero, in which case ZF is set and EAX
is undefined).
If there had to be a slight nuance though, it's what happens if X
= 0, since log(0) = -oo
This statement is false. log(0) is not infinity. To obtain a
numerical value for log(0) by e.g. Taylor series expansion, at one
stage you have to divide by zero since the differential
(d ln x )/ d x = 1/x.
 And since 1/0 is not an element of the set of computable numbers,
log(0) is not either. The only valid assignment can be
log(0):=NaN, for any base.
Post by J. Gareth Moreton
, which cannot be stored as an integer and may cause problems
with the compiler.  I would propose it return 0 as a special
case, as this will fix most problems when it comes to loops and
storage, and also ensure that FLog2 and CLog2 are "entire
functions".  To give an optimised example of FLog(2) in x86-64
XOR EDX, EDX
BSR EAX, ECX // ZF is set if ECX is zero
CMOVZ EAX, EDX // Zero (EDX) written to result if ZF is set.
Some kind of deep optimization could be used if the input is
known to be non-zero and remove the instructions either side of BSR.
(Alternative names: FILog2 and CILog2, to indicate they work on
integers and to distinguish them from versions that work with
floating-point numbers)
Gareth
_______________________________________________
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
_______________________________________________
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
<%3Ca%20href=>">http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
_______________________________________________
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
J. Gareth Moreton
2018-06-13 00:13:27 UTC
Permalink
I can see where you're coming from here.
If not a logarithm, what would you call
these functions that concisely and
compactly describes their behaviour and
purpose? While "bit scan reverse" is
effectively a truncated log2, with a flag
set if the input was zero, it's not a name
that jumps out at the user as "aah, that's
what I need" or "ah, I know what that's
for", and there's no equivalent to what
I'm calling CLog2 yet.

Would "ILog2" and "CILog2" differentiate
them enough from the mathematical
functions?

Gareth

On Wed 13/06/18 01:56 , Wolf
On 13/06/2018 11:07, J. Gareth Moreton
Well, I would argue that when computing
log(x) of any base, as x tends
to 0 from a positive number, log(x)
becomes a progressively more negative
number to the point that, yes, it's
undefined, but that's only going by the
definition of limits.
The issue here is that my proposed
logarithm functions work not on
floating-point numbers, but integers,
primarily for the sake of speed and
application.  There is no way to store a
NaN or plus or minus infinity in
an integral type, which means the only
other option is to raise an
exception. That's why I am willing to
withdraw my objections, as stated,
the moment you remove log from the
function name. What you want to use is a
function of your own design, with value
mappings different from what a
logarithm does. My objection is not
about your optimization work, it is
about potential mis-use of your work,
and mis-understandings by any future
maintainer of your work.
Wolf
For truly mathematical functions with a
continuous domain, then yes, it
should return proper values.  I suppose
what I'm proposing is not the true
base-2 logarithm, but two functions that
FLog2(x), x element of N (natural
numbers), including zero
0, x = 0
floor(log2(x)); x ≠ 0
CLog2(x), x element of N, including zero
0, x = 0
ceil(log2(x)); x ≠ 0
(Not easy to describe when you can't use
proper mathematical notation in
an e-mail!)
In this instance, it's less about
mathematical correctness and more for
the sake of convenience, because only a
single input (zero) would be
undefined, and for the sake of bit
lengths and loop iterations, 0 is a more
convenient value than throwing an
exception or returning something
undefined or potentially hazardous like
$FFFFFFFF, which if passed blindly
into a for-loop, will cause 4.29 billion
iterations.. I understand that,
and if you want to use a Bit Scan
Reverse instruction, use it. But do not
call it a logarithm, because that has
implications . . .  Take a look at
the [FPC-PASCAL] ROUND(2.5)=2 thread.
Why is nobody there suggesting to
look for Intel to sort out his /her
rounding issues? That thread displays
the kind of blindness I am concerned
about. The answers are available, but
hidden in massive documentation, as you
yourself noticed so recently.
Wolf
Gareth
On Wed 13/06/18 00:45 , Wolf
Hi
I object to one component of Gareth's
proposal - to make log2(0)=0. The
problem lies not with what Gareth wants
to do with it, but what other
people will use it for once it becomes
available.  log2(0) is undefined
(and undefinable, as it is not a
computable number), the appropriate choice
for log2(0) is to make it Not-A-Number
(NaN).
FLog2(0) = NaN = CLog2.
Such a choice would avoid the mess
Gustavson got himself into when he
mapped [1] 0 and 1/0 onto the same
number - a mapping that has many
advantages for computing, but eventually
destroys computability [2]. To a
lesser degree, this mess is already
present in the IEEE 754 standard for
floating-point arithmetic, and thus led
to, to put it mildly, computing
difficulties [3], difficulties that many
programmers gloss over - or simply
ignore.
I will have to say more about this when
I am ready to file a bug report
on floating point exceptions, since
Gareth's proposal has deep links to how
floating point numbers are defined - and
why they were defined such.
Wolf
On 13/06/2018 00:42, J. Gareth Moreton
Hi everyone,
Sorry to flood the mailing list again
with more ideas and experiments.
I would like to propose introducing a
new pair of in-built functions for
the compiler.
function FLog2(X: Cardinal): Cardinal;
function CLog2(X: Cardinal): Cardinal;
FLog2 returns the base-2 logarithm of X,
rounded down, while CLog2 returns
the base-2 logarithm of X, rounded up.
To give examples where these functions
could be useful, FLog2(X) + 1
indicates the minimum number of bits
required to store X as an unsigned
integer, while CLog2(X) is equal to the
maximum number of iterations
required for a binary search of an X-
sized list.
Given the stated purpose, I could
withdraw my objection if any reference
to logarithm was removed from the
function and its name. Then Gareth would
be free to create his function any way
he likes and assign to it the
properties he chooses. The only
requirement left then would be to state in
the comments around the function what it
is supposed to achieve, as a
deterrence to mis-use and guidance to
future maintainers, who may not think
the same as Gareth does.
Why should they be in-built though? With
the binary search example, the
size of the list is sometimes known at
compile time, hence is a constant -
therefore, its logarithm is also a
constant.  By pre-calculating the
logarithm using the in-built function,
it can be used to aid optimization
such as loop unrolling.  It also makes
them easier to inline, where
FLog2(X) on x86_64-win64 translates to a
BSR EAX, ECX (unless X is zero, in which
case ZF is set and EAX is
undefined).
If there had to be a slight nuance
though, it's what happens if X = 0,
since log(0) = -oo This statement is
false. log(0) is not infinity. To
obtain a numerical value for log(0) by
e.g. Taylor series expansion, at one
stage you have to divide by zero since
the differential
(d ln x )/ d x = 1/x.
 And since 1/0 is not an element of the
set of computable numbers,
log(0) is not either. The only valid
assignment can be log(0):=NaN, for any
base.
, which cannot be stored as an integer
and may cause problems with the
compiler.  I would propose it return 0
as a special case, as this will fix
most problems when it comes to loops and
storage, and also ensure that
FLog2 and CLog2 are "entire functions". 
To give an optimised example of
XOR EDX, EDX
BSR EAX, ECX // ZF is set if ECX is zero
CMOVZ EAX, EDX // Zero (EDX) written to
result if ZF is set.
Some kind of deep optimization could be
used if the input is known to be
non-zero and remove the instructions
either side of BSR.
(Alternative names: FILog2 and CILog2,
to indicate they work on integers
and to distinguish them from versions
that work with floating-point
numbers)
Gareth
------
[1]
http://secureweb.fast.net.uk/www.johngusta
fson.net/pdfs/BeatingFloatingPoin
t.pdf[2]
https://arxiv.org/pdf/1701.00722
[3]
http://secureweb.fast.net.uk/www.itu.dk/%7
Esestoft/bachelor/IEEE754_article
.pdf[4] http://secureweb.fast.net.uk/
http:=
[5] http://lists.freepascal.org/cgi-
bin/mailman/listinfo/fpc-devel
[6]
http://secureweb.fast.net.uk/parse.php?
redirect=http://lists.freepascal.org
/cgi-bin/mailman/listinfo/fpc-devel
_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman
Wolf
2018-06-13 02:11:05 UTC
Permalink
Why not call them UIntSize instead of FLog2 and BinSearchSize forCLog2?
That would give an indication of how to use them, and class them along
SizeOf, for finding them easily. After all, you are looking for the size
of a variable?

Wolf
Post by J. Gareth Moreton
I can see where you're coming from here.
If not a logarithm, what would you call
these functions that concisely and
compactly describes their behaviour and
purpose? While "bit scan reverse" is
effectively a truncated log2, with a flag
set if the input was zero, it's not a name
that jumps out at the user as "aah, that's
what I need" or "ah, I know what that's
for", and there's no equivalent to what
I'm calling CLog2 yet.
Would "ILog2" and "CILog2" differentiate
them enough from the mathematical
functions?
Gareth
_______________________________________________
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freepascal
J. Gareth Moreton
2018-06-13 08:32:33 UTC
Permalink
I started to think about such names while
sleeping on it. Something like BitCount
would work splendidly for FLog2, and a
note in the documentation can be made to
show the similarity to the binary
logarithm. BinSearchSize is a little
restrictive, but I'm sure something
similar can be found.

All these arguments over names!

Gareth

On Wed 13/06/18 03:11 , Wolf
Post by Wolf
Why not call them UIntSize instead of
FLog2 and BinSearchSize forCLog2?
Post by Wolf
That would give an indication of how to
use them, and class them along
Post by Wolf
SizeOf, for finding them easily. After
all, you are looking for the size
Post by Wolf
of a variable?
Wolf
On 13/06/2018 12:13, J. Gareth Moreton
Post by J. Gareth Moreton
I can see where you're coming from
here.
Post by Wolf
Post by J. Gareth Moreton
If not a logarithm, what would you
call
Post by Wolf
Post by J. Gareth Moreton
these functions that concisely and
compactly describes their behaviour
and
Post by Wolf
Post by J. Gareth Moreton
purpose? While "bit scan reverse" is
effectively a truncated log2, with a
flag
Post by Wolf
Post by J. Gareth Moreton
set if the input was zero, it's not a
name
Post by Wolf
Post by J. Gareth Moreton
that jumps out at the user as "aah,
that's
Post by Wolf
Post by J. Gareth Moreton
what I need" or "ah, I know what
that's
Post by J. Gareth Moreton
for", and there's no equivalent to
what
Post by Wolf
Post by J. Gareth Moreton
I'm calling CLog2 yet.
Would "ILog2" and "CILog2"
differentiate
Post by J. Gareth Moreton
them enough from the mathematical
functions?
Gareth
__________________________________________
_____
Post by Wolf
Post by J. Gareth Moreton
fpc-devel maillist - fpc-
http://lists.freepascal.org/cgi-
bin/mailman/listinfo/fpc-devel
__________________________________________
_____
Post by Wolf
fpc-devel maillist - fpc-
http://lists.freepascal.org/cgi-
bin/mailman/listinfo/fpc-devel
_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists
Florian Klämpfl
2018-06-13 19:28:44 UTC
Permalink
Post by J. Gareth Moreton
Hi everyone,
Sorry to flood the mailing list again with more ideas and experiments.
I would like to propose introducing a new pair of in-built functions for the compiler.
function FLog2(X: Cardinal): Cardinal;
function CLog2(X: Cardinal): Cardinal;
FLog2 returns the base-2 logarithm of X, rounded down, while CLog2 returns the base-2 logarithm of X, rounded up.
To give examples where these functions could be useful, FLog2(X) + 1 indicates the minimum number of bits required to
store X as an unsigned integer, while CLog2(X) is equal to the maximum number of iterations required for a binary search
of an X-sized list.
Why should they be in-built though? With the binary search example, the size of the list is sometimes known at compile
time, hence is a constant - therefore, its logarithm is also a constant.  By pre-calculating the logarithm using the
in-built function, it can be used to aid optimization such as loop unrolling.  It also makes them easier to inline,
where FLog2(X) on x86_64-win64 translates to a single line of assembly language: BSR EAX, ECX (unless X is zero, in
which case ZF is set and EAX is undefined).
If there had to be a slight nuance though, it's what happens if X = 0, since log(0) = -oo, which cannot be stored as an
integer and may cause problems with the compiler.  I would propose it return 0 as a special case, as this will fix most
problems when it comes to loops and storage, and also ensure that FLog2 and CLog2 are "entire functions".  To give an
XOR EDX, EDX
BSR EAX, ECX // ZF is set if ECX is zero
CMOVZ EAX, EDX // Zero (EDX) written to result if ZF is set.
Some kind of deep optimization could be used if the input is known to be non-zero and remove the instructions either
side of BSR.
Here again: I would first try find a good pascal implementation of the two functions, then check, what FPC produces,
actually, FPC has already a BsrDWord function which is very similar to FLog2, though it returns 255 for 0. And then I
would propose to improve the optimizer to generate code close to hand coded assembler.

_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/cgi
J. Gareth Moreton
2018-06-13 18:46:40 UTC
Permalink
On Wed 13/06/18 20:28 , Florian Klämpfl
Am 12.06.2018 um 14:42 schrieb J. Gareth
Post by J. Gareth Moreton
Hi everyone,
Sorry to flood the mailing list again
with more
ideas and experiments.
Post by J. Gareth Moreton
I would like to propose introducing a
new pair
of in-built functions for the compiler.
Post by J. Gareth Moreton
function FLog2(X: Cardinal): Cardinal;
function CLog2(X: Cardinal): Cardinal;
FLog2 returns the base-2 logarithm of
X, rounded
down, while CLog2 returns the base-2
logarithm of X, rounded up.
Post by J. Gareth Moreton
To give examples where these functions
could be
useful, FLog2(X) + 1 indicates the
minimum number of bits required to
Post by J. Gareth Moreton
store X as an unsigned integer, while
CLog2(X)
is equal to the maximum number of
iterations required for a binary search
Post by J. Gareth Moreton
of an X-sized list.
Why should they be in-built though?
With the
binary search example, the size of the
list is sometimes known at compile
Post by J. Gareth Moreton
time, hence is a constant - therefore,
its
logarithm is also a constant.  By pre-
calculating the logarithm using
the
Post by J. Gareth Moreton
in-built function, it can be used to
aid
optimization such as loop unrolling.  It
also makes them easier to
inline,
Post by J. Gareth Moreton
where FLog2(X) on x86_64-win64
translates to a
single line of assembly language: BSR
EAX, ECX (unless X is zero, in
Post by J. Gareth Moreton
which case ZF is set and EAX is
undefined).
Post by J. Gareth Moreton
If there had to be a slight nuance
though, it's
what happens if X = 0, since log(0) = -
oo, which cannot be stored as an
Post by J. Gareth Moreton
integer and may cause problems with
the
compiler.  I would propose it return 0
as a special case, as this will
fix most
Post by J. Gareth Moreton
problems when it comes to loops and
storage, and
also ensure that FLog2 and CLog2 are
"entire functions".  To
give an
Post by J. Gareth Moreton
optimised example of FLog(2) in x86-64
assembly
Post by J. Gareth Moreton
XOR EDX, EDX
BSR EAX, ECX // ZF is set if ECX is
zero
Post by J. Gareth Moreton
CMOVZ EAX, EDX // Zero (EDX) written
to result
if ZF is set.
Post by J. Gareth Moreton
Some kind of deep optimization could
be used if
the input is known to be non-zero and
remove the instructions either
Post by J. Gareth Moreton
side of BSR.
Here again: I would first try find a
good pascal implementation of the two
functions, then check, what FPC
produces,
actually, FPC has already a BsrDWord
function which is very similar to
FLog2, though it returns 255 for 0. And
then I
would propose to improve the optimizer
to generate code close to hand coded
assembler.
__________________________________________
_____
fpc-devel maillist - fpc-
http://lists.freepascal.org/cgi-
bin/mailman/listinfo/fpc-devel
Aah, I didn't know about that function -
thank you. I have looked up algorithms for
floor(log2(x)), and the best ones I've
found use a heap of bit shifts and other
logical operations, and a few conditional
checks, I think, which is all somewhat
slower than the 10-cycle BSR instruction.
Granted, I agree there should be a good
Pascal implementation so it is portable.

For CLog2(x) the only solution I've found
currently is FLog2(x-1)+1 while returning
0 if x is 0 or 1. Mind you, an extra
addition, subtraction and a conditional
check isn't bad.

Gareth

P.S. BsrDWord is effectively FLog2, with
the 255 return value standing in for -oo.
_______________________________________________
fpc-devel maillist - fpc-***@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/m

Loading...