[Push] division and tangent instructions

Sun Jun 27 00:18:03 EDT 2010

On the floats, wouldn't you always have to be checking for nans and infs and if you're not careful mightn't some operations on these values throw exceptions which you'd have to catch and handle in some language specific way? I don't think that every language I've worked in will handle them gracefully without extra work.

On ints I've always resented machine dictated limits like 32 bits -- if I want factorial 100 then I want factorial 100. I realize there are costs to this, but at least when I'm working in an implementation language that gives me BigInts automatically (like common lisp or scheme or clojure for now, although that's under discussion) I strongly favor allowing integers to get really big. One needs a limit to prevent numbers of sizes that fill all of memory, but that can be a heck of a lot more than 32 bits.

I not saying that it that a Push with 32-bit integers wouldn't be nice for some purposes (e.g. running a little faster), just that I like auto-promotion to BigInts in the integer type for many of my own applications.

I agree that it'd be just as well to add ratios and/or arbitrary precision floats as new, independent types. Integers are different, because integer is a core type that figures in a lot of standard instructions for other types.

 -Lee

On Jun 26, 2010, at 5:36 PM, Maarten Keijzer wrote:

> My take on this,
> 
> I would say that the 'float' datatype in Push should be defined as a single or 
> double precision floating point value with IEEE-754 semantics. This means 
> follow this standard (which is followed by practically all programming 
> languages) for division by zero, overflows, propagation of nans and infs, etc. 
> There's no need to invent something new, and any implementation of push will 
> therefore implicitly use this.
> 
> I suggest using double precision.
> 
> If ratios or arbitrary precision floats are necessary, they are a separate 
> type, thus use their own stack.
> 
> I would also say that the basic 'int' datetype in Push should be a 32-bit 
> integer, but I don't feel nearly as strong about it as having a IEEE-754 
> compliant semantics.
> 
> And note that Koza called 'his' division operator 'pdiv', p for protected. 
> There's no reason Push can't support a pdiv, but a real division should be 
> standard.
> 
> -Maarten-
> 
> 
> On Saturday, June 26, 2010 05:34:07 pm Lee Spector wrote:
>> 
>> I don't think I personally have strong opinions how this is handled (as long 
> as the rationale makes sense), and I agree that it'd be good for it to be 
> documented more carefully whether or not it is standardized across 
> implementations. That said, here's a little history.
>> 
>> The idea of instructions becoming no-ops when given arguments that produce 
> exceptions is something that I built into Push from the start, and I like it 
> for its clarity and uniformity across different kinds of possible exceptions. 
> Maarten Keijzer has advocated an alternative in which stacks are left in 
> intermediate states upon failure, and he has some interesting arguments for 
> this, but for now I personally prefer the more consistent no-op behavior. 
>> 
>> The idea that division by zero is an exception comes (in my experience) from 
> Koza, who defined "protected division" to return a specific value (0 or 1 -- I'd 
> have to look it up) to avoid runtime errors. I don't why he didn't use Inf or 
> NaN, but I think I was just following his lead. I followed him in considering 
> division by zero "exceptional" but then did something different in response: 
> rather than pushing a specific (incorrect) value I adopted the general Push no-
> op strategy (which wasn't an option for Koza when evolving Lisp-style s-
> expressions). 
>> 
>> I've thought of using Inf and/or NaN more recently but the details of 
> generating and handling these values can be messy and depend on the 
> implementation language, and I've generally proceeded by eliminating Inf/NaN 
> one way or another rather than treating it as a legitimate value OR as an 
> exception. For example, because I've often been working in languages that 
> allow for arbitrary sized integers (which can grow big enough to consume all 
> RAM), and also because I want to avoid underflow exceptions (which I can't 
> easily detect until the implementation language triggers an exception, which 
> I'd rather not have to deal with in different language contexts), I generally 
> define some minimum and maximum number magnitudes and clamp values to these. 
> (In several of my implementations there's a function called something like 
> "keep-number-reasonable" that handles this.) Conceivably, for consistency, I 
> should be making instructions become no-ops whenever the min/max magnitudes 
> are hit, but in
>>  fact I just clamp the values.
>> 
>> This could be done better. But it's a slippery slope. There's a lot of 
> possible complexity in the design of a language's numeric type system. In 
> several implementation languages it'd be easy to incorporate ratios and/or 
> complex numbers, and there would be a lot of things to consider not only about 
> exceptions but also about how the type system and the Push types interact 
> (e.g. if there's auto-promotion to BigInts on the integer stack, or if there's 
> instead a second stack for BigInts, etc.). When considering these design 
> decisions you have to consider not only completeness and expressive power, 
> etc., but also potential mutation/evolution dynamics, which is HARD.
>> 
>> I'm sure there are other interesting approaches to this, maybe involving the 
> "option types" or monads that you mention... 
>> 
>> On your questions about specific effects of these choices: This will only 
> affect symbolic regression of 1/x if you have a fitness case with x=0. If you 
> do, what's the target y value? If it's inf then yes, you need a number type 
> that can generate and manipulate inf. But if inf isn't in your data set then 
> you'll never need to handle it in the interpreter in a correct program... 
> unless you're doing transfinite symbolic regression, which would be VERY 
> INTERESTING! But it would require additional redesign of the numeric 
> instructions.
>> 
>> The same applies to the Tan example. If your input data contains inf then 
> you need number instructions that deal with it, but since it usually doesn't 
> you can usually clamp or handle the very big numbers in some other way.
>> 
>> -Lee
>> 
>> 
>> 
>> On Jun 26, 2010, at 10:22 AM, PerPlex Ed wrote:
>> 
>>> Division and modulo instruction are defined as no-op when the top of the 
> stack is zero.
>>> 
>>> While I understand some of the reason behind this choice, I don't feel at 
> ease with it completely.
>>> 
>>> For floating point computation I guess it's common for most platform to be 
> able to represent and perform conputation on the special values NaN, +Inf and 
> -Inf following more or less strictly the some IEEE standard. Java and C# 
> behaviours are described here:
>>> 
>>> http://www.itu.dk/people/sestoft/javaprecisely/java-floatingpoint.pdf
>>> http://www.itu.dk/~sestoft/csharpprecisely/csharp-floatingpoint.pdf
>>> 
>>> I don't know about Lisp.
>>> 
>>> Now what if I want to perform symbolic regression of 1/x?
>>> 
>>> Because of the current Push definition, any genetic programming system 
> based on Push won't be able to find an exact result. I don't have enough 
> experience to tell if this problem will make harder for Push based system to 
> find an aproximate solution.
>>> 
>>> What about integers? I see that modern languages use "option types" or the 
> Maybe monad to represent computation that can possibly result an invalid 
> result without causing any exceptional conditions or disruption in the program 
> evaluation. Isn't this something that looks appropriate for the purposes of 
> Push?
>>> 
>>> In any case, the tangent function goes to infinity periodically but there 
> is nothing in the language specification about these cases. Should they be 
> performed as no-op? Why can Float.Tan push Inf or NaN on the stack while the 
> division is not allowed to do that?
>>> In my latest Push run Float.Tan found 2.8097223026326946E+36 on the top of 
> the stack and pushed NaN.
>>> 
>>> Thanks.
>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Push mailing list
>>> Push at lists.hampshire.edu
>>> https://lists.hampshire.edu/mailman/listinfo/push
>> 
>> --
>> Lee Spector, Professor of Computer Science
>> School of Cognitive Science, Hampshire College
>> 893 West Street, Amherst, MA 01002-3359
>> lspector at hampshire.edu, http://hampshire.edu/lspector/
>> Phone: 413-559-5352, Fax: 413-559-5438
>> 
>> Check out Genetic Programming and Evolvable Machines:
>> http://www.springer.com/10710 - http://gpemjournal.blogspot.com/
>> 
>> _______________________________________________
>> Push mailing list
>> Push at lists.hampshire.edu
>> https://lists.hampshire.edu/mailman/listinfo/push
>> 

--
Lee Spector, Professor of Computer Science
School of Cognitive Science, Hampshire College
893 West Street, Amherst, MA 01002-3359
lspector at hampshire.edu, http://hampshire.edu/lspector/
Phone: 413-559-5352, Fax: 413-559-5438

Check out Genetic Programming and Evolvable Machines:
http://www.springer.com/10710 - http://gpemjournal.blogspot.com/