Virtual Types

DarScott · Post by **DarScott** » Wed Jun 19, 2013 6:15 pm

Since numbers being numerals is probably not a "viable path", then we here at my lab will discuss reopening development on an arithmetic and math package for LiveCode users that will include high-precision decimal numerals.

mwieder · Post by **mwieder** » Wed Jun 19, 2013 6:58 pm

Like this?
http://software.intel.com/en-us/article ... th-library

DarScott · Post by **DarScott** » Wed Jun 19, 2013 7:52 pm

@mwieder, that std is something I was thinking for LiveCode virtual numerals for machines without the hardware for decimal floating point. The decimal point will allows numbers to be virtually strings. (Breaking some scripts related to numberFormat, though.)

However, that doesn't make our arithmetic library slow enough, so we might go to fixed point for a GPL version (if offered), and also fixed or programmable precision scientific (floating point) and indefinite precision with programmable accuracy. (We weren't able to come up with some single form for results good for both cryptography and scientific uses, but are open for ideas.)

LCMark · Post by **LCMark** » Thu Jun 20, 2013 9:37 am

Since numbers being numerals is probably not a "viable path", then we here at my lab will discuss reopening development on an arithmetic and math package for LiveCode users that will include high-precision decimal numerals.

If by 'numbers being numerals' you meant moving to decimal arithmetic in the engine, then I'm all for that...

Indeed, after refactoring this becomes a case of 'just' implementing a variant of the MCNumberRef type that uses decimal arithmetic rather than binary fp. We can also look at making integers 'infinite-precision' too.

The same is true of unicode support, after refactoring this becomes a case of 'just' adding the ability to manipulate Unicode strings to the MCStringRef type.

Now, in both of these cases I say 'just' because there are will be some aspects of the engine and user-visible script syntax that might need to be adjusted but in either case this won't be a great deal.

that std is something I was thinking for LiveCode virtual numerals for machines without the hardware for decimal floating point. The decimal point will allows numbers to be virtually strings. (Breaking some scripts related to numberFormat, though.)

There are a number of libraries out there that implement decimal arithmetic according to the IEEE standard - IBM has one too. It's just a case of finding one that has a suitable license that we can use.

LCMark · Post by **LCMark** » Thu Jun 20, 2013 9:39 am

One little thing that arrays are missing is the ability to pass a sub array as a reference in user commands and functions. Perhaps we can agree that that can be fixed. (I say little, but there might be some bad side effect problems.)

This is something that will be possible in the refactored branch. Indeed, the aim is to generalize it to the ability to be able to pass any chunk as a reference parameter:

Code: Select all

   myHandler tNotByRef, char 1 to 10 of tThisIsPassedByRef

   command myHandler tNormal, @xByRef
     put "foo" into xByRef
   end command

DarScott · Post by **DarScott** » Thu Jun 20, 2013 5:23 pm

I am for moving to decimal numbers. (I'm not volunteering. Yet.)

But by saying, "numbers as numerals", I mean that you can't tell that there are numbers. You currently can recognize the result of arithmetic by converting it to a string with different numberFormat values. Kids currently see strange things happen when working with equality (beyond the binary point arithmetic problems) and it creates confusion. The goal of what I'm pondering is to remove that. That is, every value of a number has a canonical string representation with no loss of information. It is as if it is that string. Only kids who work with very big or very small numbers will see scientific notation (which doesn't have to be E notation). And it works the other way. Among all the strings that can be converted to numbers for arithmetic, those canonical strings are a subset.

This creates problems. What does numberFormat do? With the above change 'put 1/3 into field 1' will display "0.33333333333333333333" instead of the shorter string now. Perhaps an explicit operation where we would now put into a display or do the &empty trick, would be needed and it would use numberFormat. These problems might be way too much for people.

If we can't make this change, then we need better documentation and better language for talking about the result of arithmetic.

By the way, I talked with one of the programmers for the math/arithmetic package and he's going to be busy in grad school, so it might not happen. But, maybe.

DarScott · Post by **DarScott** » Thu Jun 20, 2013 5:40 pm

I did an analysis of the problems with passing chunks and array elements by reference. Maybe it is on the improve list, maybe it is lost. It might be more wimpy than I remember.

From what I remember, there are potential problems with overlapping chunks, global variables and maybe variables accessible through context do.

I'm happy with behavior being undefined for scripts that operate on the same values from two views. However, there should not be an explosive amount of time or memory involved and there should be no crash. Also, the effect should be as local as possible.

However, if you guys can think of some reasonable behavior that is easy to define, then that is fine. Well, maybe only the non-overlapping cases should be easy to define. The overlapping ones might take another paragraph or two.

One simple way for an array element is to allow it to be come orphaned if there are array changes to the parent array that removes it and the handler works with the variable being the only owner of the value. One simple way for chunks is to assume char n to m of a reference and ignore changes to the value of the reference. But that can be smarter, but I'm not sure how far that will go. If the reference is no longer owned by any other variable, that should be fine. If the reference is (say) emptied, then the char chunk can work similar or the same as now.

LCMark · Post by **LCMark** » Thu Jun 20, 2013 6:36 pm

From what I remember, there are potential problems with overlapping chunks, global variables and maybe variables accessible through context do.

Indeed - although I think I've come up with a simple solution to the issue...

Imagine you have a situation such as this:

Code: Select all

local sVar, sVarArray, sVarString
on mouseUp
  local x, n
  put "foo" into x
  put 100 into n
  myHandler sVar, sVarArray[x], item n of sVarString
end mouseUp

command myHandler @xVar, @xElement, @xItem
  put 100 into xVar
  put 100 into xElement
  put 100 into xItem
end command

The case of what happens in the @xVar case is clear - you want the variable that was passed into the command to be updated. Indeed, one can view this as if the line were rewritten as:

Code: Select all

  put 100 into sVar

The caller passes in the location of the sVar variable, rather than it's value.

This is easily extended to the other cases...

So, what is the location of 'sVarArray[x]'? Well, it is 'sVarArray["foo"]'. So, in the @xElement case, the action would be as if the line were rewritten as:

Code: Select all

  put 100 into sVarArray["foo"]

Again, what is the location of 'item n of sVarString'? Well, it is 'item 100 of sVarString'. So in the @xItem case, the action would be as if the line were rewritten as:

Code: Select all

  put 100 into item 100 of sVarString

Essentially, when evaluating the parameter list the engine will evaluate each parameter just enough to fix a location and it will be that location which is used in reference parameters. For example to encode a location of an array element you need the base variable and a sequence of keys, to encode an item chunk you need the base variable and an index.

DarScott · Post by **DarScott** » Thu Jun 20, 2013 7:23 pm

I think there are some interesting consequences to your approach.

If the variable reference and a path of keys is used to refer to an array element, then those will have to be calculated each time. That will make a performance hit. However, if a side effect removes part of the array, setting the referenced parameter will replace those branches. If a reference to the element is used, then the time is short and constant and the same for all cases. However, side effect changes can make the handler the only reference to the element. Manipulation is consistent, but there is no final effect.

If the chunk is recalculated at each use, there can be a performance hit. However, if the chunk is converted to a character chunk, which can presumably be accessed in constant time, or near constant time if tail sharing is used. The character chunk can remain constant in offset or be adjusted for some simple side effects.

Here is what I mean by side effects.

Code: Select all

...
global x
global s
put 32 into y["a"]
put y into x["b"]
put "a,b,c,d" into s
process x["b"], item 3 of s, char 5 of s
...

command process @w, @z, @zz
   global x
   global s
   if w["a"] < 5 then return "bad w"  -- some range checking
   put empty into x -- what impact on w?
   add 1 to w["a"]  -- we think this is now at least 6
   put the keys of x 
   put lf & w["a"] after message
   put "z," before s --what impact on z, and zz?
   put "m" into char 1 of z
   put space & z after message
   put space & zz after message
   return sqrt( w["a"]-6 )  -- error?  range checking was done
end process

It might be straightforward to allow some sort of consistency with array elements, but that might not be possible with chunks.

One way to make passed chunks consistent internal to the handler is to virtually copy the chunks and then virtually put them back left-most arguments first, or undefined order. By "virtually" I mean the poor implementer is expected to make it a lot faster than that somehow.

DarScott · Post by **DarScott** » Thu Jun 20, 2013 7:36 pm

I think my second paragraph above is hard to read because I used reference two ways.

I am imagining that values will be referenced counted entities. These might be strings or arrays (for this discussion).

A LiveCode parameter reference might be done a couple ways. It might be passed the mutable entity in the variable along with a path down the array (if any) to the entity passed (or chunk of that entity if a string). Or, it might be passed simply the mutable entity itself, no path.

The latter is faster and allows consistent manipulation of the passed LiveCode parameter reference.

The first is self-repairing if the parent is changed or removed.

I like the first.

(I wonder if this needs its own topic.)

DarScott · Post by **DarScott** » Thu Jun 20, 2013 9:55 pm

I don't know why I'm making a big deal of this; I'm sure you guys will do something good.

DarScott · Post by **DarScott** » Fri Jun 21, 2013 5:23 pm

A thought came to me this morning.

Maybe the same mechanism used to pass chunks as reference can be used to implement lazy chunks. The copy is not done unless the source string or the copy is modified. Then the value becomes detached, the copy is made and it becomes its own string.

LCMark · Post by **LCMark** » Fri Jun 21, 2013 7:54 pm

I am imagining that values will be referenced counted entities. These might be strings or arrays (for this discussion).

Yes - all values will be reference counted. Also copy-on-write which removes the need for using @ (reference parameters) for efficiency improvements.

Or, it might be passed simply the mutable entity itself, no path.

Implementation-wise this is really muddy - if you don't want copy-on-write semantics then it would work to just pass the child node in as the mutable entity. However, that then means that it can become orphaned if the parent variable is changed. This essentially means you don't really have a 'reference' parameter anymore - just one that is until you do something that makes it not able to be one. With this scenario, if you want to keep copy-on-write semantics then you have to pass the base variable along with it (since copy-on-write with arrays means when you change a child you need to ensure each of its ancestors is mutable) and thus you are doing extra work for a resulting semantic which very much 'depends on context'.

The first is self-repairing if the parent is changed or removed.

Indeed - it also means you really do have a reference parameter that is a direct analog of just passing a variable.

Here is what I mean by side effects.

I must confess I quite liked your little puzzle... Here is the analysis of the result based on the semantics I proposed above (caveat - I have not double-checked it...):

Code: Select all

command process @w, @z, @zz
   global x
   global s
   if w["a"] < 5 then return "bad w"  -- some range checking -- w["a"] == x["b"]["a"] == 32
   put empty into x -- what impact on w?
   add 1 to w["a"]  -- we think this is now at least 6 -- adds 1 to x["b"]["a"], so now == 1
   put the keys of x -- outputs "b"
   put lf & w["a"] after message -- outputs x["b"]["a"] == 1
   put "z," before s --what impact on z, and zz?
   put "m" into char 1 of z -- same as into char 1 of item 3 of s -- so s is now "s,a,m,c,d"
   put space & z after message -- outputs item 3 of s == "m"
   put space & zz after message -- output char 5 of s == "m"
   return sqrt( w["a"]-6 )  -- error?  range checking was done -- attempts to do sqrt(x["b"]["a"]) == sqrt(-5)
end process

Now, while at first sight this might seem ghastly, you can actually get exactly the same effect now:

Code: Select all

global a
put 5 into a
process a

on process @x
  global a
  if x < 5 then exit process
  subtract 10 from a
  return sqrt(x) -- oops, error - but we checked!
end process

So the real problem here is that you have to be careful with reference parameters.

If the chunk is recalculated at each use, there can be a performance hit

It is true that in the case of locations that access array elements or chunks of strings you would have to traverse the path, or recalculate the location of the items/lines/words etc. in the string each time the reference parameter was used or changed. However, this is only true until some sort of optimization is employed. If the compiler can prove that the base of a reference parameter cannot be altered in a handler, then it can optimize away this cost. In the simplest case, if a handler calls nothing with side-effects (that could affect the reference parameter), or attempts to mutate variables that could be the base of a reference parameter then there is no issue. e.g.

Code: Select all

on doSomething @x
  repeat 100 times
    add 1 to x
  end repeat
end doSomething

I realize that is a slightly trite example, but ultimately the problem of making reference parameters efficient is the same that optimizing C compilers face in making indirection efficient - you do an aggressive alias analysis, and work from there.

Ultimately, a scripter can always optimize their use of reference parameters themselves too - by only using them in an in, out or in-out style.

By the way, just to clarify something - if something like 'item <complex expression> of tString' is passed by reference, the <complex expression> gets evaluated in the caller, so the location that is passed to the callee is a pair (value of <complex expression>, tString) - i.e. the expressions used to construct the chunk expression in the caller are *not* re-evaluated each time.

LCMark · Post by **LCMark** » Fri Jun 21, 2013 7:58 pm

I don't know why I'm making a big deal of this; I'm sure you guys will do something good.

I'm flattered by your confidence in us

I've been thinking about this stuff in detail for a long time and have been down many incorrect paths (at least thought experiment wise). It's always good to discuss them in depth... Making a mistake in an end-user app is bad, but can (generally) be easily fixed... Making a mistake in language design tends to end up being an irrevocable disaster.

mwieder · Post by **mwieder** » Fri Jun 21, 2013 8:46 pm

Also copy-on-write which removes the need for using @ (reference parameters) for efficiency improvements.

I hope I'm reading that as "removes the need... if you're only using reference parameters for efficiency improvements" because it doesn't remove the need for reference parameters in a general sense, and a copy-on-write will be very different for a reference parameter case.

LiveCode Forums

Virtual Types

Re: Virtual Types

Re: Virtual Types

Re: Virtual Types

Re: Virtual Types

Re: Virtual Types

Re: Virtual Types

Re: Virtual Types

Re: Virtual Types

Re: Virtual Types

Re: Virtual Types

Re: Virtual Types

Re: Virtual Types

Re: Virtual Types

Re: Virtual Types

Re: Virtual Types