Speeding up text operations on Win
Posted: Fri Aug 27, 2021 11:52 am
Hi,
it has been reported repeatedly that operations with large amounts of text are unbearable slow when using new versions of LC on Windows, while not suffering this much on MacOS.
Now there is a thread on the mailing list (where the enlightened ones enjoy their neolithic quote hell undisturbed by us unwashed masses) targeting this, and it actually yielded interesting results. So for the education of "the rest of us" I did a few quick tests and will provide some code samples here.
A very common piece of code:
Usually we would have something here that filters, or changes the data - for this example we omit this. We just copy myData, line by line, into myVar.
myData here is 22,470,000 Bytes, in 502,000 lines (rounded).
Running this on LC 6.7.10/Win takes 559 millisecs.
Running this on LC 9.6.3/Win takes 25,724 millisecs. Ouch.
Mark Waddingham replies and, as usual, finds a scapegoat ("the windows heap manager not being very good at continually re-extending a buffer"). But shows a workaround:
This reduces the time on LC 9.6.3/Win to tolerable 1,200 millisecs. Only 100% slower than 6.7.10 ;-)
Interesting here:
Have fun!
it has been reported repeatedly that operations with large amounts of text are unbearable slow when using new versions of LC on Windows, while not suffering this much on MacOS.
Now there is a thread on the mailing list (where the enlightened ones enjoy their neolithic quote hell undisturbed by us unwashed masses) targeting this, and it actually yielded interesting results. So for the education of "the rest of us" I did a few quick tests and will provide some code samples here.
A very common piece of code:
Code: Select all
repeat for each line L in myData
put L & CR after myVar
end repeat
delete char -1 of myVar
myData here is 22,470,000 Bytes, in 502,000 lines (rounded).
Running this on LC 6.7.10/Win takes 559 millisecs.
Running this on LC 9.6.3/Win takes 25,724 millisecs. Ouch.
Mark Waddingham replies and, as usual, finds a scapegoat ("the windows heap manager not being very good at continually re-extending a buffer"). But shows a workaround:
Code: Select all
put 1000000 into myBufferSize
repeat for each line L in myNewData
put L & CR after myBuffer
if (the number of codeunits in myBuffer> myBufferSize) then
put myBuffer after myVar
delete codeunit 1 to -1 of myBuffer
end if
end repeat
put myBuffer after myVar
delete codeunit -1 of myVar
Interesting here:
- "the number of codeunits in myBuffer" is much, much faster than the classic "len(myBuffer)"!
- The size of the buffer is important - in my case the "sweet spot" is ~ 1MB.
Have fun!