L-System Studio is an environment for creating and exploring L-Systems. It was built with Elm by Hunor GerΓ©d. I learnt about the project when he decided to submit it to Built with Elm back in April 2023. It's a fun application for exploring L-Systems but it can be quite resource intensive. My browser has even crashed on several occasions. Definitely not a good look for Elm. Could the performance be improved whilst using Elm or do we have to resort to mutable generation in JavaScript? That was the nagging question.
What's the problem?
The first thing that caught my attention was the huge number of SVG nodes that were being generated in the DOM. Those SVG nodes are created by looping over the array of characters returned by LSys.generateSequence
. It's not one to one but the number of SVG nodes that are added to the DOM can be very close to the number of characters in the array. And that array gets really large really quickly.
> import LSys
> import Array
> LSys.generateSequence 4 "F+F+F+F" [('F', String.toList "F+F-F-FF+F+F-F")] |> Array.length
30427 : Int
> LSys.generateSequence 5 "F+F+F+F" [('F', String.toList "F+F-F-FF+F+F-F")] |> Array.length
243419 : Int
> LSys.generateSequence 6 "F+F+F+F" [('F', String.toList "F+F-F-FF+F+F-F")] |> Array.length
1947355 : Int
> LSys.generateSequence 7 "F+F+F+F" [('F', String.toList "F+F-F-FF+F+F-F")] |> Array.length
<--- Last few GCs --->
[17075:0x2724f250] 11351 ms: Scavenge 2022.8 (2062.0) -> 2022.6 (2073.0) MB, 3.52 / 0.00 ms (average mu = 0.173, current mu = 0.111) allocation failure;
[17075:0x2724f250] 11359 ms: Scavenge 2029.6 (2073.0) -> 2030.4 (2073.7) MB, 5.33 / 0.00 ms (average mu = 0.173, current mu = 0.111) allocation failure;
[17075:0x2724f250] 11668 ms: Scavenge 2030.4 (2073.7) -> 2029.6 (2096.7) MB, 309.40 / 0.00 ms (average mu = 0.173, current mu = 0.111) allocation failure;
<--- JS stacktrace --->
FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
----- Native stack trace -----
1: 0xab10e4 node::OOMErrorHandler(char const*, v8::OOMDetails const&) [/nix/store/dcdc33kdjdhjnzg6rkmd0cx4kpwl8cac-nodejs-20.17.0/bin/node]
2: 0xe6de60 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [/nix/store/dcdc33kdjdhjnzg6rkmd0cx4kpwl8cac-nodejs-20.17.0/bin/node]
3: 0xe6e244 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, v8::OOMDetails const&) [/nix/store/dcdc33kdjdhjnzg6rkmd0cx4kpwl8cac-nodejs-20.17.0/bin/node]
4: 0x109d907 [/nix/store/dcdc33kdjdhjnzg6rkmd0cx4kpwl8cac-nodejs-20.17.0/bin/node]
5: 0x10b6479 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/nix/store/dcdc33kdjdhjnzg6rkmd0cx4kpwl8cac-nodejs-20.17.0/bin/node]
6: 0x108f0e7 v8::internal::HeapAllocator::AllocateRawWithLightRetrySlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [/nix/store/dcdc33kdjdhjnzg6rkmd0cx4kpwl8cac-nodejs-20.17.0/bin/node]
7: 0x108fd24 v8::internal::HeapAllocator::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [/nix/store/dcdc33kdjdhjnzg6rkmd0cx4kpwl8cac-nodejs-20.17.0/bin/node]
8: 0x106f04e v8::internal::Factory::NewFillerObject(int, v8::internal::AllocationAlignment, v8::internal::AllocationType, v8::internal::AllocationOrigin) [/nix/store/dcdc33kdjdhjnzg6rkmd0cx4kpwl8cac-nodejs-20.17.0/bin/node]
9: 0x14d8d70 v8::internal::Runtime_AllocateInYoungGeneration(int, unsigned long*, v8::internal::Isolate*) [/nix/store/dcdc33kdjdhjnzg6rkmd0cx4kpwl8cac-nodejs-20.17.0/bin/node]
10: 0x719fe2699ef6
N.B. I used my fork of the repository which I shared as an example in GitHub Actions, Devbox, and Elm.
I can't even check the length of the array after 7 iterations because the algorithm used for generation is using up too much memory.
How does LSys.generateSequence
work?
generateSequence : Float -> String -> List ( Char, List Char ) -> Array Char
generateSequence iterations axiom rules =
let
expandSymbol symbol =
case Extra.find (\( s, _ ) -> s == symbol) rules of
Just ( _, replacement ) ->
Array.fromList replacement
Nothing ->
Array.fromList [ symbol ]
in
List.foldl
(\_ seq ->
Array.map expandSymbol seq
|> Array.toList
|> List.map Array.toList
|> List.concat
|> Array.fromList
)
(String.toList axiom |> Array.fromList)
(List.range 1 (round iterations))
-
iterations
- The number of iterations -
axiom
- The initial string -
rules
- The rewrite rules
Recall List.foldl : (a -> b -> b) -> b -> List a -> b
.
For each character in the array we apply expandSymbol
to it. expandSymbol
attempts to find a rewrite string (represented as a list of characters) for a given character. A new array of lists is built which is eventually transformed back into an array of characters. A lot of intermediate lists and arrays are built up along the way. This is done round iterations
number of times over arrays of increasing length.
expandSymbol
searches an association list multiple times for the same symbol and each time it creates a new array from the replacement list of characters it finds.
It seems then that a poor choice of data structures and algorithms has led to performance and memory issues.
Thoughts after the investigation
- Why is
iterations
even aFloat
? -
rules
would be easier to enter viaelm repl
if it had the typeList ( Char, String )
. -
rules
should be processed into a data structure that has good lookup performance. - The replacements returned by a rule lookup should be created exactly once.
- An array or a list is not the right data structure for storing the characters generated by a given L-System since when stored in memory it will take up too much space. We need to be able to represent the whole value that could be generated but only generate what's needed at any given point in time. Could lazy evaluation work here?
- Maybe I could repurpose
Logic.Stream
from my The Reasoned Schemer (2nd Edition) project to use as the data structure for the in-memory representation of the entire L-System.
The new API
This is the new API I came up with:
module LSystem exposing (generate)
generate : Int -> List ( Char, String ) -> String -> Sequence Char
module Rules exposing
( Rules
-- Create
, fromList
-- Query
, lookup
)
type Rules
fromList : List ( Char, String ) -> Rules
lookup : Char -> Rules -> Sequence Char
module Sequence exposing
( Sequence
-- Create
, singleton, fromString
-- Combine
, concat, concatMap
-- Query
, length
-- Convert
, uncons, toList
)
type Sequence a
singleton : a -> Sequence a
fromString : String -> Sequence Char
concat : Sequence a -> Sequence a -> Sequence a
concatMap : (a -> Sequence b) -> Sequence a -> Sequence b
length : Sequence a -> Int
uncons : Sequence a -> Maybe ( a, Sequence a )
toList : Sequence a -> List a
LSystem
module LSystem exposing (generate)
import Rules exposing (Rules)
import Sequence exposing (Sequence)
generate : Int -> List ( Char, String ) -> String -> Sequence Char
generate n rules axiom =
generateHelper n (Rules.fromList rules) (Sequence.fromString axiom)
generateHelper : Int -> Rules -> Sequence Char -> Sequence Char
generateHelper n rules current =
if n <= 0 then
current
else
let
next =
Sequence.concatMap (flip Rules.lookup rules) current
in
generateHelper (n - 1) rules next
flip : (a -> b -> c) -> b -> a -> c
flip f b a =
f a b
The number of iterations, n
, is represented as an Int
.
rules
input has changed to List ( Char, String )
which should make it easier to enter on the elm repl
or for tests but that's not the way rules are used internally. They are processed into a data structure, Rules.fromList rules
, that has better lookup performance than an association list.
The looping is done using generateHelper
which is implemented tail-recursively to ensure we don't blow the stack. Though that may not have been necessary since n
doesn't need to be too large. For e.g. 50 iterations is probably over doing it. Either way, the implementation can handle millions of iterations now.
The workhorse of the algorithm is Sequence.concatMap
. It is used to transform the sequence from one iteration to the next.
Rules
module Rules exposing (Rules, fromList, lookup)
import Dict exposing (Dict)
import Sequence exposing (Sequence)
type Rules
= Rules (Dict Char (Sequence Char))
fromList : List ( Char, String ) -> Rules
fromList =
Rules << Dict.fromList << List.map (Tuple.mapSecond Sequence.fromString)
lookup : Char -> Rules -> Sequence Char
lookup ch (Rules table) =
Dict.get ch table
|> Maybe.withDefault (Sequence.singleton ch)
Rules
is an opaque type that is implemented as a dictionary. The keys are characters and the values are preprocessed, List.map (Tuple.mapSecond Sequence.fromString)
, into sequences that get reused each time a key is looked up.
lookup
finds the expansion for a given character.
N.B. The only issue I have with lookup
is that for missing keys it creates a new singleton sequence each time. Any ideas to avoid doing that would be appreciated.
Sequence
module Sequence exposing
( Sequence
, singleton, fromString
, concat, concatMap
, length
, uncons, toList
)
type Sequence a
= Empty
| Cons a (Sequence a)
| Thunk (() -> Sequence a)
A sequence is a combination of a list and a lazy list. An idea I stole from The Reasoned Schemer (2nd Edition).
If you remove the Thunk
constructor then you have a list.
type List a
= Empty
| Cons a (List a)
If you combine the Cons
and Thunk
constructors then you have a lazy list.
type LazyList a
= Empty
| Cons a (() -> LazyList a)
Create
singleton : a -> Sequence a
singleton x =
Cons x Empty
fromString : String -> Sequence Char
fromString s =
Thunk (\_ -> String.foldr Cons Empty s)
Combine
concat : Sequence a -> Sequence a -> Sequence a
concat a b =
case a of
Empty ->
b
Cons head tail ->
Cons head (concat tail b)
Thunk t ->
Thunk (\_ -> concat (t ()) b)
concatMap : (a -> Sequence b) -> Sequence a -> Sequence b
concatMap f s =
case s of
Empty ->
Empty
Cons head tail ->
concat
(f head)
(concatMap f tail)
Thunk t ->
Thunk (\_ -> concatMap f (t ()))
The major difference between this implementation and Logic.Stream
is that Sequence.concat
which corresponds to Logic.Stream.append
doesn't interleave between sequences.
Query
length : Sequence a -> Int
length =
lengthHelper 0
lengthHelper : Int -> Sequence a -> Int
lengthHelper n s =
case s of
Empty ->
n
Cons _ tail ->
lengthHelper (n + 1) tail
Thunk t ->
lengthHelper n (t ())
Sequences can get very long after relatively few iterations, much longer than the value an Int
can store. So this length
is really only useful for debugging purposes and it has to be implemented tail-recursively to ensure it gets optimized into a loop. Otherwise, it won't even be helpful for debugging.
Convert
uncons : Sequence a -> Maybe ( a, Sequence a )
uncons s =
case s of
Empty ->
Nothing
Cons head tail ->
Just ( head, tail )
Thunk t ->
uncons (t ())
toList : Sequence a -> List a
toList s =
case s of
Empty ->
[]
Cons head tail ->
head :: toList tail
Thunk t ->
toList (t ())
uncons
allows you to process the sequence one character at time. It's not used for generation but my plan is to use it for rendering.
toList
is only useful for debugging purposes on small sequences. I didn't even bother to make it tail-recursive.
Results
> import LSystem
> import Sequence
> Sequence.length <| LSystem.generate 6 [('F', "F+F-F-FF+F+F-F")] "F+F+F+F"
1947355 : Int
> Sequence.length <| LSystem.generate 7 [('F', "F+F-F-FF+F+F-F")] "F+F+F+F"
15578843 : Int
> Sequence.length <| LSystem.generate 8 [('F', "F+F-F-FF+F+F-F")] "F+F+F+F"
124630747 : Int
> Sequence.length <| LSystem.generate 9 [('F', "F+F-F-FF+F+F-F")] "F+F+F+F"
997045979 : Int
> Sequence.length <| LSystem.generate 10 [('F', "F+F-F-FF+F+F-F")] "F+F+F+F"
7976367835 : Int
> Sequence.uncons <| LSystem.generate 100 [('F', "F+F-F-FF+F+F-F")] "F+F+F+F"
Just ('F',Cons '+' (Cons 'F' (Cons '-' (Cons 'F' (Cons '-' (Cons 'F' (Cons 'F' (Cons '+' (Cons 'F' (Cons '+' (Cons 'F' (Cons '-' (Cons 'F' (Cons '+' (Thunk <function>)))))))))))))))
Iteration 10 of the L-System we started off with has over 7 billion characters. π€―
We can now generate and store the result of 100 iterations of the L-System and process it one character at time. π
How are we going to render the L-System if we can't store it in a list or put billions of SVG nodes in the browser? These are questions best left for another day.
Interesting Related Resources
-
Why Functional Programming Matters by John Hughes at Functional Conf 2016
- Word-at-a-time programming vs whole value programming
- Lazy evaluation implied whole value programming was possible
- John Hughes was interested in separating the decision of how many things to compute from the code you have to compute
- Lazy Evaluation
-
Recursive Programming Techniques
- Section 3.10 - Sequences, Coroutines, and Streams
Subscribe to my newsletter
If you're interested in improving your skills with Elm then I invite you to subscribe to my newsletter, Elm with Dwayne. To learn more about it, please read this announcement.
Top comments (1)
Very interesting read. Thanks for sharing how you approached these issues.