From fc93da8f3c51db03aa5b7bd0e0fb2f41e3334c44 Mon Sep 17 00:00:00 2001
From: Romain Goyet
Date: Thu, 22 Aug 2019 14:49:18 +0200
Subject: [PATCH] [docs] Document Epsilon alongside its source code
---
docs/architecture.svg | 86 +++++++++++++++++
docs/index.md | 98 ++++++++++++++++++++
docs/ion/index.md | 40 ++++++++
docs/poincare/baseline.svg | 46 ++++++++++
docs/poincare/beautify.svg | 103 +++++++++++++++++++++
docs/poincare/expression_tree.svg | 64 +++++++++++++
docs/poincare/index.md | 138 ++++++++++++++++++++++++++++
docs/poincare/layout.svg | 135 +++++++++++++++++++++++++++
docs/poincare/order.svg | 68 ++++++++++++++
docs/poincare/reduce.svg | 147 ++++++++++++++++++++++++++++++
docs/poincare/rtti.svg | 117 ++++++++++++++++++++++++
docs/poincare/simplify.svg | 123 +++++++++++++++++++++++++
12 files changed, 1165 insertions(+)
create mode 100644 docs/architecture.svg
create mode 100644 docs/index.md
create mode 100644 docs/ion/index.md
create mode 100644 docs/poincare/baseline.svg
create mode 100644 docs/poincare/beautify.svg
create mode 100644 docs/poincare/expression_tree.svg
create mode 100644 docs/poincare/index.md
create mode 100644 docs/poincare/layout.svg
create mode 100644 docs/poincare/order.svg
create mode 100644 docs/poincare/reduce.svg
create mode 100644 docs/poincare/rtti.svg
create mode 100644 docs/poincare/simplify.svg
diff --git a/docs/architecture.svg b/docs/architecture.svg
new file mode 100644
index 000000000..8a848854e
--- /dev/null
+++ b/docs/architecture.svg
@@ -0,0 +1,86 @@
+
diff --git a/docs/index.md b/docs/index.md
new file mode 100644
index 000000000..3f1ef556a
--- /dev/null
+++ b/docs/index.md
@@ -0,0 +1,98 @@
+---
+title: Software Engineering
+layout: breadcrumb
+breadcrumb: Software
+---
+# Epsilon
+
+Epsilon is a high-performance graphing calculator operating system. It includes eight apps that cover the high school mathematics curriculum.
+
+## Build your own version of Epsilon
+
+First of all, you should learn [how to build and run](build/) your very own version of Epsilon. Note that you don't need an actual NumWorks calculator to do this. Indeed, Epsilon can be compiled as a standalone application that will run on your computer.
+
+## Discover Epsilon's architecture
+
+Epsilon's code is comprehensive, as it goes from a keyboard driver up to a math engine. Epsilon is made out of five main bricks: [Ion](<%= p "ion" %>), Kandinsky, [Poincaré](<%= p "poincare" %>), Escher, and Apps.
+
+
+
+### Ion — Hardware abstraction layer
+
+Ion is the underlying library that [abstracts all the hardware operations](ion/). It performs tasks such as setting the backlight intensity, configuring the LED or setting pixel colors on the screen. It also answers to questions such as "tell me which keys are pressed" and "what is the battery voltage?".
+
+### Kandinsky — Graphics engine
+
+That library is in charge of doing all the drawing. It performs functions such as "draw that text at this location" or "fill that rectangle in blue".
+
+### Escher — GUI toolkit
+
+Escher is our GUI toolkit. It provides functionalities such as "draw a button" or "place three tabs named Foo, Bar and Baz". It asks Ion for events and uses Kandinsky to do draw the actual user interface.
+
+### Poincare — Mathematics engine
+
+Poincare is in charge of parsing, laying out and evaluating mathematical expressions. You feed it some text such as `sin(root(2/3,3))` and it will draw the expression as in a text book and tell you that this expression is approximatively equal to 0.01524.
+
+### Apps — Applications
+
+Last but not least, each app makes heavy use of both Escher and Poincare to display a nice user interface and to perform mathematical computation.
+
+## Read our coding guidelines
+
+We're listing here all the topics you should be familiar with before being able to efficiently contribute to the project. Those are not hard requirements, but we believe it would be more efficient if you got familiar with the following concepts.
+
+### Using C++
+
+The choice of a programming language is a controversial topic. Not all of them can be used to write an operating system, but quite a few can. We settled on C++ for several reasons:
+
+- It is a [system](https://en.wikipedia.org/wiki/System_programming_language) programming language, which is something we need since we have to write some low-level code.
+- It has excellent tooling: several extremly high-quality compilers
+- It is used for several high-profile projects LLVM, WebKit, MySQL, Photoshop, etc... This ensures a strong ecosystem of tools, code and documentation.
+- It easily allows Object-Oriented Programming, which is a convenient abstraction.
+
+
+Of course knowing a tool means knowing its limits. C++ isn't exempt of defaults:
+- It *is* a complicated language. The C++ 11 specification is 1300 pages long.
+- It allows for a lot of abstractions, which is a double-edged sword. It can allow for some very tricky code, and it's very easy to have complex operations being run without noticing.
+
+If you want to contribute to Epsilon, you'll need to learn some C++.
+
+### Working with limited memory
+
+Our device has 256 KB of RAM. That's very little memory by today's standards. That being said, by writing code carefuly, a huge lot can be achieved in that space. After all, that's 64 times more memory than the computer of the Apollo mission!
+
+#### Stack memory
+
+The stack memory is possibly the most used area of memory. It contains all local variables, and keeps track of the context of code execution. It can be overflowed in case of nested function calls if the reserved space is too small. We booked 32KB for the stack.
+
+#### Heap memory
+
+Unfortunately, local variables can't answer all use cases, and sometimes one need to allocate memory that lives longer than a function call. This is traditionally done by using a pair of *malloc* / *free* functions.
+
+This raises a lot of potential problems that can trigger unpredictable dynamic behaviors:
+
+
+
Memory leaks
+
If one forgets to free memory that is no longer used, the system will eventually run out of available memory.
+
Memory fragmentation
+
Memory allocation has to be contiguous. So the allocation algorithm has to use a smart heuristic to ensure that it will not fragment its allocated space too much.
+
+
+Some automatic memory management solutions do exist (garbage collection, smart pointers), but they all come with a cost. We decided to manually manage dynamic memory, but to use it as sparingly as possible.
+
+### Writing code that runs on the bare metal
+
+Unlike code that runs inside of an operating system (pretty much everything these days), an embedded firmware doesn't make use of virtual memory.
+
+In practice, this means that the firmware will need to know in advance how the memory space is laid out. In other words, it will need to answer those questions:
+
+- Where will the stack be located in memory?
+- What about the heap and global variables?
+- Where will we store read-only variables?
+- Where will the code live in memory?
+
+The firmware will also need to take special care of the system initialization. There is no such thing as a "main" function on a firmware. Instead, on Cortex-M4 devices, after reset the CPU simply jumps to the address contained at address 0x00000000 (which happens to be the first bytes of flash memory). So if your firmware starts by 0x12345678, code execution will start at address 0x12345678.
+
+Enforcing such a careful memory layout would be an impossible job without the proper tool. Fortunately, embedded linkers can be scripted and allow this kind of tailor-made configuration. You'll find Epsilon's linker script in "ion/src/device/boot/flash.ld" - it is heavily commented and should be self-explanatory.
+
+That being said, there are additional things the OS usually takes care of which we need to do ourselves : for example, initialize global variables to zero. This is done in the "ion/src/device/boot/rt0.cpp" file, which is worth reading too.
diff --git a/docs/ion/index.md b/docs/ion/index.md
new file mode 100644
index 000000000..7d1051df5
--- /dev/null
+++ b/docs/ion/index.md
@@ -0,0 +1,40 @@
+---
+title: Ion — Firmware architecture — Software Engineering
+layout: breadcrumb
+breadcrumb: Ion
+---
+# Ion
+
+## Overview
+
+Ion is a library of functions that abstract interacting with the hardware. For example, Ion exposes calls such as `serialNumber()` and `LED::setColor()`. Code in Epsilon always uses Ion functions to perform operations that deal directly with the hardware — be it reading a key, setting a pixel on the screen or setting the color of the LED.
+
+By providing multiple implementations of the Ion functions, we therefore can get Epsilon to run on multiple platforms. For example, we have an implementation of `Display::pushRect()` that knows how to drive the LCD panel of an actual calculator, and another implementation of the same `Display::pushRect()` function that knows how do display content in a web browser. This way, the rest of the Epsilon code can run unmodified either on a calculator or a web browser.
+
+## Device
+
+This is the reference platform corresponding to the actual device. To really understand what the code is doing, you'll need to refer to our Electrical Engineering pages. Among other thing, Ion is responsible for handling the boot process and the memory layout of the code on the device:
+
+### Boot
+
+On boot, the Cortex core interprets the beginning of Flash memory as an array of addresses, each having a specific meaning. For example, the first value gives the address of the stack and the second one the address the processor jumps to right after reset. This list of addresses is called the ISR table.
+
+### Memory layout
+
+Like we saw in the previous paragraph, the MCU has a specific memory layout (for example, Flash starts at address 0x08000000) and expects certain values at certain addresses. To ensure the firmware is laid out in memory exactly how the processor expects it, we use a custom linker script.
+
+## Simulator
+
+The simulator platform implements Ion calls using the FLTK library. The result is a native GUI program that can run under a variety of operating systems such as Windows, macOS or most Linux distributions.
+
+It's very practical to code using the simulator, but one has to pay attention to the differences from an actual device: it'll be significantly faster, and will have a lot more memory. So code written using the simulator should always be thoroughly tested on an actual device.
+
+## Emscripten
+
+The emscripten platform implements Ion calls for a Web browser. This lets us build a version of Epsilon that can run directly in a browser, as shown in our online simulator. The C++ code is transpiled to JavaScript using Emscripten, and then packaged in a webpage.
+
+Building on Emscripten takes quite a lot of time so you will most likely not want to use it for development purposes. But obviously it's a very neat feature for end users who can give the calculator a spin straight from their browser.
+
+## Blackbox
+
+The blackbox platform may seem odd at first sight because it implements most Ion calls doing nothing. In practice, a blackbox build results in an executable that runs on a regular PC but does virtually no I/O. It is in fact very useful for measuring and instrumenting the code. We use it for fuzzing and running the test suite.
diff --git a/docs/poincare/baseline.svg b/docs/poincare/baseline.svg
new file mode 100644
index 000000000..78af55cf7
--- /dev/null
+++ b/docs/poincare/baseline.svg
@@ -0,0 +1,46 @@
+
diff --git a/docs/poincare/beautify.svg b/docs/poincare/beautify.svg
new file mode 100644
index 000000000..e0e5e89f9
--- /dev/null
+++ b/docs/poincare/beautify.svg
@@ -0,0 +1,103 @@
+
diff --git a/docs/poincare/expression_tree.svg b/docs/poincare/expression_tree.svg
new file mode 100644
index 000000000..0fec012c9
--- /dev/null
+++ b/docs/poincare/expression_tree.svg
@@ -0,0 +1,64 @@
+
diff --git a/docs/poincare/index.md b/docs/poincare/index.md
new file mode 100644
index 000000000..63ed63ca3
--- /dev/null
+++ b/docs/poincare/index.md
@@ -0,0 +1,138 @@
+---
+title: Poincare — Firmware architecture — Software Engineering
+layout: breadcrumb
+breadcrumb: Poincare
+katex: true
+---
+# Poincaré
+
+## Structure
+
+Poincare takes text input such as `1+2*3` and turns it into a tree structure, that can be simplified, approximated and pretty-printed.
+
+Each node of a tree represents either an operator or a value. All nodes have a type (`Type::Addition`, `Type::Multiplication`...) and some also store a value (ie `Type::Rational`).
+
+{:class="img-right"}
+According to their types, expressions are childless (`Type::Rational`) or store pointers to their children (we call those children operands). To ease tree traversal, each node also keeps a pointer to its parent: that information is somewhat redundant but makes dealing with the expression tree much easier. `Multiplication` and `Addition` are the only type that can hold an infinite number of operands. Other expressions have a fixed number of operands: for instance, an `AbsoluteValue` will only ever have one child.
+
+## RTTI: Run-time type information
+
+The type of a C++ object is used by the compiler to generate a vtable. A vtable is a lookup table that tells which function to call for a given object class, hence creating polymorphism. Once the vtable has been built, the compiler completely discards the type information of a given object.
+
+The problem with vtables is that they allow polyphormism based on a single class only: you can have different code called on a Node depending on whether it's an addition or a multiplication. But vtables can't handle dynamic behavior based on two parameters. For example, if you want to call a function depending on the type of two parameters, vtables can't do that.
+
+That case happens quite often in Poincare: for example, if an expression contains the addition of another addition, we can merge both nodes in a single one ($$1+(\pi+x)$$ is $$1+\pi+x$$), see figure below). And we want to implement this behavior only if both nodes are additions.
+
+The C++ standard has support for keeping type information at runtime, a behavior known as RTTI. However that feature is quite comprehensive and a bit overkill for what we needed, so we decided to do an equivalent solution manually: each expression subclass implements a `type()` function to give its type.
+
+{:class="img-responsive"}
+
+## Expression parsing
+
+Lexing and parsing are done by homemade lexer and parser available [here](https://github.com/numworks/epsilon/tree/master/poincare/src/parsing).
+
+## Simplification
+
+{:class="img-right"}
+Expression simplification is done in-place and modifies directly the expression. Simplifying is a two-step process: first the expression is reduced, then it is beautified. So far, we excluded matrices from the simplification process to avoid increasing complexity due to the non-commutativity of matrix multiplication.
+
+### Ordering of operands
+
+To simplify an expression one needs to find relevant patterns. Searching for a given pattern can be extremely long if done the wrong way. To make pattern searching much more efficient, we need to sort operands of commutative operations.
+
+To sort those operands, we defined an order on expressions with the following features:
+
+* The order is total on types and values: `Rational(-2/3)` < `Rational(0)` < `...` < `Multiplication` < `Power` < `Addition` < `...`
+* The order relationship is depth-first recursive: if two expressions are equal in type and values, we compare their operands starting with the last.
+* To compare two expressions, we first sort their commutative children to ensure the unicity of expression representations. This guarantees that the order is total on expressions.
+
+
+{:class="img-responsive"}
+In the example, both root nodes are r so we compare their last operands. Both are equal to $$\pi$$ so we compare the next operands. As 3 > 2, we can conclude on the order relation between the expressions.
+
+Moreover, the simplification order has a few additional rules:
+
+* Within an `Addition` or a `Multiplication`, any `Rational` is always the first operand
+* Comparing an `Addition` a with an `Expression` e is equivalent to comparing a with an `Addition` whose single operand is e. Same goes for the `Multiplication`.
+* To compare a `Power` p with an `Expression` e, we compare $$p$$ with $$e^1$$.
+
+Thanks to these rules, the order groups similar terms together and thus avoid quadratic complexity when factorizing. For example, it groups expressions with same bases together (ie $$\pi$$ and $$\pi^3$$) and terms with same non-rational factors together (ie $$\pi$$ and $$2*\pi$$).
+
+Last but not least, as this order is total, it makes checking if two expressions are identical very easy.
+
+### Reduction
+
+The reduction phase is the most important part of simplification. It happens recursively and bottom-up: we first reduce the operands of an expression before reducing the expression itself. That way, when reducing itself, an expression can assert that its operands are reduced (and thus have some very useful knowledge such as "there is no `Division` or `Subtraction` among my operands"). Every type of `Expression` has its own reduction rules.
+
+To decrease the set of possible expression types in reduced expressions, we turn `Subtraction` into `Addition`, `Division` and `Root` into `Power` and so on:
+
+* $$a-b \rightarrow a+(-1)*b$$
+* $$-a \rightarrow (-1)*a$$
+* $$\frac{a}{b} \rightarrow a*b^{-1}$$
+* $$\sqrt{x} \rightarrow x^{\frac{1}{2}}$$
+* $$\sqrt[y]{x} \rightarrow x^{\frac{1}{y}}$$
+* $$\ln(x) \rightarrow log_{e}(x)$$
+
+{:class="img-responsive"}
+
+Here is a short tour of the reduction rules for the main `Expression` subclasses:
+
+#### `Additions` are reduced by common applying mathematics rules
+
+* Associativity: $$(a+b)+c \rightarrow a+b+c$$
+* Commutativity: $$a+b \rightarrow b+a$$ which enables to sort operands and group like-terms together
+* Factorization: $$a+5*a \rightarrow 6*a$$
+* $$a+0 \rightarrow a$$
+* Reducing addition to the same denominator
+
+#### `Multiplications` apply the following rules
+
+* Associativity: $$(a*b)*c \rightarrow a*b*c$$
+* Commutativity: $$a*b \rightarrow b*a$$ (which is true because we do no reduce matrices yet)
+* Factorization: $$a*a^5 \rightarrow a^6$$
+* $$a*0 \rightarrow 0$$
+* $$\frac{sine}{cosine} \rightarrow tangent$$
+* Distribution: $$a*(b+c) \rightarrow a*b+a*c$$
+
+#### `Powers` apply the following rules
+
+* We get rid of square roots at denominator and of sum of 2 square roots at denominator
+* $$x^0 \rightarrow 1\;if\;x \neq 0$$
+* $$x^1 \rightarrow x$$
+* $$0^x \rightarrow 0 \;if\; x > 0$$
+* $$1^x \rightarrow 1$$
+* $$(a^b)^c \rightarrow a^{b*c} \;if\; a > 0 \;or\; c \in \mathbb{Z}$$
+* $$(a*b*c*...)^n \rightarrow a^n*b^n*c^n*... \;if\; n \in \mathbb{Z}$$
+* $$(a*b*c*...)^r \rightarrow \mid a\mid^r*(sign(a)*b*c*...)^r \;if\; a \in \mathbb{Q}$$
+* $$a^{b+c} \rightarrow (a^b)*b^c \;if\; a, b \in \mathbb{Z}$$
+* $$a^{b+c} \rightarrow (a^b)*b^c \;if\; a, b \in \mathbb{Z}$$
+* $$r^s\;with\; r, s \in \mathbb{Q}$$ can be simplified using the factorisation in primes of $$r$$ (ie, $$8^{\frac{1}{2}} \rightarrow 2*2^{\frac{1}{2}}$$)
+* $$i^{\frac{p}{q}} \rightarrow e^{\frac{i*\pi*p}{2*q}} \;with\; p, q \in \mathbb{Z}$$
+* $$e^{\frac{i*\pi*p}{q}} \rightarrow cos(\frac{\pi*p}{q})+i*sin(\frac{\pi*p}{q}) \;with\; p, q \in \mathbb{Z}$$
+* $$x^{log(y,x)} \rightarrow y \;if\;y > 0$$
+* $$10^{log(x)} \rightarrow x \;if\; x > 0$$
+
+To avoid infinite loops, reduction is contextualized on the parent expression. This forces to reduce an expression only once it has been attached to its parent expression.
+
+### Beautify
+
+{:class="img-left"}
+This phase turns expressions in a more readable way. Divisions, subtractions, Naperian logarithms reappear at this step. Parentheses are also added to be able to print the tree in infix notation without any ambiguity. This phase is also recursive and top-down: we first beautify the node expression and then beautify its operands.
+
+## Approximation
+
+Expressions can be approximate thanks to the method `approximate()` which return another (dynamically allocated) expression that can be either:
+
+* A complex
+* A matrix of complexes
+
+To approximate an expression, we first approximate its operands (which are ensured to be either complex or matrix of complexes) and then approximate the expression depending on its type (an `Addition` add its operand approximations for example).
+
+# Pretty print
+
+Poincare is responsible for laying out expressions in 2D as in a text book. The `ExpressionLayout` class represents the layout on screen of an `Expression`, and can be derived from an `Expression` by calling the function `createLayout()` . `ExpressionLayout` is also a tree structure, although the layout tree does not exactly follow the expression tree
+
+{:class="img-responsive"}
+
+{:class="img-right"}
+The `baseline()` of `ExpressionLayout` is useful to align several layouts relatively to each other.
diff --git a/docs/poincare/layout.svg b/docs/poincare/layout.svg
new file mode 100644
index 000000000..95f097c82
--- /dev/null
+++ b/docs/poincare/layout.svg
@@ -0,0 +1,135 @@
+
diff --git a/docs/poincare/order.svg b/docs/poincare/order.svg
new file mode 100644
index 000000000..33e163d1f
--- /dev/null
+++ b/docs/poincare/order.svg
@@ -0,0 +1,68 @@
+
diff --git a/docs/poincare/reduce.svg b/docs/poincare/reduce.svg
new file mode 100644
index 000000000..4caf71c71
--- /dev/null
+++ b/docs/poincare/reduce.svg
@@ -0,0 +1,147 @@
+
diff --git a/docs/poincare/rtti.svg b/docs/poincare/rtti.svg
new file mode 100644
index 000000000..9dbbfbd3b
--- /dev/null
+++ b/docs/poincare/rtti.svg
@@ -0,0 +1,117 @@
+
diff --git a/docs/poincare/simplify.svg b/docs/poincare/simplify.svg
new file mode 100644
index 000000000..0b3218430
--- /dev/null
+++ b/docs/poincare/simplify.svg
@@ -0,0 +1,123 @@
+