+// SPDX-FileCopyrightText: 2023 Philippe Proulx <eeppeliteloop@gmail.com>
+// SPDX-License-Identifier: CC-BY-SA-4.0
+
// Show ToC at a specific location for a GitHub rendering
ifdef::env-github[]
:toc: macro
This package offers both a portable {py3} module and a command-line
tool.
-WARNING: This version of Normand is 0.1, meaning both the Normand
+WARNING: This version of Normand is 0.23, meaning both the Normand
language and the module/CLI interface aren't stable.
ifdef::env-github[]
aa bb f7 a7 32 da
----
-UTF-8, UTF-16, and UTF-32 literal strings::
+Strings::
+
Input:
+
----
"hello world!" 00
u16le"stress\nverdict 🤣"
+s:latin3{hex(ICITTE)}
----
+
Output:
----
68 65 6c 6c 6f 20 77 6f 72 6c 64 21 00 73 00 74 ┆ hello world!•s•t
00 72 00 65 00 73 00 73 00 0a 00 76 00 65 00 72 ┆ •r•e•s•s•••v•e•r
-00 64 00 69 00 63 00 74 00 20 00 3e d8 23 dd ┆ •d•i•c•t• •>•#•
+00 64 00 69 00 63 00 74 00 20 00 3e d8 23 dd 30 ┆ •d•i•c•t• •>•#•0
+78 32 66 ┆ x2f
----
Labels: special variables holding the offset where they're defined::
The value of a variable assignment is the evaluation of a valid {py3}
expression which may include label and variable names.
-Value encoding with a specific length (8{nbsp}bits to 64{nbsp}bits) and byte order::
+Fixed-length number with a given length (8{nbsp}bits to 64{nbsp}bits) and byte order::
+
Input:
+
----
{strength = 4}
-{be} 67 <lbl> 44 $178 {(end - lbl) * 8 + strength : 16} $99 <end>
-{le} {-1993 : 32}
+!be 67 <lbl> 44 $178 [(end - lbl) * 8 + strength : 16] $99 <end>
+!le [-1993 : 32]
+[-3.141593 : 64be]
+----
++
+Output:
++
+----
+67 44 b2 00 2c 63 37 f8 ff ff c0 09 21 fb 82 c2
+bd 7f
+----
++
+The encoded number is the evaluation of a valid {py3} expression which
+may include label and variable names.
+
+https://en.wikipedia.org/wiki/LEB128[LEB128] integer::
++
+Input:
++
+----
+aa bb cc [-1993 : sleb128] <meow> dd ee ff
+[meow * 199 : uleb128]
----
+
Output:
+
----
-67 44 b2 00 2c 63 37 f8 ff ff
+aa bb cc b7 70 dd ee ff e3 07
----
+
-The encoded value is the evaluation of a valid {py3} expression which
+The encoded integer is the evaluation of a valid {py3} expression which
may include label and variable names.
+Conditional::
++
+Input:
++
+----
+aa bb cc
+
+(
+ "foo"
+
+ !if {ICITTE > 10}
+ "bar"
+ !else
+ "fight"
+ !end
+) * 4
+----
++
+Output:
++
+----
+aa bb cc 66 6f 6f 66 69 67 68 74 66 6f 6f 66 69 ┆ •••foofightfoofi
+67 68 74 66 6f 6f 62 61 72 66 6f 6f 62 61 72 ┆ ghtfoobarfoobar
+----
+
Repetition::
+
Input:
+
----
aa bb * 5 cc <zoom> "yeah\0" * {zoom * 3}
+
+!repeat 3
+ ff ee "juice"
+!end
----
+
Output:
61 68 00 79 65 61 68 00 79 65 61 68 00 79 65 61 ┆ ah•yeah•yeah•yea
68 00 79 65 61 68 00 79 65 61 68 00 79 65 61 68 ┆ h•yeah•yeah•yeah
00 79 65 61 68 00 79 65 61 68 00 79 65 61 68 00 ┆ •yeah•yeah•yeah•
+ff ee 6a 75 69 63 65 ff ee 6a 75 69 63 65 ff ee ┆ ••juice••juice••
+6a 75 69 63 65 ┆ juice
+----
+
+Alignment::
++
+Input:
++
+----
+!be
+
+ [199:32]
+@64 [43:64]
+@16 [-123:16]
+@32~255 [5584:32]
+----
++
+Output:
++
+----
+00 00 00 c7 00 00 00 00 00 00 00 00 00 00 00 2b
+ff 85 ff ff 00 00 15 d0
+----
+
+Filling::
++
+Input:
++
+----
+!le
+[0xdeadbeef:32]
+[-1993:16]
+[9:16]
++0x40
+[ICITTE:8]
+"meow mix"
++200~FFh
+[ICITTE:8]
+----
++
+Output:
++
+----
+ef be ad de 37 f8 09 00 00 00 00 00 00 00 00 00 ┆ ••••7•••••••••••
+00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
+00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
+00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
+40 6d 65 6f 77 20 6d 69 78 ff ff ff ff ff ff ff ┆ @meow mix•••••••
+ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
+ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
+ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
+ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
+ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
+ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
+ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ┆ ••••••••••••••••
+ff ff ff ff ff ff ff ff c8 ┆ •••••••••
+----
+
+Transformation::
++
+Input:
++
----
+"end of file @ " [end:8]
+
+!transform gzip
+ "this part will be gzipped"
+!end
+<end>
+----
++
+Output:
++
+----
+65 6e 64 20 6f 66 20 66 69 6c 65 20 40 20 3c 1f ┆ end of file @ <•
+8b 08 00 7b 7b 26 65 02 ff 2b c9 c8 2c 56 28 48 ┆ •••{{&e••+••,V(H
+2c 2a 51 28 cf cc c9 51 48 4a 55 48 af ca 2c 28 ┆ ,*Q(•••QHJUH••,(
+48 4d 01 00 d4 cc 5b 8a 19 00 00 00 ┆ HM••••[•••••
+----
Multilevel grouping::
+
6f 6d cc aa bb 7a 6f 6f 6d cc de de de de ┆ om•••zoom•••••
----
+Macros::
++
+Input:
++
+----
+!macro hello(world)
+ "hello"
+ !if world " world" !end
+!end
+
+!repeat 17
+ ff ff ff ff
+ m:hello({ICITTE > 15 and ICITTE < 60})
+!end
+----
++
+Output:
++
+----
+ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c ┆ ••••hello••••hel
+6c 6f ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c ┆ lo••••hello worl
+64 ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c 64 ┆ d••••hello world
+ff ff ff ff 68 65 6c 6c 6f 20 77 6f 72 6c 64 ff ┆ ••••hello world•
+ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c 6c ┆ •••hello••••hell
+6f ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 ┆ o••••hello••••he
+6c 6c 6f ff ff ff ff 68 65 6c 6c 6f ff ff ff ff ┆ llo••••hello••••
+68 65 6c 6c 6f ff ff ff ff 68 65 6c 6c 6f ff ff ┆ hello••••hello••
+ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c 6c 6f ┆ ••hello••••hello
+ff ff ff ff 68 65 6c 6c 6f ff ff ff ff 68 65 6c ┆ ••••hello••••hel
+6c 6f ff ff ff ff 68 65 6c 6c 6f ┆ lo••••hello
+----
+
Precise error reporting::
+
----
----
+
----
-/tmp/meow.normand:24:19 - Illegal (unknown or unreachable) variable/label name `meow` in expression `(meow - 45) // 8`; the legal names are {`mix`, `zoom`}.
+/tmp/meow.normand:24:19 - Illegal (unknown or unreachable) variable/label name `meow` in expression `(meow - 45) // 8`; the legal names are {`ICITTE`, `mix`, `zoom`}.
----
+
----
-/tmp/meow.normand:18:9 - Value 315 is outside the 8-bit range when evaluating expression `end - ICITTE` at byte offset 45.
+/tmp/meow.normand:32:19 - While expanding the macro `meow`:
+/tmp/meow.normand:35:5 - While expanding the macro `zzz`:
+/tmp/meow.normand:18:9 - Value 315 is outside the 8-bit range when evaluating expression `end - ICITTE`.
----
You can use Normand to track data source files in your favorite VCS
[NOTE]
====
Normand has a single module file, `normand.py`, which you can copy as is
-to your project to use it (both the <<python-3-api,`normand.parse()`>>
+to your project to use it (both the <<python3-api,`normand.parse()`>>
function and the <<command-line-tool,command-line tool>>).
`normand.py` has _no external dependencies_, but if you're using
-Python{nbsp}3.4, you'll need a local copy of the standard `typing`
-module.
+Python{nbsp}3.4 or Python{nbsp}3.5, you'll need a local copy of the
+standard `typing` module.
====
+== Design goals
+
+The design goals of Normand are:
+
+Portability::
+ We're making sure `normand.py` works with Python{nbsp}≥{nbsp}3.4 and
+ doesn't have any external dependencies so that you may just copy the
+ module as is to your own project.
+
+Ease of use::
+ The most basic Normand input is a sequence of hexadecimal constants
+ (for example, `4e6f726d616e64`) which produce exactly what you'd
+ expect.
++
+Most Normand features map to programming language concepts you already
+know and understand: constant integers, literal strings, variables,
+conditionals, repetitions/loops, and the rest.
+
+Concise and readable input::
+ We could have chosen XML or YAML as the input format, but having a
+ DSL here makes a Normand input compact and easy to read, two
+ important traits when using Normand to write tests, for example.
++
+Compare the following Normand input and some hypothetical XML
+equivalent, for example:
++
+.Actual Normand input.
+----
+ff dd 01 ab $192 $-128 %1101:0011
+
+[end:8]
+
+{iter = 1}
+
+!if {not something}
+ # five times because xyz
+ !repeat 5
+ "hello world " [iter:8]
+ {iter = iter + 1}
+ !end
+!end
+
+<end>
+----
++
+.Hypothetical Normand XML input.
+[source,xml]
+----
+<?xml version="1.0" encoding="utf-8" ?>
+<group>
+ <byte base="x" val="ff" />
+ <byte base="x" val="dd" />
+ <byte base="x" val="1" />
+ <byte base="x" val="ab" />
+ <byte base="d" val="192" />
+ <byte base="d" val="-128" />
+ <byte base="b" val="11010011" />
+ <fixed-len-num expr="end" len="8" />
+ <var-assign name="iter" expr="1" />
+ <cond expr="not something">
+ <!-- five times because xyz -->
+ <repeat expr="5">
+ <str>hello world </str>
+ <fixed-len-num expr="iter" len="8" />
+ <var-assign name="iter" expr="iter + 1" />
+ </repeat>
+ </cond>
+ <label name="end" />
+</group>
+----
+
== Learn Normand
A Normand text input is a sequence of items which represent a sequence
[%header%autowidth]
|===
-|State variable |Description |Initial value: <<python-3-api,{py3} API>> |Initial value: <<command-line-tool,CLI>>
+|State variable |Description |Initial value: <<python3-api,{py3} API>> |Initial value: <<command-line-tool,CLI>>
|[[cur-offset]] Current offset
|
-The current offset has an effect on the value of
-<<label,labels>> and of the special `ICITTE` name in <<value,value>> and
-<<variable-assignment,variable assignment>> expression evaluation.
+The current offset has an effect on the value of <<label,labels>> and of
+the special `ICITTE` name in <<fixed-length-number,fixed-length
+number>>, <<leb128-integer,LEB128 integer>>, <<string,string>>,
+<<filling,filling>>, <<variable-assignment,variable assignment>>,
+<<conditional-block,conditional block>>, <<repetition-block,repetition
+block>>, <<macro-expansion,macro expansion>>, and
+<<post-item-repetition,post-item repetition>> expression evaluation.
Each generated byte increments the current offset.
A <<current-offset-setting,current offset setting>> may change the
-current offset.
+current offset without generating data.
+
+An <<current-offset-alignment,current offset alignment>> generates
+padding bytes to make the current offset satisfy a given alignment.
|`init_offset` parameter of the `parse()` function.
|`--offset` option.
|[[cur-bo]] Current byte order
|
-The current byte order has an effect on the encoding of <<value,values>>.
+The current byte order can have an effect on the encoding of
+<<fixed-length-number,fixed-length numbers>>.
A <<current-byte-order-setting,current byte order setting>> may change
the current byte order.
|One or more `--label` options.
|<<variable-assignment,Variables>>
-|Mapping of variable names to integral values.
+|Mapping of variable names to integral or floating point number values.
|`init_variables` parameter of the `parse()` function.
-|One or more `--var` options.
+|One or more `--var` or `--var-str` options.
|===
The available items are:
-* A <<byte-constant,constant integer>> representing a single byte.
+* A <<byte-constant,constant integer>> representing one or more
+ constant bytes.
-* A <<literal-string,literal string>> representing a sequence of bytes
- encoding UTF-8, UTF-16, or UTF-32 data.
+* A <<literal-string,literal string>> representing a constant sequence
+ of bytes encoding UTF-8, UTF-16, UTF-32, or Latin-1 to Latin-10 data.
* A <<current-byte-order-setting,current byte order setting>> (big or
little endian).
-* A <<value,{py3} expression to be evaluated>> as an unsigned or signed
- integer to be encoded on one or more bytes using the current byte
- order.
+* A <<fixed-length-number,fixed-length number>> (integer or
+ floating point), possibly using the <<cur-bo,current byte order>>, and
+ of which the value is the result of a {py3} expression.
+
+* An <<leb128-integer,LEB128 integer>> of which the value is the result
+ of a {py3} expression.
+
+* A <<string,string>> representing a sequence of bytes encoding UTF-8,
+ UTF-16, UTF-32, or Latin-1 to Latin-10 data, and of which the value is
+ the result of a {py3} expression.
* A <<current-offset-setting,current offset setting>>.
+* A <<current-offset-alignment,current offset alignment>>.
+
+* A <<filling,filling>>.
+
* A <<label,label>>, that is, a named constant holding the current
offset.
+
* A <<group,group>>, that is, a scoped sequence of items.
-Moreover, you can <<repetition,repeat>> any item above, except an offset
-or a label, a given fixed or variable number of times. This is called a
-repetition.
+* A <<conditional-block,conditional block>>.
-A Normand comment may exist:
+* A <<repetition-block,repetition block>>.
-* Between items, possibly within a group.
-* Between the nibbles of a constant hexadecimal byte.
-* Between the bits of a constant binary byte.
-* Between the last item and the ``pass:[*]`` character of a repetition,
- and between that ``pass:[*]`` character and the following number
- or expression.
+* A <<transformation-block,transformation block>>.
+
+* A <<macro-definition-block,macro definition block>>.
+
+* A <<macro-expansion,macro expansion>>.
+
+Moreover, you can repeat many items above a constant or variable number
+of times with the ``pass:[*]`` operator _after_ the item to repeat. This
+is called a <<post-item-repetition,post-item repetition>>.
+
+A Normand comment may exist pretty much anywhere between tokens.
A comment is anything between two ``pass:[#]`` characters on the same
-line, or from ``pass:[#]`` until the end of the line. Whitespaces and
-the following symbol characters are also considered comments where a
-comment may exist:
+line, or from ``pass:[#]`` until the end of the line. Whitespaces are
+also considered comments. The following symbols are also considered
+comments around and between items, as well as between hexadecimal
+nibbles and binary bits of <<byte-constant,byte constants>>:
----
-! @ / \ ? & : ; . , + [ ] _ = | -
+& , - . / : ; = ? \ _ |
----
The latter serve to improve readability so that you may write, for
example, a MAC address or a UUID as is.
+[[const-int]] Many items require a _constant integer_, possibly
+negative, in which case it may start with `-` for a negative integer. A
+positive constant integer is any of:
+
+Decimal::
+ One or mode digits (`0` to `9`).
+
+Hexadecimal::
+ One of:
++
+* The `0x` or `0X` prefix followed with one or more hexadecimal digits
+ (`0` to `9`, `a` to `f`, or `A` to `F`).
+* One or more hexadecimal digits followed with the `h` or `H` suffix.
+
+Octal::
+ One of:
++
+* The `0o` or `0O` prefix followed with one or more octal digits
+ (`0` to `7`).
+* One or more octal digits followed with the `o`, `O`, `q`, or `Q`
+ suffix.
+
+Binary::
+ One of:
++
+* The `0b` or `0B` prefix followed with one or more bits (`0` or `1`).
+* One or more bits followed with the `b` or `B` suffix.
+
+In general, anything between `pass:[{]` and `}` is a {py3} expression.
+
You can test the examples of this section with the `normand`
<<command-line-tool,command-line tool>> as such:
=== Byte constant
-A _byte constant_ represents a single byte.
+A _byte constant_ represents one or more constant bytes.
A byte constant is:
Hexadecimal form::
- Two consecutive hexits.
+ Two consecutive hexadecimal digits representing a single byte.
Decimal form::
- A decimal number after the `$` prefix.
+ One or more digits after the `$` prefix representing a single byte.
+
+Binary form:: {empty}
++
+--
+. __**N**__ `%` prefixes (at least one).
++
+The number of `%` characters is the number of subsequent expected bytes.
-Binary form::
- Eight bits after the `%` prefix.
+. __**N**__{nbsp}×{nbsp}8 bits (`0` or `1`).
+--
====
Input:
----
-ab cd [3d 8F] CC
+ab cd (3d 8F) CC
----
Output:
----
%01110011 %01100001 %01101100 %01110101 %01110100
+%%%1101:0010 11111111 #A#11 #B#00 #C#011 #D#1
----
Output:
----
-73 61 6c 75 74 ┆ salut
+73 61 6c 75 74 d2 ff c7 ┆ salut•••
----
====
=== Literal string
-A _literal string_ represents the UTF-8-, UTF-16-, or UTF-32-encoded
-bytes of a string.
+A _literal string_ represents the encoded bytes of a literal string
+using the UTF-8, UTF-16, UTF-32, or Latin-1 to Latin-10 encoding.
The string to encode isn't implicitly null-terminated: use `\0` at the
end of the string to add a null character.
A literal string is:
-. **Optional**: one of the following encodings instead of UTF-8:
+. **Optional**: one of the following encodings instead of the default
+ UTF-8:
+
--
[horizontal]
-`u16be`:: UTF-16BE.
-`u16le`:: UTF-16LE.
-`u32be`:: UTF-32BE.
-`u32le`:: UTF-32LE.
+`s:u8`::
+`u8`::
+ UTF-8.
+
+`s:u16be`::
+`u16be`::
+ UTF-16BE.
+
+`s:u16le`::
+`u16le`::
+ UTF-16LE.
+
+`s:u32be`::
+`u32be`::
+ UTF-32BE.
+
+`s:u32le`::
+`u32le`::
+ UTF-32LE.
+
+`s:latin1`::
+ ISO/IEC 8859-1.
+
+`s:latin2`::
+ ISO/IEC 8859-2.
+
+`s:latin3`::
+ ISO/IEC 8859-3.
+
+`s:latin4`::
+ ISO/IEC 8859-4.
+
+`s:latin5`::
+ ISO/IEC 8859-9.
+
+`s:latin6`::
+ ISO/IEC 8859-10.
+
+`s:latin7`::
+ ISO/IEC 8859-13.
+
+`s:latin8`::
+ ISO/IEC 8859-14.
+
+`s:latin9`::
+ ISO/IEC 8859-15.
+
+`s:latin10`::
+ ISO/IEC 8859-16.
--
. The ``pass:["]`` prefix.
Input:
----
-u32be "\"illusion is the first\nof all pleasures\" 🦉"
+s:u32be "\"illusion is the first\nof all pleasures\" 🦉"
----
Output:
----
====
+====
+Input:
+
+----
+s:latin1 "Paul Piché"
+----
+
+Output:
+
+----
+50 61 75 6c 20 50 69 63 68 e9 ┆ Paul Pich•
+----
+====
+
=== Current byte order setting
This special item sets the <<cur-bo,_current byte order_>>.
The two accepted forms are:
[horizontal]
-``pass:[{be}]``:: Set the current byte order to big endian.
-``pass:[{le}]``:: Set the current byte order to little endian.
+`!be`:: Set the current byte order to big endian.
+`!le`:: Set the current byte order to little endian.
-=== Value
+=== Fixed-length number
-A _value_ represents a fixed number of bytes encoding an unsigned or
-signed integer which is the result of evaluating a {py3} expression
-using the <<cur-bo,current byte order>>.
+A _fixed-length number_ represents a fixed number of bytes encoding
+either:
-For a value at some source location{nbsp}__**L**__, its {py3} expression
-may contain the name of any accessible <<label,label>>, including the
-name of a label defined after{nbsp}__**L**__, as well as the name of any
-<<variable-assignment,variable>> known at{nbsp}__**L**__.
+* An unsigned or signed integer (two's complement).
++
+The available lengths are 8, 16, 24, 32, 40, 48, 56, and 64.
-An accessible label is either:
+* A floating point number
+ (https://standards.ieee.org/standard/754-2008.html[IEEE{nbsp}754-2008]).
++
+The available lengths are 32 (_binary32_) and 64 (_binary64_).
-* Outside of the current <<group,group>>.
-* Within the same immediate group (not within a nested group).
+The value is the result of evaluating a {py3} expression.
-In the {py3} expression of a value, the value of the special name
-`ICITTE` is the <<cur-offset,current offset>> (before encoding the
-value).
+The byte order to use to encode the value is either directly specified
+or is the <<cur-bo,current byte order>>.
-A value is:
+A fixed-length number is:
-. The ``pass:[{]`` prefix.
+. The `[` prefix.
. A valid {py3} expression.
++
+For a fixed-length number at some source location{nbsp}__**L**__, this
+expression may contain the name of any accessible <<label,label>> (not
+within a nested group), including the name of a label defined
+after{nbsp}__**L**__ (except within a
+<<transformation-block,transformation block>>), as well as the name of
+any <<variable-assignment,variable>> known at{nbsp}__**L**__.
++
+The value of the special name `ICITTE` (`int` type) in this expression
+is the <<cur-offset,current offset>> (before encoding the number).
. The `:` character.
-. An encoding length in bits amongst `8`, `16`, `24`, `32`, `40`,
- `48`, `56`, and `64`.
+. An encoding length in bits amongst:
++
+--
+The expression evaluates to an `int` or `bool` value::
+ `8`, `16`, `24`, `32`, `40`, `48`, `56`, and `64`.
++
+NOTE: Normand automatically converts a `bool` value to `int`.
+
+The expression evaluates to a `float` value::
+ `32` and `64`.
+--
-. The `}` suffix.
+. **Optional**: a suffix of the previous encoding length, without
+ any whitespace, amongst:
++
+--
+[horizontal]
+`be`:: Encode in big endian.
+`le`:: Encode in little endian.
+--
++
+Without this suffix, the encoding byte order is the <<cur-bo,current
+byte order>> which must be defined if the encoding length is greater
+than eight.
+
+. The `]` suffix.
====
Input:
----
-{le} {345:16}
-{be} {-0xabcd:32}
+[345:16le]
+[-0xabcd:32be]
----
Output:
Input:
----
-{be}
+!be
# String length in bits
-{8 * (str_end - str_beg) : 16}
+[8 * (str_end - str_beg) : 16]
# String
<str_beg>
Input:
----
-{20 - ICITTE : 8} * 10
+[20 - ICITTE : 8] * 10
----
Output:
----
====
-=== Current offset setting
+====
+Input:
-This special item sets the <<cur-offset,_current offset_>>.
+----
+[2 * 0.0529 : 32le]
+----
-A current offset setting is:
+Output:
-. The `<` prefix.
+----
+ac ad d8 3d
+----
+====
-. A positive integer (hexadecimal starting with `0x` or `0X` accepted)
- which is the new current offset.
+=== LEB128 integer
-. The `>` suffix.
+An _LEB128 integer_ represents a variable number of bytes encoding an
+unsigned or signed integer which is the result of evaluating a {py3}
+expression following the https://en.wikipedia.org/wiki/LEB128[LEB128]
+format.
+
+An LEB128 integer is:
+
+. The `[` prefix.
+
+. A valid {py3} expression of which the evaluation result type
+ is `int` or `bool` (automatically converted to `int`).
++
+For an LEB128 integer at some source location{nbsp}__**L**__, this
+expression may contain:
++
+--
+* The name of any <<label,label>> defined before{nbsp}__**L**__
+ which isn't within a nested group.
+* The name of any <<variable-assignment,variable>> known
+ at{nbsp}__**L**__.
+--
++
+The value of the special name `ICITTE` (`int` type) in this expression
+is the <<cur-offset,current offset>> (before encoding the integer).
+
+. The `:` character.
+
+. One of:
++
+--
+[horizontal]
+`uleb128`:: Use the unsigned LEB128 format.
+`sleb128`:: Use the signed LEB128 format.
+--
+
+. The `]` suffix.
====
Input:
----
- {ICITTE : 8} * 8
-<0x61> {ICITTE : 8} * 8
+[624485 : uleb128]
----
Output:
----
-00 01 02 03 04 05 06 07 61 62 63 64 65 66 67 68 ┆ ••••••••abcdefgh
+e5 8e 26
----
====
Input:
----
-aa bb cc dd <meow> ee ff
-<12> 11 22 33 <mix> 44 55
-{meow : 8} {mix : 8}
+aa bb cc dd
+<meow>
+ee ff
+[-981238311 + (meow * -23) : sleb128]
+"hello"
----
Output:
----
-aa bb cc dd ee ff 11 22 33 44 55 04 0f ┆ •••••••"3DU••
+aa bb cc dd ee ff fd fa 8d ac 7c 68 65 6c 6c 6f ┆ ••••••••••|hello
----
====
-=== Label
+=== String
-A _label_ associates a name to the <<cur-offset,current offset>>.
+A _string_ represents a variable number of bytes encoding a string which
+is the result of evaluating a {py3} expression using the UTF-8, UTF-16,
+UTF-32, or Latin-1 to Latin-10 encoding.
-All the labels of a whole Normand input must have unique names.
+A string has two possible forms:
-A label may not share the name of a <<variable-assignment,variable>>
-name.
+Encoding prefix form:: {empty}
++
+. An encoding amongst:
++
+--
+[horizontal]
+`s:u8`::
+`u8`::
+ UTF-8.
-A label name may not be `ICITTE` (see <<value>> and
-<<variable-assignment>> to learn more).
+`s:u16be`::
+`u16be`::
+ UTF-16BE.
-A label is:
+`s:u16le`::
+`u16le`::
+ UTF-16LE.
-. The `<` prefix.
+`s:u32be`::
+`u32be`::
+ UTF-32BE.
-. A valid {py3} name which is not `ICITTE`.
+`s:u32le`::
+`u32le`::
+ UTF-32LE.
-. The `>` suffix.
+`s:latin1`::
+ ISO/IEC 8859-1.
-=== Variable assignment
+`s:latin2`::
+ ISO/IEC 8859-2.
-A _variable assignment_ associates a name to the integral result of an
-evaluated {py3} expression.
+`s:latin3`::
+ ISO/IEC 8859-3.
+
+`s:latin4`::
+ ISO/IEC 8859-4.
-For a variable assignment at some source location{nbsp}__**L**__, its
-{py3} expression may contain the name of any accessible <<label,label>>,
-including the name of a label defined after{nbsp}__**L**__, as well as
-the name of any variable known at{nbsp}__**L**__.
+`s:latin5`::
+ ISO/IEC 8859-9.
-An accessible label is either:
+`s:latin6`::
+ ISO/IEC 8859-10.
-* Outside of the current <<group,group>>.
-* Within the same immediate group (not within a nested group).
+`s:latin7`::
+ ISO/IEC 8859-13.
-A variable name may not be `ICITTE`.
+`s:latin8`::
+ ISO/IEC 8859-14.
-In the {py3} expression of a variable assignment, the special name
-`ICITTE` is the <<cur-offset,current offset>>.
+`s:latin9`::
+ ISO/IEC 8859-15.
-A variable is:
+`s:latin10`::
+ ISO/IEC 8859-16.
+--
. The ``pass:[{]`` prefix.
-. A valid {py3} name which is not `ICITTE`.
+. A valid {py3} expression of which the evaluation result type
+ is `bool`, `int`, `float`, or `str` (the first three automatically
+ converted to `str`).
++
+For a string at some source location{nbsp}__**L**__, this expression may
+contain:
++
+--
+* The name of any <<label,label>> defined before{nbsp}__**L**__
+ which isn't within a nested group.
+* The name of any <<variable-assignment,variable>> known
+ at{nbsp}__**L**__.
+--
++
+The value of the special name `ICITTE` (`int` type) in this expression
+is the <<cur-offset,current offset>> (before encoding the string).
-. The `=` character.
+. The `}` suffix.
-. A valid {py3} expression.
+Encoding suffix form:: {empty}
++
+. The `[` prefix.
-. The `}` suffix.
+. A valid {py3} expression of which the evaluation result type
+ is `bool`, `int`, `float`, or `str` (the first three automatically
+ converted to `str`).
++
+For a string at some source location{nbsp}__**L**__, this expression may
+contain:
++
+--
+* The name of any <<label,label>> defined before{nbsp}__**L**__
+ which isn't within a nested group.
+* The name of any <<variable-assignment,variable>> known
+ at{nbsp}__**L**__.
+--
++
+The value of the special name `ICITTE` (`int` type) in this expression
+is the <<cur-offset,current offset>> (before encoding the string).
-====
-Input:
+. The `:` character.
-----
-{mix = 101} {le}
-{meow = 42} 11 22 {meow:8} 33 {meow = ICITTE + 17}
-"yooo" {meow + mix : 16}
-----
+. A string encoding amongst:
++
+--
+[horizontal]
+`s:u8`::
+ UTF-8.
-Output:
+`s:u16be`::
+ UTF-16BE.
-----
-11 22 2a 33 79 6f 6f 6f 7a 00 ┆ •"*3yoooz•
-----
-====
+`s:u16le`::
+ UTF-16LE.
-=== Group
+`s:u32be`::
+ UTF-32BE.
-A _group_ is a scoped sequence of items.
+`s:u32le`::
+ UTF-32LE.
-The <<label,labels>> within a group aren't visible outside of it.
+`s:latin1`::
+ ISO/IEC 8859-1.
-The main purpose of a group is to <<repetition,repeat>> more than a
-single item.
+`s:latin2`::
+ ISO/IEC 8859-2.
-A group is:
+`s:latin3`::
+ ISO/IEC 8859-3.
+
+`s:latin4`::
+ ISO/IEC 8859-4.
+
+`s:latin5`::
+ ISO/IEC 8859-9.
-. The `(` prefix.
+`s:latin6`::
+ ISO/IEC 8859-10.
-. Zero or more items.
+`s:latin7`::
+ ISO/IEC 8859-13.
+
+`s:latin8`::
+ ISO/IEC 8859-14.
+
+`s:latin9`::
+ ISO/IEC 8859-15.
+
+`s:latin10`::
+ ISO/IEC 8859-16.
+--
-. The `)` suffix.
+. The `]` suffix.
====
Input:
----
-((aa bb cc) dd () ee) "leclerc"
+{iter = 1}
+
+!repeat 10
+ u8{iter} " "
+ {iter = iter + 1}
+!end
----
Output:
----
-aa bb cc dd ee 6c 65 63 6c 65 72 63 ┆ •••••leclerc
+31 20 32 20 33 20 34 20 35 20 36 20 37 20 38 20 ┆ 1 2 3 4 5 6 7 8
+39 20 31 30 20 ┆ 9 10
----
====
Input:
----
-((aa bb cc) * 3 dd ee) * 5
+{meow = 'salut jérémie'}
+[meow.upper() : s:latin1]
----
Output:
----
-aa bb cc aa bb cc aa bb cc dd ee aa bb cc aa bb
-cc aa bb cc dd ee aa bb cc aa bb cc aa bb cc dd
-ee aa bb cc aa bb cc aa bb cc dd ee aa bb cc aa
-bb cc aa bb cc dd ee
+53 41 4c 55 54 20 4a c9 52 c9 4d 49 45 ┆ SALUT J•R•MIE
----
====
-====
+=== Current offset setting
+
+This special item sets the <<cur-offset,_current offset_>>.
+
+A current offset setting is:
+
+. The `<` prefix.
+
+. A <<const-int,positive constant integer>> which is the new current
+ offset.
+
+. The `>` suffix.
+
+====
+Input:
+
+----
+ [ICITTE : 8] * 8
+<0x61> [ICITTE : 8] * 8
+----
+
+Output:
+
+----
+00 01 02 03 04 05 06 07 61 62 63 64 65 66 67 68 ┆ ••••••••abcdefgh
+----
+====
+
+====
+Input:
+
+----
+aa bb cc dd <meow> ee ff
+<12> 11 22 33 <mix> 44 55
+[meow : 8] [mix : 8]
+----
+
+Output:
+
+----
+aa bb cc dd ee ff 11 22 33 44 55 04 0f ┆ •••••••"3DU••
+----
+====
+
+=== Current offset alignment
+
+A _current offset alignment_ represents zero or more padding bytes to
+make the <<cur-offset,current offset>> meet a given
+https://en.wikipedia.org/wiki/Data_structure_alignment[alignment] value.
+
+More specifically, for an alignment value of{nbsp}__**N**__{nbsp}bits,
+a current offset alignment represents the required padding bytes until
+the current offset is a multiple of __**N**__{nbsp}/{nbsp}8.
+
+A current offset alignment is:
+
+. The `@` prefix.
+
+. A <<const-int,positive constant integer>> which is the alignment value
+ in _bits_.
++
+This value must be greater than zero and a multiple of{nbsp}8.
+
+. **Optional**:
++
+--
+. The ``pass:[~]`` prefix.
+. A <<const-int,positive constant integer>> which is the value of the
+ byte to use as padding to align the <<cur-offset,current offset>>.
+--
++
+Without this section, the padding byte value is zero.
+
+====
+Input:
+
+----
+11 22 (@32 aa bb cc) * 3
+----
+
+Output:
+
+----
+11 22 00 00 aa bb cc 00 aa bb cc 00 aa bb cc
+----
+====
+
+====
+Input:
+
+----
+!le
+77 88
+@32~0xcc [-893.5:32]
+@128~0x55 "meow"
+----
+
+Output:
+
+----
+77 88 cc cc 00 60 5f c4 55 55 55 55 55 55 55 55 ┆ w••••`_•UUUUUUUU
+6d 65 6f 77 ┆ meow
+----
+====
+
+====
+Input:
+
+----
+aa bb cc <29> @64~255 "zoom"
+----
+
+Output:
+
+----
+aa bb cc ff ff ff 7a 6f 6f 6d ┆ ••••••zoom
+----
+====
+
+=== Filling
+
+A _filling_ represents zero or more padding bytes to make the
+<<cur-offset,current offset>> reach a given value.
+
+A filling is:
+
+. The ``pass:[+]`` prefix.
+
+. One of:
+
+** A <<const-int,positive constant integer>> which is the current offset
+ target.
+
+** The ``pass:[{]`` prefix, a valid {py3} expression of which the
+ evaluation result type is `int` or `bool` (automatically converted to
+ `int`), and the `}` suffix.
++
+For a filling at some source location{nbsp}__**L**__, this expression
+may contain:
++
+--
+* The name of any <<label,label>> defined before{nbsp}__**L**__
+ which isn't within a nested group.
+* The name of any <<variable-assignment,variable>> known
+ at{nbsp}__**L**__.
+--
++
+The value of the special name `ICITTE` (`int` type) in this expression
+is the <<cur-offset,current offset>> (before handling the items to
+repeat).
+
+** A valid {py3} name.
++
+For the name `__NAME__`, this is equivalent to the
+`pass:[{]__NAME__}` form above.
+
++
+This value must be greater than or equal to the current offset where
+it's used.
+
+. **Optional**:
++
+--
+. The ``pass:[~]`` prefix.
+. A <<const-int,positive constant integer>> which is the value of the
+ byte to use as padding to reach the current offset target.
+--
++
+Without this section, the padding byte value is zero.
+
+====
+Input:
+
+----
+aa bb cc dd
++0x40
+"hello world"
+----
+
+Output:
+
+----
+aa bb cc dd 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
+00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
+00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
+00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ┆ ••••••••••••••••
+68 65 6c 6c 6f 20 77 6f 72 6c 64 ┆ hello world
+----
+====
+
+====
+Input:
+
+----
+!macro part(iter, fill)
+ <0> "particular security " [ord('0') + iter : 8] +fill~0x80
+!end
+
+{iter = 1}
+
+!repeat 5
+ m:part(iter, {32 + 4 * iter})
+ {iter = iter + 1}
+!end
+----
+
+Output:
+
+----
+70 61 72 74 69 63 75 6c 61 72 20 73 65 63 75 72 ┆ particular secur
+69 74 79 20 31 80 80 80 80 80 80 80 80 80 80 80 ┆ ity 1•••••••••••
+80 80 80 80 70 61 72 74 69 63 75 6c 61 72 20 73 ┆ ••••particular s
+65 63 75 72 69 74 79 20 32 80 80 80 80 80 80 80 ┆ ecurity 2•••••••
+80 80 80 80 80 80 80 80 80 80 80 80 70 61 72 74 ┆ ••••••••••••part
+69 63 75 6c 61 72 20 73 65 63 75 72 69 74 79 20 ┆ icular security
+33 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 ┆ 3•••••••••••••••
+80 80 80 80 80 80 80 80 70 61 72 74 69 63 75 6c ┆ ••••••••particul
+61 72 20 73 65 63 75 72 69 74 79 20 34 80 80 80 ┆ ar security 4•••
+80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 ┆ ••••••••••••••••
+80 80 80 80 80 80 80 80 70 61 72 74 69 63 75 6c ┆ ••••••••particul
+61 72 20 73 65 63 75 72 69 74 79 20 35 80 80 80 ┆ ar security 5•••
+80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 ┆ ••••••••••••••••
+80 80 80 80 80 80 80 80 80 80 80 80 ┆ ••••••••••••
+----
+====
+
+=== Label
+
+A _label_ associates a name to the <<cur-offset,current offset>>.
+
+All the labels of a whole Normand input must have unique names.
+
+A label must not share the name of a <<variable-assignment,variable>>
+name.
+
+A label is:
+
+. The `<` prefix.
+
+. A valid {py3} name which is not `ICITTE`.
+
+. The `>` suffix.
+
+=== Variable assignment
+
+A _variable assignment_ associates a name to the integral result of an
+evaluated {py3} expression.
+
+A variable assignment is:
+
+. The ``pass:[{]`` prefix.
+
+. A valid {py3} name which is not `ICITTE`.
+
+. The `=` character.
+
+. A valid {py3} expression of which the evaluation result type is `int`,
+ `float`, or `bool` (automatically converted to `int`), or `str`.
++
+For a variable assignment at some source location{nbsp}__**L**__, this
+expression may contain:
++
+--
+* The name of any <<label,label>> defined before{nbsp}__**L**__
+ which isn't within a nested group.
+* The name of any <<variable-assignment,variable>> known
+ at{nbsp}__**L**__.
+--
++
+The value of the special name `ICITTE` (`int` type) in this expression
+is the <<cur-offset,current offset>>.
+
+. The `}` suffix.
+
+====
+Input:
+
+----
+{mix = 101} !le
+{meow = 42} 11 22 [meow:8] 33 {meow = ICITTE + 17}
+"yooo" [meow + mix : 16]
+----
+
+Output:
+
+----
+11 22 2a 33 79 6f 6f 6f 7a 00 ┆ •"*3yoooz•
+----
+====
+
+=== Group
+
+A _group_ is a scoped sequence of items.
+
+The <<label,labels>> within a group aren't visible outside of it.
+
+The main purpose of a group is to <<post-item-repetition,repeat>> more
+than a single item and to isolate labels.
+
+A group is:
+
+. The `(`, `!group`, or `!g` opening.
+
+. Zero or more items except, recursively, a macro definition block.
+
+. Depending on the group opening:
++
+--
+`(`::
+ The `)` closing.
+
+`!group`::
+`!g`::
+ The `!end` closing.
+--
+
+====
+Input:
+
+----
+((aa bb cc) dd () ee) "leclerc"
+----
+
+Output:
+
+----
+aa bb cc dd ee 6c 65 63 6c 65 72 63 ┆ •••••leclerc
+----
+====
+
+====
+Input:
+
+----
+!group
+ (aa bb cc) * 3 dd ee
+!end * 5
+----
+
+Output:
+
+----
+aa bb cc aa bb cc aa bb cc dd ee aa bb cc aa bb
+cc aa bb cc dd ee aa bb cc aa bb cc aa bb cc dd
+ee aa bb cc aa bb cc aa bb cc dd ee aa bb cc aa
+bb cc aa bb cc dd ee
+----
+====
+
+====
+Input:
+
+----
+!be
+(
+ <str_beg> u16le"sébastien diaz" <str_end>
+ [ICITTE - str_beg : 8]
+ [(end - str_beg) * 5 : 24]
+) * 3
+<end>
+----
+
+Output:
+
+----
+73 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
+6e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 01 e0 ┆ n• •d•i•a•z•••••
+73 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
+6e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 01 40 ┆ n• •d•i•a•z••••@
+73 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
+6e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 00 a0 ┆ n• •d•i•a•z•••••
+----
+====
+
+=== Conditional block
+
+A _conditional block_ represents either the bytes of zero or more items
+if some expression is true, or the bytes of zero or more other items if
+it's false.
+
+A conditional block is:
+
+. The `!if` opening.
+
+. One of:
+
+** The ``pass:[{]`` prefix, a valid {py3} expression of which the
+ evaluation result type is `int` or `bool` (automatically converted to
+ `int`), and the `}` suffix.
++
+For a conditional block at some source location{nbsp}__**L**__, this
+expression may contain:
++
+--
+* The name of any <<label,label>> defined before{nbsp}__**L**__
+ which isn't within a nested group.
+* The name of any <<variable-assignment,variable>> known
+ at{nbsp}__**L**__.
+--
++
+The value of the special name `ICITTE` (`int` type) in this expression
+is the <<cur-offset,current offset>> (before handling the contained
+items).
+
+** A valid {py3} name.
++
+For the name `__NAME__`, this is equivalent to the
+`pass:[{]__NAME__}` form above.
+
+. Zero or more items to be handled when the condition is true
+ except, recursively, a macro definition block.
+
+. **Optional**:
+
+.. The `!else` opening.
+.. Zero or more items to be handled when the condition is false
+ except, recursively, a macro definition block
+
+. The `!end` closing.
+
+====
+Input:
+
+----
+{at = 1}
+{rep_count = 9}
+
+!repeat rep_count
+ "meow "
+
+ !if {ICITTE > 25}
+ "mix"
+ !else
+ "zoom"
+ !end
+
+ !if {at < rep_count} 20 !end
+
+ {at = at + 1}
+!end
+----
+
+Output:
+
+----
+6d 65 6f 77 20 7a 6f 6f 6d 20 6d 65 6f 77 20 7a ┆ meow zoom meow z
+6f 6f 6d 20 6d 65 6f 77 20 7a 6f 6f 6d 20 6d 65 ┆ oom meow zoom me
+6f 77 20 6d 69 78 20 6d 65 6f 77 20 6d 69 78 20 ┆ ow mix meow mix
+6d 65 6f 77 20 6d 69 78 20 6d 65 6f 77 20 6d 69 ┆ meow mix meow mi
+78 20 6d 65 6f 77 20 6d 69 78 20 6d 65 6f 77 20 ┆ x meow mix meow
+6d 69 78 ┆ mix
+----
+====
+
+====
+Input:
+
+----
+<str_beg>
+u16le"meow mix!"
+<str_end>
+
+!if {str_end - str_beg > 10}
+ " BIG"
+!end
+----
+
+Output:
+
+----
+6d 00 65 00 6f 00 77 00 20 00 6d 00 69 00 78 00 ┆ m•e•o•w• •m•i•x•
+21 00 20 42 49 47 ┆ !• BIG
+----
+====
+
+=== Repetition block
+
+A _repetition block_ represents the bytes of one or more items repeated
+a given number of times.
+
+A repetition block is:
+
+. The `!repeat` or `!r` opening.
+
+. One of:
+
+** A <<const-int,positive constant integer>> which is the number of
+ times to repeat the previous item.
+
+** The ``pass:[{]`` prefix, a valid {py3} expression of which the
+ evaluation result type is `int` or `bool` (automatically converted to
+ `int`), and the `}` suffix.
++
+For a repetition block at some source location{nbsp}__**L**__, this
+expression may contain:
++
+--
+* The name of any <<label,label>> defined before{nbsp}__**L**__
+ which isn't within a nested group.
+* The name of any <<variable-assignment,variable>> known
+ at{nbsp}__**L**__.
+--
++
+The value of the special name `ICITTE` (`int` type) in this expression
+is the <<cur-offset,current offset>> (before handling the items to
+repeat).
+
+** A valid {py3} name.
++
+For the name `__NAME__`, this is equivalent to the
+`pass:[{]__NAME__}` form above.
+
+. Zero or more items except, recursively, a macro definition block.
+
+. The `!end` closing.
+
+You may also use a <<post-item-repetition,post-item repetition>> after
+some items. The form ``!repeat{nbsp}__X__{nbsp}__ITEMS__{nbsp}!end``
+is equivalent to ``(__ITEMS__){nbsp}pass:[*]{nbsp}__X__``.
+
+====
+Input:
+
+----
+!repeat 0o400
+ [end - ICITTE - 1 : 8]
+!end
+
+<end>
+----
+
+Output:
+
+----
+ff fe fd fc fb fa f9 f8 f7 f6 f5 f4 f3 f2 f1 f0 ┆ ••••••••••••••••
+ef ee ed ec eb ea e9 e8 e7 e6 e5 e4 e3 e2 e1 e0 ┆ ••••••••••••••••
+df de dd dc db da d9 d8 d7 d6 d5 d4 d3 d2 d1 d0 ┆ ••••••••••••••••
+cf ce cd cc cb ca c9 c8 c7 c6 c5 c4 c3 c2 c1 c0 ┆ ••••••••••••••••
+bf be bd bc bb ba b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 ┆ ••••••••••••••••
+af ae ad ac ab aa a9 a8 a7 a6 a5 a4 a3 a2 a1 a0 ┆ ••••••••••••••••
+9f 9e 9d 9c 9b 9a 99 98 97 96 95 94 93 92 91 90 ┆ ••••••••••••••••
+8f 8e 8d 8c 8b 8a 89 88 87 86 85 84 83 82 81 80 ┆ ••••••••••••••••
+7f 7e 7d 7c 7b 7a 79 78 77 76 75 74 73 72 71 70 ┆ •~}|{zyxwvutsrqp
+6f 6e 6d 6c 6b 6a 69 68 67 66 65 64 63 62 61 60 ┆ onmlkjihgfedcba`
+5f 5e 5d 5c 5b 5a 59 58 57 56 55 54 53 52 51 50 ┆ _^]\[ZYXWVUTSRQP
+4f 4e 4d 4c 4b 4a 49 48 47 46 45 44 43 42 41 40 ┆ ONMLKJIHGFEDCBA@
+3f 3e 3d 3c 3b 3a 39 38 37 36 35 34 33 32 31 30 ┆ ?>=<;:9876543210
+2f 2e 2d 2c 2b 2a 29 28 27 26 25 24 23 22 21 20 ┆ /.-,+*)('&%$#"!
+1f 1e 1d 1c 1b 1a 19 18 17 16 15 14 13 12 11 10 ┆ ••••••••••••••••
+0f 0e 0d 0c 0b 0a 09 08 07 06 05 04 03 02 01 00 ┆ ••••••••••••••••
+----
+====
+
+====
Input:
----
-{be}
-(
- <str_beg> u16le"sébastien diaz" <str_end>
- {ICITTE - str_beg : 8}
- {(end - str_beg) * 5 : 24}
-) * 3
+{times = 1}
+
+aa bb cc dd
+
+!repeat 3
+ <here>
+
+ !repeat {here + 1}
+ ee ff
+ !end
+
+ 11 22 !repeat times 33 !end
+
+ {times = times + 1}
+!end
+
+"coucou!"
+----
+
+Output:
+
+----
+aa bb cc dd ee ff ee ff ee ff ee ff ee ff 11 22 ┆ •••••••••••••••"
+33 ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ 3•••••••••••••••
+ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
+ff ee ff ee ff 11 22 33 33 ee ff ee ff ee ff ee ┆ ••••••"33•••••••
+ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
+ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
+ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
+ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
+ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
+ff ee ff ee ff ee ff ee ff ee ff ee ff ee ff ee ┆ ••••••••••••••••
+ff ee ff ee ff ee ff ee ff ee ff ee ff 11 22 33 ┆ ••••••••••••••"3
+33 33 63 6f 75 63 6f 75 21 ┆ 33coucou!
+----
+====
+
+=== Transformation block
+
+A _transformation block_ represents the bytes of one or more items
+transformed into other bytes by a function.
+
+As of this version, Normand only offers a predetermined set of
+transformation functions.
+
+An encoded block is:
+
+. The `!transform` or `!t` opening.
+
+. A transformation function name amongst:
++
+--
+[horizontal]
+`base64`::
+`b64`::
+ Standard https://datatracker.ietf.org/doc/html/rfc4648.html#section-4[Base64].
+
+`base64u`::
+`b64u`::
+ URL-safe Base64, using `-` instead of `pass:[+]` and `_` instead of
+ `/`.
+
+`base32`::
+`b32`::
+ Standard https://datatracker.ietf.org/doc/html/rfc4648.html#section-6[Base32].
+
+`base16`::
+`b16`::
+ Standard https://datatracker.ietf.org/doc/html/rfc4648.html#section-8[Base16].
+
+`ascii85`::
+`a85`::
+ https://en.wikipedia.org/wiki/Ascii85[Ascii85] without padding.
+
+`ascii85p`::
+`a85p`::
+ Ascii85 with padding.
+
+`base85`::
+`b85`::
+ https://en.wikipedia.org/wiki/Ascii85[Base85] (like Git-style binary
+ diffs) without padding.
+
+`base85p`::
+`b85p`::
+ Base85 with padding.
+
+`quopri`::
+`qp`::
+ MIME
+ https://datatracker.ietf.org/doc/html/rfc2045#section-6.7[quoted-printable]
+ without quoted whitespaces.
+
+`quoprit`::
+`qpt`::
+ MIME quoted-printable with quoted whitespaces.
+
+`gzip`::
+`gz`::
+ https://en.wikipedia.org/wiki/Gzip[gzip].
+
+`bzip2`::
+`bz2`::
+ https://en.wikipedia.org/wiki/Bzip2[bzip2].
+--
+
+. Zero or more items except, recursively, a macro definition block.
++
+Any {py3} expression within any of those items may not refer to a future
+<<label,label>>.
++
+The value of the special name `ICITTE` in any {py3} expression within
+any of those items is the <<cur-offset,current offset>> _before_ Normand
+applies the transformation function. Therefore, labels defined within
+those items also have the current offset value _before_ Normand applies
+the transformation function.
+
+. The `!end` closing.
+
+The <<cur-offset,current offset>> after having handled the last item of
+a transformation block is the value of the current offset before
+handling the first item plus the size of the generated (transformed)
+bytes. In other words, <<current-offset-setting,current offset
+settings>> within the items of the block have no impact outside said
+block.
+
+====
+Input:
+
+----
+aa bb cc dd
+
+"size of compressed section: " [end - start : 8]
+
+<start>
+
+!transform bzip2
+ "this will be compressed!"
+ 89*100 00*5000
+!end
+
<end>
+
+"yes!"
----
Output:
----
-73 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
-6e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 01 e0 ┆ n• •d•i•a•z•••••
-73 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
-6e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 01 40 ┆ n• •d•i•a•z••••@
-73 00 e9 00 62 00 61 00 73 00 74 00 69 00 65 00 ┆ s•••b•a•s•t•i•e•
-6e 00 20 00 64 00 69 00 61 00 7a 00 1c 00 00 a0 ┆ n• •d•i•a•z•••••
+aa bb cc dd 73 69 7a 65 20 6f 66 20 63 6f 6d 70 ┆ ••••size of comp
+72 65 73 73 65 64 20 73 65 63 74 69 6f 6e 3a 20 ┆ ressed section:
+52 42 5a 68 39 31 41 59 26 53 59 68 e1 8c fc 00 ┆ RBZh91AY&SYh••••
+00 33 d1 e0 c0 00 60 00 5e 66 dc 80 00 20 00 80 ┆ •3••••`•^f••• ••
+00 08 20 00 31 40 d3 43 23 26 20 ca 87 a9 a1 e8 ┆ •• •1@•C#& •••••
+18 29 44 80 9c 80 49 bf cc b3 e8 45 ed e2 76 ad ┆ •)D•••I••••E••v•
+0f 12 8b 8a d6 cd 40 04 7e 2e e4 8a 70 a1 20 d1 ┆ ••••••@•~.••p• •
+c3 19 f8 79 65 73 21 ┆ •••yes!
+----
+====
+
+====
+Input:
+
+----
+88*16
+
+!t a85
+ "I am determined to be cheerful and happy in whatever situation "
+ "I may find myself. For I have learned that the greater part of "
+ "our misery or unhappiness is determined not by our circumstance "
+ "but by our disposition."
+!end
+
+@128~99h
+
+!t qp <beg> [ICITTE - beg : 8] * 50 !end
+----
+
+Output:
+
+----
+88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 88 ┆ ••••••••••••••••
+38 4b 5f 47 59 2b 43 6f 26 2a 41 54 44 58 25 44 ┆ 8K_GY+Co&*ATDX%D
+49 6d 3f 24 46 44 69 3a 32 41 4b 59 4a 72 41 53 ┆ Im?$FDi:2AKYJrAS
+23 6d 6f 46 5f 69 31 2f 44 49 61 6c 27 40 3b 70 ┆ #moF_i1/DIal'@;p
+31 32 2b 44 47 5e 39 47 41 28 45 2c 41 54 68 58 ┆ 12+DG^9GA(E,AThX
+2a 2b 45 4d 37 3d 46 5e 5d 42 2b 44 66 2d 5b 68 ┆ *+EM7=F^]B+Df-[h
+2b 44 6b 50 34 2b 44 2c 3e 2a 41 30 3e 60 37 46 ┆ +DkP4+D,>*A0>`7F
+28 4b 30 22 2f 67 2a 57 25 45 5a 64 70 72 42 4f ┆ (K0"/g*W%EZdprBO
+51 27 71 2b 44 62 55 74 45 63 2c 48 21 2b 45 56 ┆ Q'q+DbUtEc,H!+EV
+3a 2a 46 3c 47 5b 3d 41 4b 59 57 2b 41 52 54 5b ┆ :*F<G[=AKYW+ART[
+6c 45 5a 66 3d 30 45 63 60 46 42 41 66 75 23 37 ┆ lEZf=0Ec`FBAfu#7
+45 5a 66 34 35 46 28 4b 42 3b 2b 45 29 39 43 46 ┆ EZf45F(KB;+E)9CF
+60 28 6c 24 45 2c 5d 4e 2f 41 54 4d 6f 38 42 6c ┆ `(l$E,]N/ATMo8Bl
+62 44 2d 41 54 56 4c 28 44 2f 21 6d 21 41 30 3e ┆ bD-ATVL(D/!m!A0>
+63 2e 46 3c 47 25 3c 2b 45 29 43 43 2b 43 66 2c ┆ c.F<G%<+E)CC+Cf,
+2b 40 73 29 58 30 46 43 42 26 73 41 4b 59 48 29 ┆ +@s)X0FCB&sAKYH)
+46 3c 47 25 3c 2b 45 29 43 43 2b 43 6f 32 2d 45 ┆ F<G%<+E)CC+Co2-E
+2c 54 66 33 46 44 35 5a 32 2f 63 99 99 99 99 99 ┆ ,Tf3FD5Z2/c•••••
+3d 30 30 3d 30 31 3d 30 32 3d 30 33 3d 30 34 3d ┆ =00=01=02=03=04=
+30 35 3d 30 36 3d 30 37 3d 30 38 3d 30 39 0a 3d ┆ 05=06=07=08=09•=
+30 42 3d 30 43 0d 3d 30 45 3d 30 46 3d 31 30 3d ┆ 0B=0C•=0E=0F=10=
+31 31 3d 31 32 3d 31 33 3d 31 34 3d 31 35 3d 31 ┆ 11=12=13=14=15=1
+36 3d 31 37 3d 31 38 3d 31 39 3d 31 41 3d 31 42 ┆ 6=17=18=19=1A=1B
+3d 31 43 3d 31 44 3d 31 45 3d 31 46 20 21 22 23 ┆ =1C=1D=1E=1F !"#
+24 25 26 27 28 29 2a 2b 2c 2d 3d 0a 2e 2f 30 31 ┆ $%&'()*+,-=•./01
+----
+====
+
+=== Macro definition block
+
+A _macro definition block_ associates a name and parameter names to
+a group of items.
+
+A macro definition block doesn't lead to generated bytes itself: a
+<<macro-expansion,macro expansion>> does so.
+
+A macro definition may only exist at the root level, that is, not within
+a <<group,group>>, a <<repetition-block,repetition block>>, a
+<<conditional-block,conditional block>>, or another
+<<macro-definition-block,macro definition block>>.
+
+All macro definitions must have unique names.
+
+A macro definition is:
+
+. The `!macro` or `!m` opening.
+
+. A valid {py3} name (the macro name).
+
+. The `(` parameter name list prefix.
+
+. A comma-separated list of zero or more unique parameter names,
+ each one being a valid {py3} name.
+
+. The `)` parameter name list suffix.
+
+. Zero or more items except, recursively, a macro definition block.
+
+. The `!end` closing.
+
+====
+----
+!macro bake()
+ !le [ICITTE * 8 : 16]
+ u16le"predict explode"
+!end
+----
+====
+
+====
+----
+!macro nail(rep, with_extra, val)
+ {iter = 1}
+
+ !repeat rep
+ [val + iter : uleb128]
+ [0xdeadbeef : 32]
+ {iter = iter + 1}
+ !end
+
+ !if with_extra
+ "meow mix\0"
+ !end
+!end
+----
+====
+
+=== Macro expansion
+
+A _macro expansion_ expands the items of a defined
+<<macro-definition-block,macro>>.
+
+The macro to expand must be defined _before_ the expansion.
+
+The <<state,state>> before handling the first item of the chosen macro
+is:
+
+<<cur-offset,Current offset>>::
+ Unchanged.
+
+<<cur-bo,Current byte order>>::
+ Unchanged.
+
+Variables::
+ The only available variables initially are the macro parameters.
+
+Labels::
+ None.
+
+The state after having handled the last item of the chosen macro is:
+
+Current offset::
+ The one before handling the first item of the macro plus the size
+ of the generated data of the macro expansion.
++
+IMPORTANT: This means <<current-offset-setting,current offset setting>>
+items within the expanded macro don't impact the final current offset.
+
+Current byte order::
+ The one before handling the first item of the macro.
+
+Variables::
+ The ones before handling the first item of the macro.
+
+Labels::
+ The ones before handling the first item of the macro.
+
+A macro expansion is:
+
+. The `m:` prefix.
+
+. A valid {py3} name (the name of the macro to expand).
+
+. The `(` parameter value list prefix.
+
+. A comma-separated list of zero or more unique parameter values.
++
+The number of parameter values must match the number of parameter
+names of the definition of the chosen macro.
++
+A parameter value is one of:
++
+--
+* A <<const-int,constant integer>>, possibly negative.
+
+* A constant floating point number.
+
+* The ``pass:[{]`` prefix, a valid {py3} expression of which the
+ evaluation result type is `int` or `bool` (automatically converted to
+ `int`), and the `}` suffix.
++
+For a macro expansion at some source location{nbsp}__**L**__, this
+expression may contain:
+
+** The name of any <<label,label>> defined before{nbsp}__**L**__
+ which isn't within a nested group.
+** The name of any <<variable-assignment,variable>> known
+ at{nbsp}__**L**__.
+
++
+The value of the special name `ICITTE` (`int` type) in this expression
+is the <<cur-offset,current offset>> (before handling the items of the
+chosen macro).
+
+* A valid {py3} name.
++
+For the name `__NAME__`, this is equivalent to the
+`pass:[{]__NAME__pass:[}]` form above.
+--
+
+. The `)` parameter value list suffix.
+
+====
+Input:
+
+----
+!macro bake()
+ !le [ICITTE * 8 : 16]
+ u16le"predict explode"
+!end
+
+"hello [" m:bake() "] world"
+
+m:bake() * 5
+----
+
+Output:
+
+----
+68 65 6c 6c 6f 20 5b 38 00 70 00 72 00 65 00 64 ┆ hello [8•p•r•e•d
+00 69 00 63 00 74 00 20 00 65 00 78 00 70 00 6c ┆ •i•c•t• •e•x•p•l
+00 6f 00 64 00 65 00 5d 20 77 6f 72 6c 64 70 01 ┆ •o•d•e•] worldp•
+70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
+65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 02 ┆ e•x•p•l•o•d•e•p•
+70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
+65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 03 ┆ e•x•p•l•o•d•e•p•
+70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
+65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 04 ┆ e•x•p•l•o•d•e•p•
+70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
+65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 70 05 ┆ e•x•p•l•o•d•e•p•
+70 00 72 00 65 00 64 00 69 00 63 00 74 00 20 00 ┆ p•r•e•d•i•c•t• •
+65 00 78 00 70 00 6c 00 6f 00 64 00 65 00 ┆ e•x•p•l•o•d•e•
+----
+====
+
+====
+Input:
+
+----
+!macro A(val, is_be)
+ !le
+
+ !if is_be
+ !be
+ !end
+
+ [val : 16]
+!end
+
+!macro B(rep, is_be)
+ {iter = 1}
+
+ !repeat rep
+ m:A({iter * 3}, is_be)
+ {iter = iter + 1}
+ !end
+!end
+
+m:B(5, 1)
+m:B(3, 0)
+----
+
+Output:
+
+----
+00 03 00 06 00 09 00 0c 00 0f 03 00 06 00 09 00
----
====
-=== Repetition
+====
+Input:
+
+----
+!macro flt32be(val) !be [val : 32] !end
+
+"CHEETOS"
+m:flt32be(-42.17)
+m:flt32be(56.23e-4)
+----
+
+Output:
-A _repetition_ represents the bytes of an item repeated a given number
-of times.
+----
+43 48 45 45 54 4f 53 c2 28 ae 14 3b b8 41 25 ┆ CHEETOS•(••;•A%
+----
+====
+
+=== Post-item repetition
-A repetition is:
+A _post-item repetition_ represents the bytes of an item repeated a
+given number of times.
-. Any item.
+A post-item repetition is:
+
+. One of those items:
+
+** A <<byte-constant,byte constant>>.
+** A <<literal-string,literal string>>.
+** A <<fixed-length-number,fixed-length number>>.
+** An <<leb128-integer,LEB128 integer>>.
+** A <<string,string>>.
+** A <<macro-expansion,macro-expansion>>.
+** A <<transformation-block,transformation block>>.
+** A <<group,group>>.
. The ``pass:[*]`` character.
** A positive integer (hexadecimal starting with `0x` or `0X` accepted)
which is the number of times to repeat the previous item.
-** The ``pass:[{]`` prefix, a valid {py3} expression, and the
- ``pass:[}]`` suffix.
+** The ``pass:[{]`` prefix, a valid {py3} expression of which the
+ evaluation result type is `int` or `bool` (automatically converted to
+ `int`), and the `}` suffix.
++
+For a post-item repetition at some source location{nbsp}__**L**__, this
+expression may contain:
++
+--
+* The name of any <<label,label>> defined before{nbsp}__**L**__
+ which isn't within a nested group and
+ which isn't part of the repeated item.
+* The name of any <<variable-assignment,variable>> known
+ at{nbsp}__**L**__, which isn't part of its repeated item, and which
+ doesn't.
+--
++
+The value of the special name `ICITTE` (`int` type) in this expression
+is the <<cur-offset,current offset>> (before handling the items to
+repeat).
+
+** A valid {py3} name.
++
+For the name `__NAME__`, this is equivalent to the
+`pass:[{]__NAME__pass:[}]` form above.
-When using an expression, it can't refer, directly or indirectly, to a
-subsequent label name and to the reserved `ICITTE` name.
+You may also use a <<repetition-block,repetition block>>. The form
+``__ITEM__{nbsp}pass:[*]{nbsp}__X__`` is equivalent to
+``!repeat{nbsp}__X__{nbsp}__ITEM__{nbsp}!end``.
====
Input:
----
-{end - ICITTE - 1 : 8} * 0x100 <end>
+[end - ICITTE - 1 : 8] * 0x100 <end>
----
Output:
----
====
-====
-This example shows how to use a repetition as a conditional section
-depending on some predefined variable.
-
-Input:
-
-----
-aa bb cc dd
-(ee ff "meow mix" 00) * {cond}
-{be} {-1993:16}
-----
-
-Output (`cond` is 0):
-
-----
-aa bb cc dd f8 37
-----
-
-Output (`cond` is 1):
-
-----
-aa bb cc dd ee ff 6d 65 6f 77 20 6d 69 78 00 f8 ┆ ••••••meow mix••
-37 ┆ 7
-----
-====
-
== Command-line tool
If you <<install-normand,installed>> the `normand` package, then you
== {py3} API
-The whole `normand` package/module API is:
+The whole `normand` package/module public API is:
[source,python]
----
+# Byte order.
class ByteOrder(enum.Enum):
# Big endian.
BE = ...
LE = ...
-VarsT = typing.Dict[str, int]
-
-
-class TextLoc:
+# Text location.
+class TextLocation:
# Line number.
@property
def line_no(self) -> int:
...
-class ParseError(RuntimeError):
+# Parsing error message.
+class ParseErrorMessage:
+ # Message text.
+ @property
+ def text(self):
+ ...
+
# Source text location.
@property
- def text_loc(self) -> TextLoc:
+ def text_location(self):
+ ...
+
+
+# Parsing error.
+class ParseError(RuntimeError):
+ # Parsing error messages.
+ #
+ # The first message is the most _specific_ one.
+ @property
+ def messages(self):
...
+# Variables dictionary type (for type hints).
+VariablesT = typing.Dict[str, typing.Union[int, float]]
+
+
+# Labels dictionary type (for type hints).
+LabelsT = typing.Dict[str, int]
+
+
+# Parsing result.
class ParseResult:
# Generated data.
@property
# Updated variable values.
@property
- def variables(self) -> VarsT:
+ def variables(self) -> SymbolsT:
...
# Updated main group label values.
@property
- def labels(self) -> VarsT:
+ def labels(self) -> SymbolsT:
...
# Final offset.
# Final byte order.
@property
- def byte_order(self) -> typing.Optional[int]:
+ def byte_order(self) -> typing.Optional[ByteOrder]:
...
+
+# Parses the `normand` input using the initial state defined by
+# `init_variables`, `init_labels`, `init_offset`, and `init_byte_order`,
+# and returns the corresponding parsing result.
def parse(normand: str,
- init_variables: typing.Optional[VarsT] = None,
- init_labels: typing.Optional[VarsT] = None,
+ init_variables: typing.Optional[SymbolsT] = None,
+ init_labels: typing.Optional[SymbolsT] = None,
init_offset: int = 0,
init_byte_order: typing.Optional[ByteOrder] = None) -> ParseResult:
...
The `parse()` function raises a `ParseError` instance should it fail to
parse the `normand` string for any reason.
+
+== Development
+
+Normand is a https://python-poetry.org/[Poetry] project.
+
+To develop it, install it through Poetry and enter the virtual
+environment:
+
+----
+$ poetry install
+$ poetry shell
+$ normand <<< '"lol" * 10 0a'
+----
+
+`normand.py` is processed by:
+
+* https://microsoft.github.io/pyright/[Pyright]
+* https://github.com/psf/black[Black]
+* https://pycqa.github.io/isort/[isort]
+
+Licensing and copyright follows the
+https://reuse.software/tutorial/[REUSE] specification and is checked
+with the https://github.com/fsfe/reuse-tool[reuse tool].
+
+=== Testing
+
+Use https://docs.pytest.org/[pytest] to test Normand once the package is
+part of your virtual environment, for example:
+
+----
+$ poetry install
+$ poetry run pip3 install pytest
+$ poetry run pytest
+----
+
+The `pytest` project is currently not a development dependency in
+`pyproject.toml` due to backward compatibiliy issues with
+Python{nbsp}3.4.
+
+In the `tests` directory, each `*.nt` file is a test. The file name
+prefix indicates what it's meant to test:
+
+`pass-`::
+ Everything above the `---` line is the valid Normand input
+ to test.
++
+Everything below the `---` line is the expected data
+(whitespace-separated hexadecimal bytes).
+
+`fail-`::
+ Everything above the `---` line is the invalid Normand input
+ to test.
++
+Everything below the `---` line is the expected error message having
+this form:
++
+----
+LINE:COL - MESSAGE
+----
+
+=== Contributing
+
+Normand uses https://review.lttng.org/admin/repos/normand,general[Gerrit]
+for code review.
+
+To report a bug, https://github.com/efficios/normand/issues/new[create a
+GitHub issue].