Style guide

Our best practices for good and readable code

Preamble: What do style guides have to take into account?

Style guides often reflect (as we do) personal preferences mixed with generally accepted conventions. For this reason alone, it is not advisable to blindly follow a style guide. What we offer you here are good guidelines that you can follow. We explicitly do not recommend that you adopt our suggestions without checking them. In general (as for many other programming languages):

Readability counts

So if you can read our suggestions less well, don't use them, but follow your own rules.

§1 Code structure

§1.1 Data flow logic

Good code is characterized in particular by a good and meaningful structure. This allows its contents to be grasped quickly, even if one does not read every line explicitly. It is central that code blocks that belong together are also merged. Nothing is more confusing than when operations on different data sets are mixed up wildly. Here it is better to perform these operations separately for each data set.

§1.2 Program Flow Logic

It is also important to avoid logic jumps. If the control flow logic (i.e., not the data, but the program flow) is heavily intertwined, grasping the relationships will be tedious and unnecessarily difficult. Again, summarize commonalities in the program flow. If there are essentially two cases that your code should process, then separate your code into these two cases instead of merging them over and over again. On the one hand you save countless if ... else ... endif statements, and your code will be much more maintainable and testable. In fact, it is a good idea to avoid else or elseif as much as possible, since this also reduces the risk of superfluous nesting.

§1.3 Complexity, modularization and procedure length

On the other hand, you should make sure that your code does not consist of huge control-flow blocks (e.g. a huge for ... endfor loop), but of easy-to-digest, easy-to-follow blocks that do only one task if possible. In general, you should write your code in a way that only one task is done per procedure. If you can subdivide this task further, then distribute the task to further, subordinate procedures. It is much easier to ensure the correctness of short procedures than to trudge through long, nested control flow blocks. As a guide, if a procedure contains more than three nested control flow blocks, it needs to be reworked at this point. To quantify this even further, a procedure should have under 100 lines of code and a cyclomatic complexity of under 15. Use NumeRe's static code analyzer to calculate these values.

§1.4 Testability and tests

Yes, we also thought for a long time that tests for procedures are completely unnecessary. You just test a few cases and then it will work. Unfortunately, no. We were often taught better, so that we now write test procedures for all packages that we provide (at least where this is possible). The advantage is, you can run them quickly when you change the code and make sure it still works. Furthermore, you can deal extensively with the edge cases and then not forget to test them when you update. However, in order to write tests well, it is important to develop procedures from the beginning with testability in mind: as clear and simple tasks as possible and no or only a few side effects.

§2 Naming of symbols and valid characters

§2.1 Naming of symbols

Ah yes. The topic with the names for variables and procedures. In almost every style guide this topic is mentioned and almost always the "speaking naming" is listed as the be-all and end-all. Nevertheless, there is still code in which exactly this is not practiced. Why is that? Probably because it is actually really hard to find a really good name for a symbol that describes its purpose and content well. We proceed as follows: everything that contains data (i.e. variables) gets a noun with a "static" connotation (e.g. calculationResult, fileContent, simulatedData). Interestingly, this already prohibits the use of "temp" in a variable, since this word does not have the "static" connotation.

Procedures, on the other hand, get an "active" name by starting them with a verb (e.g. $calculateResult, $loadFile, $simulateData). Procedures that perform a conversion can alternatively be started with "to" or "from" (e.g. $toKeyVal, $fromXml). "get" and "set" are also legitimate beginnings of procedures that return values or modify such values (e.g. $getValue, $setFileName).

§2.2 Type prefixes

Let's not kid ourselves. Opinions are strongly divided on this topic: some detest type prefixes in front of variable names, others can't do without them. For those who don't know what to do with them: Type prefixes are single letters at the beginning of variable names that describe their type. For example, this is n for natural number, f for a floating point number, and s for a string, as in nId, fValue, and sFileName. NumeRe will also recommend this, since it has an immediate advantage: you can already tell from the name what type is behind a variable, and even whether you should expect that a variable may also contain inf or nan. However, there is also the danger of trusting too much and not validating the content of variables. We have actually always had good experiences with these prefixes in the past, but this need not be true for everyone.

§2.3 Valid character sets

This is a short section: for all symbols, only the standard ASCII character set should and (can) be used. Designating variables or procedures with an "ä" or a "µ" is not possible and should (even must) be avoided.

§2.4 Procedure arguments

It has become a good practice to prefix procedure arguments with an additional underscore (even before the actual typing), e.g. _sFileName, _nId. The advantage of this is that you always have an overview of which variables are local and which are arguments. This is especially important when the arguments are passed by reference. You don't want to inadvertently overwrite the user's data. NumeRe will recommend the use of leading underscores, but at the same time hide them in the procedure tooltip to keep the length of the signature manageable here.

§2.5 CamelCase vs. snake_case

There are two main systems for combining "multi-word" names for symbols: CamelCase, where each word starts with an uppercase letter and is then continued in lowercase (e.g. fromXmlFile) or snake_case, where everything is written in lowercase and words are connected by an underscore (e.g. get_data_from_file). The latter variant has an advantage that is also a disadvantage: since there is more space between words, such names are easier to read, but at the same time they take up more space. Therefore we recommend to use CamelCase and to use snake_case only in special cases. For example, NumeRe uses snake_case prefixes for functions like "is_table()" or "to_string()" to make them stand out more. However, all other functions are written together and only in lowercase. Furthermore, istable() or tostring() would also be much less readable than e.g. getfilelist().

§3 Formatting and line length

§3.1 Indentations

We are big friends of using indentations to make the structure of code clearer. It is irrelevant whether you prefer TABs or spaces here. The NumeRe editor can be configured for both and will accept both mixed. Much more important is the question how to indent optimally. In fact, this is quite easy to answer: code that is in a block (i.e. between commands of the form CMD ... endCMD) should be indented by one level (i.e. 1 TAB or 4 spaces) for each block. NumeRe can automatically indent the code as you type if you select the appropriate option in the toolbar or in the Tools menu.

§3.2 Spaces and blank lines

The use of spaces between operators and symbols can greatly improve readability. We follow the rule of thumb that binary operators (e.g. +, * or ==) are separated from the surrounding symbols on both sides by one space each. This keeps the line length within the frame, while it separates the symbols and operators more visually. In fact, you can also make the order of operators clear with the spaces, e.g.: a + b * c and a + b*c are both valid procedures, whereas a^2 is preferable compared to a ^ 2. By the way, we do not make extra spaces between parentheses and the expressions they contain. There is bracket highlighting for that kind of thing. Instead, we like to use spaces after commas, e.g.: getfilelist("<savepath>/*.ndat", 1)

Just like spaces, empty lines can also help to better understand the structure of a code. Separate logically related blocks from the surrounding code with a blank line. Also separate multiple procedures by at least two blank lines. In fact, much of this can be done by NumeRe itself. In the Tools menu there is an option to have the formatting adjusted to the "NumeRe standard format". This automatically adds and adjusts spaces, indentations, and blank lines for optimal readability in our view.

§3.3 Line breaks and line length

You should always avoid long lines. Although the editor can display overlong lines in multiple lines, working with such lines turns out to be tedious, especially if the editor is split in the middle, so that only half or even less space is available. We recommend a line length of 100 characters here (others rather 80, but let's face it: most of us have an ultra-wide on the table in front of us anyway). You can display a vertical line after the 100 characters in the settings to guide you.

But what if the line has to be longer? There are two possibilities: ignore the rule here, or break the line manually. The latter is possible with the character sequence \\ at the end of a line. If NumeRe finds this character, it will append the following line(s) to the previous line during execution. A good place to perform these wraps is before binary operators, so that the wrapped lines start with them, making the togetherness even clearer. Also, wrapped lines should be indented one level from the second line on, e.g.:

## This is a long espression broken down to

## multiple lines

surf 1 + sin(_2pi*t)*Y(3, 2, y, x) \\

+ 0.5*cos(_2pi*t)*Y(5, 2, y, x) \\

+ 0.7*sin(_2pi*t+2)*Y(8, 5, y, x) \\

-set coords=spherical_pt animate

(If this looks strange, please turn Your smartphone.)


§4.1 KISS - Keep It Short & Simple

Let's keep it short: when you write procedures, keep them as short as possible. Make sure that each procedure does only one task. If this task consists of subtasks, then you can delegate this to subordinate procedures. The simpler the tasks of each procedure, the less error-prone the implementation. In addition, the solution for easy-to-understand tasks is much easier to work out compared to the situation when you try to consider all tasks at once.

When writing your code, don't use undocumented hacks. Even though they may be faster, they tend to be unsafe and may be removed in future versions of NumeRe. And we don't want code to be heavily version dependent and not even remotely upward compatible. Unfortunately, the latter is an unavoidable challenge with programming languages that are in flux.

§4.2 POLS - Principle Of Least Surprise

What is meant by this "principle of least surprise"? The idea is actually quite simple to convey: when using code, you want to be surprised as little as possible by its behavior. That means, for example, that if a procedure is called $loadData(), you expect that procedure to load data. If it then immediately performs corrections on the data, on the other hand, this is surprising. This is to be avoided. You can solve this by either changing the name of the procedure to $loadAndCorrectData() or creating a new procedure $correctData() instead and moving the correcting to here. In addition, the behavior must also result from the documentation of the procedure.

Likewise under POLS also still it is understood that objects, which have a common name apart from the code, are called in the code also in such a way. It is to be paid attention also to the fact that from the name also results, what one administers here. For example, xml{} is also a name away from the code, but it is not clear whether it is a specific XML node or an entire file. So xmlFile{} would be preferable. Also watch out for possible ambiguities. While you can create the *.int file type as int{}, it will then not be clear to the reader whether it is an integer, an integral, or perhaps this file type. It is better to use intFile{} here as well.

§5 Namespaces

§5.1 Purpose of namespaces

Short answer: Namespaces organize code. They can be used to organize modules that are separate from each other. By using the private procedure flag, it is even possible to make procedures accessible only from within a module. Furthermore, namespaces can be used to use the same procedure name more than once and still remain unique, e.g. there can be one $read() procedure for different file types as long as the file types are represented by separate namespaces, i.e. $TYP1~read() and $TYP2~read(). By explicitly prepending the namespace, they are clearly distinguishable from each other even when used. If you use only one of these two file types, you can even save the prepending of the namespace by using namespace TYP1.

In principle, namespaces can simply be thought of as folders, which they are (with a few exceptions) for procedure files. Within a folder the file names are also unique. If you move a procedure from one (namespace) folder to another, its namespace will also be adjusted accordingly.

As already shown in the examples, namespaces in NumeRe are specified using the ~ operator.

§5.2 Organization of functionalities

How to use namespaces in the most useful way? We like to use them to organize functionalities. This means that a namespace provides functionality similar to a module. It is also possible that these functionalities are specified by subordinate namespaces and the parent namespace is more abstract in nature, e.g. files~ini~, files~logging~ and files~tools~. By using child namespaces, you can also group similar functionalities together at the same time.

Why should you use namespaces? Primarily to keep your code organized. This also creates clarity, as there are fewer procedures in each namespace. You can also use shorter, more concise procedure names through namespaces, so that the purpose of a procedure is clear from the namespace, e.g. $files~ini~read() instead of $readIniFile(). Another advantage of using namespaces is that code can be externalized more easily, i.e. it is more reusable. Last but not least, you can map your architecture (if given) directly into the namespaces:




§5.3 Using the namespace command

In a previous section we mentioned the namespace command as a shortcut. With a command of this kind there are often reservations about the remaining uniqueness, since it is again not recognizable which procedure is used. In fact, however, this is not the case. The namespace command can only keep one namespace active at a time, i.e. when namespace is used again, the namespace is changed and not a new one is added. Therefore, it makes sense to use namespace only for the main namespace and to address all other namespaces directly. Furthermore, namespace can also be used only within procedures and then applies only to the following code within this procedure. Leaking namespaces into other procedures is therefore not possible.

§6 Comments and documentation

§6.1 Writing comments

Let's face it: nobody really loves to comment their code. At the moment you write the code, it seems like a waste of time to most people, because it's totally self-explanatory. It doesn't make any sense to comment those 20 lines of code. Only after either a few months have passed or you're using a new version where new features allow you to use certain problems more elegantly and efficiently, only then do you ask yourself, "What the hell did I do here?"

The problem with comments is not that you need them at the moment you write the code. You need them especially when you read it. And since code is read much more often than written, it is essential to make it as easy as possible for the reader to understand the logic and thoughts behind it. Also, it is much easier to find bugs in code if you know the ideas and thoughts behind the code. (To anyone who thinks they don't know this: there is no such thing as non-trivial, bug-free code). In doing so, we do not require that comments be written at the same time as the code. De facto, the first draft of code generally has a very short half-life, so commenting is not useful until the solution is reasonably stable and tidy.

§6.2 Documentation blocks

In addition to the usual comments, there are also documentation comments or blocks. These have the purpose of actually documenting the code. From them a PDF documentation can be generated, which can also be used as a manual. In addition, documentation blocks are also used before procedures to fill the tooltips with meaningful information. It is therefore always a good idea to use these blocks and also to fill them - you do not have to create them yourself. NumeRe can make a template for you with the appropriate menu option. The only things NumeRe can't do are filling the template and determining the return value of the procedure. So you will have to document the return value yourself with the keyword \return. But even this effort is worthwhile, because then the tooltip also shows the return type.

§6.3 Commented code

We all know it: we are in the process of developing a solution and want to quickly try another way, so we comment out the current solution and add a new one. There is nothing wrong with this in itself and it is a great way to deal with just such challenges, but one occasionally tends to leave the commented out (dead) code in the file. This, of course, has no impact on runtime, but can significantly interfere with readability and comprehension.

Therefore you should really remove code that you don't need anymore. And if you do want to go back to it, there are version control systems like Git or SVN that allow you to jump back to previous versions. In fact, NumeRe also has simplified versioning embedded, which creates a new version every time you save a file. You can enable version control in the preferences if it is not already. Otherwise, you will have noticed the ominous (rev123) information in the tooltips anyway: these show the current version number (123 in this case). You can also use the corresponding function in the context menu to display the previous versions and even create a diff, i.e. a comparison file of two versions.

The version control has even another, immediate advantage: if you are trying around and suddenly notice that your results are getting worse again, you can easily revert to a better working previous version.

§7 Local vs. global variables

If you have the choice to use local variables, you should always prefer them to global variables. Local variables have the advantage that their sphere of influence is very limited and they cannot be modified surprisingly from somewhere. This does not mean that global variables are bad by definition. On the contrary - there are valid reasons to use them. For example, there are cases where global variables make things more readable. Also cases where debugging without them would be tedious are possible and acceptable cases to use global variables. Only in the latter case, it should be switched to local variables once the bug is fixed.

Local variables are only possible in procedures. In all other cases (aka scripts or console) variables are automatically global. To create local variables in procedures you use the commands var (numeric variable), str (string), cst (cluster) and tab (table). Variables that you define apart from these commands are also global.

§8 Constants

In almost every code there are constants, i.e. values that do not change during the evaluation. These can be, for example, file paths or names where the configuration is to be stored. It is good practice not to write these constants as literals (i.e. values, not variables) directly into the code, but to use separate symbols for them. NumeRe offers you the possibility to create constants for the current file with the command declare (similar to the #define preprocessor macro of C or C++). You can then use the generated symbols in code similar to variables (except that they cannot be overwritten).

The immediate advantages are two: first, your code becomes much more readable, since no ominous numbers or strings appear. On the other hand, there are symbols with a readable name. The second advantage is the modifiability: it is much easier to change the value of a constant in one place instead of searching through the whole code. Search n' Replace doesn't always do the job here either: it may well be that the same value is used for different constants. So 32 can make sense as a character value or for bit values.

isWhiteSpace = ascii( == 32;

nLength32Bit = 2^32;

§9 Reusable vs. specific

The central task in programming is primarily to solve one or more problems. One breaks down the tasks so far that they are in principle no longer subdivisible. Only then one begins with the implementation of the individual tasks. Exactly here one can follow two different approaches: one can develop the implementation specifically for the current problem, or one detaches oneself from the explicit problem and writes the code immediately as abstractly as possible. Both are valid approaches and both have their advantages and disadvantages.

Writing code specifically has the immediate advantage that testing becomes quite easy and you don't have to test all extreme edge cases. In addition, one is faster, since one does not have to think long about the best possible interface. On the other hand, the disadvantage is that the solution only works for this one case. As soon as the requirements change even slightly, it is likely that you will have to touch the implementation again. In addition, you probably can't use the implementation for another problem, since there is no complete matching here either.

With an abstract implementation, on the other hand, one does not have the latter problem. Because of the abstract solution, reusability is trivial. And it is also not as likely that one has to touch abstract solutions several times when the requirements change. In addition, it is easier to convert abstract solutions into libraries pzw. packages. Disadvantages are basically the advantages of specific implementation: it takes longer, you have to think about the interface and testing is more laborious. In addition, the readability is lower with abstract implementations. So, as soon as the readability suffers significantly, it might make sense to use a specific solution.