paint-brush
How is a Code Formatter Implemented in Turtle Graphicsby@amrdeveloper
112 reads

How is a Code Formatter Implemented in Turtle Graphics

by Amr HeshamNovember 3rd, 2022
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Code formatter is an essential tool in programmers' day-to-day jobs, but did you ask yourself how it works? I will talk in detail how simple code formatter work and how I implemented in Turtle Graphics app. The first step is to read a text file and convert it into a list of tokens, A token is a class that represents a keyword, number, bracket, string, …etc with this position in the source code for example. The result is a List of tokens for example, a class of tokens.
featured image - How is a Code Formatter Implemented in Turtle Graphics
Amr Hesham HackerNoon profile picture

Hello everyone, in the last article I introduced the Turtle Graphics Android App project with implementation details and resources about the scripting language, editor, generating documentation…etc, after I published the app and got more than 2000 downloads a few times and good ratings and feedback, I decided to add support for code formatting and in this article, I will talk in detail how a simple code formatter works and how I implemented it in the Turtle Graphics app.


As programmers, Code formatters are an essential tool in our day-to-day jobs, They make it easier to read the code if it is formatted, but did you ask yourself how it works?


Before talking about Code formatter, let’s first talk about how Compilers represent your code from text to data structure to do the process on it such as type checking.


Let’s start our story from your file that contains a simple hello world example


fun main() {
   print("Hello, World!")
}


The first step is to read this text file and convert it into a list of tokens, A token is a class that represents a keyword, number, bracket, string, …etc with this position in the source code for example

data class Token (
   val kind : TokenKind,
   val literal : String,
   val line : Int,
)


We can also save the file name, column start and end so when we want to report an error we can provide useful info about the position for example


Error in File Main Line 10: Missing semicolon :D


This step is called scanner, lexer or tokenizer and at the end, we will end up with a List of tokens for example

{ FUN_KEYWORD, "fun", 1 }
{ IDENTIFIER, "main", 1 }
{ LEFT_PAREN, "(", 1 }
{ RRIGHT_PAREN, ")", 1 }
{ LEFT_BRACE, "{", 1 }
{ IDENTIFIER, "print", 2 }
{ LEFT_PAREN, "(", 2 }
{ STRING, "Hello, World!", 2 }
{ RRIGHT_PAREN, ")", 2 }
{ RIGHT_BRACE, "}", 3 }


The result is a list of tokens

val tokens : List<Token> = tokenizer(input)


Note that in this step we can check for some errors such as unterminated string or char, unsupported symbols …etc


After this step, you will forget your text file and deal with this list of tokens, and now we should convert some tokens into nodes depending on our language grammar for when we saw FUN_KEYWORD that means we will build a function declaration node and we expect name, paren, parameters …etc


In this step, we need a data structure to represent the program in a way we can traverse and validate it later and it is called Abstract Syntax Tree (AST), each node in AST represent statement such as If, While, Function declaration, var declaration …etc or expressions such as assignments, unary …etc, each node store required information to use them later in the next steps for example


Function Declaration

data class Function (
   var name : String,
   var arguments : List<Argument>,
   var body : List<Statement>
)


Variable Declaration

data class Var (
   var name : String
   var value : Expression
)


This step is called parsing and we will end up with an AST object that we can use later to traverse all nodes.


var astNode = parse(tokens)


If the language statically types such as Java, C, Go …etc we will go to the Type Checker step, the goal for this step is to check that the user use type correctly for example, if the user declares a variable with int type it should store only integers on it, the if condition must be a boolean type or an integer in a language like C …etc


After this step, we will end up with the same AST node but now we know that it is valid and we can now compile it to any target or evaluate it, But also we can do the formatting, static analysis, optimization, check code style …etc


For example suppose that we want all developers to declare variables without using _ inside the name, to check that we will traverse our AST node to find all Var nodes and check them

fun checkVarDeclaration(node : Var) {
   if (node.name.contains("_") {
      reportError("Ops your variable name ${node.name} contains _")
   }
}


But now we need to format it, so how to do that? It's the same we traverse our AST and for each node, we will write it back to text but formatted for example

fun formatVarDeclaration(node : Var) : String {
   var builder = StringBuilder()
   builder.append(indentation)
   builder.append("var ")
   builder.append(node.name)
   builder.append(" = ")
   builder.append(formatValue(node.value))
   builder.append("\n")   
   return builder.toString()
}


In this simple method, we rewrite the node to string but with correct indentations and add a new line after it so now 2 variables are declared in the same line, the value also is formatted using another function you can use Visitor design pattern to make it easy to handle all nodes.


At the end of this step, we end up with a string that represents the same input file but formatted and then we write it back to the file.


This is the basic implementation of code formatter, a real production code formatter must handle more cases, for example, what if the code is not valid?, should I format only valid code? should we read the whole program every time we want to format or compile the code?


Now back to Turtle graphics, In this project i already did all the required steps before and has a ready AST, so i just rewrite it with code as you saw above ^_^ i read it from the UI format it and write it back to UI in my case


If you are interested and want to read more I suggest

  • Read at least one Compiler book such as Crafting interpreters
  • Read about Language Server Protocol (LSP)
  • Watch Typescript Compiler explained by the Author Anders Hejlsberg
  • Think if you have Your program as AST what else you can do with it


I hope you enjoyed my article and you can find me on

Enjoy Programming 😋.