Building a Language Interpreter in JavaScript: Pt. 3

October 26, 2022

6 min read

Building a Language Interpreter in JavaScript - Part 3 - Featured Image

Welcome to part 3 of building an interpreter in pure JavaScript. This is the final part of a journey to build an interpreter in pure JavaScript.

In the previous part, we built a parser for our tokens for the programming language we are making - printly. We defined grammar for our language and a way to structure the statements which we can use.

In this part, we will use the statement structure we designed and interpret it to make our language perform some actual work.

Yep, this part is all about building an interpreter.

Cleanup

Before we start, let's do some cleanup on our index.js code. Let's move the parser part into a separate file.

This is how our index.js should look like:

1// Code which we want to parse
2const code = `i = 5;`;
3
4// Import the lexer
5const analyseCode = require('./lexer-analyser');
6
7// Run the lexer
8const tokens = analyseCode(code);
9
10// Import the parser
11const parseTokens = require('./parser-analyser');
12
13// Run the parser
14const statements = parseTokens(tokens);
15
16// Finally we output the statements we parsed.
17console.dir(statements, { depth: null });

And we will move the parsing code into a separate file parser-analyser.js:

1// We include grammar here.
2// Grammar exports the very first rule: LineStatement
3// That means that parseGrammar is actually the same as the LineStatement constant.
4const parseGrammar = require('./grammar');
5const TokenReader = require('./token-reader');
6
7const parseTokens = tokens => {
8    // Create a reader for our tokens.
9    const reader = new TokenReader(tokens);
10
11    const statements = [];
12
13    while (reader.hasNext()) {
14        // We parse grammar until we have the next token.
15        const statement = parseGrammar(reader);
16
17        if (statement) {
18            // Our statement was parsed successfully,
19            // so we add it to the list of statements.
20            statements.push(statement);
21            continue;
22        }
23
24        // Something went wrong here, we couldn't parse our statement here
25        // so our language needs to throw a syntax error.
26        let token = reader.hasNext() ? reader.get() : reader.getLastToken();
27        throw new Error(`Syntax error on ${token.line}:${token.character} for "${token.value}". Expected an assignment, function call or an if statement.`);
28    }
29
30    // Return the parsed statements
31    return statements;
32};
33
34module.exports = parseTokens;

This will allow us to work easily on the interpreter.

You can follow this part in the code by checking out part3-chapter1 in the repository.

Building an interpreter

Interpreted languages like Python, PHP, Ruby, and JavaScript also have a few more steps which include optimizations and even compilation into an intermediary language or bytecode which is easy for the interpreter to run, we are keeping things simple so we will just run the actions directly and return the result.

We will interpret each of the statements and return the result from the final statement.

To be able to do that, we will make a function:

1// Take an array of statements and interpret
2// each of the statements and use the passed vm (we will explain this below)
3const interpretStatements = (statements, vm) => {
4    let result = null; // Set initial result for null
5
6    for(const statement of statements) {
7        // Interpret the statement and set the result.
8        result = interpretStatement(statement, vm);
9    }
10
11    // Return last set result.
12    return result;
13};
14
15// We define a dummy function for now
16const interpretStatement = (statement, vm) => null;

In the function, you see that we used an additional variable vm. This is a variable that will hold the state of our program after statements work on it. In other words, it will simulate the purpose of a "virtual machine" or vm.

This variable will hold all necessary data for our program to work correctly. So, things like functions, our program can call and all variables our program set or changed.

Here is how we will structure the vm variable:

1const vm = {
2    variables: {},
3    functions: {}
4};

In more complicated languages where we define functions and a lot of nesting and switching, we would also need to define a stack and push the previous state of variables in it and other data and pop it when the function returns. Our language doesn't have a way to define functions so we do not need it.

Now that we have a vm defined, let's start with statement interpretation. We will use statement type to determine how we will interpret the statement itself.

We'll start with the assignment statement.

1// Interpret the statement itself with the passed vm
2const interpretStatement = (statement, vm) => {
3    switch(statement.type) { // We first check the statement type and run the corresponding function.
4        case 'assignment': 
5            return interpretAssignment(statement, vm);
6        case 'function':
7            return interpretFunctionCall(statement, vm);
8        case 'if':
9            return interpretIf(statement, vm);
10    }
11
12    // We didn't get the correct type here of the statement so
13    // that means something is wrong and we need to throw an error.
14    throw new Error(`Invalid statement type: ${statement.type}`);
15};
16
17// We define other functions as dummy functions for now.
18const interpretAssignment = (statement, vm) => 'To be implemented';
19const interpretFunctionCall = (statement, vm) => 'To be implemented';
20const interpretIf = (statement, vm) => 'To be implemented';

Interpreting Assignment statement

The assignment statement has the following format for code a = 5;:

1{
2    type: 'assignment',
3    name: 'a',
4    expression: {
5        type: 'unary',
6        withNot: false,
7        value: { type: 'number', value: '5' }
8    }
9}

Therefore, we only need to set the variable requested to the vm.

1// Assignment just sets a variable as a result of the expression.
2const interpretAssignment = (statement, vm) => {
3    vm.variables[statement.name] = interpretExpression(statement.expression, vm);
4
5    // We also return the set variable as a statement result.
6    // This is not strictly necessary, but we want
7    // every statement to return something.
8    return vm.variables[statement.name];
9};
10
11

As a part of this, we also need to interpret an expression. We will implement that too. Expressions can be infinitely deep which is bad for recursion but in our case, we should not reach the recursion limit unless we have a super complicated expression with a lot of nesting, which nobody would normally write.

1const interpretExpresion = (expression, vm) => {
2    // Similar to statements, we interpret expressions
3    // based on the type.
4    switch (expression.type) { 
5        case 'string':
6            // For strings we just return them because
7            // they are already ready.
8            return expression.value;
9        case 'number':
10            // For numbers we convert them to actual numbers
11            // so that they can be used further in expressions.
12            return parseFloat(expression.value);
13        case 'variable': // For variables and the rest we will delegate them to their functions
14            return interpretVariableRetrieval(expression, vm);
15        case 'function':
16            return interpretFunctionCall(expression, vm);
17        case 'unary':
18            return interpretUnaryOperation(expression, vm);
19        case 'operation':
20            return interpretBinaryOperation(expression, vm);
21    }
22
23    // We got the wrong type here so we throw an error
24    // because something went wrong.
25    throw new Error(`Invalid type: ${expression.type}`);
26};
27
28// Dummy functions for now
29const interpretVariableRetrieval = (expression, vm) => 'Not implemented';
30const interpretFunctionCall = (expression, vm) => 'Not implemented';
31const interpretUnaryOperation = (expression, vm) => 'Not implemented';
32const interpretBinaryOperation = (expression, vm) => 'Not implemented';

We directly implemented interpreting the simplest expressions for strings and numbers, but other expressions need some more work. Accordingly, we need to implement a function for each of them:

Variable

1{ type: 'variable', name: 'variable_name' }

Function:

1const interpretVariableRetrieval = (expression, vm) => {
2    if (!(expression.name in vm.variables)) {
3        // Variable is not defined so this is a runtime error that we need to throw.
4        throw new Error(`Runtime Error. Variable '${expression.name}' does not exist.`);
5    }
6
7    // We return the variable's value from VM.
8    return vm.variables[expression.name];
9};

Function:

1{
2  type: 'function',
3  name: 'function_name',
4  parameters: [ ...expressions ]
5}

Function:

1const interpretFunctionCall = (expression, vm) => {
2    if (!(expression.name in vm.functions)) {
3        // Function is not defined so this is a runtime error so we throw it.
4        throw new Error(`Runtime Error. Function '${expression.name}' is not defined.`);
5    }
6
7    // We need to interpret an expression for each of the function parameters
8    // so we perform it using a map call.
9    const parameters = expression.parameters.map(
10        parameter => interpretExpression(parameter, vm)
11    );
12
13    // And we call a function from VM an pass parameters by a spread operator.
14    return vm.functions[expression.name](...parameters);
15};

Operation

1{
2    type: 'operation',
3    operation: '+',
4    left: {
5        ...expression
6    },
7    right: {
8       ...expression
9    }
10}

Function:

1const interpretBinaryOperation = (expression, vm) => {
2    // We interpret the expression for the left side of the operation
3    const leftValue = interpretExpression(expression.left, vm);
4    // And now for the right side
5    const rightValue = interpretExpression(expression.right, vm);
6
7    // And based on the operation we perform the same operation on the
8    // left and right side to get the result we need.
9    switch(expression.operation) { 
10        case '+':
11            return leftValue + rightValue;
12        case '-':
13            return leftValue - rightValue;
14        case '*':
15            return leftValue * rightValue;
16        case '/':
17            return leftValue / rightValue;
18        case '&&':
19            return leftValue && rightValue;
20        case '||':
21            return leftValue || rightValue;
22        case '>':
23            return leftValue > rightValue;
24        case '<':
25            return leftValue < rightValue;
26        case '<=':
27            return leftValue <= rightValue;
28        case '>=':
29            return leftValue >= rightValue;
30        case '==':
31            return leftValue == rightValue;
32        case '!=':
33            return leftValue != rightValue;
34    }
35
36    // We didn't get a operation we expect so we throw an exception here.
37    throw new Error(`Invalid operation requested: ${expression.operation}`);
38};

Unary

1{
2    type: 'unary',
3    withNot: false,
4    value: { ...expression }
5}

Function:

1const interpretUnaryOperation = (expression, vm) => {
2    // We interpret the value we need
3    const value = interpretExpression(expression.value, vm);
4
5    // If the value is with ! we also apply that before passing it
6    // back.
7    return expression.withNot ? !value : value;
8};

Implementing If statement

If the statement is pretty easy, once we have other things in place, because it is just a conditional interpretation of its own statements.

Here's how the if statement looks like:

1{
2    type: 'if',
3    check: {
4        ...expression
5    },
6    statements: [ ...statements ]
7}

And the implementation is:

1const interpretIf = (statement, vm) => {
2    // Interpret the check expression we are checking for
3    const checkValue = interpretExpression(statement.check, vm);
4
5    if (checkValue) {
6        // Value is true so we interpret the if's own statements
7        // and return the value.
8        return interpretStatements(statement.statements, vm);
9    }
10
11    // If check failed so we just return null.
12    return null;
13};

Implementing function statement

We have this part already. When did we implement it? Back when we implemented the expression we also implemented the interpretFunctionCall.

Our statement format is the same:

1{ 
2    type: 'function', 
3    name: 'function_name', 
4    parameters: [ ...expressions ] 
5}

So, we just reuse the interpretFunctionCall, and we are done.

Putting it all together

Finally, we just need to do some slight modifications to our index.js to complete everything:

1// Code which we want to parse
2const code = `
3world = 'World';
4print('Hello ' + world);
5`;
6
7// Import the lexer
8const analyseCode = require('./lexer-analyser');
9
10// Run the lexer
11const tokens = analyseCode(code);
12
13// Import the parser
14const parseTokens = require('./parser-analyser');
15
16// Run the parser
17const statements = parseTokens(tokens);
18
19// Import the interpreter
20const interpret = require('./interpreter');
21
22// We create a virtual state machine object
23const vm = {
24    variables: {},
25    functions: {
26        print(message) { // We add a print function so that we can call a function from our code.
27            console.log('MESSAGE:', message); 
28        },
29        pow(x, y) { // We also add a function which returns something for an expression.
30            return Math.pow(x, y);
31        }
32    }
33};
34
35// Interpret the statements and return last result.
36const result = interpret(statements, vm);
37
38// And finally we output the result
39console.log('Code we ran:');
40console.log(code);
41console.log('Result:')
42console.dir(result, {depth: null});
43console.log('Final VM State:', vm);

You can get to this part of the code by checking out the branch part3-chapter2.

And as an output we get:

MESSAGE: Hello World
Code we ran:

world = 'World';
print('Hello ' + world);

Result:
undefined
Final VM State: {
  variables: { world: 'World' },
  functions: { print: [Function: print], pow: [Function: pow] }
}

Our result is undefined because print function returns nothing. If we try this code:

pow(2, 5);

We get the following output:

Code we ran:

pow(2, 5);

Result:
32
Final VM State: {
  variables: {},
  functions: { print: [Function: print], pow: [Function: pow] }
}

And our interpreter is finished! We kept things separated and modular so that you can hopefully use this to build your own language or even extend printly.

And one last thing; we mentioned back in part 1 that we will also reimplement this in an easier way. We will leave that for the bonus part next time.

Hope you had a great learning experience and hope that you understand how languages are parsed way better by this point.

Accelerate Your Career with 2am.tech

Join our team and collaborate with top tech professionals on cutting-edge projects, shaping the future of software development with your creativity and expertise.

Open Positions

Don't miss out on
our latest insights
– Subscribe Now!

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Building a Language Interpreter in JavaScript - Part 3

Cleanup

Building an interpreter

Interpreting Assignment statement

Variable

Operation

Unary

Implementing If statement

Implementing function statement

Putting it all together

Further Reading

Accelerate Your Career with 2am.tech

Don't miss out on
our latest insights
– Subscribe Now!

Reinvent
Your Business!

Building a Language Interpreter in JavaScript - Part 3

Cleanup

Building an interpreter

Interpreting Assignment statement

Variable

Operation

Unary

Implementing If statement

Implementing function statement

Putting it all together

Further Reading

Accelerate Your Career with 2am.tech

Don't miss out on our latest insights – Subscribe Now!

ReinventYour Business!

Don't miss out on
our latest insights
– Subscribe Now!

Reinvent
Your Business!