DEV Community

Kenneth Tubman
Kenneth Tubman

Posted on • Originally published at kentubman5.xyz on

Writing a Static Documentation Generator using Tree-sitter [Part 1]

Introduction

Lately I’ve been trying to generate documentation from multiple different repositories that are in different languages, and have found that the tools are mostly inadequate for what I need them to do. Usually I can find some documentation generators for 1 language or maybe 2 (JavaScript/TypeScript) if they are similar. But when using multiple languages it doesn’t seem to work as well. So instead of trying to use conversion tools to make it fit for the existing documentation generators, I thought I’d take a stab. The tool I want to use is Tree-sitter to parse the different languages since there are a lot of parsers available and the output format is convenient and easy to use. Neovim uses it for internally now and it has allowed for some fast and accurate syntax highlighting, so I know it works well.

Testing the waters

I started out by just making sure I could parse a particular string of code into an AST (Abstract Syntax Tree). I used TypeScript to test it since it’s what I used but you could use it with any language that supports tree-sitter.

const Parser = require('tree-sitter');
const TypeScript = require('tree-sitter-typescript');

const tsParser = new Parser();
tsParser.setLanguage(TypeScript.typescript);

const tree = tsParser.parse(`
/**
 * @returns string Hello world
 */
const hello = () => {
  return "Hello world";
}
`);

console.log(tree.rootNode.toString());
Enter fullscreen mode Exit fullscreen mode

It worked well and produced a nice AST that looks like this:

(program
  (comment) 
  (lexical_declaration 
    (variable_declarator 
      name: (identifier)
      value: (arrow_function
        parameters: (formal_parameters)
        body: (statement_block (return_statement (string)))))))
Enter fullscreen mode Exit fullscreen mode

For my purposes I want the comment nodes in the AST. Then what I need to do is check if those comments match the standard in the specific language. The standard I want to use for the comments is JSDoc. Luckily there is a JSDoc grammar written for tree-sitter, so I can just download tree-sitter-jsdoc and use it.

const jsdocParser = require('tree-sitter-jsdoc');

const commentNode = tree.rootNode.child(0).text;
console.log(commentNode);

const jsDocTree = jsdocParser.parse(commentNode);
console.log(jsDocTree.rootNode.toString());
Enter fullscreen mode Exit fullscreen mode

This turns the comment string:

/**
 * @returns string Hello world
 */
Enter fullscreen mode Exit fullscreen mode

into this:

(document (tag (tag_name) (description)))
Enter fullscreen mode Exit fullscreen mode

So next I need to figure out how to traverse the AST to find these comment nodes and then parse what is below them to document them.

Another thing that needs to be figured out is how to keep structure. When there is a class in TypeScript or other languages, we need to keep track of that so we know that it belongs in the same documentation file.

AWS GenAI LIVE image

How is generative AI increasing efficiency?

Join AWS GenAI LIVE! to find out how gen AI is reshaping productivity, streamlining processes, and driving innovation.

Learn more

Top comments (0)

Tiger Data image

🐯 🚀 Timescale is now TigerData: Building the Modern PostgreSQL for the Analytical and Agentic Era

We’ve quietly evolved from a time-series database into the modern PostgreSQL for today’s and tomorrow’s computing, built for performance, scale, and the agentic future.

So we’re changing our name: from Timescale to TigerData. Not to change who we are, but to reflect who we’ve become. TigerData is bold, fast, and built to power the next era of software.

Read more

👋 Kindness is contagious

Embark on this engaging article, highly regarded by the DEV Community. Whether you're a newcomer or a seasoned pro, your contributions help us grow together.

A heartfelt "thank you" can make someone’s day—drop your kudos below!

On DEV, sharing insights ignites innovation and strengthens our bonds. If this post resonated with you, a quick note of appreciation goes a long way.

Get Started