diff --git a/notes/3-markdown-parser/note.md b/notes/3-markdown-parser/note.md new file mode 100644 index 0000000..67d1700 --- /dev/null +++ b/notes/3-markdown-parser/note.md @@ -0,0 +1,37 @@ +# Markdown parser + +_2023-09-16_ + +While traveling today i thought about ways to structure my markdown parser. + +What I want is a function that takes a string containing markdown formatted text and outputs a string of html elements. + +My plan is to solve this using a markdown line feed and symbol classes representing the different possible elements. +These symbols are for example headings, list items, code snippets, paragraphs and links. +Each symbol will know how to render itself as well as being able to tell if it's the right symbol for the current line feed line. + +Lets say we have a markdown file that looks something like this: + +``` +# Release notes + +Release 1.0.0 contains the following changes: +- We're now able to properly render markdown as html. + +``` + +The line feed consists out of the lines above and can be read one line at a time. +We're ignoring the complexity of recursivly parsing symbols due to nesting for now. +We know that the parsing is complete when the line feed is empty. + +We take the first line `# Release notes`. +We iterate through our list of symbols asking them if they're interested in this line. +We arrive at the heading symbol, which identifies the "#" and says yes, we stop the iteration. + +We pass the first line to the symbol, alongside the line feed. +The symbol can now decide if it's happy with the one line it claimed or if it wants to continue reading from the line feed. +Symbols can peek at the line feed, the line is not removed unless it is claimed by the symbol. +In this case the heading is a single line symbol, so the symbol strips the line that it claimed from the "#" heading syntax and saves the rest to render as a heading later. +The static symbol instance now returns an instance of a header symbol which we can save to a list for later rendering. + +Calling render on a symbol should result in a html string composed of itself plus any child symbols recursivly.