Parsing for Fun and Profit (slides & code)

Here are the slides and code for my talk Parsing for Fun and Profit given at North West Ruby User Group (Manchester, UK) in February 2013. Slides first, then a guided tour of the code at the bottom of this post.


Talk summary

The talk is a very brief overview of what parsing is, why you'd want to do it and what you can do with a parser. The goal is not to go into detail about parsing or any of the tools used, or to show any "parsing best practices", just to show that building a simple language application is more accessible than you might think.

The specific examples are done by writing Parsing Expression Grammars in Treeptop. I show how to build a grammar one rule at a time by incrementally building up a suite of examples in RSpec, taking the Arithmetic sample grammar from the Treetop gem. As an example of a more complete language application with a more complex grammar, I show how to build a syntax highlighter for a simple subset of Ruby, which turns source code into marked-up HTML.

Code highlights

The source code is availble at You may like to browse all the example source, but here are some highlights:

  • arithmetic_parser_spec.rb is the example we worked through semi-live, and shows how you can build a grammar one rule at a time
  • simple_ruby.treetop is a Treetop grammar for a very small subset of Ruby - it's by no means production quality, but it's expressive enough for our demo purposes
  • simple_ruby_parser_spec.rb shows how you can build a complex grammar by inspecting a simplified version of the parse tree that Treetop generates
  • simple_ruby_parser.rb contains the code that generates these simplified syntax trees
  • spec_helper.rb shows how to get helpful error messages from Treetop - it relies on the tree simplifier in the SimpleRubyParser node classes
  • bin/rb2html is our little syntax highlighter application - it only takes about 20 lines of code!

Most of the code is commented to explain why it's done the way it is.


