Could we add syntax diagrams?

+2 votes
asked 5 days ago in Wanted features by Todd Musheno (290 points)

I would like to be able to generate syntax diagrams (also known as railroad diagrams):

https://en.wikipedia.org/wiki/Syntax_diagram

I would suggest starting with vanilla EBNF, as its the most widely used, and for most cases this should be "good enough".

A good site that can do this well is:

https://bottlecaps.de/rr/ui

Discussion might be good if you should support a whole syntax tree, or just one rule at a time.

I would LIKE to see support for a full syntax, but think single rule may be enough.

1 Answer

0 votes
answered 5 days ago by plantuml (279,640 points)

Sure, having support for https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form would be nice.

For example, we could have:

@startebnf
letter = "A" | "B" | "C" | "D" | "E" | "F" | "G"
       | "H" | "I" | "J" | "K" | "L" | "M" | "N"
       | "O" | "P" | "Q" | "R" | "S" | "T" | "U"
       | "V" | "W" | "X" | "Y" | "Z" | "a" | "b"
       | "c" | "d" | "e" | "f" | "g" | "h" | "i"
       | "j" | "k" | "l" | "m" | "n" | "o" | "p"
       | "q" | "r" | "s" | "t" | "u" | "v" | "w"
       | "x" | "y" | "z" ;
digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" ;
symbol = "[" | "]" | "{" | "}" | "(" | ")" | "<" | ">"
       | "'" | '"' | "=" | "|" | "." | "," | ";" ;
character = letter | digit | symbol | "_" ;
 
identifier = letter , { letter | digit | "_" } ;
terminal = "'" , character , { character } , "'"
         | '"' , character , { character } , '"' ;
 
lhs = identifier ;
rhs = identifier
     | terminal
     | "[" , rhs , "]"
     | "{" , rhs , "}"
     | "(" , rhs , ")"
     | rhs , "|" , rhs
     | rhs , "," , rhs ;

rule = lhs , "=" , rhs , ";" ;
grammar = { rule } ;
@endebnf

The drawing part is easy.

The most complex part is parsing EBNF.
Unfortunately, we cannot find any easy to integrate and free Java EBNF parser and we don't really have time to write such a parser.

However, if anyone with Java skills is okay to write such a parser, we would be glad to implement the drawing part!

commented 3 days ago by Todd Musheno (290 points)
Good news, if you could detail what you need precisely I am a Java developer, and can work on it over the weekends!

What are you using as a parser at the moment for other file types?

Any help would be... helpful ;-)
commented 3 days ago by Todd Musheno (290 points)

Also any reason not to do @startsyntax?

I would think EBNF would be a good starting place, as just about everyone in that space supports at least that...

There are a couple exceptions from COBOL days, but I would just force those people to convert there BNF/more restrictive diagrams to EBNF... good news the exceptions are all less rich with different syntax to their syntax documents (I hope that sentence makes sense)

There are some syntax types that may need more, but for now I would suggest that's out of scope, and they are all extensions of EBNF.

commented 2 days ago by plantuml (279,640 points)

This is really an alpha version but we have made a try.

@startebnf
character = letter | digit | symbol | "_" ;
@endebnf

If you are curious, you can have a look on the code.

Right now, only "alternation" is working.

Any though?

commented 2 days ago by Todd Musheno (290 points)

Looks great!

You might want to provide some way to visually distinguish between terminal strings and identifiers on the rhs (right hand side), and I am not sure if you want to be able to do full grammers or just one rule at a time.

Also I am guessing people will eventually want full ebnf support at least, so you may want to figure out how to distinguish between types visually (terminals, identifiers, optionals, comments, etc...)

The stuff beyond ebnf will all be additional types in exactly this sense, but you are looking at a handful of types, so letters or something may suffice.

Also I would simply list each rule one at a time... its common for syntaxes to be recursive, so trying to unroll horizontally... that way leads madness. (also, not sure if you want to be able to switch between horizontal and vertical display, but... I do not see why one could not do that).

https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form#Table_of_symbols

commented 2 days ago by Todd Musheno (290 points)
One minor point...

your diagram should at least indicate its for the "character" rule given that input...

If you look at the example on wikipedia you will see a "header" for each rule, so if you only have one rule you will still need to know the "identifier" for the rule.
commented 2 days ago by Todd Musheno (290 points)

You also may want to add something like an incoming arrow ▶/⮞/etc... and outgoing circle ⚫/⚪/etc... to indicate direction... I don't want to give too much detail on exact symbol/visual you use though... seen this a million ways, I think consistant with other diagams would be more important then whatever I think.

minor point

commented 56 minutes ago by Todd Musheno (290 points)
Looked over the code this weekend...

There is some stuff I don't 100% understand, but think its all UI related.

Outside that, it all seems reasonable to me.
...