Regex diagram issues...

0 votes
asked May 28, 2023 in Bug by Todd Musheno (2,680 points)
edited May 28, 2023 by Todd Musheno

There are a few improvements I think are needed for regex diagrams:

  • Special escapes should indicate the characters they are indicating somehow? for example instead of displaying \d, some indication that any diget is allowed... I am not sure about this one.
  • Octal and unicode escapes should show the unescaped character (as well/instead?).
  • Range looks a lot like a literal... maybe include min and max seperatly with an arrow? Open to ideas on this one... but [a-z] should not look like "a-z" it should look like an "a" with an indication its the min, with an association with the "z"... I hope that makes sense.
  • Catagories are displayed with the \p{} stuff in it with some change in the box to indicate its a category... I would only display the Catagory name "Letter" for example.
  • Support for inline comments should be included: "a(?#The fist latin letter)" should comment the "a"... I have seen other syntax, so open to debate if others would like another syntax.
  • \Q and \E indicate start and stop of literal text, and should not show up in the diagram for letteralCharacterSequence, other then that it looks great... keep in mind "\Qa|b\E" should be the literal text "a|b" NOT an alteration.
  • Octal escapes should not require a 0 prefix... "\337" is an octal escape, and should allow numbers until there it finds a non number.
  • Be sure to check that the "\\" escapes to "\"
commented May 28, 2023 by Todd Musheno (2,680 points)
Keep in mind there is no real logical distinction between [a-z] and [z-a]... if that helps.
commented May 28, 2023 by Todd Musheno (2,680 points)

I would like to see differant box types (see literal text) for differant items instead of just including the raw regex.

I think we need at least borders for:

  • raw text
  • character classes/escapes (and variants)
    • I am leaning to having a special border or indicator for categories.
  • comments
commented May 28, 2023 by Todd Musheno (2,680 points)
edited May 28, 2023 by Todd Musheno
I love the repetition notation! Good job!

Keep in mind... x{1,} should display the same as required repetition, and x{0,} should display the same as optional repetition.

Also x{,5} implies a 0, so its the same as x{0,5} and should display the same.

Please consider displaying x{5,} like x{5,∞} and yes that is an infinity symbol, I could go either way on that.

x{,} technically implies x{0,∞}, but I could see not supporting that too... its kinda nonsense

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.
Anti-spam verification:

[Antispam2 Feature: please please wait 1 or 2 minutes (this message will disappear) before pressing the button otherwise it will fail](--------)
To avoid this verification in future, please log in or register.