support punctuation in YAML scalars

0 votes
asked Feb 5, 2021 in Wanted features by Martin (8,360 points)
edited Feb 7, 2021 by Martin

The webpage https://yaml.org is written in YAML.  But there's a few obstacles to it parsing in PlantUML.

The following characters currently don't work in mapping keys:

( ) + # :

Note that "#" and ":" are valid string characters unless they are "<space>#" and ":<space>" which are comment and mapping respectively.  (cf https://forum.plantuml.net/13018/consider-supporting-yaml-same-line-comments)

There's a lot more punctuation that should also be allowed, YAML isn't fussy.

As a result the following snippets don't work:

@startyaml
YAML Resources:
  YAML 1.2 (3rd Edition): http://yaml.org/spec/1.2/spec.html
@endyaml

@startyaml
  C/C++:
  - libfyaml           # "C" YAML 1.2 processor                                            | YTS
@endyaml

@startyaml
  C#/.NET:
  - YamlDotNet         # YAML 1.1/(1.2) library with serialization support                  | YTS
  - yaml-net           # YAML 1.1 library
@endyaml

@startyaml
YAML Resources:
  YAML IRC Channel:       "#yaml on irc.freenode.net"
@endyaml

@startyaml
Projects:
  Perl Modules:
  - YAML::XS           # Binding to libyaml
@endyaml

1 Answer

0 votes
answered Feb 7, 2021 by plantuml (294,660 points)

We have fixed some of your examples. However, the management of : in key name raises a question.

Is some space mandatory after : in key/value definition ?

So is:

@startyaml
key:value
@enduml

a valid example or do we have to use :

@startyaml
key: value
@enduml

commented Feb 7, 2021 by Martin (8,360 points)
edited Feb 7, 2021 by Martin

Nice work, the yaml.org site processes much better now.

Caveats:

  • The directive "%YAML 1.2"
  • The start document marker "---"
  • The end document marker "..."
  • Embedded ":", as you questioned.

YAML uses three dashes (“---”) to separate directives from document content. This also serves to signal the start of a document if no directives are present. Three dots ( “...”) indicate the end of a document without starting a new one, for use in communication channels.

I'm not sure if you want to bother with those bells & whistles; they're easy enough for people to strip off and they are more stream control than yaml data.  On the other hand, a lot of the yaml examples have them and they would be very easy to ignore...

Regarding embedded ":"s in the key and/or value:

The best quote I could find in the spec was admittedly in the flow mapping section (7) but I am sure it is explaining the way the non-flow version works in order to contrast it with options for the flow version.  Here is the quote:

Normally, YAML insists the “:” mapping value indicator be separated from the value by white space. A benefit of this restriction is that the “:” character can be used inside plain scalars, as long as it is not followed by white space. This allows for unquoted URLs and timestamps. It is also a potential source for confusion as “a:1” is a plain scalar and not a key: value pair.

To ensure JSON compatibility, if a key inside a flow mapping is JSON-like, YAML allows the following value to be specified adjacent to the “:”. This causes no ambiguity, as all JSON-like keys are surrounded by indicators. However, as this greatly reduces readability, YAML processors should separate the value from the “:” on output, even in this case.

i.e. 

a:1: b:2 would be { "a:1": "b:2" }

If you were supporting flow style then the following is allowed, in order to support JSON compatibility (JSON is a subset of YAML):

{"a":b} # a colon without a space is valid in flow style as long as the key is JSON compatible

And another quote in 8.2.2. Block Mappings:

In this case, the value may be specified on the same line as the implicit key. Note however that in block mappings the value must never be adjacent to the “:”, as this greatly reduces readability and is not required for JSON compatibility (unlike the case in flow mappings).

...