Nested code fences in Markdown

susam.net

168 points by todsacerdoti 9 hours ago


rednafi - 29 minutes ago

Ah, YAML and Markdown, two beautiful accidents of tech. It still boggles my mind that we collectively couldn’t come up with a post hoc spec and fix all the warts with a strict parser for either of them. Sure, it would break quite a bit of existing stuff, but the pain would probably be worth it.

starkparker - 6 hours ago

This is also how you handle adding code blocks in GitHub comment suggestions, fwiw.

    ````suggestion
    This example should instead be:

    ```basic
    10 PRINT "LOL"
    20 GOTO 10
    ```
    ````
chuckadams - 7 hours ago

Markdown's parser seems to be a fascinating anomaly: a specification that consists entirely of exceptions and corner cases.

codazoda - 7 hours ago

I might be able to use this, especially in LLMs where I ask them to give me things in code fences all the time. If I ask for markdown in a code fence, it all falls apart. If, however, I asked for markdown in a ~~~ code fence, or even ~~~~~, all would be right with the world, since they typically use ```.

Igor_Wiwi - 6 hours ago

I will use it as a rendering benchmarking for mdview.io https://mdview.io/#mdv=N4IgbiBcCMA0IBMCGAXJUTADrhzWOAtgnjgMI...

mohsen1 - 7 hours ago

I love hacker news! You learn something useful here and there.

I always used html elements like <pre /> and <code /> to go around this in the past

zelphirkalt - 8 hours ago

All this complication seems to stem from the simple fact, that the fences don't have a recognizably distinct start and end marker. It's all "`" or "~", instead of one symbol at the start and another, different symbol at the end. And then going into the different numbers of backticks or tildes. Why add such ambiguity, that will only make it harder to parse things correctly? This immediately raises the question: "What if I start a block with 4 backticks and end it with 5?"

All these complications would have been avoidable with a more thought through design/better choices of symbols. For example one could have used brackets:

    [[[lang
    code here
    ]]]
And if one wanted to nest it, it should automatically work:

    [[[html
    html code
    [[[css
    css code
    ]]]
    [[[js
    js code
    ]]]
    html code
    ]]]
In case one wants to output literally "[[[" one could escape it using backslash, as usual in many languages.

In a parser that would be much simpler to parse. It is kind of like parsing S-expressions. There is no need for 4 backticks, 5, or any higher number. I don't want to sit there counting backticks in the document, to know what part of a nested code block some code belongs to. It's a silly design.

pratikdeoghare - 6 hours ago

I faced this problem when designing my own notation [1].

Solved it by surrounding code with more ticks than maximum number of consecutive ticks inside its text. This allows arbitrary nesting.

Postgres solves it by using `$something$ whatever $something$` [2].

[1] https://github.com/PratikDeoghare/brashtag [2] https://www.postgresql.org/docs/current/sql-syntax-lexical.h...

zahlman - 3 hours ago

I hoped this would have some discussion of the design rather than simply saying how to do it, because I already knew (because it's come up on Stack Overflow / Stack Exchange meta a few times).

- 2 hours ago
[deleted]
zokier - 7 hours ago

I realize that it would be somewhat antithetical for markdown, but I increasingly feel that length-prefixing everything makes lot of stuff easier at pretty low cost. Anything depending on delimiters or start/end tags inevitably ends up with difficult quoting rules or some other awkward scheme (like seen here).

epage - 8 hours ago

> In fact, a code fence need not consist of exactly three backticks or tildes. Any number of backticks or tildes is allowed, as long as that number is at least three

Unfortunately, some markdown implementations don't handle this well. We were looking at using code-fence like syntax in Rust and we were worried about people knowing how to embed it in markdown code fences but bad implementations was the ultimate deal breaker. We switched to `---` instead, making basic cases look like yaml stream separators which are used for frontmatter.

data_ders - 6 hours ago

TIL about triple curlies! mind blown

trvz - 6 hours ago

Markdown assumes the user won’t do anything silly, and I’m fine with that. Rather the people enabling such behaviour are annoying.

rasur - 6 hours ago

#+BEGIN_SRC lolcode

blah

#+END_SRC

org-mode to the rescue ;p

maximgeorge - 7 hours ago

[dead]