xbdev - software development
Sunday July 21, 2024
Home | Contact | Support | Programming.. More than just code .... | RegEx (Regular Expressions)... Powerful, expressive and flexible.. ..

RegEx (Regular Expressions)...

Powerful, expressive and flexible.. ..


RegEx > RegEx for Markdown (Simplified)

Simple Mardown (sweety and simple) to HTML using RegEx Patterns. Combining multiple regular expression patterns that focus on different areas of the markdown language (e.g., code blocks, bold, alert boxes).

Just to emphasis this is a simple markdown project demo. Using basic regular expressions to match markdown patterns in a document. A more robust solution would be to implement your own token parser (essentially a complete compiler/syntax analyser). This way you'd go token by token and toggle global flags and take appropriate action.

A nice aspect of the simple markdown solution, is each regular expression can be tested and enabled/disabled seperately. In a few special cases, for example code blocks, the region of the text is removed so other expressions don't cause interference. Different markdown commands have priority (e.g., bold text) can still be set inside an alert box region (but not inside a HTML or code block).

Alert Boxes

The alert box is a juicy little trick to emphasis key points in color (such as warnings or information). The syntax for alert boxes is simply a triple colon followed by the alert box type. For our example, we'll have 4 alert box types (success, info, warning and danger).

This is what the alert boxes might look like (simple 'div' tag with color style information):

Yes - it works!!

This is a message - useulf

Watch out - a warning!!

Oh No! Danger Will Robinson Danger Danger!

This is the regular expression to identify the alert box region. The pattern uses groups to identify both the content and the alert box type.

RegEx Alert Boxes

The regular expression has multiple groups to extract both the content and the alert box type (success/info). The success/info name is used to set the style information.

The groups are mapped to a DIV:

• $2 - second group item (e.g., success/info/..)
• $4 - forth group item (e.g., body of the alert box)

Example of the CSS color scheme, the CSS class name is from the alert box name (e.g., success/info). If there was an issue with the name (conflict), we could append something (like 'alert-') to the front so it has a unique class name.

You could further enhance the styles, add rounded corners or different shapes/patterns for the boxes (background graphics).

ToDo List

The checkbox syntax is a quick way to visualize done and not done activities. Of course, you could change the HTML design to be a graphic (e.g., fancy tick icon or cross). You could even enhance it further by making the nested checkbox elements have different styles?

    Buy some salad
    Brush teeth
    Drink some water
        Have nap
        Feed cat

Note, this is what the HTML for checkboxes looks like:

The default HTML checkbox style looks like:

RegEx for Checkboxes (with Nesting)

Blockquote Tags

Using the arrow to emphasis block quotations.

Since we're using regular expressions - each line should have an arrow for it to be included in the block quote. The markdown specification says the block should continue until a double line is hit, however, we'll require each line with an arrow to be a block (helps with the next bit on nesting).


This is so we can have multiple arrows to have nested blocks. Blocks within blocks.

The tricky part is the syntax can be both arrows with and without a space between them (i.e., '>>>' and '> > >').

Blockquotes can also be nested...
>> ...by using additional greater-than signs right next to each other...
...or with spaces between arrows.

RegEx for Blockquote (with Nesting)


Standard ways to make test stand out - from bold and italic to the more fancy deleted (line through) options.

This is bold text

This is italic text

Deleted text

RegEx for Bold

RegEx for Italic

RegEx for Deleted


Markdown for an image tag (optional alt text).

Code Blocks

Code blocks are defined two triple backticks (\`\`\`) - one for the start and one for the end. When you're in a block of code, you don't want anthing to interfere. For example, you might want a code block with markdown or html. To resove this with RegEx, we prioratise this, so at the start, we identify the code blocks and remove them from the text (stored them in an array). After we've processed and modified the markdown in the document we re-insert the code blocks (safe and unmodified) with the appropriate wrappings.

In HTML, you can use the pre tag to disable formatting (sort of a raw display). This works great in most cases, however, it breaks when you're displaying HTML. As the HTML tags are still parsed. There are lots of different ways around this, but a simple hacky way is to use a textarea block instead of a pre.

1. Match and Extract (Store) all Code Blocks
RegEx Match All Code Blocks

2. Replace all Code Blocks
RegEx Replace All Code Blocks with [MCODE]

3. Perform any markdown processing and modifications on the text

4. End Restore all [MCODE] placeholders with Stored Code

Hacking Textarea Resize
A small problem with the textarea tag is that it doesn't automatically resize to the content. So if you want it to have the same height as the text you have to do a little extra work. A small hack here was to piggy back on the img tag. Which tries to load an image, when it fails, it calls the error callback - which we use to check the neighboring textarea and resize it to the height of the content.

HTML/JS Example:

Link to test it out on Notebook [LINK]

Advert (Support Website)

Copyright (c) 2002-2024 xbdev.net - All rights reserved.
Designated articles, tutorials and software are the property of their respective owners.