Using VS Code and Node to Write HTML with Style (Baked-in)!

Introduction

In this article, I'm going to explain how you can use the power of Visual Studio Code and Node.js to build a custom editing tool to overcome ornery blogging platforms and just write Markdown.

The specifics of this article are tailored to the problem that we were having, but the strategy involved here can be applied to a great number of tasks where you'd like to use Markdown to create some content for a legacy system, or create other even more interesting custom editor workflows. For example, say you wanted to author some HTML for inclusion in an email, you are heavily constrained as to the markup that you can use.

The Problem

Infragistics has been providing forums and blogging for a long while now, and, as often happens when you need to keep a large back-library of content accessible, is running some rather old blogging software that works but isn't necessarily super pleasant to work with. The main vectors for getting content into the system include:

  • Working with a buggy web-based WYSIWYG editor which doesn't support entering code snippets very well, and incorrectly round trips existing content, causing the content to change in unpredictable ways if you edit it again.
  • Using a button on said buggy web-based WYSIWYG editor to inject literal HTML code, which coerces the injected HTML into sometimes buggy forms, presumably to prevent unsafe patterns, but, generally, results in things not looking at all like the author intended.
  • Working dark magics, and fighting murky single sign on systems, to hook Windows Live Writer (now Open Live Writer) up to the web service interface and injecting content from there, only to find the blog engine still mangling the submitted HTML in undesirable ways.

Beyond the difficulty of actually getting the content into the system, there's the the fact that Infragistics blog content is heavy on the use of code snippets. These should, ideally, look nice, and have syntax highlighting.

If you use a snippet storage service like Github Gist, you may think this part is easy, but our issue is that out Blog software specifically suppresses and removes iframe elements from the post content. I believe this is a security conceit due to the engine being inadequately prepared to delineate between post content and comment content as far as the editing is concerned, and you wouldn't want your comment writers injecting iframes!

Unfortunately, most decent external snippet software would like to load the snippet into an iframe, or need custom JavaScript to run (another no-no as far as our blog engine is concerned).

For a while, we were running some JavaScript at the site level which would apply some syntax highlighting to pre tags that were formatted in a very precise way, but this JavaScript kept going missing every time our Website team did some major revisions to the site, causing all the articles relying on it to suddenly look atrocious. When this went missing for the most recent time, I started cooking up this solution.

Finally, working in a WYSIWYG editor in a proprietary blogging portal doesn't afford you much in ways to review and iterate on content before it goes live. Ideally, someone should be able to ask for review of a blog post from peers or copy editors before publishing. The closest approximation we had to this was to post an unlisted draft post, and request feedback. However, there was no way to annotate and comment on the content inline, as you can in say, a markdown file stored in git with changes made using a pull request workflow.

Now, thankfully, I'm assured we are updating our blogging software soon, to a much more recent engine, so this is a short-term problem I'm addressing (hopefully), but this short-term solution was not that difficult to create, may continue to be relevant after the update, and offers a great strategy for creating custom editing workflows to get around problems similar to ours. Many of our engineers would blog less or not at all due to the barriers to authoring and iterating on the content, lowering those barriers encourages more content.

The Plan

Ok, let's summarize the problems to overcome:

  • Doing anything but injecting literal HTML into the current system is pretty difficult, and even this needs to be done with care, as there are many HTML patterns that the system will ignore, destroy, or misrepresent.
  • Working with the blogging engine's WYSIWYG editor or even Open Live Writer is a big pain, especially when working with code snippets, or iterating on the content.
  • We need code snippets that look nice, and that don't explode when some external JavasScript goes missing.
  • We need some way to have a review workflow for the content.

And here's a plan to resolve the issues:

  • VS Code is free, and has a great side by side Markdown preview when editing a markdown file.
  • Node.js and gulp let us initiate a background task to continuously convert our Markdown files to HTML, whenever they are saved.
  • Because we can run arbitrary JavaScript as part of the gulp pipelines, we should be able to perform additional work to manipulate the HTML produced so that the blogging software accepts it gracefully.
  • If the editing experience is just to iterate on Markdown files in VS Code, we should be able to store those Markdown files in git, and use a standard pull-request review workflow to create a review process for out posts.

The Solution

OK, first we need to make sure that we have Visual Studio Code and Node.js installed as they'll be the main workhorses in this workflow. Once they are installed, we need to create a directory that will house the markdown files e.g. c:\Blogging and open VS Code pointing at that folder.

First create an empty package.json file in the directory just containing:

{ }

Now we'll need to run a bunch of comands on the console in the context of the folder, so open up the integrated terminal via:

View => Integrated Terminal 

Next we need to globally install markdown-it, which is the Node.js package we'll use to convert the Markdown to HTML:

npm install -g markdown-it

Next we need to install gulp globally and locally, and some gulp extensions, which will help manage the workflow of converting the markdown files to HTML:

npm install -g gulp
npm install gulp gulp-markdown-it --save

This much should be sufficient to allow us to write a gulpfile.js that will continuously convert the Markdown files in our directory into HTML as they are saved, and to kick off the process with a Ctrl + Shift + B via some VS Code magic.

First, we'll create a test Markdown file called test.md in the folder, and give it some content:

## Introduction
This is an introduction.

This is another paragraph.

```cs
//this is a gated code block
public class TestClass
{
    public void TestMethod()
    {

    }
}
```

You can open the Markdown Preview with CTRL-K V and view the preview alongside the file you are editing:

Now, we can create the gulp configuration that will set up the conversion workflow. Create a file in the directory called gulpfile.js and fill it with this content:

var gulp = require('gulp');
var markdown = require('gulp-markdown-it');
var fs = require('fs');

gulp.task('markdown', function() {
    return gulp.src(['**/*.md', '!node_modules/**'])
        .pipe(markdown({
            options: {
                html: true
            }
        }))
        .pipe(gulp.dest(function(f) {
            return f.base;
        }));
});

gulp.task('default', ['markdown'], function() {
    gulp.watch(['**/*.md', '!node_modules/**'], ['markdown']);
});

With that file saved, we should be able to run gulp and see the results, so from the integrated terminal run:

gulp

This results in a file called test.html being created in our directory with this content:

<h2>Introduction</h2>
<p>This is an introduction.</p>
<p>This is another paragraph.</p>
<pre><code class="language-cs">//this is a gated code block
public class TestClass
{
    public void TestMethod()
    {
        
    }
}
</code></pre>

The way we have it configured, gulp will continue to watch for any changes to Markdown files in this directory (or subdirectories), and if any of them change, it will feed these through markdown-it to produce new HTML content in an html file with the same name as the Markdown file. If we make a change to the Markdown file:

## Introduction
This is an introduction.

This is another paragraph.

```cs
//this is a gated code block
public class TestClass2
{
    public void TestMethod2()
    {

    }
}
```

Here TestClass has been changed to read TestClass2 and TestMethod has been changed to read TestMethod2. After hitting save, and waiting a moment, test.html now contains:

<h2>Introduction</h2>
<p>This is an introduction.</p>
<p>This is another paragraph.</p>
<pre><code class="language-cs">//this is a gated code block
public class TestClass2
{
    public void TestMethod2()
    {
        
    }
}
</code></pre>

This is neat, but since we are using VS Code, we can even avoid needing to run the gulp command to get things started. All we need to do is create a tasks.json file in the .vscode sub folder in the project directory and provide this content:

{
    "version": "2.0.0",
    "tasks": [
        {
            "type": "gulp",
            "task": "default",
            "problemMatcher": [],
            "group": {
                "kind": "build",
                "isDefault": true
            }
        }
    ]
}

This makes it so that when you press CTRL + SHIFT + B, it will start running the gulp command in the background. In this way, we are combining the markdown editor in VS Code, including its really spiffy preview pane, with Node.js, markdown-it, and gulp in order to create an awesome HTML editor.

Taming the Ornery Blogging Engine

Now, if you aren't dealing with some ornery blogging engines, as we are, the above may be all that you will need. But even if so, you may find the rest of this interesting and useful. Here's the issues with our blogging engine that we need to address:

  • We'd prefer to render code snippets down to static styled html depending on no external JS or CSS style sheets.
  • Our blogging engine messes with tabs embedded in pre tags (why, I have no idea).
  • Our blogging engine fails to round trip line breaks within pre tags, often dropping them, so its safer to convert line breaks to explicit break tags or things will get messed up if we republish our HTML.
  • Our blogging engine likes to destroy whitespace at the start of lines even inside pre tags, so we'd like to convert those to non breaking spaces (&nbsp)

So, first, we'll deal with highlighting the code snippets and sanitizing the problematic HTML within the pre tags. We'll use a node package called highlightjs to format the gated code blogs in the Markdown into some syntax highlighted markup. So let's install that first:

npm install highlightjs --save

Once that is installed, we can modify the gulpfile.js to look like this:

var gulp = require('gulp');
var markdown = require('gulp-markdown-it');
//new
var hljs = require('highlightjs/highlight.pack.js');
var fs = require('fs');

//new
hljs.configure({
    tabReplace: '&nbsp;&nbsp;&nbsp;&nbsp;'
});

gulp.task('markdown', function() {
    return gulp.src(['**/*.md', '!node_modules/**'])
        .pipe(markdown({
            options: {
                html: true,
                //new
                highlight: function (str, lang) {
                    if (lang && hljs.getLanguage(lang)) {
                        try {
                            var output = hljs.highlight(lang, str).value;
                            output = output.replace(/(?:\r\n|\r|\n)/g, '<br />');
                            output = output.replace(/^\s+/gm, function(m){ return m.replace(/\s/g, '&nbsp;');});
                            output = output.replace(/(\<br\s+\/\>)(\s+)/gm, function(m){ return m.replace(/\>(\s+)/g, function (n) { return n.replace(/\s/g, '&nbsp;'); } ); });
                            return '<pre class="hljs"><code>' +
                                output +
                                '</code></pre>';
                        } catch (e) {}
                    }

                    return '';
                }
            }
        }))
        .pipe(gulp.dest(function(f) {
            return f.base;
        }));
});

gulp.task('default', ['markdown'], function() {
    gulp.watch(['**/*.md', '!node_modules/**'], ['markdown']);
});

In the above we:

  • Add a require statement so that we can use the highlightjs library.
  • Configure the highlightjs library to use 4 non breaking spaces instead of the tab character.
  • Use the highlight hook when running markdown-it to specify how the syntax highlighting should be performed for fenced code blocks. In this instance, we invoke highlightjs to do the highlighting.
  • In addition to transforming the fenced code blocks, we perform a series of regular expression replaces on the output content to remove patterns of characters the prove problematic for the blogging engine, and replace them with equivalent safer sequences of characters.

You should restart the gulp task at this point:

F1 => Terminate Running Task 

And then restart the task with CTRL + SHIFT + B. At this point your test.html should read like this:

<h2>Introduction</h2>
<p>This is an introduction.</p>
<p>This is another paragraph.</p>
<pre><code class="language-cs"><pre class="hljs"><code><span class="hljs-comment">//this is a gated code block</span><br /><span class="hljs-keyword">public</span> <span class="hljs-keyword">class</span> <span class="hljs-title">TestClass2</span><br />{<br />&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-function"><span class="hljs-keyword">public</span> <span class="hljs-keyword">void</span> <span class="hljs-title">TestMethod2</span>(<span class="hljs-params"></span>)<br />&nbsp;&nbsp;&nbsp;&nbsp;</span>{<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br />&nbsp;&nbsp;&nbsp;&nbsp;}<br />}<br /></code></pre></code></pre>

Its pretty ugly with all the line breaks removed from the pre tag, but now should play more nicely with the blogging engine.

We have two final issues that we'd like to address before we can call this done. First, we'd like all links to automatically open in a new tab, per requests of our website/marketing teams. Secondly, the current resulting HTML expects a CSS stylsheet to be loaded in order for the output HTML to actually be colored for the code snippets. If we were to look at the output in a browser, it would look like this:

As I alluded to before, we can't inject any per post CSS into our blog articles, so we'd rather all the syntax coloring were baked directly into the HTML inline.

Thankfully, for both of these issues, there are some simple to use node packages to fix things up. First, we can install a package called markdown-it-link-target which is a plugin for markdown-it that will cause links to render in such a way that they will open in a new tab:

npm install markdown-it-link-target --save

Next we'll install a plugin for gulp that will invoke a node package called juice:

npm install gulp-juice --save

juice is a neat library that will take a CSS style sheet, and some HTML, and then will bake the CSS styles into the HTML inline so that the external CSS is no longer needed. To hook these pieces up we again need to update our gulpfile and then restart the task:

var gulp = require('gulp');
var markdown = require('gulp-markdown-it');
var hljs = require('highlightjs/highlight.pack.js');
//new
var juice = require('gulp-juice');
var fs = require('fs');

//new
var codeCss = fs.readFileSync("./node_modules/highlightjs/styles/atom-one-dark.css", "utf-8");

hljs.configure({
    tabReplace: '&nbsp;&nbsp;&nbsp;&nbsp;'
});
gulp.task('markdown', function() {
    return gulp.src(['**/*.md', '!node_modules/**'])
        .pipe(markdown({
            //new
            plugins: ["markdown-it-link-target"],
            options: {
                html: true,
                highlight: function (str, lang) {
                    if (lang && hljs.getLanguage(lang)) {
                        try {
                            var output = hljs.highlight(lang, str).value;
                            output = output.replace(/(?:\r\n|\r|\n)/g, '<br />');
                            output = output.replace(/^\s+/gm, function(m){ return m.replace(/\s/g, '&nbsp;');});
                            output = output.replace(/(\<br\s+\/\>)(\s+)/gm, function(m){ return m.replace(/\>(\s+)/g, function (n) { return n.replace(/\s/g, '&nbsp;'); } ); });
                            return '<pre class="hljs"><code>' +
                                output +
                                '</code></pre>';
                        } catch (e) {}
                    }

                    return '';
                }
            }
        }))
        //new
        .pipe(juice({ extraCss: codeCss }))
        .pipe(gulp.dest(function(f) {
            return f.base;
        }));
});

gulp.task('default', ['markdown'], function() {
    gulp.watch(['**/*.md', '!node_modules/**'], ['markdown']);
});

With these changes, test.html should now look like this:

<h2>Introduction</h2>
<p>This is an introduction.</p>
<p>This is another paragraph.</p>
<pre><code class="language-cs"><pre class="hljs" style="background: #282c34; color: #abb2bf; display: block; overflow-x: auto; padding: 0.5em;"><code><span class="hljs-comment" style="color: #5c6370; font-style: italic;">//this is a gated code block</span><br><span class="hljs-keyword" style="color: #c678dd;">public</span> <span class="hljs-keyword" style="color: #c678dd;">class</span> <span class="hljs-title" style="color: #61aeee;">TestClass2</span><br>{<br>&nbsp;&nbsp;&nbsp;&nbsp;<span class="hljs-function"><span class="hljs-keyword" style="color: #c678dd;">public</span> <span class="hljs-keyword" style="color: #c678dd;">void</span> <span class="hljs-title" style="color: #61aeee;">TestMethod2</span>(<span class="hljs-params"></span>)<br>&nbsp;&nbsp;&nbsp;&nbsp;</span>{<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br>&nbsp;&nbsp;&nbsp;&nbsp;}<br>}<br></code></pre></code></pre>

In the preceding, this line:

var codeCss = fs.readFileSync("./node_modules/highlightjs/styles/atom-one-dark.css", "utf-8");

loads one of the css files that ships with highlightjs and bakes it in with the HTML.

Now, this can simply be pasted as raw HTML into pretty much anything that supports HTML, including ornery blogging engines!

If we were to load it in the browser at this point it would look like this:

Hope you found this interesting and/or useful!

-Graham