Using SyntaxHighlighter to Format Code in WordPress

Based on a question in the StackOverflow beta site, I did some quick research into what are the best ways to perform syntax highlighting on code that is posted on blogs. Among the methods that were suggested (by myself or others):

  1. Hack together your own display logic to format it as you see fit
  2. Use the SyntaxHighlighter JavaScript library
  3. Use Windows Live Writer with the Insert Code plugin (I discuss that here)
  4. For WordPress, use the WP-Syntax plugin

Coincidentally, I had heard Scott Hanselman talking about how he does code formatting just a couple of days ago, in Hanselminutes #125, where he described how he posted code on his blog by putting it inside <pre></pre> tags, adding specific name and class attributes, and letting some JavaScript library do the formatting work. So I went to his blog, opened up a post with some code, and found my way to the SyntaxHighlighter JavaScript library. This is a very nifty library that handles formatting very nicely for a number of popular programming and scripting languages, and seemed to have a very easy implementation. So I decided to implement it for formatting code on my site.

Installing SyntaxHighlighter

The basic steps that you have to follow are:

  1. Download the files
  2. Upload the core JavaScript files, any JavaScript files related to languages that you would want to format, the swf and css files to somewhere on your server
  3. Add references in your code to the different files

A very good and more detailed guide on how you can do this with your template can be found in this blog post by Fahd Sharif.

Displaying Code

Once you have the script, swf and css references integrated with your theme, you can post code using the following convention:

<pre name="code" class="langName">
Type your code here
</pre>

If you are doing this in WordPress, you will need to use the HTML editor to insert this. Language name reference is here.

Here is what is looks like in action:

public static class StringExtension {
  // Extension method to return first letter of a string
  public static string GetFirstLetter(this string str) {
    string val = (str.Length > 0) ? str.SubString(0,1) : "";
    return val;
  }
}

Fixing TinyMCE

After I first got this implemented in my theme, I had some problems getting it to work on actual blog posts. I would go to HTML mode in the editor, put in the <pre name=”code” class=”lang”>…</pre> syntax, go back to the Visual editor and finish my post, preview it, and no formatting would be applied. After checking the HTML source being output, I noticed that the name=”code” attribute of the pre tag was not being output. After some more investigation, I discovered that this attribute was being stripped by the TinyMCE editor when I switched from HTML to Visual editing modes.

It turns out that TinyMCE has its own built-in HTML validation that it employs when text is loaded into the Visual editor. Included in this is the ability to strip out attributes that are for whatever reason not “approved”. Tag:Name seems to be one of those.

One potential workaround would be to only use the HTML editor. Though I could do this, I like the Visual editor better when writing.

The alternative is to change the list of approved attributes for the pre tag so that name will no longer be stripped. After researching this a bit (references: 1, 2, 3), I did the following:

  1. Open up /wp-includes/js/tinymce/tiny_mce_config.php
  2. Go to approximately line 298
  3. Make the following edit (what you are doing here is giving tinyMCE an explicit list of attributes that are acceptable for the pre tag):
// Original: $content .= $ext_plugins . 'tinyMCE.init({' . $mce_options . '});';
$content .= $ext_plugins . 'tinyMCE.init({extended_valid_elements : "pre[id|class|title|style|dir|lang|name|onclick|onkeypress]",' . $mce_options . '});';
  1. Upload the new file
  2. Clear your browser cache
  3. Delete the /wp-content/uploads/js folder from the server
  4. Do a hard-refresh of your editor page in WordPress

You should now be able to toggle back and forth between HTML and Visual editor modes in WordPress without losing the pre:name attribute necessary for SyntaxHighlighter to work. You will have to repeat this whenever you upgrade WordPress.

Why Not Just Use the Plugin

WordPress junkies at this point are muttering to themselves: why go to all this trouble? There is already a plugin that implements SyntaxHighlighter functionality and spares you all of this hard work. Here are the reasons why I chose to do this the hard way:

  1. The plugin ditches the <pre> syntax for a custom syntax that looks like this: [sourcecode language="lang"]CODE GOES HERE[/sourcecode]. While this eliminates the pre:name stripping issue that I mentioned before, it introduces something that in my opinion is much worse: tinyMCE now gets rid of any spatial formatting that you are using. I like to indent my code when necessary. If I am typing in the pre tag, tinyMCE respects all of my spacing, and does not strip any of it out (that is the whole purpose of pre. Thus, I can save all of my indenting and spacing without any difficulties. If you are using a custom bracketed tag like the plugin does, tinyMCE strips a good deal of indenting and spacing, leaving code that just looks ugly.
  2. The aforementioned code formatting issues carry over to the RSS feed as well. This way just works better.
  3. The plugin includes all language JavaScript files. I don’t need to print Ruby code right now – why should I have to include the JS for it with my page?
  4. Doing the implementation yourself gives greater flexibility in terms of using the different configuration options that are available with SyntaxHighlighter.
  5. It is more fun this way

Update: Engfer lambasts me for editing wordpress core files that will be overwritten whenever wordpress is upgraded. Instead, he created a wordpress plugin that will allow you to override and define the allowable attributes for any elements parsed by TinyMCE. It is available for download here and is definitely a better way to go than manually editing files. Thanks!

Tagged , , , , , , , . Bookmark the permalink.

9 Responses to Using SyntaxHighlighter to Format Code in WordPress

  1. Wow, great article. I linked to it in my blog post.

  2. Tim Leung says:

    Great article – thanks!

  3. Çağdaş says:

    Thanks. Very good article

  4. Andrew says:

    Very useful plugin. It would be nice if it worked in the comments as well as the posts though.

  5. Pingback: Syntax Highlighting with Wordpress « Keeping My Hat On

  6. Pingback: Optimal Setup for SyntaxHighlighter with TinyMCE » Ellis Web

  7. Nail says:

    Is there a way to remove the numbers?

  8. Poma says:

    In responce to “Why Not Just Use the Plugin”. Here’s my experience using SyntaxHighlighter Evolved plugin for wordpress.

    1. It’s not an issue now. Just put your [code][/code] tags in the editor, then switch to source and back to visual. Now tinyMCE will see that you are inside tag and will respact all of your tabs and spaces.
    2. Not an issue, see 1. In your RSS feed there will be tag instead of [code], all formatting is preserved.
    3. Not an issue, All language-specific js-files are loded on-demand.
    4. You always can edit existing plugin, that's much easier.
    5. Yep! You're right. But only if you have/want to spend enough time for that kind of stuff.

    I belive that most points here are not plugin-specific and will apply to other highliting plugins.

  9. Pingback: What is best blogging host for programmers/code formatting? « « Programmers Goodies Programmers Goodies

Leave a Reply