Light it up! Pygments for Emacs Lisp.
24 Dec 2014The Challenge
More than 2 years ago, the formidable @bbatsov of Emacs Redux had this to say:
After so many years pygments (a popular syntax highlighting library used by GitHub & others) still lacks proper support for Emacs Lisp #fail
— Bozhidar Batsov (@bbatsov) September 15, 2012
Well, let's turn that #fail-frown upside down!
The Python
A quick search brought me to this page: Write your own lexer -- Pygments. Turns out that the Pygments development takes place on Bitbucket, so I had to start an account there. I shortly cloned the repository:
hg clone https://[email protected]/birkenfeld/pygments-main
Then I quickly copy-pasted some starting code:
__all__ = ['SchemeLexer', 'CommonLispLexer',
'HyLexer', 'RacketLexer',
'NewLispLexer', 'EmacsLispLexer']
class EmacsLispLexer(RegexLexer):
"""
An ELisp lexer, parsing a stream and outputting the tokens
needed to highlight elisp code.
"""
name = 'ELisp'
aliases = ['emacs', 'elisp']
filenames = ['*.el']
mimetypes = ['text/x-elisp']
flags = re.MULTILINE
# the rest of the code was copied from CommonLispLexer for now
Apparently, infrastructure-wise, I only need to know two commands. The first one needs to be run just once, so that Pygments is aware of the new lexer:
$ cd ~/git/pygments-main && make mapfiles
The second command is to (re-)generate /tmp/example.html
:
$ cp ~/git/emacs/lisp/vc/ediff.el \
~/git/pygments-main/tests/examplefiles/
$ ./pygmentize -O full -f html -o /tmp/example.html \
tests/examplefiles/ediff.el
I would repeat the last line with each update to the code, and then refresh the page in Firefox to see the result.
The Elisp
To finalize the lexer, the following tasks ensued:
- get a list of built-in macros
- get a list of special forms
- get a list of built-in functions
In the process, I've added two more lists:
a list of built-in functions that are highlighted with
font-lock-keyword-face
:'defvaralias', 'provide', 'require', 'with-no-warnings', 'define-widget', 'with-electric-help', 'throw', 'defalias', 'featurep'
a list of built-in functions and macros that are highlighted with
font-lock-warning-face
:'cl-assert', 'cl-check-type', 'error', 'signal', 'user-error', 'warn'
To generate the other three lists, I started off writing things in
*scratch*
, but after a while my compulsion to C-x C-s
kicked in and I've saved the work to research.el
. At least,
thanks to @bbatsov,
I'm not C-x C-s-ing that much since I've added this:
(defun save-and-switch-buffer ()
(interactive)
(when (and (buffer-file-name)
(not (bound-and-true-p archive-subfile-mode)))
(save-buffer))
(ido-switch-buffer))
(global-set-key "η" 'save-and-switch-buffer)
But it's time for the student to one-up the master, so here's a tip to improve even further:
(defun oleh-ido-setup-hook ()
(define-key ido-buffer-completion-map "η" 'ido-next-match))
This way I can cycle the buffers with the same shortcut that invokes
save-and-switch-buffer
. The defaults are C-s and
C-r, in case you didn't know.
The C
Getting the list of built-in C functions and special forms, obviously involved browsing the C source code. In case you don't (yet) have the Emacs sources, they're here:
$ git clone git://git.savannah.gnu.org/emacs.git
I switched to the ./src
directory and called M-x find-name-dired
with
*.c
to build a list of all the sources.
Then I ran the following code from research.el
:
(defvar foo-c-functions nil)
(defvar foo-c-special-forms nil)
(defun c-research ()
(let ((files (dired-get-marked-files))
(i 0))
(dolist (file files)
(message "%d" (incf i))
(with-current-buffer (find-file-noselect file)
(goto-char (point-min))
(while (re-search-forward "^DEFUN (" nil t)
(backward-char 1)
(let ((beg (point))
(end (save-excursion
(forward-list)
(point)))
str)
(forward-char 2)
(search-forward "\"" nil t)
(setq str (read (buffer-substring-no-properties
(+ beg 2) (1- (point)))))
(if (re-search-forward "UNEVALLED" end t)
(push str foo-c-special-forms)
(push str foo-c-functions))))))))
This was beautiful, by the way, to just generate this sort of documentation from such well-formatted and documented C sources. Free Software FTW.
If you're interested, there are 1294 built-in functions. Here's a list of 23 special forms that I found:
and catch cond condition-case defconst
defvar function if interactive let let*
or prog1 prog2 progn quote
save-current-buffer save-excursion
save-restriction setq setq-default
unwind-protect while
You can read up on the special forms in the SICP. There's no node for them, so just use isearch.
The Result
You can see it here: ediff.html, as well as on the rest of the site, since I've switched it on everywhere.
The Impact
Unfortunately this won't have impact on the Github source code highlighter, since Github dropped Pygments recently.
But people that use the static blog generator
Jekyll or the LaTeX package
minted (that's the package that
org-mode
's PDF Export uses by default) will be able to get better
Elisp highlighting. In fact, this blog is already using the new highlighter.
See the rest of projects that use Pygments here
The Bitbucket
So now, to share the new lexer with the world I just have to learn how to:
- stage and commit in Mercurial
- push Mercurial to Bitbucket
- open a pull request on Bitbucket
I don't want to become a hipster, these things just happen.