I’m working on a very primitive extension to the excellent Python Markdown library that deals with typographic issues. There are several libraries that handle this and most of the time they work, sometimes I’d much rather just type an em-dash in-line — I mean without fussing with the
--- syntax. I feel this is more portable in terms of source files and I’m completely comfortable hitting Option+Shift+- as needed.
Conversions I’m initially concerned about are listed in the table below.
||An em-dash is used to separate out departures in thought — like this.|
||An en-dash is used to indicate ranges: 10–20, 9th–13th, etc.|
||Times sign is used to indicate multiplication or height/width. 1080px × 1920px for example|
||“Smart” quotes are the curly ones.|
In addition to the items above, adding extras like a hair space around em-dashes was a consideration.
I implemented a rough Preprocessor for Markdown that does simple regular-expression based replacements of these into HTML entities. Obviously the entire thing falls over when you’re handling non-UTF-8 files but that wasn’t a restriction for me.
Here’s the biggest change which is the em-dash hair space item. An em-dash—without hair space — with hair space. Not bad eh?