I’m scraping a website and some of the text I’m getting back contains curly quotes, like so:
“Slap it in scarequotes to get more clicks,” he said, making airquotes with his fingers.
"Slap it in scarequotes..."
Aside: Oooh… I see Discourse somehow magically transforms straight quotes into curly quotes when they are paired.
Let’s see how it shapes with a common mistake/error: Summer of '69. Result: Ah, I see they play it safe and only curl the mark when it’s an apostrophe or paired. Pity.
Anyway… I’m trying to decide whether to keep the curly quote characters, or replace them with straight quotes.
The question led me down this new rabbit hole:
No preview: ASCII and Unicode quotation marks
No preview: Smart Quotes
That last link has a decent solution for quotations, but not apostrophes. The HTML standard now includes the
<q> tag, which is specifically for rendering quotes in specific languages.
For example, several languages use double chevrons called Guillemets: «Ceci est une citation en français».
Some other language, including Afrikaans, have quotation marks that start with two lower commas and end with inverted commas: „Dié is 'n aanhaling in Afrikaans.”
Or like so: „Dette er et tilbud.“
So yeah… I’m all for the drive to switch away from using straight quotes to the correct typography, but English keyboards remain the hurdle here.
Word processors do the smart quote thing and try to intelligently replace straight quotes, but fall down at contractions that start with an apostrophe like '90s, or 'Sup. When you’re writing for the web, you’re at the mercy of whatever your CMS is capable of.