Funny little marks

A word from our sponsor:

Printer-friendly version

Author: 

Blog About: 

Sometimes in stories on this site, there are funny little marks in the text. Tonight I was cruising my own old stories to see if that would give me any inspiration and in some of the text, after quotation marks, there was a little square. I don't know where they came from and they were not there when I first published the story.

Just curious.

Gwen

Comments

Characters

Daphne Xu's picture

Did the funny little remarks replace characters in your story? Do you remember what the characters were?

I think it might the the product of using the wrong "Text Encoding" (under "View" in Firefox). Or maybe you need an add-on that displays additional characters.

-- Daphne Xu

Crash?

There was a crash five or six years ago that I think had some issues as the data was recovered? It's either that or one of the many updates this site has had over the years changed the way the code renders. I know I've had a number of ghost characters show up in stories over the years myself.

They are almost always in the

They are almost always in the stories from about 10 years or earlier, perhaps it's a minor error from when the site was updated in the past?

They are annoying at times, but they are mostly harmless...nothing has changed in the stories that I can see, they are simply there.

I'm told STFU more times in a day than most people get told in a lifetime

Unicode

In the beginning, there was ASCII.

Actually, before that, there was BAUDOT (teletype,) then EBCDIC (punched paper tape, based on punched card.)

Anyhow, all of the letters of the English alphabet, plus numbers, special characters, control codes, and the like, were encoded in the 7 bit code, for 128 total codes.

Enter the computer age. ASCII (American Standard Code for Information Interchange) was the obvious thing to use for them newfangled micro computers. And with the extra bit, you have another 128 codes to use for graphics, foreign (German, Spanish, French, etc) letters.

Enter Unicode, which is designed to read normal ASCII text, but allow extra bytes to be added for gazillions of extra characters. So we now have a standard encoding scheme that contains every language known to man -- even dead ones like Linear B. Not to mention emojis.

But if the encoding is messed up, you get these little squares with hexadecimal numbers inside.

https://en.m.wikipedia.org/wiki/Unicode

Searching for Inspiration

While I was pondering "The funny little marks", I read through some of my old stories, like "Extreme", "Baby, Baby", and "Alien Investigators". Much to my surprise and inspiration I found that those stories had been well liked by people who I care about. It seems clear that Science Fiction is my niche. Now I am feeling so much happier and cared for. :) Perhaps there is a bit of imagination left in me?

Gwen

Microsoft?

I download stories, sometimes as HTML, sometimes as text, and I've noticed that some of them have hex 009D characters after the close quotes a lot of the time. The unicode table shows this as "OSC" or "Operating System Command.

I'm guessing that it has to do with whatever word processor the author used; it may use that character for its own nefarious purposes, but whe the story gets uploaded, it doesn't remove them. I can't really confirm that, since I don't use a word processor; I write my stories with a plain text editor and put the HTML tags in by hand (or with a Perl script.)

A different issue, which I haven't encountered here, is that most browsers can't render all Unicode characters, and when they run into one of those, they usually display a little square with the hex code for the character. (I don't know what they do with characters that require more than 4 hex digits -- that is, stuff that isn't in the basic code plane.)