THIS BLOG HAS MOVED - click here for new site!

September 13, 2010

Thesaurus Spam 2: The Comment Years

Filed under: Blogkeeping, Spam, Language

“Thesaurus spam” tries to avoid automated unsolicited-commercial-message detection by automatically replacing words in the spam text with “synonyms”. I put scare-quotes are around “synonyms” because thesaurus spam often fails to pick anything even close to a true synonym. So “we will fight them on the beaches” could, for instance, become “ourselves will affray them on the littoral”.

I hardly receive any thesaurus-spam via e-mail any more (largely because of upstream filtering; it’s probably still quite popular), but I do still see it. Most recently, in comments on this blog.

What happens is, a spammer comes along and creates a commenting account with a “Website” link to whatever site they want to spamvertise. Today, this was a commenter called “batterysea”, linking to www.uk-power-battery.co.uk. (All evidence of this commenter has now been erased, of course.)

Then the commenter goes into robospam mode. Instead of posting the usual robospam comments that say something like “Louis Vuitton Prada best replica fakes Rolex Viagra” et cetera et cetera, with links to a Web site from pretty much every word, they create an innocuous, linkless, plain-text comment. At a glance, the new spam-comment kind of looks as if it belongs on the page. That’s because it does kind of belong there, on account of being a copy of an earlier comment on the same page, but with the Thesaurus-O-Matic run over it to make the copying less obvious (and difficult, if not impossible, to auto-detect).

I’ve plucked a few of these ticks off the blog before, but this one this one managed to splatter a few more comments around before I stopped him, so I paid more attention. I presume these spammers try to strike a balance between getting a commercially useful amount of spam transmitted, without obviously producing tons of new comments that even a dozy admin is likely to notice. In the “batterysea” case, there were nine comments, posted at one-minute intervals on my nine most recent posts.

On this post, for instance, there’s a legitimate comment from Anne that says

Clearly I am culturally deprived - I don’t read magazines, I don’t watch TV, and I surf the web with adblock. So where would I see these ads?

Maybe a better question is, do these ads actually sell products? I mean, if I’m trying to decide on which fan to buy for my PC, is seeing an ad in a magazine actually going to affect my decision, whether the ad has giant robots or sober statistics?

And then, at the end of the page, along came the spammer to say

Clearly I am culturally beggared - I don’t apprehend magazines, I don’t watch TV, and I cream the web with adblock. So area would I see these ads?

Maybe a more good catechism is, do these ads absolutely advertise products? I mean, if I’m aggravating to adjudge on which fan to shop for for my PC, is seeing an ad in a annual absolutely activity to affect my decision, whether the ad has behemothic robots or abstaining statistics?

On this post, the spammer lifted just the second paragraph of my own comment, which started out

It’s possible that such a scheme would actually be legit, but it’s probable that it would not, because people sending money would have the implicit assumption that they were going to get something in return, even if it was as unlikely to be valuable as a lottery ticket.

That part became

It’s accessible that such a arrangement would absolutely be legit, but it’s apparent that it would not, because bodies sending money would accept the absolute acceptance that they were activity to get article in return, alike if it was as absurd to be admired as a action ticket.

…in the spam-comment.

When the robospammer can’t find any words to thesaurusise, it ends up just duplicating an existing comment. For instance, Fallingwater’s comment on this post:

The Asus EeePC 1005HA is, I think, the device that loses its rubber feet fastest than anything else that has been produced.

My solution: melt glue. Four puddles where the feet used to be have made my EeePC stick to surfaces again. Less than when it had the rubber feet, but a hell of a lot better than naked plastic.

…was duplicated word-for-word by the spammer.

This is a really feeble kind of spamming. All commenter Web-site links on this blog, and pretty much every other blog, are nofollowed, as are links in the comments themselves. So you don’t get search-engine prominence from this technique, and you don’t even get any traffic to speak of, unless human readers click on your commenter-name. I presume this happens even less often than people clicking on the links in the “Dolce Gabbana Dior bags Gucci handbags Chanel Hermes…” sorts of comments.

I think the only way to make comments that really look as if a human posted them would be by creating a spambot with something resembling real, “strong“, AI, like the burgeoning network-creatures in Maelstrom, the second of Peter Watts‘ excellent “Rifters” series (all three books of which are downloadable for free!).

In the meantime, we get aphasic thesaurus-robots, all that can be said for which is that they’re more successful than the robots that make hundreds, and hundreds, and hundreds, of accounts called things like “aFZflRhBzRsYq <asdfwerj5@gmail.com>”, but never manage to post a single actual comment.

September 6, 2010

Real books glow

Filed under: Books

A reader writes:

Seeing as you’re both someone who knows his gadgets, and someone who enjoys a good read once in a while, I was wondering if you’ve ever considered those new-fangled e-book contraptions.

I’ve been considering getting one, as shelf space is always expensive, so I want to reserve that for books I’d want to re-read often, or just proudly display. Besides, the ability to carry a lot of books at less space/weight than the average paperback is quite interesting.

I live in the Netherlands, so I’d need something internationally available (which will probably go for you in Australia as well). However, since I read mostly English, something bound to a primarily English store like Amazon (Kindle) or B&N (Nook) isn’t too much of a problem.

From what I’ve heard, the Kindle, even though it goes against my open-source instincts, is actually one of the best models for actual reading (as opposed to showing off your latest gadget).

I’d be interested to hear your thoughts.

Bernard

PS: If you recommend the iPad, I’m going to be very disappointed, as I always though you were immune to the Jobs-cult propaganda.

I don’t actually have a really good answer for this one - though I will of course manage to sound off interminably anyway - but I bet some commenters will have ideas.

I’m pretty sure that one day, paper books will be rather quaint. But I’m not crazy about any of the current e-readers. Definitely not the iPad; if you need/want the various other things an iPad can do (including just delight you with its interface) then the e-reader function is just a bonus. But it’s only got a normal LCD display with 1024 by 768 resolution, so if reading books is a primary interest for you, the iPad is nothing special.

A standard-geometry 1024 by 768 LCD with subpixel rendering is actually perfectly adequate for reading - maybe even a whole page at a time, depending on the text size. It’s just not worth spending a lot of money on. You can of course do the same thing with any number of random laptops, including various ancient tablets and other oddball devices that let you fold the screen around. You could even use a netbook.

The downside of doing your reading on a relatively normal computer is that you can’t use the online e-book stores that deliver DRM-encrusted books that can, generally, only be read on specially-blessed hardware like the dedicated readers.

Current dedicated readers
can all display at least a few kinds of non-DRMed content, and you can generally bludgeon one non-DRMed format into another so you can view it on Some Damn Reader that can display PDFs but not plain text, but it’s all still quite a fractured and hideous format war; don’t hold your breath for one reader that works with everybody’s online store and can read everything else too, including DjVu and CBR.

If you’re perfectly happy with the Amazon/B&N/whoever-else online stores (which, yes, may only be accessible in North America - what e-book stores are there, besides the Amazon one, that work outside the US and Canada?), plus whatever other formats your chosen reader deigns to support, then that’s fine, of course. (Provided you don’t end up with the bold new version of customer-service hell that prevents you from buying books.) While the online stores are still charging new-paper-book prices for e-books that you don’t really even get to own, though, they don’t interest me at all.

A buck a book, I and much of the rest of the world would be happy to pay. But Amazon clearly don’t find this very exciting while they’re still selling Kindles as fast as they can make them.

That said, the reason why I’ve chosen to ramble on about e-books despite not owning any kind of dedicated reader is that I have been doing a lot of reading on a screen instead of a page, lately. To the point that I have managed to do what I presume many others have - opened a paper book while lying in bed, turned off the light, and been surprised by the discovery that I can no longer read.

I do my reading on the World’s Greatest Conversation-Starting Laptop, the ridiculously cute OLPC XO-1. Which is not actually a very useful general-purpose computing device (as the original owner of the one I’ve got discovered…), but which makes a pretty decent e-book reader, provided you don’t want to read any DRMed books.

There are a lot of free-to-download books out there. Very few current authors let you download their stuff for free, but if you like ancient sci-fi, or any of the usually-considerably-more-ancient stuff at Project Gutenberg, or the Internet Archive’s rather-more-peculiar-on-average text archive, then you’ll have a full reading list for rather a while.

If you can get an OLPC laptop for the same price I did (I did pay for the postage!), then it’s a good option for free-book reading. The standard Reader (PDFs, etc) and Write (actually a word processor, but fine for reading plain text) interfaces are a bit of a pain, but my main complaint about Reader is that there’s no way to quickly set the zoom so that the text fills the width of the screen without wasting screen on blank margins, which isn’t a big deal unless you’re reading numerous short pieces and have to keep resetting it. The OLPC’s screen (now being separately commercialised) is one of the best things about it - it’s a TV-type hexagonal-subpixel-layout colour screen normally (effective resolution as little as 588 by 441, or as much as 984 by 738, depending on how you measure it), but if you turn the backlight down to zero it changes into a 200-dot-per-inch mono display that you can read in sunlight.

This isn’t quite as awesome as it sounds, because the mono-mode colour scheme is the good old LCD almost-black-on-dark-green, not nice white e-paper. But it’s still handy. I think e-paper readers have the reverse problem; they’re great in good light, but nobody’s yet found a way to make them light up properly.

Enough of this digression; about as many of you are likely to read books on an XO-1 as are likely to read them on an eMate. The important question is: What have I missed?

Who’s got an e-reader they really love?

Is there software you can run on a normal laptop or netbook that lets you buy and read Kindle/Nook/Sony-Reader books?

Is there some shameless DRM-cracking $100 option from Hong Kong? (There already are a hatful of dodgy little reader doodads at the usual crapvendors, but I’m pretty sure they can’t read any kind of DRMed file, and their screens look pretty terrible.)

Anybody buy books on paper and then download illicit PDFs?

Has someone started a kind-of-legal $10-a-month all-you-can-download e-book emporium yet?

September 4, 2010

Psychoacoustics again, again, and again

Filed under: Science, Music

Today’s addition to my ongoing Psychoacoustics Archive comes courtesy of Ben Goldacre.

When listening to the exact same recording, apparently being played by similar-looking but differently-attired female violinists, evaluators consistently thought the music was better when the performers were more “professionally” attired.

This turns out to be an entirely uncontroversial finding. Until I read this Bad Science post, I didn’t know that orchestra auditions are now usually blinded (the auditioner plays behind an opaque screen). This is because unblinded auditions have repeatedly been demonstrated to create unfair discrimination, even when frank racism is not involved. Even listeners who apparently honestly don’t consciously believe that, for instance, women are worse musicians than men, will often rate female performers lower. And that’s before you even start to consider attire and physical attractiveness. (Witness the recent global astonishment when an unattractive woman, apparently against all that science and art has ever told us, turned out to have a decent singing voice.)

The evaluators in this latest study were just music students and professional orchestral musicians, though, not audiophiles. I’m sure audiophiles would have done much better.

This blog is now located at howtospotapsychopath.com!