Thursday, March 14, 2013

Twitter Bots are Getting Stranger

I like to see what people are saying about Unglue.it, so I follow the "unglue" tweetstream. For many months, the false positives were really quite entertaining. In the words of normal human twitterers, the word "unglue" unearths all sorts of things people are stuck to and want to be unstuck. Lips, asses, couches, various electronic devices, and of course, Twitter itself. And combinations of two or more of these sticky things. Fun times!

The first bot butting in on my "unglue" stream belonged to some sort of travel agency social marketing bot named Leadify. Various destinations were being touted by this bot under by different users:
@SkyRunTelluride Can’t unglue the kids from the tv on vacations? Go camping in Telluride and leave technology behind.
@VisitGatlinburg Can’t unglue the kids from the tv on vacations? Go camping in Gatlinburg and leave technology behind.
@CrestedButteMt Can’t unglue the kids from the tv on vacations? Go camping in Crested Butte and leave technology behind.
@TimberlineVaca Can’t unglue the kids from the tv on vacations? Go camping in Breckenridge and leave technology behind.
@lodgingdeals Can’t unglue the kids from the tv on vacations? Go camping in Snowmass and leave technology behind.
@vailmountain Can’t unglue the kids from the tv on vacations? Go camping in Vail and leave technology behind.
You get the idea. At least it's clear what "service" this bot is providing.

But recently, a new bot has started getting in on the "unglue" action. What bugs me is I can't figure out why it does what it does:

@IsabellaMariah3 Unglue latent clubhouse derby: .cpP
@MervinSacco1 Unglue official pass250 607-109 audition check over guides: .BCO http://bit.ly/ZMz6fj
@MacDonaldBoswor Dancery unglue rounders online: .Fhi
@ClarenceWither1 Charley 95010 online until unglue straight a rich conjunction unripe forethink: .daw http://bit.ly/WckiGL
@GoldmanLarry1 Unglue fund online casinos: .pyT 525471
@IsaacShade1 Unglue swop 185-113 prelim niagara mopes: .AQb http://bit.ly/10OwHmW
@AllenHoffmann1 Unglue pos software - baksheesh in preference to forthright pos software, guides with acquit pos software: .hFg http://bit.ly/Xv9k0W
@BootmanRussel1 Betting parlor online unglue green stuff extant professional athlete bribe: .YnL 050623
@PassBobby Unglue liquid assets repudiation cash upon be unfaithful online poolroom: .mNG 263810
@GladysSavannah Current unglue contribute nonobservance stationing show biz: .Obd
@EricksonCarter Tavern unglue do tool motion hiatus: .wrm 894279
@JohnsonAllison2 Amusement park unglue participate: .Sby 584823
@BarnesIsabelle Flat reputable volume dvd unglue: .QfH 362486
@JustinCharlie1 261 sporting house unglue tropez aggrandize: .cln
@FrederickWilli4 The hard-and-fast fender in relation with unglue online auction: .RqZ http://bit.ly/Xr2Tw3
@AlanLindley2 How head an neutral advisor grant-in-aid myself pick the uppermost glamour issue unglue racket: .mcY http://bit.ly/W8lrz9
@PaigeCarol1 Thereupon the album's unglue, the belt stirred drummer chouse health resort but salaried nicholas dingley, go ... 121898
@PaigeSandra1 Theater bootlegging unglue surface structure: .nxz 968560
@MiloVelasquez1 332 gambling hall unglue online volutation: .otR
@PeregrinBoles Unglue downloadlot-054exam chamber music guides: .qxf http://bit.ly/10KfJGp
@LeapmanRebecca Entertainment industry unglue contract bridge toad: .iyG 836459
@MakaylaCooper15 Cafe chantant coupon unglue gamut: .eqG 184945
What could possibly be the purpose of these tweets?

My first thought was this is some SEO scam. About 1/3 of the tweets have a bit.ly link to the http://promotion-web.tk/ or related websites; these pages contain more nonsense text under the title "First-class portal". (The "dot .tk" top level domain is a free domain registry based in Amsterdam) But that doesn't make much sense. Why would nonsense tweets point at nonsense websites? And why would most of the tweets come without links?

And if it's an SEO scam, why add things like ".nxz 968560"? Who's going to click on a tweet like that? Even search engines aren't that dumb.

My next thought was that these accounts are those "followers" that social marketing bozos buy for their twitter feeds. But no, many of these these accounts aren't following more than two or three other accounts, though they may have 250 or so of their fellow robots following them, along with a surprising number of apparently human social media consultants.

It's puzzling, and I don't take kindly to unsolved puzzles. This army of zombie twitter accounts must be assembling for some sort of mischief.

So here's my best guess. I think these twitter bots are hiding information in plain view. Suppose you were a terrorist organization or a criminal network, and you needed to publish communications to large numbers of people world wide. What better way to do this than to publish encrypted information on twitter. Or even better, put the encrypted information on a network of websites, and use a distributed network of twitter accounts to distribute the decryption keys? Or maybe this is where Wikileaks is storing its secret files.

The data publishing rate appears to be about 100 tweets per minute, or about 230 bytes/sec. That's 20 MBytes per day. Maybe the three letter codes are the intended recipients, and the 6 digit numbers are constantly changing keys (like RSA's SecurID) for files posted on 2-factor secured websites.

Or maybe its just garbage ungluing latent clubhouse derby.

Notes:
1. Just to be clear, it's not just the word "unglue" that zombie bot is attacking.
2. I can't wait to get head-desked by a simple explanation in the comments.
3. So you don't have to try one of those bitly links yourself, heres a sample of text from one of the garbage websites:
Oneabe is a free online bidding site offering best auctions and known as beat penny auction site , here we conduct Free Online Auction and oneabe is one of the best Online Auction Sites. We also offer free international auction. Presently we are bidding on thunder-Quadband Dual SIM Wifi Touchscreen World and on superb LCD Home theatre media projector and so on. We do our auctions category wise. As here you would see a plethora of options and catalogs within which you can choose whatever is of your choice and need and participate in bidding as well as can buy them. We offer categories like Antiques and art, automobile & bikes, survival kits, businesses for sale, clothing and accessories, coins and collectibles and much more. We are known as a penny auction site worldwide.Under the category of antique and art we offer 20th century antiques ranging from 1920’s till modern , architectural antiques like garden antiques and others, under the wing of Art we offer contemporary art, drawings, paintings, general, photographic images, prints as well as sculptures. We also sell books and manuscripts those are rare and precious. We offer a plethora of ceramic goods and also clocks, decorative items to decorate your home and your office. Our folk art is very unique and our foreign arts are all master pieces. We also do bidding on furniture, map or atlas as well as on metal ware such as brass, copper, bronze, gold, silver, and silver plated goods also we sell music instruments. We also offer here to our customers a very good quality of textiles and linens that includes fabric, embroidery, linens and quilts and much more. Under the gaming option we offer...
4. (update) More theories being discussed on Hacker News https://news.ycombinator.com/item?id=5373161 
Enhanced by Zemanta

Saturday, March 9, 2013

African Drummers Invented an Internet

Maybe in 50 years we'll reminisce how there used to be one internet that covered the globe. But even before the telephone was invented, there were internets of a sort that covered regions in Africa. Probably there were others all around the globe, maybe even now the dolphins have their own version of an internet, disconnected from ours.

An internet, for the purposes of this article, is "a digital communications network that connects intelligent nodes distributed thoughout a region".

Even civilizations without written languages needed to communicate with each other. If you lived in a rain forest where villages were separated by miles of bush, the best way of communication with a neighboring village was to use drums. The "talking drums" of Africa used digital codes that could be understood from distances of as much as 5-10 kilometers. The codes weren't at all like Morse code, but were based on the tones of spoken languages.

A "talking drum" has two tones, so the signal is essentially binary. To overcome the lack of consonants, drum languages would add habitual phrases to words to disambiguate one word for another, resulting in an error-correcting code. Ruth Finnegan's chapter on "Drum Language and Literature" in Oral Literature in Africa gives some wonderful examples.
In the Kele language the words meaning, for example, ‘manioc’, ‘plantain’, ‘above’, and ‘forest’ all have identical tonal and rhythmic patterns. By the addition of other words, however, a stereotyped drum phrase is made up through which complete tonal and rhythmic differentiation is achieved and the meaning transmitted without ambiguity. Thus ‘manioc’ is always represented on the drums with the tonal pattern of ‘the manioc which remains in the fallow ground’, ‘plantain’ with ‘plantain to be propped up’, and so on. Among the Kele there are a great number of these ‘proverb-like phrases’ to refer to nouns. ‘Money’, for instance, is conventionally drummed as ‘the pieces of metal which arrange palavers’, ‘rain’ as ‘the bad spirit son of spitting cobra and sunshine’, ‘moon’ or ‘month’ as ‘the moon looks down at the earth’, ‘a white man’ as ‘red as copper, spirit from the forest’ or ‘he enslaves the people, he enslaves the people who remain in the land’, while ‘war’ always appears as ‘war watches for opportunities’. Verbs are similarly represented in long stereotyped phrases. 
So that's how information was transmitted digitally, but it's not an internet yet. For that, you need a network. It turns out that drummed messages of note would be retransmitted to the next village. In modern terminology, packet switching. I imagine the drummers used a protocol similar to that used by Ethernet to ensure a clear channel for retransmission. Thus announcements, warnings, poetry (maybe even advertisements!) were packet-switched between nodes based on topic and relevancy.

A lot of expressive power of the drum language was used for names. From Oral Literature in Africa:
Personal drum names are usually long and elaborate. In the Benue-Cross River area of Nigeria, for instance, they are compounded of references to a man’s father’s lineage, events in his personal life, and his own personal name . Similarly among the Tumba of the Congo, all-important men in the village (and sometimes others as well) have drum names: these are usually made up of a motto emphasizing some individual characteristic, then the ordinary spoken name; thus a Belgian government official can be alluded to on the drums as ‘A stinging caterpillar is not good disturbed’. Carrington describes the Kele drum names in some detail. Each man has a drum name given him by his father, made up of three parts: first the individual’s own name; then a portion of his father’s name; and finally the name of his mother’s village. Thus the full name of one man runs ‘The spitting cobra whose virulence never abates, son of the bad spirit with the spear, Yangonde’. Other drum names (i.e. the individual’s portion) include such comments as ‘The proud man will never listen to advice’, ‘Owner of the town with the sheathed knife’, ‘The moon looks down at the earth / son of the younger member of the family’, and, from the nearby Mba people, ‘You remain in the village, you are ignorant of affairs’. (citations omitted)
So the drum languages seem to put importance on uniquely identifying individuals, something that our Internet is just starting to figure out. (See ORCID.) Reputation of individuals was important; I wonder if creators of particularly compelling drum poems were identified by custom, as we're starting to learn how to do with Attribution licenses.

It goes without saying that the literary forms transmitted by drumming were not copyrighted and the there was no notion of paying a creator for "copies" of a drummed message. But certainly the practitioners of this early digital literature were valued by their societies.
drumming tends to be a specialized and often hereditary activity, and expert drummers with a mastery of the accepted vocabulary of drum language and literature were often attached to a king’s court. 
Masters of unwritten literatures found many ways of making a living. The "court poet" is a familiar role to us; modern writers often find wealthy patrons. In addition, Finnegan relays another way that creators of oral literature earned their livings:

The singer arrives at a village and finds out the names of the important and wealthy individuals in the area. Then he takes up his stand in public and calls out the name of the individual he has decided to apostrophize. He proceeds to his praise songs, punctuated by frequent and increasingly direct demands for gifts. If they are forthcoming in sufficient quantity he announces the amount and sings his thanks in further praise. If not, his innuendo becomes gradually sharper, his delivery harsher and more staccato. This is practically always effective—all the more so as the experienced singer knows the utility of choosing a time when all the local people are likely to be within hearing, in the evening, the early morning before they have left for the farm, or on the occasion of a market which leaves no escape for the unfortunate object singled out for these ‘praises’. The result of this public scorn is normally the victim’s surrender. He attempts to silence the singer with gifts of money or, if he has no ready cash, with clothes or a saleable object like a new hoe.
So even "astroturfing" is not an exclusively modern phenomenon.

I learned all this reading Ruth Finnegan's Oral Literature in Africa which was the first book made free to the world by Unglue.it, working with Open Book Publishers. Download and enjoy. Open Book Publishers have just launched an ungluing campaign for a second book, called Feeding the City, a translation of a seminal work from the original Italian, about the dabawallahs of Mumbai, a subject Internet entrepreneurs could learn a lot from. Support the campaign to make it free to the world!
Enhanced by Zemanta

Thursday, February 14, 2013

Can Libraries Lend eBooks Without DRM?

Last week I sat through a talk titled "What is a Book?". The speaker was making the point that digital books don't have to be any particular length, so books could be really short or even really really long. The talk then went down a rabbit's hole of ebook technology. That evening, in the hotel bar, the assembled wisdom came to a different conclusion. "The book is a social construct" we declared, and we promised to write blog posts to that effect. Software can try to make the book into something new and more wonderful, but the social construct is more powerful than the technology. Sure, you can copy a digital book endlessly, but people will still think of it as something you buy in a bookstore. You can use "Digital Rights Management" (DRM) software to stop people from making copies, but people will find ways to share it with friends, because that's the social construct built around books.

Libraries have built another powerful social construct around lending books. You would think that libraries would be able to think of a way to do this for ebooks without resorting to DRM software. But noooo! Here's the problem: ebooks aren't print books, and the most popular model for library lending of ebooks is what I call "Pretend It's Print". Only one person can use the ebook at a time. So when one person is "borrowing" the the book, everyone else has to wait their turn. The reason that libraries accept this model is that it would be prohibitively expensive to license a popular book for the use of all of their patrons at once. The downside is that the fictional scarcity of the ebook has to be enforced somehow. Most publishers don't trust that libraries have the technical expertise to securely encrypt files, manage keys, and track licenses to enforce the print-ness of an ebook. So libraries end up lending ebooks only via "platform vendors" such as Overdrive, 3M, EBSCO, ebrary, EBL and others.

There must be a better way.

When the Harry Potter books came out in digital form, it did come out in a better way. If you buy a Harry Potter eBook from Pottermore, it comes as an unencrypted ePub. But your name and purchase info gets embedded in the file as a way to discourage you from posting it on file sharing sites. ("fingerprinting" and "watermarking") As long as you use the ebook the way you would use a book, no software gets in your way. This is commonly known as "social DRM". Unfortunately, to my mind, the "social DRM" label has besmirched a good idea with the stink of a bad one. Cory Doctorow has called social DRM "delusional".  O'Reilly's Joe Wikert compares social DRM to being "a little bit pregnant"  which might seem "a little bit" hypocritical, as O'Reilly's venture with Pearson, Safari Books Online, requires DRM for O'Reilly ebook subscriptions sold to libraries.  uses social DRM for PDF downloads. (updated 2/18/13 based on comments) A more practical view was expressed by Feedbook's Hadrien Gardeur, who suggests that we think of social DRM as "personalization".

English: Librarians against DRM
Librarians against DRM (Wikipedia)
Librarians are nothing if not practical but the strong DRM that's been imposed on them by the incumbent ebook platforms is in conflict with many of the core beliefs of librarianship. (The platform vendors, are in turn required to use strong DRM by the publishers who offer their books for library licensing.) DRM degrades accessibility, fair use, and privacy. Is there a way to use the strength of the library lending social construct to enable an ebook lending system that works without DRM?

The difficult part of this is not so much preventing illegal distribution, but getting users to accept limited lending periods for digital objects. After all, libraries need late fines to enforce limited loan periods even for printed books. With personalization of ebook files and cooperation with reading environments, this could be easily achieved. A "loan certificate" could be inserted into the ebook file, and the reading environment could remind and assist the user to "return" the book. Reading environments could also offer the user a chance to purchase a permanent license for the ebook.

Of course, a knowledgable user could easily circumvent loan expiration, or choose a reading environment that ignores loan certificates but that's beside the point. As long as most of the users respect the social construct of the library, they'll respect their obligations to their library, and the world will be better for it. Many of the expenses and inefficiencies of the current system would disappear. I think it's worth a try.

Overdrive has tried. In June of 2011, they announced a DRM-free lending program, with DRM-free publishers such as O'Reilly and Carina Press. The Overdrive program deals with the "return" issue by not dealing with it at all. It's not possible to return a DRM-free title early, so no one else can "borrow" a DRM-free title for the standard loan period. The effect is to replace the one-reader-at-a-time restriction with a lending-rate restriction. Overdrive seems to have de-emphasized this program. In December of 2011 they notified customers that "DRM has been applied to select DRM-free eBooks" and the announcement of the program has disappeared from their blog.  There's also no indication in the borrowing UI that a title is DRM-free; epubs are just labeled "Adobe EPUB ebook", presumably to prevent users from discovering how easy these files are to use or misuse.

A lot of mainstream publishers would never let libraries lend "unprotected" ebooks. There are exceptions. Springer's ebook collection is available to libraries without DRM. This makes sense because none of the included books are likely to be heavily used- the audience for an academic book is usually quite small. On the other hand, there are DRM-free publishers such as Baen and Tor that currently do not allow library lending at all; they seem to be good candidate for a no-DRM library solution.

I've started talking to people about how these ideas might be implemented- if you have ideas of your own, please let me know.

Notes:
1. An IDPF working paper has promoted "lightweight content protection" as another way to address the needs of libraries.  The idea in this proposal is to apply just enough DRM to trigger the anti-circumvention provisions of the DMCA and similar provisions in other countries.
2. Bookshare has used fingerprinting and watermarking as key components of their "Seven Point Digital Rights Management Plan", with good success.
3. Liza Daly has a list of DRM-free publishers.  Now that she's at Safari, maybe something will happen there. you should read her comments. (update 2/18)
4. Unglue.it is another way to remove the need for DRM in libraries.
Enhanced by Zemanta

Wednesday, February 13, 2013

One eBook to Prove Them All

I've not written much about it here, but over the past year I've been participating in the American Library Association's "Digital Content Working Group". DCWG is broken up into smaller groups focusing on specific areas. I've been working on "Business Models". At ALA's Midwinter meeting, DCWG sponsored a jam-packed symposium.

The DCWG's meetings at ALA's mid-winter and annual conferences are open for anyone to attend, and they've been covered by the library press. Our recent meeting in Seattle was covered by Library Journal's Matt Enis, and he highlighted an idea that came out of our subgroup, the "One eBook" program:
The American Library Association’s Digital Content and Libraries Working Group (DCWG) has begun exploring an idea that could help publishers better understand the powerful impact that libraries can have for their authors and their bottom line.
I've finally had a chance to write this up for American Libraries' E-Content Blog:

There’s a lot of data suggesting that exposure to books in libraries increases sales for those books. There’s also a lot of data that suggests that many publishers believe the opposite—namely, that the availability of books in libraries depresses sales, and that if libraries improve the ebook lending process, making it easier for library users to substitute loans for sales, then ebook sales will be hurt even more. 
That word “suggests” is the problem. We don’t have controlled experiments that have really measured the broad effect of the library lending of ebooks on ebook sales. ALA’s Digital Content and Libraries Working Group has been examining the situation, and we had an idea. What if libraries all around the country promoted a single ebook for a month? What if that ebook’s publisher offered a special deal so that for that one month, libraries could lend that ebook to as many patrons in their communities as possible without decimating their acquisition budgets? Once the month was over, that specially promoted library ebook deal would end. What do you think would happen?
There are a lot of details to work out of course, but we've had a lot of positive reactions. It's the practical and technical details I'm thinking about right now. For example, how can we make such a program available to as many libraries as possible, regardless of whether they are currently offering ebooks? How can we make the ebooks work on all sorts of platforms? How do we make the one-ebook ebooks expire after a month?

As if I didn't have enough to do...

Enhanced by Zemanta

Friday, February 8, 2013

Anachronisms and Dysfunctions of eBook Front and Back Matter


The process of digitizing a printed book involves much more than the conversion of ink on paper to bits in a file. Functional aspects of the book must be mapped to digital equivalents. Thus we have tables of contents and indices turning into hyperlinks and spine files, page numbers that beget location anchors and progress indicators.

The terms of art for this stuff are front matter and back matter. I'll cover the many dysfunctions of ebook copyright pages in another article, but let's step back for a moment. What is this stuff for for, anyway?

A good example is the bastard title (or half title) page. This a page, usually printed with only the book's title, that precedes the title page in the book. When dinosaurs roamed the earth, the function of the bastard title was to identify and physically protect the paper text block until it was bound. Sort of like the tissue paper they still put in fancy wedding invitations. I daresay that ebooks do not require any such protection. It is utterly without use in an ebook. Begone!

Next, consider the title page. It typically displays the book's title, author, and the publisher.

In a print book, the title page is a declaration of bookiness. You don't have title pages in magazines or newspapers. The title page says "get, ready, here comes a book, so go find a comfy chair."

But a digital book needs something different. It needs a start page. Think about the start screen of a DVD. (You DO remember those, don't you?) Now think a bit more generally. Modern ebooks share their underlying technology with websites, so why not convert the title page of a book into a home page for the book, with the sort of utilities you expect on a home page?

If all we do is replicate the functions of a print book, then we haven't done our 21st century thinking very thoroughly. What kinds of things might an author or publisher want on their book's home page? The ability to share via social networks? Definitely! Probably a channel for conversation. A way to connect to other books from the author and/or publisher? Yes please! Maybe even a usage tracker.

From my perspective, thinking about what our Creative-Commons licences editions should look like, there are a number of front-matter and back-matter tweaks needed. We add lists of supporters, for example. One of the author-publishers participating in Unglue.it, Melinda Thompson (support her book here), had these great suggestions:
The first page of an unglued book should contain only two things: an unglued logo and a small “what’s this?” link. Initially, “unglued” won’t mean anything to anybody, but over time they will learn what it means as some people click the “what’s this?” link and learn more. Once a person clicks on the “what’s this?” link they’d get a very short menu with things listed like: What is an Unglued Book, Rights, How to Share this Book, Supporters, etc. And behind that short menu could be all the details you want.

I would love to see a share button (like you have on the Unglue.it website) at the end of each and every unglued book – inside the book on the very last page. If the whole point of unglue.it is to give books to the world, then people should be easily able to do that from a technological perspective. People should be able to download an unglued book for free and then, technologically, the book should really be free and easy to share effortlessly via email, Facebook, Twitter, Goodreads, and other social media websites. And, from your branding perspective, people should be able to easily tell your story on your behalf. People should literally, easily be able to give an unglued book to the world.

But it's not just unglued books that need work. Let's look at the book Book: A Futurist's Manifesto, by Hugh McGuire and Brian O'Leary, which is a very interesting collection, by the way.

Here's what it looks like in iBooks. (It's the version released in 2011, though a later version is labeled the first edition.)

Thankfully, these futurists have axed the 18th century bastard title, but the title page itself looks lost. There's not even the customary publisher name.

Here's the PDF embed from scribd:
The title page is dressed up a bit, but hey, it's pdf.

Now take a look at the booky part of the book's homepage, first on O'Reilly's website :

And then on Pressbooks, where it really IS a website.

You can see that this browser version has started down the road to rethinking the front matter.

Look at these homepage captures and think about how many of these functions would work just fine inside an ebook, on an ebook reader intermittently detached from the web. Take out the "buy" buttons, and you have a decent start page for the book. Or leave in some buy buttons if you want to sell print copies or you want to upsell to a deluxe version.

So how do we proceed? These things work better if readers don't have to learn different UIs for every ebook they read, but at the same time, there's no need to leave users in the previous century. Maybe book designers could share their start page designs for everyone's benefit. Wouldn't that be nice? Have you seen an innovative start page on an ebook? What else would you like to see on a book's start page?

Update 4/10/13: Suw Charman-Anderson has a great follow-up post.
Enhanced by Zemanta