Monday, July 20, 2009

The Evolution of Names and Personal Identifiers

One of my Great Great Great Great Great Great Great Great (8G) Grandfathers was born around 1590 in northern Sweden and was named either Pål or Påfvel. A written record of him that survives refers to him as "Pål de Äldre", or Pål the Elder, to distinguish him from his son, called either Pål Pålsson or Påfvel Påfvelsson, depending on which record you believe. Pål de Äldre's father was also called either Pål or Påfvel; in a written record he was called Pål Finne, or Pål the Finn. In 1600, it just wasn't important for a farmer in northern Sweden to have a consistent spelling of his name, or even to have a consistent name. It's likely that his name was written down only a dozen or so times in his lifetime- in church baptismal records, maybe a marriage record. If he had been a city dweller, it would have been different, but for the most part the writing down of names was part of a effort by the church and thus the government to extend its dominion into the countryside. It was common for people to be identified only by their first name and the name of their village. Patronymics were the cultural norm in the cities, and as the countryside developed and communicated with other parts of the kingdom, the patronymic names and registration of those names became more formal and regularized.

In the late 1800's, it became popular in Sweden to adopt family names instead of patronymics. Probably this had to do with an increase of awareness of naming conventions in the rest of the world. My great grandfather, born Abel Olsson, and his brother, Salomon Olsson, decided to take the family name "Hellman", which translates as "bright man". Their brother Olof took the name "Holmberg", while brother Magnus went with "Hellgren". Salomon later emigrated to America, and on entry changed the name to "Hallman" because a religious member of the family who knew English disliked having the word "Hell" in their name.

Personal names are now used globally- my name has to to compete with that of other Eric Hellmans around the world when somebody wants to search for information about us on Google- I am NOT the Eric Hellman that used to manage the band Blink-182. So the global village reduces us to using the same attribute- based disambiguation schemes that were used in 1600- I am now Eric Hellman of Montclair, New Jersey, or Eric Hellman the Linking Technologist.

When I was a scientist, writing articles for technical journals, I always used the form "E.S. Hellman" for my name, because I had searched Science Citation Index and knew that no one else had published under that name for at least 10 years. The process was exactly the same as you might use now to pick a twitter hashtag- temporal disambiguation is the best I can hope for.

My mother-in-law has used many names. She was born in China, and her birth name indicated her birth-order in the family and a "generation name". When she entered school, she received a school name. As an adult, she chose a name for herself. When she emigrated to the US, she chose to adopt an English name and a transliteration of her family name. When she married, she began to use a married name. When the full married name didn't fit on important legal forms, she chose a legal name.

With this family background, you'd think I'd know better. When I designed a database-driven e-journal with front-to back automation in the early days of the internet (1995), I had the bright idea of giving identifiers to all the entities involved in publishing a paper. I imagined a world in which the authors of every paper and the institutions they worked for would have global identifiers, and the every article would be rendered using the up-to the minute information from the database. In short, my design for the e-journal was hoping for a linked data future.

I briefly considered the possibility that an author would change their name, and I naively decided that papers were authored by people and not by author names. If an author changed their name in real life, the name on the paper should change as well. By the time this circumstance actually occurred, I had learned enough about how citations were used by abstracting services and the like, that I realized what a bad idea a retroactive change in an author list would be. I changed the way data binding occurred to prevent the retroactive change from happening. But because I used author identifiers in the database underlying the e-journal, the generated author pages displayed all the papers written by the author who had changed her name.

At this point in the post you might expect to read a clarion call for the establishment of global person identifiers, to enable the global cloud of URI-based linked data to know all the articles authored by a given person. If so, I have to disappoint you. What I have for you is an observation, and a question. The observation is that naming of individuals is a universal practice across all human societies, and name shifting is almost as universal, and certainly as human. The regularization of names on the other hand, the conversion of names into identifiers, if you will, has always been a governmental activity. It was the Kingdom of Sweden and its Church that wrote down Pål Pålsson's name; my middle name seems to exist mostly for entry onto forms, and to lend an initial to my signature. It was the organizing force of my e-journal database that regularized the authornames it displayed.

My question is this: if we give people global identifiers, what will they do with them? Will they view them as progress for civilization, like roads and communication systems, or will they view them as encroachments on privacy and liberty and on their rights to change their name and identity? Will people embrace their identifiers and view them as property, or will they attempt to subvert them and hide them away as Americans must do with their social security numbers? Is the urge to regularize our identifiers a natural extension of our human proclivity to give ourselves names, or is it something that can only be accomplished as an expression of human government?


  1. I doubt governments will set up global identifiers : signs of the times are, they'll let the likes of Google or Verisign do it. Oh, and in fact they do have some sort of Global Identifier : an OpenID account. It's used for access control and authentication, but it does provide for a digital identity.
    And in the online world, they'll hopefully learn how not to allow complete aggregation of all their identities : so as, you know, to keep the online life they had at 15 years old from getting in the way of the online life they will have as 30 or 40 years old.
    Or maybe sometimes they'll have to: one's university insists the affiliation has to be part of the identity when you sign a paper; it's not there when you sign up on twitter or publish a children's book.
    Then if we keep the Verisign/OpenID analogy : it allows one to select which information which web site will get from our OpenID account. But, it's messy : cumbersome to use, and the data that's in my OpenID is poor and isn't really curated, except by me.
    I have a question of my own: don't you think the more global identifiers are used in real life situations and around the web, the cleaner those global identifiers might become over time?

  2. If global identifiers actually came into being would there be a single database for same, and how would it be managed? Or would each government build and manage its own according to a predetermined international standard. Can't help thinking it's not going to happen. Of course we already figure on many databases official and unofficial all using slightly different datasets according to context. I might like to be globally known in some contexts but not others, and what about data protection?
    You've raise some really interesting points here, Eric, which touch many aspects of life in the 21st century.

  3. Great post! To think about global identifiers and how they might get repurposed, I think this talk, "Spaces of Calculation" might be thought provoking. It's about the history of street names and how they've been reused for alternate purposes.

    PS-Is "no pasting into comments" intentional?

  4. Jodi- Posting into comments works for me.

    Nicomo- Increasing use of things like OpenID could make things cleaner, but I can imagine the opposite as well.

    Breizhlady- There will certainly be a single database for global identifiers, and probably many of them!