Stripmining The User: DataPortability, The “Pragmatic” Web, And A Bad Philosophy

Regards the ReadWriteWeb article “The Future Is All About Context: The Pragmatic Web” … Well, I really think you should read it for yourself, ‘coz between the lines it is rather shocking.

I don’t much disagree with the first couple of paragraphs; I’m deeply cynical about the “semantic” web and still hew to my belief that “if you create documents that a computer can read, only a computer will want to read them” but although Alisa Leonard-Hansen [ed: henceforth ALH] echoes the AI zealots of my youth when writing:

…the intelligent personal agents that are able to process this structured data still have a long way to go before becoming fully actualized.

…I think ultimately we disagree because I believe (a) the personal agents are already here, (b) they are called ‘smartphones’ and (c) they will evolve and improve in ways that we cannot adequately imagine, but they certainly will give people a platform capability which, as a Unix sysadmin, I would have drooled for in 1995.

ALH continues by boosting for “the pragmatic web”[1] and ties that term to some thinking with which I mostly agree, viz: your digital identity equates to your digital footprint:

We need to better understand our identity as it begins to define our experience of the Web and the networked-enabled world we inhabit. Our online identity will increasingly be defined by three “pillars”: who I say I am, what I do and say, and who I connect to (and who connects to me).

To clarify, our online identities are comprised primarily of three specific kinds of data:

  • Explicit or prescriptive data (i.e. the data that I input about myself: name, age, occupation, etc.);
  • Activity or behavioral data (i.e. what I do and say online);
  • Relationship data (i.e. my social graph and what my connections say about me).

Indeed, Adriana has been saying this, better, for years; but then ALH’s article goes horribly wrong; the rest of the article flows from a bunch of unstated premises which I think would be written:

  • The user’s identity is not under his control
  • This cannot, perhaps even should not be changed or fixed
  • This is a good thing because it affords business opportunity
  • ALH’s “pragmatic web” is a good thing which must be brought about

Rather than just strawman this, I’ll try to justify how I reverse-engineered these premises:

“The user’s identity is not under his control”

Well, yes, this is a given on Facebook (at least) – you hand over your data poke your friends with vampires, post pictures about yourself vomiting, and then have to fight to control who sees them – if you actually care.

“This cannot, perhaps even should not be changed or fixed”

I justify the “cannot” because there is no alternative presented:

But the centralization of identity data on one or two major networks […] won’t realize the vision of the pragmatic Web. So, how will the pragmatic Web come to be? How do we realize the power of a dynamic Web that is based on our [ed: distributed and uncontrolled digital footprint] identities?

…and instead we see merry pictures of how having one’s identity hanging-out-there-in-public can be exploited for profit monetised:

The resulting vision is that of a highly personalized, dynamic, relevant and remixable Web experience, yielding greater access to information through discovery, communication and collaboration. For enterprise, this could mean the rise of innovative new business models, based on data-driven value exchange.

I further justify the “should not” because it leads into another pseudopremise:

“This is a good thing because it affords business opportunity”

For me this is justified by the money quote:

Consider this: as media companies scramble to identify new and innovative ways to advertise to the sea of nameless, pixeled users who graze through their content each day, a rich supply of highly valuable identity data lies just beneath the surface, left unmeasured and unmonetized.

There it is, folks: you are all natural inforesources begging to be crushed and rendered into yummy data that feed the advertising industry. You are a chicken and the advertisers want McNuggets. Yes they really think like that; they just don’t put it that way because it sounds bad, but what you browse and what you like are more important than “you”, in this world. You exist as a demographic.

And finally:

“The ‘pragmatic web’ is a good thing which must be brought about”

…well, if you sold these concepts to advertisers and vendors (“So, how will the pragmatic Web come to be? How do we realize the power”) you would believe and write everything from that perspective.

So like in any endeavour, with this pragmatic web we have:

  • the motivation (profit)
  • the opportunity (data just lying around waiting to be harvested)

…which leaves only:

  • the ability (tools and a suitable environment)

…to be created. This is what almost everyone is trying to do, nowadays; it’s where the money is. ALH starts to suggests that Elias Bizannes of DataPortability is also channeling Adriana, with:

One final note on identity data as it relates to enterprise. As Bizannes points out, the value of this kind of identity data rests on the key factors of time and timeliness. Essentially, identity data is valuable only if it is recent. Facebook wouldn’t be able to sell your (permissions-enabled) data to advertisers if it used your explicit data from a year ago rather than from today.

…which is astonishingly similar to The Mine Project’s longstanding philosophy that relationships are maintained by sharing information – and that because currency is valuable then your ability to control peoples’ access to your current data, thoughts and feelings identity, gives you the whip hand in a digital relationship.

However Bizannes apparently holds a Bizzaro approach to this line of thought:

So, Bizannes argues that real-time “access” to someone’s identity matters most, and it’s no longer about data “capture.” Thus, as new business models arise out of monetizing permissions-enabled identity data, the value of the business models will depend on these entities having real-time access to the data.

I really do wonder whether ALH is quoting Bizannes correctly?

Anyway, this is another example of what I call FacebookEnvy[2] – you can smell the line of thought:

  • Facebook has all this identity information!
  • We should make it open, so that it’s better!
  • But we need to keep a back door, so we can monetise it!
  • So what we’ll do is, we’ll be intermediaries!
  • Like Facebook is!

I shaln’t name them here but this is a growth area of the web at the moment – world-class hot-air merchants developing systems to empower the little guy / the common man, arguing that the way to do benefit humanity is for them to adopt this wonderful new [THING] to interpose between [YOURSELF] and [THE OUTSIDE WORLD].

Pay no attention to the revenue model behind the curtain, and God (or Law, or Protocol) please forbid that users have, control and use platforms for themselves, or that anyone trust what people say about themselves.

But that’s a rant for a future blog post.

– alec

Postscript 1: The Pragmatic Web

I wonder if ALH really means the same “Pragmatic Web” concept that appears to live at – where they publish a manifesto with the following helpful definition:

The vision of the Pragmatic Web is thus to augment human collaboration effectively by appropriate technologies, such as systems for ontology negotiations, for ontology-based business interactions, and for pragmatic ontology-building efforts in communities of practice. In this view, the Pragmatic Web complements the Semantic Web by improving the quality and legitimacy of collaborative, goal-oriented discourses in communities.

Which is all very “semantic” and seems to overlap with ALH’s article, somewhat.

To highlight the value of this “Pragmatic Web”, the authors also write:

To search for potential window manufacturers (WMs), current search engines suffice, although a general ontology may offer improvement. But once negotiations with different window manufacturers begin, a branch-specific ontology is required that includes, for example, the specification of construction materials.

The WM should only use highly insulated window frames and should construct the windows using specific techniques to avoid thermal bridges.

If the WM is not German, the legal regulations might be unknown and so the manufacturer must understand the underlying ontology and commit to it.

It can also occur that the partners must add new concepts to the existing ontology. For example, they might have to agree on a specific type of low-energy house, namely one using three litres of energy per square meter of area with controlled ventilation and using geological heat sources.

Such a concept is not an objective description of a given reality, but is developed within the conversation between the parties, who in their conceptualization of this kind of house take into account many tacit, non-formalizable context factors. The effect of the resultant joint definition may be that contract negotiation is smoothened, or even that the costs are reduced since some requirements may turn out to be superfluous.

[ed: paragraph breaks added for clarity]

Excusing energy being measured in “litres” I can almost see what they are getting at, because even the godlike powers of Google fail providing the ability to deal with queries such as:"windows windowtype:upvc legalsystem:german glazedepth:3 heatloss:-3kw +ventilation:true +attractiveness:pretty +tint:clear +style:art-deco"

…and I agree totally that there is insufficient means to describe windows for a mechanical search, and I understand how it can keep some semantic/markup/librarian-types up all night, worrying about it.

But Me? As in most of my semantic-web scenarios, in reality I use an ultrasoft-AI approach:

I do a search and pick up the phone to discuss what I want.

It works. I’d call that the “pragmatic” approach.

Postscript 2: FacebookEnvy

A line of thought: “Facebook does X. We should do X, but open, so it’s better.” Leads to any amount of pseudoinnovation. Replace with TwitterEnvy where appropriate.

7 Replies to “Stripmining The User: DataPortability, The “Pragmatic” Web, And A Bad Philosophy”

  1. Hi – Bizarro here.

    1) your refutation of “…the intelligent personal agents that are able to process this structured data still have a long way to go before becoming fully actualized.”
    > Really? I told Alisa to include this when she asked for feedback on her original post. My reasoning being personal agents do exist but they are still dumb. And the reason they are dumb, is because there is not *enough* machine processable information to allow them to act without human guidance. We are getting there, but it’s no mass market opportunity yet.

    2) Channeling Adriana? I’m also a member of the VRM Project. So no coincidence we have similar ideas.

    But to get more evidence of the access point, you can read this blog post I wrote

    There is another post I made on the mailing list well over a year ago, and I essentially made the point that your identity data only has value if it is recent…hence why access is more valuable than capturing the data (as you always get the latest data). So for example, your relationship status and current employer changes dramatically every year, five years, ten years – if Facebook had that data about you five years ago, it’s less valuable than if you updated it two days ago.

    3) Your broader point about a distaste for monetising people’s identity, I think you need to recognise that’s just how the world works. And rather than preach a utopia where companies cannot do so, we need to instead shape a world where we control our data and the benefit surrounding it. Rather than prohibiting an existing practice, we need to re-engineer it and do so along the lines of incentives for business as that is how you create change.

  2. Channeling Adriana? Well, I am familiar with “her” work at VRM (and like Elias as part of the DPP we work in similar circles)….but let’s be honest, many of her ideas are based on the work by Doc Searls and early Cluetrain concepts…and frankly some of her ideas are deeply flawed.

    As for the content of my article, it was greatly truncated and overly simplified due to the nature of the varied core audience of RWW and does not represent the whole of my thoughts on the matter.

  3. Hello Alisa,

    Channeling Adriana? Well, I am familiar with “her” work at VRM (and like Elias as part of the DPP we work in similar circles)

    I love the “double quotes” for emphasis, they truly drip with sarcasm.

    but let’s be honest, many of her ideas are based on the work by Doc Searls and early Cluetrain concepts and frankly some of her ideas are deeply flawed.

    Given the above and extrapolating from the observation that you are trying to bridge two criticisms here (“foo AND bar”) in order to lend credibility to the more flawed of them, I would suggest:

    • You’re not being “let’s be honest”.
    • Doc Searls has been an inspiration to Adriana, and I have seen that the reverse is true, too. Considerably, and on frequent occasions, many of which I have witnessed. In fact I would hesistate to say who has learned more from whom, because I am not sure it’s decidable.
    • But what I am sure of is that Doc gets the bigger press, and more people get exposed to her ideas that way.
    • If you’re going to glibly assert that someone’s ideas are flawed, it’s impolite not at least to provide some idea of what you are attacking.

    As for the content of my article, it was greatly truncated and overly simplified due to the nature of the varied core audience of RWW and does not represent the whole of my thoughts on the matter.

    Then I am sure you’ll not mind expanding at length, perhaps with the complete text, in your blog, where it can be understood better.

  4. Hi, Alec

    hopefully you’ll get this. I’ve been loosely following the Mine Project over time and am a big believer in the user-ownership/hosting approach. In fact, so much so that I’d like to see what it would take to move this project forward more rapidly.

    I have a few questions, if you don’t mind:

    1) I saw a mention of Tahoe-LAFS integration and I think this would be a great thing, especially if it includes the VLC video streaming ability hinted at in the Tahoe info.

    So much is trending towards video and video sharing that I think some solution for streaming will be necessary. Also, the redundancy of storage in Tahoe is appealing.

    What is happening with this in terms of assessing feasibility or moving towards integration?

    2) Are the ijetty/paws/kws web servers on Android a viable solution fro running pymine on Android phones? Your talk on phones threatening the internet was great, btw. I think it’s clear that the trend toward mobile is accelerating .. yet another argument for a Tahoe-LAFS capable NAS drive at home and a phone in hand .. ?

    3) How do strangers find each other in the “Mine Field” .. hehe..probably not a good term .. but is their a discovery and introduction mechanism? I’m thinking of something like the vizster query-able network map extending beyond my own current connections .. but with specific ID info blinded at whatever level a user chooses. So, I may put my”self” “out there” as a physiotherapist but leave off gender, age, etc .. or vice versa .. and be discoverable based on what I share.

    I think it would be helpful if I could find users of interests and attributes I’m looking for along with a path thru “closest known” users .. as well as to be able to ask the known user to pass my vcard along and request an intro.

    but a mechanism may already exist…?

    4) Is there any reason a Mine cannot run over TCP AND over a MEsh?

    5) What kind of resources would it take to get the pymine code in shape for broad use in 3-6 months?

    6) Is there a more direct way to contact you?

    Thanks .. it’s great work and SHOULD be the forward trend.

