Thursday 22 March 2007

Friday 9 March 2007

The SPAM solution -Demand Good Manners

About a year ago I recognized the SPAM problem for what it really is; how it really works; and what it really does. There is nothing new about SPAM, it pre-dates the printing press, it pre-dates common literacy, SPAM as we know it is just a special form of an old problem.

What problem?

Bad Manners

It’s rude to talk to someone you’ve not been introduced to. In Pride and Prejudice, Mr Collins tries to introduce himself to Mr Darcy by way of knowing his relative, but is scorned and ignored.

Why don’t well-mannered cultured people speak with those to whom they are not introduced? Simply because they would be mistaken for bad mannered hangers-on, butters-in or favour-grabbers.

Why do we have a problem with spam? Because we authorize our mail servers to accept mail from any ne’er-do-well without the slightest introduction.

Don’t talk to strangers

In the human world (body snatchers and manchurian candidates aside) the humans body and social knowledge is a good substitute for the persons own identity, and thus it is possible to recognize respectability (as we deem it) and trust (as we value it) in whatever circumstance we find ourselves.

In the electronic world of the internet the email From: address is a poor substitute identity, and often spoofed. SYN-spoofing was solved with SYN-cookies; but what is the email equivalent? SYN-cookies don’t avoid depleting resources when the claimed sending node really is the sending node. A higher level email equivalent would merely prove that there was such a valid email identity for the moment.

SPF is an attempt to filter out messages sent from a bogus location. If I know my mother is in Brighton, and I receive a letter asking me to remit 100 pounds to her sister, but the senders postmark is Liverpool, I could guess that the letter was a forgery. SPF is a way for domain administrators to allow mail spoofed from their domain to be recognized. As a consequence most SPAM spoofers will not spoof From: addresses using domains that have SPF records.

However my problem is how to stop hangers-on and cold-callers offering me the latest in pills or potions or job offers in the Belle Vue, California (instead of Belle Vue, Wakefield) from troubling me with their wares.  I want them to obtain an introduction, I want to assess how I trust them, and not be troubled by the quantity of callers jostling for attention.

The solution is to resurrect the habit of requiring introductions as we again live in a world inhabited by those who are anxious to push their message upon us as if it were the only thing we longed to hear.

Trusted Identity

If your friends get them selves an SSL certificate for email; or use PGP, establish a web-of-trust, it will be easily possible to recognize email from those friends as such howsoever sent. If the said friends can competently manage this electronic identity and associated cryptographic keys, you will be able to trust their identity,and their mail need never go astray - wrongly filtered as spam.

Those without such SSL or PGP signatures will be strangers, albeit recognizable stangers if they use SPF records.

Have an Identity, be trusted

Whats the solution? Have an identity, use it, be trusted.

Sadly some jurisdictions recognize such signed emails as legally binding, use of signing keys is not recommended to the incompetent in such jurisdictions.

Get your signing keys certified by http://www.thawte.com/

Why haven’t I done this? Because it’s too complicated, I’m too busy, and I don’t trust my computer with my keys (!!), so I’ll have to get a smart-card from FSFE-UK.

Wednesday 7 March 2007

Collapsing on clusters

I’ve seen use of preceding-sibling to collapse on unique attributes, but how about collapsing clusters on unique attributes? For example:
a,a,a,b,a,b,b,c should collapse to a,b,a,b,c

Here’s some XSLT I came up with to detect which position a node is in it’s cluster, and what cluster number it is. NODE repesents the nodes being collapsed, and VALUE/@value a child node/attribute that is being collapsed upon.

Here we look for the number of nodes that are not preceded by a node having the same value - this will be the number of clusters.


<xsl:variable name="cluster-no" select="count(preceding-sibling::NODE[
  not(VALUE/@value = preceding-sibling::NODE[1]/VALUE/@value) ]
 |current()[not(VALUE/@value = preceding-sibling::NODE[1]/VALUE/@value)]
)"/>


He we count the number of nodes that precede the current node and have the same value, but there is no intervening node with a different value, this will be the position in the current cluster.


<xsl:variable name="position-in-cluster" select="count(preceding-sibling::NODE[VALUE/@value = current()/VALUE/@value
and not( VALUE/@value!=following-sibling::NODE[ count(following-sibling::NODE[ generate-id()=generate-id(current())]) &gt; 0 ]/VALUE/@value)])"/>