UK file-sharers to be “cut-off” — Detection
So, how do you detect a file-sharer?
One way is to use what’s called “Deep Packet Inspection”. Some ISPs are quite fond of DPI, because it lets them loosely identify what kind of traffic some parts of somebody’s Internet connection appear to be. The key word there is “appear”. DPI isn’t conclusive, and has never (seriously) been claimed to be. It looks for patterns, and performs actions—such as “throttling” (slowing down)—based on those patterns. This in itself is pretty flawed, but lots of ISPs across the world, including in the UK, do it.
ISP’s aren’t so keen on being forced to use DPI equipment, though. Part of this is that they know it’s only good at spotting patterns, and part of it is that many ISPs don’t use it at all and would have to invest in a technology which they don’t want being a part of their network. Fair enough, really.
Another problem with DPI is that it can’t distinguish between legitimate and illicit traffic. It has no idea that, say, Spotify’s peer-to-peer traffic is legitimate, unless Spotify’s developers keep DPI vendors well-informed about precisely how to distinguish Spotify’s peer-to-peer traffic from other kinds. This is a pain, but not the end of the world. The real problems arise when you have applications—such as BitTorrent—which are used for both perfectly legal and illicit purposes, sometimes simultaneously. LegalTorrents is a site specialising in redistributable content, and even a sizeable chunk (although still the minority) of torrents on the Pirate Bay have been perfectly legal. Plenty of (primarily Linux- and BSD-based operating systems) are distributed via BitTorrent, not to mention big applications like OpenOffice.org, and the updaters built into several games. Because the application itself is no more “bad” than having the ability to view a web page is “bad”, you can’t reliably detect and block based on what the application is.
Web filters work on the basis of two sets of criteria: blacklists, and scanning content. The content-scanners are generally fairly simple affairs, and don’t need to be anything massively complex because they’re just looking for occurrence of words and phrases in a page of text (much as how an e-mail spam filter works). This doesn’t work when the content you’re trying to filter is a small segment of a (possibly compressed, possibly encrypted) movie file. Nobody has the sort of computing power—and probably not even the algorithms—required to do that.
A blacklist doesn’t work on the basis of scale. You can’t blacklist a source of a torrent, because by nature they’re peer-to-peer. Although it’s easy to blacklist a network block when it come to web hosts, it’s much harder in a peer-to-peer setting because everybody’s sharing with everybody else.
Fortunately, the proposals haven’t suggested using DPI. Instead, they tell us that rights-holders will themselves monitor public trackers. Leaving aside, for a moment, the fact that this will only drive the adoption of completely distributed trackerless BitTorrent, there’s also the small wrinkle that the information about peers can’t be relied upon. In the wake of the various legal cases brought by the RIAA in the US regarding file-sharing, researchers set out to find out how easy it would be to spoof the information that investigators would see (and rely upon). The answer? Very easy. The researchers, without a huge amount of difficulty, made it look like dumb network devices (such as printers) were hot-beds of illicit peer-to-peer activity. How long before these spoofing techniques make it into every BitTorrent client available?
The bottom line here is that you just can’t detect file-sharing going on with any real degree of certainty. What you can do is try to track down those who actually uploaded the torrent files in the first place, but again, this only works as long as trackerless torrents aren’t the norm, and is contingent upon the BitTorrent tracker being operated in a country which will play nicely with a request for information. If BitTorrent proves to be eventually too easy to reliably detect (be it the “seeds”—the uploaders—or the “leechers”—the downloaders, remembering that most will be both at some stage or another), people will use something else (and there are plenty of potential options, just not a huge impetus).