BitTorrent is a file transfer protocol aimed at providing rapid distributed transfer of popular files.
Previous widely used file transfer protocols, such as ftp and http transfer files in a linear fashion from a server to a client, starting with the first byte of information in a file and proceeding to the last. Robust file servers can send the same file to thousands or thousands of clients simultaneously, but servers which host very large, very popular files (such as CD images, movies and similar) are soon swamped by requests and either the file server or their connection to the network, grinds to a halt. When a server is overwhelmed in such a way, the clients each have only the first part of the file, so even if a group of clients get together, they cannot piece the file together from the parts they have.
Mirror systems, in which popular files are mirrored on many, geographically seperate locations, are effective, but propagation time to the mirrors is a serious issue, as is the need for considerable formal infrastructure and advanced insight into which files are likely to be popular.
BitTorrent is a so-called peer-to-peer protocol (also written as peer2peer and p2p), in which each client is also a server, and called a "peer." Content files to be distributed via BitTorrent are split into pieces and the details of each of these pieces is assembled into a torrent file (typically using the .torrent extension), which lists the size and checksum of each piece and the URL a "tracker," a server which stores information about which peers are interested in which files. The computer with the first copy of file then becomes the first peer. As peers start downloading the file they are said to join the "swarm" for that torrent.
To download a content file a peer:
- Downloads the torrent file
- Contacts the tracker and downloads the list of other peers interested in the file (i.e. those who are downloading or have already downloaded it)
- Contacts the peers to check which pieces of the file they have
- Downloads pieces from a number of peers, choosing the rarest pieces first
- Checks the checksums of downloaded pieces against the checksums in the torrent file
- Assembles the file(s) from the pieces
- Uploads pieces already downloaded to peers which request them
This last step, the uploading of pieces already obtained, is the crucial step, because it allows the system to keep working even when the peer with the original copy of the file becomes unavailable. Even the tracker becoming unavailable does not prevent peers downloading pieces, but it does temporarily prevent new peers joining the swarm.
The system works extremely well for very large, very popular, files. For example, as I write there are 462 peers on the BitTorrent for windows release the latest OpenOffice version and I can download from them at a speed limited only by my local network bandwidth (>1200 KB/s).
Problems with the Solution
BitTorrent is complex, much more complex than previous systems such as http and ftp. This makes for challenging user interface design issues.
BitTorrent has considerable overhead, the torrent file needs to be downloaded and there is much communication overhead, these overheads apply even when a peer is downloading from a single other peer. This complexity makes it unattractive for small files or those only likely to be downloaded a few times.
By becoming the tool of choice for those trading movies and games of dubious legality, BitTorrent has managed to get banned and filtered in a number of places.
For rarely-used or infrequently downloaded files, BitTorrent works very poorly, unless the initial peer is professionally hosted and maintained.
RSS and replacing mirrors
Mirror services (such as the JISC mirror service http://www.mirror.ac.uk) are a traditional method of easing software distribution to isolated sections of the network. Some BitTorrent peers now support polling RSS feeds for new .torrent files and automatically downloading them. This merge of the peer-to-peer and hierarchical distribution appears to be most suitable either for bridging firewalls and other network anomalies and for peer who anticipate a releases and wish to download them immediately. Unlike previous software distribution methods there peers waiting for a release and joining the swarm immediately on release do not hinder the download.