Multicast and Network Neutrality

by on May 9, 2006 · 30 comments

Robert X Cringely has an interesting article about the future of digital content distribution and peer-to-peer networks. I think his big thesis–that the existing one-to-many, end-to-end model for distributing video content won’t scale–is right. But I think he’s missing a few things when he points to peer-to-peer technologies as the savior.

Here’s the technical problem: Right now, if ABC wants to deliver 20 million copies of Desperate Housewives over the Internet, it woul have to transmit the same stream of bits 20 million times to its ISP. The ISP, in turn, might have to transmit 5 million copies to each of 4 peers. Those peers, in turn, might have to transmit a million copies to each of 5 of its peers. And so on down the line, until each end user receives a single copy of the content. That’s wasteful, because sending 20 million redundant copies of a file uses a lot of bandwidth.

In a perfect world, ABC should only have to transmit one copy to its ISP, and the ISP, in turn, should only have to transmit one copy to each interested peer, and so on. Each Internet node would receive one copy and transmit several, until everyone who wants a copy is able to get one. Geeks call this multicast. It’s theoretically part of the TCP/IP protocol suite, but for a variety of technical reasons I don’t fully understand, it hasn’t proved feasible to implement multicast across the Internet as a whole.

However, there are plenty of quasi-multicast technologies out there. One of the most important is Akamai’s EdgePlatform. It’s a network of 18,000 servers around the world that serve as local caches for distributing content. So when a company like Apple wants to distribute 20 million copies of a file, it doesn’t have to transmit it 20 million times. Instead, it transmits the content to Akamai’s servers (and presumably Akamai’s servers distribute it among themselves in a peer-to-peer fashion) and then users download the files from the Akamai server that’s topologically closest to them on the network.


This sort of arrangement gives you much of the advantages of true multicast without having to solve the thorny technical problems raised by genuine multicast at the IP level. And it’s also not really a “peer to peer” solution. Akamai’s servers are commercial boxes owned by Akamai, and they charge content companies for the use of the network.

This sort of technology is likely to be a major component of any widescale video broadcasting over the Internet. Indeed, the logical people to do this sort of local caching are broadband ISPs themselves. Comcast could set up a bunch of caching servers that receive content from the Internet and re-transmit it to its own customers. Not only could it likely charge content companies for the service, but they’d be saving themselves money on bandwidth too, because traffic over their backbone links would drop.

For the most part, peer-to-peer applications perform an equivalent function. But they’re unlikely to be as efficient or as reliable. Peer-to-peer applications generate a lot of unnecessary traffic themselves, because content is downloaded to a node and then immediately uploaded over the same node. Although this is often bandwidth that wouldn’t have been used anyway, it’s clearly not the most efficient way of doing things. Moreover, although peer-to-peer networks do their best to find the peers that are closest to them, their knowledge of network topography can’t possibly be as good as the knowledge possessed by the ISP. Hence, peer-to-peer networks will have much less ability to optimize the process to minimize the number of redundant packets transmitted.

None of which is to say peer-to-peer applications are bad. They clearly have some advantages, most notably the fact that they don’t require any dedicated hardware. However, as the scale of digital distribution grows, it seems likely that they will be partially supplanted by more efficient and robust caching schemes. And if the ISPs are smart, they’ll do the caching themselves.

Which brings me to the network neutrality point: how would a network neutrality rule regard this sort of scheme? After all, Akamai already does precisely what NN advocates fret about: they allow big companies to pay for their content to get to consumers faster. It seems to me that’s undisputably a good thing in this case, but it’s not clear how the FCC would regard it when applying a NN rule. Would a network neutrality rule forbid Comcast from signing deals with content companies to cache their content locally for faster delivery? If so, is that a good thing or a bad thing?

Update: You should all read the comments by George below, who offers a compelling argument that peer-to-peer swarms will generally outperform centrally-managed caching schemes.

  • http://www.blindmindseye.com MikeT

    If you’re principled and rational, it’d be good because it would free up a lot of network resources for other users. But… I think that assumes too much of many NN supporters.

  • http://www.blindmindseye.com MikeT

    If you’re principled and rational, it’d be good because it would free up a lot of network resources for other users. But… I think that assumes too much of many NN supporters.

  • Greg

    Undeniably, it’s a good thing for local ISPs to cache frequently requested material for their users. It just makes sense, for both parties. And if an ISP wants to charge for preferential _caching_ on top of neutral service, that’s fine, too. The problem is if ISPs are allowed to _require_ content providers who are not their direct customers to pay a fee simply to cross the ISP’s network. That’s what the whole network neutrality debate is about–it’s not about whether or not users should have the option paying more for better service, but rather that they have to make separate agreements with everyone that might pass their packets, instead of the ISPs taking care of arranging the peering agreements and then passing on the costs to their _direct_ customers.

  • George

    Tim, you’ve got some erroneous perceptions of p2p, multicast, and Akamai here that I’d like to address.

    #1. multicast theoretically only works if all ‘consumers’ of the stream are consuming at roughly the same time (and even then, the technical hurdles are significant). This is because multicast doesn’t cache the bits for later use. It isn’t a good solution for users who want to view content on-demand.

    #2. Akamai’s Edgecast solution does indeed address the shortcomings of multicast, by caching data at each node. The inner-workings of distribution among these nodes is not publicly known in detail, but your assumptions are probably not far off.

    #3. Here’s the big technical error: you claim that a p2p model is less efficient than Akamai. You are wrong for several reasons:

    a) Akamai has to physically place nodes, and the distribution of those nodes must match demand. This is problematic for several reasons. Suppose the content provider assumes that 10 million users in the US will download a copy of their content, and that these users will be more or less uniformly distributed throughout the country. But then, to their surprise, 8 million people in Arkansas try to download the show, and only 1 million in the rest of the country. The Arkansas residents will be starved for capacity (resulting in millions of failed downloads), while the rest of the country will be overserved with capacity. With a p2p distribution system, each consumer becomes a provider, and capacity automatically moves to where it is needed. This is commonly referred to as ‘swarming’, and is what happens with products like BitTorrent and RedSwoosh. Because there is a 1-to-1 relationship to consuming/providing content, capacity is only limited by total network capacity — you can’t beat that for performance!

    b) You claim that ISPs (and Akamai) have better knowledge of physical topologies, and can thus better place caching nodes in the topology. Yet physical topology is irrelevant. This is counterintuitive, but the only metric that matters in networking is current network performance. Physical proximity or channel capacity doesn’t matter, because performance (bps) is variable. Even if nodeA is physically closer to me and the connection to it, pipeA, has higher capacity, it does me no good to prefer it over further away nodeB and lower capacity pipeB if pipeA is currently congested and pipeB is not.

    c) Cost and efficiencies. This derives a bit from a) and b), but the reason Akamai will always be more expensive to run than BitTorrent (even if all bandwidth were 100% free) is that you have to have an army of Akamai installations and support staff to service those 18,000 nodes, and the staff has to be distributed around the globe. Even if your support army works for free, you still have to pay for electricity, physical facilities, and bandwidth. Spread your content with BitTorrent instead, and you can get rid of all that staff and all those physical installations.

    In short, I’d sum up centralized, hierarchical distribution strategies as analgous to centrally planned, hierarchical economies. The Akamai design is the design of a network communist, centrally planned by committee, politburo, or dictator — and no matter how well it is planned, it can never respond to changing market forces as well as a free market can. (BTW, that isn’t a criticism of Akamai as a capitalist business or even as an engineering design — Akamai continues to pull in wonderful revenues, and at the time it was designed, it was revolutionary. It’s just that Adam Smith hadn’t come along yet…)

  • http://www.techliberation.com/ Tim

    George: Thanks! Those are good points. I certainly don’t think that Akamai-style caching will completely supplant BitTorrent-style peer-to-peer distributions for the reasons you mention. In particular, peer-to-peer applications are clearly better at handling sudden, unexpected usage patterns. And it’s also obviously less expensive for the distributor since your users do all the distribution for you.

    However, I still think that peer-to-peer applications are relatively wasteful of bandwidth. It’s true that a peer-to-peer application doesn’t need to know the details of network topology to get its own packets from point A to point B. However, the fastest route for a particular P2P client isn’t necessarily the same as the most cost-efficient way of getting content.

    An important factor is the relative costs of backbone bandwidth and storage. If storage is cheap and backbone bandwidth is expensive, the ISP may find they can save money by caching more content locally.

    ISP-run caches have the advantage that they’re available 24/7. With a P2P scheme, the last person on my local network to download a particular file may not be on the network any more, requiring that it be re-downloaded from the backbone. ISP-run caches might also have a greater ability to pre-stock the cache with content during periods of low bandwidth demand. ABC might, for example, load ISP caches up with the new episode of Desperate Housewives early in the morning for release the following evening.

    Obviously this isn’t a binary issue. Both P2P and commercial caching will be important for the foreseeable future, and I may very well be overestimating the advantages of the latter. But I think it’s pretty likely that caching will be the superior strategy for at least a few applications, and if so, it’s important that the legal system doesn’t interfere with experimentation.

  • Greg

    Undeniably, it’s a good thing for local ISPs to cache frequently requested material for their users. It just makes sense, for both parties. And if an ISP wants to charge for preferential _caching_ on top of neutral service, that’s fine, too. The problem is if ISPs are allowed to _require_ content providers who are not their direct customers to pay a fee simply to cross the ISP’s network. That’s what the whole network neutrality debate is about–it’s not about whether or not users should have the option paying more for better service, but rather that they have to make separate agreements with everyone that might pass their packets, instead of the ISPs taking care of arranging the peering agreements and then passing on the costs to their _direct_ customers.

  • George

    Tim, you’ve got some erroneous perceptions of p2p, multicast, and Akamai here that I’d like to address.

    #1. multicast theoretically only works if all ‘consumers’ of the stream are consuming at roughly the same time (and even then, the technical hurdles are significant). This is because multicast doesn’t cache the bits for later use. It isn’t a good solution for users who want to view content on-demand.

    #2. Akamai’s Edgecast solution does indeed address the shortcomings of multicast, by caching data at each node. The inner-workings of distribution among these nodes is not publicly known in detail, but your assumptions are probably not far off.

    #3. Here’s the big technical error: you claim that a p2p model is less efficient than Akamai. You are wrong for several reasons:

    a) Akamai has to physically place nodes, and the distribution of those nodes must match demand. This is problematic for several reasons. Suppose the content provider assumes that 10 million users in the US will download a copy of their content, and that these users will be more or less uniformly distributed throughout the country. But then, to their surprise, 8 million people in Arkansas try to download the show, and only 1 million in the rest of the country. The Arkansas residents will be starved for capacity (resulting in millions of failed downloads), while the rest of the country will be overserved with capacity. With a p2p distribution system, each consumer becomes a provider, and capacity automatically moves to where it is needed. This is commonly referred to as ‘swarming’, and is what happens with products like BitTorrent and RedSwoosh. Because there is a 1-to-1 relationship to consuming/providing content, capacity is only limited by total network capacity — you can’t beat that for performance!

    b) You claim that ISPs (and Akamai) have better knowledge of physical topologies, and can thus better place caching nodes in the topology. Yet physical topology is irrelevant. This is counterintuitive, but the only metric that matters in networking is current network performance. Physical proximity or channel capacity doesn’t matter, because performance (bps) is variable. Even if nodeA is physically closer to me and the connection to it, pipeA, has higher capacity, it does me no good to prefer it over further away nodeB and lower capacity pipeB if pipeA is currently congested and pipeB is not.

    c) Cost and efficiencies. This derives a bit from a) and b), but the reason Akamai will always be more expensive to run than BitTorrent (even if all bandwidth were 100% free) is that you have to have an army of Akamai installations and support staff to service those 18,000 nodes, and the staff has to be distributed around the globe. Even if your support army works for free, you still have to pay for electricity, physical facilities, and bandwidth. Spread your content with BitTorrent instead, and you can get rid of all that staff and all those physical installations.

    In short, I’d sum up centralized, hierarchical distribution strategies as analgous to centrally planned, hierarchical economies. The Akamai design is the design of a network communist, centrally planned by committee, politburo, or dictator — and no matter how well it is planned, it can never respond to changing market forces as well as a free market can. (BTW, that isn’t a criticism of Akamai as a capitalist business or even as an engineering design — Akamai continues to pull in wonderful revenues, and at the time it was designed, it was revolutionary. It’s just that Adam Smith hadn’t come along yet…)

  • http://www.techliberation.com/ Tim

    George: Thanks! Those are good points. I certainly don’t think that Akamai-style caching will completely supplant BitTorrent-style peer-to-peer distributions for the reasons you mention. In particular, peer-to-peer applications are clearly better at handling sudden, unexpected usage patterns. And it’s also obviously less expensive for the distributor since your users do all the distribution for you.

    However, I still think that peer-to-peer applications are relatively wasteful of bandwidth. It’s true that a peer-to-peer application doesn’t need to know the details of network topology to get its own packets from point A to point B. However, the fastest route for a particular P2P client isn’t necessarily the same as the most cost-efficient way of getting content.

    An important factor is the relative costs of backbone bandwidth and storage. If storage is cheap and backbone bandwidth is expensive, the ISP may find they can save money by caching more content locally.

    ISP-run caches have the advantage that they’re available 24/7. With a P2P scheme, the last person on my local network to download a particular file may not be on the network any more, requiring that it be re-downloaded from the backbone. ISP-run caches might also have a greater ability to pre-stock the cache with content during periods of low bandwidth demand. ABC might, for example, load ISP caches up with the new episode of Desperate Housewives early in the morning for release the following evening.

    Obviously this isn’t a binary issue. Both P2P and commercial caching will be important for the foreseeable future, and I may very well be overestimating the advantages of the latter. But I think it’s pretty likely that caching will be the superior strategy for at least a few applications, and if so, it’s important that the legal system doesn’t interfere with experimentation.

  • George

    Tim, re:costs of caching.

    I’m not against ISP caching. Certainly, for many types of traffic, it helps. But I maintain that caching data at the edge of the network a priori as a scheme for distributing content with the goal of reducing costs is a much less-cost-effective solution than using adaptive p2p networks. If you aren’t convinced of this by a simple cost analysis of what you have to pay to get Akamai (or one of its competitors) to distribute your content vs. what you have to pay to use BitTorrent, RedSwoosh, or one of its competitors, I’m not sure that anything I say will convince you. But I thought it might be useful to describe the technical reasons why the cost structure for centralized, hierarchical network distribution channels will always be much greater than that of democratic on-demand cooperative swarming distribution.

    One of these technicalities is cache size and cache replacement policies. As long as the only thing we are talking about is episodes of Desperate Housewives, it is easy for ISPs or Akamai to provision enough storage resources at the nodes on the edge of the network to cache this content. But we know that users will want more than just Desperate Housewives. In fact, we know that traffic in today’s p2p filesharing networks follows the long-tail distribution, and that the total amount of bytes consumed by the long tail dwarfs that of the blockbuster spike. So you can set up a cache of finite size at the edge, and you can cache the most popular content, but you can’t possibly ever hope to have enough storage to contain the majority of traffic that will be generated by the long tail.

    So if you are an ISP, what do you do for all that long-tail content? You hope that your users are using something like BitTorrent, which will prefer to download from peers that are near because those peers will, by their very nature, be able to provide the best service. And, lucky for you, those peers that are closest will be on your own network, so you won’t be incurring costs going out to the backbone (just as if they had a cache on your own network).

    Yes, if no one on your local network is sharing a copy of a particular file, your users will have to go out to the backbone (at least once). But that’s the absolute best you can do. Period. Even if you have a cache, if none of your users have accessed that particular file recently, it will get expunged from the cache soon anyway (because no matter how big, your cache is finite, and because the long tail causes a lot of expunging).

    How about this: pretend your ISP cache is a distributed cache. Now, think of the nodes of that distributed cache residing not on your own servers in your data center, but out on your customer’s PCs. It’s still a local cache. Only its bigger, smarter, cheaper, and more efficient than your centralized ISP cache could ever be. And that is why, sooner or later, the ‘push data to edge node caches’ model is doomed.

    I hope I’m not derailing the discussion, and I’m not sure if any of this affects your core point about net neutrality, but I really feel strongly about p2p as an incredibly robust architecture (the fact that it is most commonly associated with pirating media is a shame).

  • http://www.techliberation.com/ Tim

    George,

    You’ve obviously thought about this question more than I have, as you make a very good argument. I’ll have to give this some more thought, but right now I find your argument pretty convincing. Thanks for commenting.

  • George

    Tim, re:costs of caching.

    I’m not against ISP caching. Certainly, for many types of traffic, it helps. But I maintain that caching data at the edge of the network a priori as a scheme for distributing content with the goal of reducing costs is a much less-cost-effective solution than using adaptive p2p networks. If you aren’t convinced of this by a simple cost analysis of what you have to pay to get Akamai (or one of its competitors) to distribute your content vs. what you have to pay to use BitTorrent, RedSwoosh, or one of its competitors, I’m not sure that anything I say will convince you. But I thought it might be useful to describe the technical reasons why the cost structure for centralized, hierarchical network distribution channels will always be much greater than that of democratic on-demand cooperative swarming distribution.

    One of these technicalities is cache size and cache replacement policies. As long as the only thing we are talking about is episodes of Desperate Housewives, it is easy for ISPs or Akamai to provision enough storage resources at the nodes on the edge of the network to cache this content. But we know that users will want more than just Desperate Housewives. In fact, we know that traffic in today’s p2p filesharing networks follows the long-tail distribution, and that the total amount of bytes consumed by the long tail dwarfs that of the blockbuster spike. So you can set up a cache of finite size at the edge, and you can cache the most popular content, but you can’t possibly ever hope to have enough storage to contain the majority of traffic that will be generated by the long tail.

    So if you are an ISP, what do you do for all that long-tail content? You hope that your users are using something like BitTorrent, which will prefer to download from peers that are near because those peers will, by their very nature, be able to provide the best service. And, lucky for you, those peers that are closest will be on your own network, so you won’t be incurring costs going out to the backbone (just as if they had a cache on your own network).

    Yes, if no one on your local network is sharing a copy of a particular file, your users will have to go out to the backbone (at least once). But that’s the absolute best you can do. Period. Even if you have a cache, if none of your users have accessed that particular file recently, it will get expunged from the cache soon anyway (because no matter how big, your cache is finite, and because the long tail causes a lot of expunging).

    How about this: pretend your ISP cache is a distributed cache. Now, think of the nodes of that distributed cache residing not on your own servers in your data center, but out on your customer’s PCs. It’s still a local cache. Only its bigger, smarter, cheaper, and more efficient than your centralized ISP cache could ever be. And that is why, sooner or later, the ‘push data to edge node caches’ model is doomed.

    I hope I’m not derailing the discussion, and I’m not sure if any of this affects your core point about net neutrality, but I really feel strongly about p2p as an incredibly robust architecture (the fact that it is most commonly associated with pirating media is a shame).

  • http://www.techliberation.com/ Tim

    George,

    You’ve obviously thought about this question more than I have, as you make a very good argument. I’ll have to give this some more thought, but right now I find your argument pretty convincing. Thanks for commenting.

  • http://tommykeswick.com/blog/ Tommy

    And thanks Tim for noting the comments with an update to the post. I would not have caught the excellent discussion due to using a feed reader.

  • http://tommykeswick.com/blog/ Tommy

    And thanks Tim for noting the comments with an update to the post. I would not have caught the excellent discussion due to using a feed reader.

  • George

    And thanks Tim for being such a reasonable, open-minded person, and for putting truth ahead of your own desires to be right. It is refreshing and inspiring, and makes participating at TLF a lot of fun (even when it’s over a rather minor technical issue that is largely irrelevant to the main discussion :) ).

  • George

    And thanks Tim for being such a reasonable, open-minded person, and for putting truth ahead of your own desires to be right. It is refreshing and inspiring, and makes participating at TLF a lot of fun (even when it’s over a rather minor technical issue that is largely irrelevant to the main discussion :) ).

  • watcher

    “The problem is if ISPs are allowed to _require_ content providers who are not their direct customers to pay a fee simply to cross the ISP’s network.”

    You put the stress on require, I’d put it on _allowed to_. Just because they have the right to charge to cross their network doesn’t mean they will. I know… I know… I’ve read all the doom and gloom reports too. But a far more likely model is the one where sites are given the option to pay for priority service. If they pay, they are given priority bandwidth and their data travels with increased reliability. If they don’t pay, their data travels with the same speed and reliability it travels with now. Sure, they look slower because they don’t pay, but it’s only a relative slowness.

    The real issue is whether the ISPs will be able to keep up with the demand for priority bandwidth, and what they plan to do if they can’t.

  • watcher

    “The problem is if ISPs are allowed to _require_ content providers who are not their direct customers to pay a fee simply to cross the ISP’s network.”

    You put the stress on require, I’d put it on _allowed to_. Just because they have the right to charge to cross their network doesn’t mean they will. I know… I know… I’ve read all the doom and gloom reports too. But a far more likely model is the one where sites are given the option to pay for priority service. If they pay, they are given priority bandwidth and their data travels with increased reliability. If they don’t pay, their data travels with the same speed and reliability it travels with now. Sure, they look slower because they don’t pay, but it’s only a relative slowness.

    The real issue is whether the ISPs will be able to keep up with the demand for priority bandwidth, and what they plan to do if they can’t.

  • Net Chick

    The internet is a huge playground — cant Google and Verizon just play nice? Why is there even a need for this kind of regulation. Just because the problem MAY arise doesn’t mean it will. Why put everybody through all this commotion? The internet playground is big enough for all the kids. Keep competition alive.

  • Net Chick

    The internet is a huge playground — cant Google and Verizon just play nice? Why is there even a need for this kind of regulation. Just because the problem MAY arise doesn’t mean it will. Why put everybody through all this commotion? The internet playground is big enough for all the kids. Keep competition alive.

  • Stevens33

    You make a great argument for tiered access. I have rarely heard it put so well. To make the greater bandwidth available will cost money, money that should be covered by content providers who want the higher speeds.

  • Stevens33

    You make a great argument for tiered access. I have rarely heard it put so well. To make the greater bandwidth available will cost money, money that should be covered by content providers who want the higher speeds.

  • Ajay

    Tim, you were right. George doesn’t know what he’s talking about. There has never been and probably never will be a show that 8 million people in Arkansas, and only in Arkansas, will want (especially since there’s only 2.7 million people in Arkansas). p2p is extremely inefficient because it uses the smallest and most expensive pipes in the system, the upstream links from our homes. The fact p2p is even being used is a cheap economic hack because ISPs don’t charge separately for upload bandwidth. p2p works best for very popular content but George admits that cacheing is a better solution for that. And p2p will never work for the long tail because there is, by definition, very little demand for long tail content. Long tail content doesn’t need a new solution as it is already well-served now: rent out a colocated server and start pumping it out. George is just a p2p zealot who hasn’t thought out his arguments in much detail.

  • Ajay

    Tim, you were right. George doesn’t know what he’s talking about. There has never been and probably never will be a show that 8 million people in Arkansas, and only in Arkansas, will want (especially since there’s only 2.7 million people in Arkansas). p2p is extremely inefficient because it uses the smallest and most expensive pipes in the system, the upstream links from our homes. The fact p2p is even being used is a cheap economic hack because ISPs don’t charge separately for upload bandwidth. p2p works best for very popular content but George admits that cacheing is a better solution for that. And p2p will never work for the long tail because there is, by definition, very little demand for long tail content. Long tail content doesn’t need a new solution as it is already well-served now: rent out a colocated server and start pumping it out. George is just a p2p zealot who hasn’t thought out his arguments in much detail.

  • Anonymous

    f61b60a6d5c9 My homepage

  • Anonymous

    f61b60a6d5c9 My homepage

  • http://www.abc-acupuncture.com/baxqorav tramadol

    81e31de21f46 Hello! http://www.abc-acupuncture.com/baxqorav tramadol

  • http://www.abc-acupuncture.com/baxqorav tramadol

    81e31de21f46 Hello! http://www.abc-acupuncture.com/baxqorav tramadol

  • http://www.abc-acupuncture.com/baxqorav tramadol

    81e31de21f46 Nice site tramadol tramadol

  • http://www.abc-acupuncture.com/baxqorav tramadol

    81e31de21f46 Nice site tramadol tramadol

Previous post:

Next post: