Introduction to Complexity: Scale-Free and Long-Tailed Degree Distributions Part 2


One example of the implications is the
fact that Google works so well. And that can be explained by the fact that the web
has a long-tailed degree distribution instead of a bell curve degree
distribution. Google is able to rank its pages that that
it responds to for your query by giving preference to webpages that have many
inlinks. That is, many other webpages point to them. Google assumes that if a
webpage has many pages that point to it, then it’s more likely to be a good
response to your query. It’s more likely to be, say, authoritative. The reason this
works is because at every level of the network, you have some pages that have
many links to them, and many other pages that have few links to them. So, no matter
what your query, Google will be able to find pages that have relatively many
links to them, and thus can put them high up in the list. Whereas if the web was a
random network, then all of the webpages would have about the same number of
links, the average of the bellcurve. And Google would have no way to discriminate
among them to decide how to rank the answers. Now, the web was not designed by
anyone. It built itself. It’s self-organized. And it just happened
to self-organize in this way, so that it has this kind of long-tailed degree-
distribution, and Google took advantage of that. This same notion explains some
of the success of other networks in nature The fact that they have long-tailed
distributions is taken advantage of and becomes useful for organisms, and social
systems, as well. You can read about that in some of the
suggested readings on the course materials website. The obvious question is how does this
structure of scale-free networks arise? If the web wasn’t designed by any person
or committee of people to be that way, how did it become that way?
Well, there’s multiple hypotheses: One prominent hypothesis is called
preferential attachment. It’s the idea that the rich get richer. That is, if
you’re coming in, and you have a new webpage, and you want to decide who to
link to, you’re more likely to link it to a page that already has a lot of links to
it, just because that page is more likely to be found by you. Similarly, the next time somebody else
comes along and wants to link to a page, that page will be even more likely to be
linked to, because now you’ve linked to it, and so the number of links to it have
increased. And so on. So this is the idea, the rich get richer; the linked to
get more linked to; the cited get more cited-that happens in science, at least.
And by cited, I mean c-i-t-e-d. That is, many people cite the papers that have
already been cited by many people. We’re going to be looking at a NetLogo model of
preferential attachment. And here’s how that model works: We start with two
linked nodes. At each time step, we create a new node. The new node is going
to randomly choose one of the existing nodes to link to. But it has bias: the
more links the existing node has, the more likely it is to be chosen. For
example, each one of these existing nodes has one link coming out of it. And so,
each has an equal chance here of being linked to by this node. So let’s say that
this node randomly chooses this other node to link to. Ok, at the next time
step, we have a new node, and it has a choice to make, but this node in the
middle here has two links coming out of it, whereas each of these other nodes has
one link, so the nodes with one link only have a 25% chance of being picked,
whereas the node with two links has a 50% chance of being picked. And here, this
node picked it. And this continues, so let’s see what that model looks like.
Here’s the interface for PreferentialAttachment.nlogo, which is
one of the models in the NetLogo models library, so you can get it there, or else
you can download it from our course materials page. So I’m going to ‘setup’,
and if I ‘go’ once, we have our third node comes in, and the links to that
node. ‘Go’ once again, again, again, again, again. Well, you’re going to see
that this is going to start to look kind of messy. In fact, if I keep going, I get
a big mess. But this has a nice feature, which is you can click ‘layout’ on, and
now if I do ‘setup’, and ‘go’, it actually creates a very nice layout for
the network. Although it’s a little bit slower to run. So you can see this is
going, just doing that bias choice each time a new node is created. It makes a
bias choice among the nodes that exist. You can see that there’s a big hub
forming here in the middle, and little hubs down here, and inside the little
hubs, there’s even littler hubs, and so on. And if we just let this run, it’s
going to turn into a very nice scale-free network. And you can see the degree
distribution here, which has the characteristic power law-looking shape,
and when it’s plotted on a log-log plot here, it’s making a straight line. And as
more and more nodes are added, these will look more and more like actual power
laws. So this model is more to look at rather than to actually do experiments
with, but I recommend downloading it from the course materials page, and playing
with it. The interesting thing here is that such a very simple model of network
formation can produce networks that look very much like real world networks. So
there might be something to this hypothesis of how these kinds of networks
form. Subtitles by the Amara.org community




Leave a Reply

Your email address will not be published. Required fields are marked *