\documentclass[dvips,12pt]{book}
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{dropping}
\usepackage[english,swedish]{babel}
\usepackage[sort&compress]{natbib}
\usepackage{chapterbib}
\bibpunct{(}{)}{;}{a}{,}{,}
\bibliographystyle{phd}
\usepackage{url}
\urlstyle{rm}
\usepackage{color}
\usepackage{graphicx}
\usepackage[tex4ht,%
breaklinks=true,% %%% breaks lines, but links are very small
colorlinks=true,% % links are colored
citecolor=black,% % color of cite links
pagecolor=black,% % color of page links
linkcolor=black,% % color of hyperref links
menucolor=black,%
urlcolor=black,%
pdfborder=black%
]{hyperref}%
\usepackage[vflt]{floatflt}
\usepackage{amssymb,latexsym}
\usepackage{colortbl}
\usepackage{pstricks,pst-node,pst-text,pst-3d,pst-grad}
\definecolor{gradstart0}{rgb}{0.247,0,0}
\definecolor{gradstart}{rgb}{0.749,0,0}
\definecolor{gradmid}{rgb}{1.0,0.5,0}
\definecolor{gradend}{rgb}{1,1,0.51}
\definecolor{light-grey}{rgb}{0.5,0.5,0.5}
\definecolor{light-blue}{rgb}{0.8,0.85,1}
\definecolor{light-orange}{rgb}{1,1.0,0.5}
\setcounter{secnumdepth}{3}
\setcounter{tocdepth}{3}
\definecolor{darkorange}{rgb}{0.5,0.0,0.0}
\definecolor{orange}{rgb}{0.831,0.165,0.0}
\definecolor{lightorange}{rgb}{1.0,0.5,0.0}
\newcounter{papern}
\setcounter{papern}{0}
\renewcommand*{\thepapern}{\Roman{papern}}
% \newenvironment{paper}%
% {\refstepcounter{papern}%
% \addcontentsline{toc}{subsection}{\textit{Summary of paper \Roman{\thepapern}}}%
% \subsection**{Summary of paper \Roman{\thepapern}}}%
% {}%
\newenvironment{paper}{\refstepcounter{papern}\addcontentsline{toc}{subsection}{\textit{Summary of Paper \Roman{papern}}}\subsection**{Summary of paper \Roman{papern}}}{\par}%
\begin{document}
\selectlanguage{english}
\chapter*{Introduction}
What wrote ``To be, or not to be: that is the question?'' The brain that put words in Hamlet's mouth consisted of roughly 100 billion neurons, the same number as in your or my brain. But when did we write our last sonnets? This introduction to complex networks is about how the whole sometimes is greater (and sometimes is less) than the sum of its parts. The essence of network science is that the whole is the sum of its parts plus the interactions between them.
This introduction is derived from the first chapters of my Ph.D.\ thesis, which I defended in September 2006. Read the full story at \url{http://www.diva-portal.org/umu/abstract.xsql?dbid=840}.
\section*{Why networks?}
Complex networks are neither magnetic or charged, nor do they emit light. They are not made of
any exotic matter, they are not radioactive and you will never hear: Complex networks offer the world a secure energy supply! So why am I, as a physicist, interested in modeling complex networks?
Networks make it possible to characterize the complex systems of our world,
in the same way as a map describes the surrounding landscape.
A network is a map of interactions, for example of communication in a social system.
As \textit{Homo loquens}, the speaking man \citep{fry}, communication is fundamental in our society.
If we want to understand the complexity in different systems in the world,
we must first understand their basic interaction patterns.
These networks are often neither regular lattices,
nor are all units connected randomly---the interaction patterns are complex.
\emph{Complex}, because a \emph{complex system} is made up of a large number of components, or \emph{agents},
interacting in such a way that their collective behavior is not a simple combination of their individual behavior \citep{newman:sfn}. Craig Reynolds has expressed it strikingly as ``A flock is not a big bird'' \citep{reynolds}.
Moreover, the interactions in a complex system often go beyond the system itself to make it adaptive to an ever changing environment.
On the other hand, a \emph{simple system} is a system with a limited set of interacting units with a behavior that can be described by laws, for example a pendulum or a bouncing ball \citep{intro-review}. Further, a \emph{complicated system} is a system with a large set of components, each with exact roles that also together act under absolute laws, for example billiards or cars.
When I started by asking \emph{what} instead of \emph{who} wrote Hamlet's famous question, I did not intend to confuse you about who actually wrote Shakespeare's plays. Instead I wanted to emphasize that he was not isolated. Shakespeare's network went beyond the 100 billion neurons connected by an even larger number of physical connections inside his brain. Thereby connecting virtually to cultural, historical and social entities like myths, books, and individuals---that in their turn had their connections. Hamlet would probably have used a different vocabulary if Shakespeare's \emph{network} had been different.
Cultural, sociological, economic, political, biological, and ecological systems all evolve under the influence of numerous components.
The challenge is: how simple can we make the rules of the individuals, constrained by interactions on a network, to reproduce the observed collective behavior in the real world? In particular, how should this network of interactions be constructed?
I now again turn to the analogy with maps and use a passage from \emph{Sylvie and Bruno Concluded} \citep{carroll}.
\begin{quotation}\label{carroll-quote}
\noindent
{\color{light-grey}``What a useful thing a pocket-map is!'' I remarked.
``That's another thing we've learned from your Nation,'' said Mein Herr, ``map-making. But
we've carried it much further than you. What do you consider the largest map that would be
really useful?'' ``About six inches to the mile.'' ``Only six inches!'' exclaimed Mein Herr. ``We very soon got to six yards to the mile. Then
we tried a hundred yards to the mile. And then came the grandest idea of all! We actually
made a map of the country, on the scale of a mile to the mile!'' ``Have you used it much?'' I enquired.
``It has never been spread out, yet,'' said Mein Herr: ``the farmers objected: they said it
would cover the whole country, and shut out the sunlight! So we now use the country
itself, as its own map, and I assure you it does nearly as well.''}
\end{quotation}
\section*{Specific aims}
Obviously, to be useful, a network model must be a simplified version of the pattern of interactions in the complex system it represents, in the same way as a good map is a simplified version of the landscape it represents.
The challenge is to create this map of some aspects other than the land that we stand upon.
It must be a dynamic map, because neither solar flares nor communicating people are static systems. This calls for a map that is a full model with interacting players over an ever-changing network---a dynamic network model. By characterizing the world in this way we will not only be able to answer specific questions, but also focus on asking general questions. For example, on the way to this dynamic network model, it has been necessary to use, and sometimes develop, structural and statistical measures that are informative not only for a particular network, but for a spectrum of networks from biology to the man-made Internet.
To uncover the interplay between function and structure and the strong connection to the formation process in networks, I have found it necessary to study networks from a perspective of information transfer. In other words, I will treat networks as information machines. With information I here mean any knowledge that has been gathered or received by communication. The connection between information, communication, and networks is central here.
I am in particular interested in how the network structure affects the communication conditions over the network, and vice versa, how the ongoing communication affects the network structure. For example, who communicates with whom and the social structure of a society are strongly integrated. The social network reflects the access to information that different parts of the system experience, and social mobility may be seen as a quest for better information access. Read more about this in my \href{http://www.diva-portal.org/umu/abstract.xsql?dbid=840}{thesis}. Here follows an introduction to network science based on the first chapters of the thesis.
% \section*{Outline}
%
% There are already more than a handful good reviews on network science, and I would like to recommend the reviews by \citet{latora-rew,newman2003,newman-pow}, the books by \citet{newman-pap,bornholdt}, and the Ph.D.\ thesis by \citet{holme-thesis}.
%
% Instead of a full review, this thesis is first of all a journey that connects the articles that I have written together with my colleagues. The purpose is to present the ideas in a simple way, emphasize their interconnectedness and to create a synergetic picture. Several models and concepts are also visualized on \url{http://cmol.nbi.dk/models}.
%
% The field of network science can be roughly categorized into three areas: empirical studies, network generating models, and dynamics on networks. This is also how the thesis is organized.
% It starts with a short overview to present the field of network science (\autoref{chapter-networks}).
% The overview, which also contains some important concepts and definitions that make the reading of the remaining thesis easier, is followed by the most common network models (\autoref{chapter-models}). The central part of this thesis are \hyperref[chapter-navhorizon]{Chapters \ref*{chapter-navhorizon}} and \ref{chapter-comhorizon}, where the information/communication approach to network science is most evident. The last chapter in the first part of the thesis (\autoref{chapter-summary}) summarizes briefly the main findings of the thesis.
%
% The second part of the thesis is the eleven papers the thesis is based
% on, listed on \hyperref[chapter-papers]{page
% \pageref*{chapter-papers}} and hereafter referred to as Paper I,
% Paper II, and so on.
\chapter*{Complex networks}\label{chapter-networks}
Physics is a powerful science. Nevertheless, the deterministic Newtonian science lacks words to describe complex behavior. Physics has been the tool that break things down into laws. This reductionist viewpoint has proved to be very successful, but it is now time to reassemble things into a whole again \citep{wilson}. There is much to be learned on this journey, and a new science has to be developed.
Here I take the approach that complex systems are best understood through simple models. Simple, because it should be possible to understand the model in itself. Moreover, a model that contains everything that is known about the system can, by some fiddling with the parameters, in principle predict anything. And a model that can predict anything predicts nothing \citep{popper}.
\section*{Background\label{section-background}}
Networks are everywhere, have always been, and will always be---social networks, gene networks, cellular networks, ecological networks, computer networks, phone-call networks, airline networks, power-line networks---why this buzz now? The short answer is the Internet.
A longer answer is that mathematicians have had, since Leonard Euler solved the so called \emph{Königsberg bridge problem}\footnote{Seven bridges connected the four different land masses in Königsberg, and the question was, ``Does there exist any single path that crosses all seven bridges exactly once each?''} in 1736, the knowledge to abstract away details from a problem and reduce it to a \emph{graph}\footnote{The difference between the mathematicians \emph{graph} and the \emph{networks} discussed here is subtle, but where a graph is mainly a mathematical object, a network often refers to a decentralized, naturally evolving system \citep{newman-pap}.}---a set of nodes and links. The strength with this mathematical object is of course that the nodes are not limited to represent land masses and the links limited to represent bridges. The list of networks above could have been made much longer, since so many systems can be simplified with maintained complexity by a network.
It matters to abstract a system into its underlying network when the interacting units are \emph{unique}, like humans, proteins, and web pages, but not when they are \emph{interchangeable}, like atoms, electrons, and spins\footnote{Spin is the intrinsic angular momentum in elementary particles.}. It is therefore not surprising that sociologists, and not physicists, were the first to apply the abstract mathematical objects of graphs to concrete problems. This happend around the 1950s and one challenging question was ``How small is the world?'' \citep{milgram,kochen1978,kochen}. At about the same time the two Hungarian mathematicians \mbox{Paul} \mbox{Erd{\H o}s} and \mbox{Alfréd} \mbox{Rényi} started to develop a statistical theory of dynamic networks \citep{erdos}, originally proposed by \citet{rapoport}.
Why did the field of network research wait another 50 years to take off? The main problem the sociologists faced was the difficulty to create large reliable networks. Especially \citet{kochen1978} asked many fundamental questions\label{kochenquestions}, still important today, but they could not answer them in an adequate manner because of the poor experimental data. For example, to estimate how long the shortest chain of friends is between two people chosen at random, they first had to estimate how many friends an individual has. The two problems they faced were: what is a friend and why are people so poor at estimating their number of friends? Try to estimate your number of friends and you will realize that it is not straightforward.
Kochen and de Sola Pool instead turned to the mathematical models by \citet{rapoport} and \citet{erdos}, and estimated the average separation to three. \citet{milgram} estimated, by tracing letters sent by chains of aquaintaces, the average distance to six. Obviously there was a problem here. The experiments predicted twice as large average separation as the models. Were the models wrong, or the experiments unsound?
One exception from small and unreliable data from experiments was presented in the pioneering work by \citet{price1} on the network of citations between scientific articles. He found that the distribution of citations, corresponding to links in network language, was not peaked at an average value as in the mathematical models of random networks \citep{erdos}. Instead most of the articles had only a few citations and a few articles had many citations: the distribution of citations per article followed a power law. A decade later he presented the model called \emph{cumulative advantage} based on an idea by \citet{simon} that could explain this observation \citep{price2}.\footnote{This model is very similar to the preferential attachment model discussed in the next chapter.} However, the work by Price considered one particular network, and the concepts did not cross over to other sciences but remained an isolated work.
%It is impossible to overestimate the importance of cross fertilization between theory and experiments in science.
%In this case What held the field of networks down was the absence of good experiments.
In the late 1990s, at the same time as the Internet started to become mature and useable\footnote{Notable: Ines Usman, the former Swedish secretary of transport and communications, said in 1996 that ``The Internet is a fad that may just pass by.''}, all necessary ingredients were suddenly at hand \citep{barabasi:smcn}: the emergence of large databases on the topology of various real networks, increased computing power, the noticeable breakdown of boundaries between disciplines, and the new holistic viewpoint to try to understand the system as a whole. Again it became apparent that it is impossible to overestimate the importance of cross fertilization between theory and experiments in science.
The breakthrough originated in the physics community because
the tools of statistical mechanics are particularly well suited to analyze the patterns of interactions in networks. For example, in the context of statistical mechanics, physicists have investigated how a system behaves on the macroscopic level from the properties of its constituents, the \emph{agents}, on the microscopic level. The new idea was that behind every complex system there is an underlying network with a nonrandom topology \citep{barabasi}. It became clear that this pattern of interactions, which forms the network, plays a fundamental role in understanding the system \citep{newman:sfn}.
\subsection*{Real-world networks}
The first network that was large enough for reliable statistical measurements, and at the same time accessible, turned out to be the World Wide Web (WWW).\footnote{The network of the WWW consists of web pages linked by the hyperlinks we click on to go from page to page.} With the purpose to test a real-world network against the random network model \citep{erdos}, \citet{barabasi-web} analyzed a subdomain of the WWW consisting of more than \mbox{300,000} web pages. As in the study by \citet{price1}, the distribution of the number of links per node did not follow the predicted distribution from the random network model. Again it was power-law distributed. However, this study did not remain isolated, because a number of experiments followed shortly. \citet{broder} repeated the experiment on a much larger section of the WWW and confirmed the results. \citet{faloutsos99powerlaw} and \citet{satorras} showed that the distribution of physical connections in the Internet followed roughly the same distribution.
Were the discovered power-law tails of the distributions of links ``much ado about nothing?'' Yes and no. Yes, because the discovery has led to an unreasonable hunt for power laws in empirical studies and the construction of ad hoc models that produce power laws. No, because the few nodes with a very high number of links, the hubs of the network, have a dramatic effect on the properties of the network.\footnote{\citet{vespignani} showed that networks with power-law distribution of links per node may not have an epidemic threshold and a disease will therefore always spread.} Moreover, the broad distribution was discovered in other networks than technological as well. \citet{ws:smallworld} studied movie actor collaborations, \citet{fell} the metabolic network of the bacterium \textit{E.\ coli}, \citet{Jeong2000} the metabolic networks of more than 40 organisms, \citet{amaral} airline networks, \citet{liljeros} sexual contacts, and we have for example studied city street networks \citep{pap:city}. The list can be made much longer, but the general result was that the networks did neither look like the random network model predicted, nor were they regular, and what they really look like is what the remaining of this text is about.
\section*{Network topology}
I will begin this section with two examples of real-world networks. This will not only visualize the pattern of links connecting nodes, the \emph{network topology} or the \emph{network structure}, in two different systems. It will also illustrate the difference between a cumbersome and detailed, and an efficient but inexact way to extract the underlying network in a system.
\subsection*{Two network examples}
The first network in \autoref{terrornet} is by \citet{krebs} and describes the September 11 terrorist network. With the aim to ``uncover network patterns that would reveal Al Qaeda's preferred methods of stealth organization,'' Krebs gathered information from major newspapers such as the \emph{New York Times} and the \emph{Wall Street Journal}. Obviously this method is time consuming and inevitably incomplete.
The second network in \autoref{googlenet} consists of a subset of the 100 most influential people in 2006 selected by \emph{Time Magazine}. In this case I have let \emph{Google} create the network by asking how many hits two names get together compared with what one of the names gets. If the ratio is above a threshold, 5\% in this case, I create a directed link between the two names pointing to the name I compare with. This is a very fast method, but gives of course only a proxy of the real network.
\begin{figure}[tbp]
\leavevmode
\centering
\includegraphics[width=1.2\textwidth]{./figs/terroreps.eps}
%\input{./figs/terrornet.pstex_t}
\caption{\label{terrornet}Terrorist network of the September 11 hijackers and associated \citep{krebs}. In total 62 terrorists, with the terrorists on each plane identified by separate colors.}
\end{figure}
\begin{figure}[tbp]
\leavevmode
\centering
\includegraphics[width=1.2\textwidth]{./figs/topeps.eps}
%\input{./figs/top54.pstex_t}
\caption{\label{googlenet}A subset (54) of the 100 most influential people in the world selected by Time magazine, and connected by Google in June 2006. A strong connection is represented by a dark link.}
\end{figure}
I have already stated that real-world networks are different from the random network model by \citet{erdos}. This difference refers to the ways the links are organized between the nodes---the network topology. It is for example easy to see that the links in the two networks in \autoref{terrornet} and \autoref{googlenet} are not randomly distributed, but how much, and in what way are they not? Before answering these questions, we need a benchmark to enable comparison between networks through proper null models. However, I have already used too many concepts that deserve to be defined rigorously before I use them, so it is time to take a step back.
\subsection*{Nodes and links}
Technically speaking, a mathematician uses the term \emph{graph} to represent a structure with a set of objects called \emph{vertices} connected by links called \emph{edges}. If the edges have directions (object A links to B, but B does not automatically link to A), the graph is directed. If the edges are also associated with a weight, that for example could represent the distance between two objects, the mathematician uses the term \emph{network}. However, that \emph{network} and not \emph{graph} will be used from now on does not imply that this text is about directed weighted graphs. Instead, undirected and unweighted graphs are almost exclusively the focus of this text. This contradiction originates from the loose language outside mathematics and the tradition to use the term network as something that represents a real-world evolving system upon which distributed dynamic processes might exist \citep{newman-pap}. I will also use the terms \emph{nodes} and \emph{links} instead of vertices and edges.
The simplest concepts beyond the nodes and links, still belonging to the local properties of a network,
are the \emph{degree} and the \emph{clustering}. They will be explained in the following sections together with the \emph{shortest path}. Degree, clustering, and shortest path were three major concepts in focus in the early studies of complex networks, not least because of their relevance to the fundamental questions raised by \citet{kochen1978}.
\subsection*{Degree distribution}
The \emph{degree} $k_v$ is the number of links incident to node $v$ and is consequently equal to the number of nearest neighbors of $v$. In physical literature the degree of a node is also known as the node's connectivity.
There is a spread in the node degree, described by the \emph{degree distribution},
which is characterized by the \emph{distribution function} $P(k)$.
$P(k)$ gives the probability that a randomly chosen node in the network has exactly $k$ links or equivalently $k$ nearest neighbors. The degree distribution plays a central role in \autoref{chapter-models}. Power-law degree distributions have already been mentioned, and refer to distributions that are proportional to $P(k)\propto k^{-\gamma}$. A real network does not of course exactly follow a given parametrization of a distribution, but if it is roughly $P(k)\propto k^{-\gamma}$ it will be called a broad degree distribution in this text. The important feature in a network with a broad degree distribution is that most nodes have very few links, but a few nodes have very many links. The nodes with very many links are called \emph{hubs}.
\subsection*{Degree correlations}
\begin{floatingfigure}{0.9\columnwidth}
\leavevmode
\centering
\includegraphics[width=0.7\columnwidth]{./figs/randomrewireeps.eps}
%\input{./figs/randomrewire.pstex_t}
\caption{\label{randomrewire}The elementary step in the random rewiring introduced by \citet{maslov:sstpn}.
The links $e_{uv}$ and $e_{xy}$ are chosen at random and rewired to become $e_{uy}$ and $e_{vx}$ provided that none of these links already existed.}
\end{floatingfigure}
\citet{maslov:sstpn} developed a method to test for correlations in the degree of the nodes.\label{degree-degree-corr}
The probability $P(k_0,k_1)$ that two nodes of degree $k_0$ and $k_1$ are connected via a link is calculated and compared with the probability $P_r(k_0,k_1)$ in a randomized version of the same network with preserved degree sequence.
The idea is a reinvention of the conditional uniform test by \citet{katz}, presented in a network context by \citet{snijders}.
In the \emph{randomized counterpart} of the original network, the \emph{null network}, all nodes have the same degree as in the original network, but the connections are randomized.
The random counterpart is created by first randomly selecting a pair of links, like $e_{uv}$ and $e_{xy}$ in \autoref{randomrewire}.
The two links are then rewired in such a way that $u$ becomes connected to $y$ and $v$ to $x$, or vice versa provided that none of these links already existed.
The step is iterated until all information of the original network is lost.
If the above algorithm is repeated, a set of randomized counterparts makes average expectation values and standard-deviation measurements possible for any property of the network. The randomized networks will hereafter be called MS\--ran\-dom\-ized networks (from Maslov and Sneppen) to avoid confusion with the random networks, already mentioned but more thoroughly presented in \autoref{chapter-models}.
The ratio of a property between the original and MS-ran\-dom\-ized versions uncovers deviations from random properties.
The null-network approach is extremely powerful to separate nonrandom features from features that are only associated to a particular degree distribution and will be used over and over here.\\
% If the network to be investigated is \emph{connected} (any node can be reached from any other node via walks along the links), the above algorithm must be adjusted to give unbiased results.
% A constraint is added to the local rewiring in Figure \ref{randomrewire} that aborts the rewiring step if the new network is disconnected.
% Two new links, $e_{u'v'}$ and $e_{x'y'}$, are chosen and the elementary step is repeated.\\
\begin{figure}[tph]
\leavevmode
\centering
\includegraphics[width=1.2\textwidth]{./figs/degmountain.eps}
\caption{\label{fig:degscape}color degree landscape with mountains and valleys, with the altitude of a node proportional to its degree.
}
\end{figure}
\noindent In \citet{pap:degmountain} we generalized the degree-organizational view \citep{Pastor-Satorras}
of real-world networks with broad degree distributions \citep{barabasi:scaling}
in a landscape analogy with mountains (high-degree nodes) and valleys (low-degree nodes) (see \autoref{fig:degscape}).
For example, with the presented viewpoint, correlated degrees between adjacent nodes
correspond to smooth landscapes (social networks) \citep{newman2002},
hierarchical networks to one-mountain landscapes (the Internet) \citep{trusina-hierarchy},
and networks with anti-correlated degrees without hierarchical features to
rough landscapes with several mountains \citep{newman2003b} (see \autoref{fig:terrordegscape}).
\begin{figure}[tph]
\leavevmode
\centering
\begin{tabular}{ccc}
\includegraphics[width=0.4\textwidth]{./figs/terror_landscape_bw.eps} & \includegraphics[width=0.4\textwidth]{./figs/terrorrandom_landscape_bw.eps} & \includegraphics[width=0.4\textwidth]{./figs/terrorrandomer_landscape_bw.eps}\\
\end{tabular}
\caption{\label{fig:terrordegscape}The terrorist network in \autoref{terrornet}, here as degree landscapes. The left landscape corresponds to the original network. In the middle it is randomized with fixed degree distribution and to the right it is completely reshuffled without any constraints except for the number of nodes and links.
The important difference between the left network and the middle network is that the latter essentially only contains one mountain. All high degree nodes have disappeared in the completely reshuffled network to the right. This is clear in the topological maps (bottom), where the mountain is washed out for the right network.
}
\end{figure}
To quantify the topology, we measured the widths of the mountains and the separation between different mountains.
We also generated ridged landscapes to model networks organized
under constraints imposed by the space the networks are embedded in,
associated to spatial or, in molecular networks, to functional localization.\\
\subsection*{\label{subsection-shortestpath}Shortest path}
The length of all links is here defined to be 1.
The \emph{distance} $l_{uv}=l_{vu}$ between two nodes $u$ and $v$ in the network is consequently defined to be the number of links traced on the \emph{shortest path} between the nodes.
A \emph{connected component} is a set of nodes that all can be reached from each other,
disconnected from other components of the network.
If the network only contains one connected component, the network is \emph{connected}.
When this is true, it is straightforward to define the \emph{average path length} $l$, as the average over all shortest paths
\begin{equation}
\label{avel}
l = \frac{2}{N(N-1)}\sum_{u \ne v}l_{uv}.
\end{equation}
The node that on average has shortest path length to all other nodes in the network can naturally be defined to be the \emph{center node}.
If the network is disconnected the \emph{average reciprocal distance} $l^{-1}$, also called the \emph{efficiency}, can be used instead of the average path length \citep{latora-eff}
\begin{equation}
\label{averl}
l^{-1} = \frac{2}{N(N-1)}\sum_{u \ne v}\frac{1}{l_{uv}}.
\end{equation}
A measurement related to the shortest path is the \emph{betweenness}.
The betweenness of a node is the number of times the node is on the shortest path between all other pairs of nodes.
The quantity was introduced to investigate the participation rates of nodes
in networks with the shortest paths as proxies for communication channels \citep{freeman}.
An alternative center node in this context is the one with the highest betweenness.
\subsection*{Clustering}
The \emph{clustering coefficient} $C$ is the probability that the two nearest neighbors of a node also are nearest neighbors to one another.
In the network, such a relation forms a \emph{triangle} of links.
Accordingly, a high \emph{clustering coefficient} reflects an increased number of triangles in the network,
and gives a hint about a dense topology on the local network level.
A common definition of the clustering coefficient is given by \citet{barrat}:
\begin{eqnarray}\label{cluster_newman}
C=3 \frac{N_{\triangle}}{N_{\sqcup}},%\mathrm{number\:of\: triangle\: on\: the \:graph}}{\mathrm{number\: of\: connected \:triples\:of\:vertices}}.
\end{eqnarray}
where $N_{\triangle}$ is the number of \emph{triangles} in the network and $N_{\sqcup}$ is the number of \emph{connected triples} of nodes in the network.
Triangles are trios of nodes with every node connected to both of the others.
Connected triples are trios of nodes with at least one connected to both of the others.
The factor $3$ compensates for the fact that each triangle corresponds to three connected triples.
The factor constrains $C$ to lie in the range between $0$ and $1$.
In a social network, where nodes are persons and links symbolize friendship, $C$ corresponds to the probability that two of one's friends also are friends with one another. Whereas the shortest path is a global property of the network, clustering is a local property.
The drawback of the definition of the clustering above is that it is dependent on the global properties of the network.
To overcome that problem we use the null-network approach.
We think that the best way to calculate the clustering in a network is to count the number of triangles in the network and compare with the number of triangles in its random counterparts. This is a good way to avoid the direct effect of the degree distribution on the clustering, and we use this method in all presented papers.
\section*{Beyond simple measures}
There are plenty of other measures invented to reveal the structure of real-world networks, too many to all be mentioned here. However, by just looking at the networks in \autoref{terrornet} and \autoref{googlenet} it is clear that they both have a higher order structure than the broad distributions of links and the many triangles. It looks like the terrorist network could easily be divided into three groups, and the network of influential people could be divided into two groups. By looking more closely at the names it is clear that political leaders form one group and people from showbiz form the other group, and that Oprah Winfrey falls somewhere in between.
\subsection*{Community structure}
No one has contributed more than \citet{newman-comrew} on how to detect groups in networks, or \emph{community structures}, which they are often called. For example, \citet{girvan_newman} presented a method based on the betweenness. They showed that it is possible to partition the network by successively removing the link with the highest betweenness until the network falls apart into two disconnected components. This brute force method can then be repeated on the components formed.
Later \citet{newman-com} introduced a measure of the community structure under the name \emph{modularity}. It resembles the process we perform when we look at the networks in \autoref{terrornet} and \autoref{googlenet} to identify the groups. A meaningful division of the networks is not only one in which there are few links between the groups---it is one where there are fewer than expected. They formulated this mathematically and have shown that it gives rise to meaningful partitions of networks \citep{newman-fast}. \citet{newman-compnas} presented a method for the maximization of the modularity over possible divisions of a
network based on a matrix method that is very efficient.
The other way to look at the problem is to study the formation process, and what dynamic features the community structure brings? This is indirectly the topic of the two final chapters of my thesis. The formation process has also been studied by for example \citet{gronlund-sec}, and the effect of the community structure on the spread of fads by \citet{gronlund-soc}. For example, \citet{boguna} showed that the epidemic threshold can be nonzero even in networks with power-law degree distribution if the network is modular or has high clustering.
Before going into more detail here, and to be able to connect the
network topology with its intrinsic function and the underlying
evolutionary process, the next chapter will be about network models.
\chapter*{Network models}\label{chapter-models}
Constructing a network model is a way to understand the formation process of the network, and thereby also shed light on the function of the network. A network model is nothing but a set of rules that in a static or dynamic way in the end specifies what connects to what. A network model can also create generic classes of networks, which can be used as playgrounds for dynamic models on networks. For example, to thoroughly examine vaccination strategies on different network topologies.
All network models are originally promoted by empirical studies. Observed topological features, such as short path-length, high clustering, and nonrandom degree distributions, have been followed by models that reproduce the desired attributes in an attempt to uncover their dynamic origin \citep{ws:smallworld,barabasi:scaling,klemm}. We learned from the quotation of \emph{Sylvie and Bruno Concluded} by Lewis Carroll that a good model is a simple model, and details are therefore by necessity left out. This is the essence of the different classes of network models presented in this chapter.
\section*{Simple networks}
A lattice is the simplest possible interaction pattern (\autoref{ER-graph}(a)).
The extent to which it has been used in statistical physics, to model dynamic systems, can not be overestimated.
In fact, many systems can be accurately modeled by having their parts interacting on a lattice layout. For example, magnets are accurately modeled by interactions on a lattice in the Ising model \citep{ising}, simply because the lattice is a good representation of the real interaction pattern. A lattice is also a good choice if the scope is to understand
the intrinsic features of a model. The results can in some cases be derived analytically, since the effect of the interaction pattern is the simplest possible. However, a lattice is not a good network model when the effect of the interaction pattern itself plays an important role, since it is a bad representation of typical interactions in the real-world \citep{ws:smallworld}.
\section*{Random networks}
\begin{figure}[t]
\includegraphics[width=1.2\columnwidth]{./figs/modelnetworkseps.eps}
%\input{./figs/modelnetworks.pstex_t}
\caption{\label{ER-graph}Four different network models, all designed to mimic real-world networks.
(a) A 2-dimensional lattice.
(b) $n=24$ nodes and $m=60$ links corresponding to $p=0.14$ in the random network model by \mbox{Erd{\H o}s} and \mbox{Rényi}. The average degree is consequently $\langle k \rangle= 4$.
(c) The small-world model by Watts and Strogatz; 24 nodes, each connected to the first and second nearest neighbor in the one-dimensional ring. 4 links are rewired, which corresponds to a probability of rewiring $p=0.08$.
(d) A scale-free network with degree distribution $P(k) \propto k^{-2.2}$.
}
\end{figure}
Random networks, or ER-networks after \citet{erdos}, briefly discussed in \autoref{chapter-networks} and illustrated in \autoref{ER-graph}(b), should not to be mixed up with the MS-randomized networks in \autoref{randomrewire}. ER-networks play a central role in the study of complex networks because the underlying mechanisms of organization in nature is sometimes random \citep{barabasi:smcn}. The advantage with the ER-model presented below is, thanks to its simplicity, that many properties can be analyzed analytically.
\begin{list}{}{}
\item (i) The total number, $n$, of nodes is fixed;
\item (ii) The probability that two arbitrary nodes are connected is $p$.
\end{list}
The average number of links is consequently
\begin{eqnarray}
\langle m \rangle = pn(n-1)/2.
\end{eqnarray}
The degree distribution is binomial
\begin{eqnarray}
P(k) = {n-1 \choose k}p^k(1-p)^{n-1-k},
\end{eqnarray}
since for a certain node the probability of $k$ links is $p^k$, the probability of absence of extra links is $(1-p)^{n-1-k}$ and there are ${n-1 \choose k}$ possible end nodes for the $k$ links. For large $n$, $P(k)$ takes the form
\begin{eqnarray}\label{pois}
P(k) = e^{-\langle k\rangle}\langle k\rangle^k/k!,
\end{eqnarray}
which is the narrow Poisson distribution, peaked at the average degree
\begin{eqnarray}
\langle k\rangle = p(n-1) \approx p n.
\end{eqnarray}
Random networks have been extensively studied and one conclusion is that for $\langle k\rangle \gtrsim \ln(n)$ almost every network is connected and the average shortest path is \citep{Mendes:eon,barabasi:smcn}
\begin{eqnarray}\label{l_rand}
l_{rand} \sim \ln n / \ln \langle k \rangle.
\end{eqnarray}
The clustering coefficient is
\begin{eqnarray}\label{C_rand}
C_{rand} = p \approx \frac{\langle k\rangle}{n},
\end{eqnarray}
since the probability that two of a node's neighbors are connected is $p$.
To study feature deviations in a network from an ER-network,
\emph{random ER-counterparts} of the network can be generated in a similar way as the MS-random\-ized networks. A link is chosen randomly together with two unique nodes that become the new endpoints of the rewired link.
This procedure washes away all nonrandom properties, also associated to the degree distribution.
However, we seldom use this strategy since it has the same disadvantage as the definition of the clustering coefficient in \hyperref[cluster_newman]{Equation (\ref{cluster_newman})}, in that it disregards the extensive effects the degree distribution itself has on many properties.
The ER-networks share the short average path length, \hyperref[l_rand]{Equation (\ref{l_rand})}, with networks in nature,
but for large networks the clustering coefficient, \hyperref[C_rand]{Equation (\ref{C_rand})}, is negligible.
This is an unrealistic property, since networks in nature are characterized by a high clustering coefficient.
% \begin{table*}[!hbtp]
% \caption{\label{clustertable}Average path length and clustering coefficient for some networks in nature.
% $n$ is the number of nodes, $\left< k \right>$ the average degree, $l$ the average path length, $l_{rand}$ the average path length in the ER model for the same $n$ and $\left< k \right>$, $C$ is the clustering coefficient Equation(\ref{cluster_newman}) and $C_{rand}$ the clustering coefficient in the ER model for the same $n$ and $\left< k \right>$}
% \begin{tabular}{lccccccr}
% {\footnotesize Network} & {\footnotesize $n$} & {\footnotesize $\left< k \right>$} & {\footnotesize $l$} & {\footnotesize $l_{rand}$} & {\footnotesize $C$} & {\footnotesize $C_{rand} $} & {\footnotesize Reference} \\
% \hline
% \rowcolor{light-orange}
% {\footnotesize WWW (site level)} & {\footnotesize 153157 } & {\footnotesize 35.2 } & {\footnotesize 3.1 } & {\footnotesize 3.35 } & {\footnotesize 0.11 } & {\footnotesize $2.3\;10^{-4}$ } & {\scriptsize \citet{adamic99small}} \\
% {\footnotesize Internet (1999) } & {\footnotesize 5287 } & {\footnotesize 3.8 } & {\footnotesize 3.7 } & {\footnotesize 6.2 } & {\footnotesize 0.21 } & {\footnotesize $1\;10^{-3}$ } & {\scriptsize \citet{satorras}} \\
% \rowcolor{light-orange}
% {\footnotesize \emph{C. elegans} (neural) } & {\footnotesize 282 } & {\footnotesize 14 } & {\footnotesize 2.65 } & {\footnotesize 2.25 } & {\footnotesize 0.28 } & {\footnotesize $5\;10^{-2}$ } & {\scriptsize \citet{ws:smallworld}} \\
% {\footnotesize Power grid (US) } & {\footnotesize 4941 } & {\footnotesize 2.67 } & {\footnotesize 18.7 } & {\footnotesize 12.4 } & {\footnotesize 0.08 } & {\footnotesize $5\;10^{-3}$ } & {\scriptsize \citet{ws:smallworld}} \\
% \hline
% \end{tabular}
% \end{table*}
\section*{\label{section:small-world}Small-world networks}
The concept of a small world dates from the famous experiment by \citet{milgram}.
The social psychologist Stanley Milgram sent letters to a few hundred randomly selected individuals in Nebraska and Kansas, USA, with the instruction to the receiver to reach one of two target persons in the Boston area \citep{milgram,kleinberg-smallworld}.
The recipients were told to send the letter to a person they knew on first-name basis, an acquaintance, who they thought was more likely to know the target than they were themselves.
Subsequent recipients followed the same procedure.
The information given about the target was their name, address, and occupation.
The letters were traced by requesting that each participant should send Milgram a postcard.
The result was that the average length of the resulting acquaintance chains was about six.
This is popular known as the principle of six degrees of separation and was the first evidence that the world is indeed small \citep{watts-six}.\footnote{Although this was the first real evidence that the world is small, the concept was not new. Frigyes Karinthy published 1929 \emph{Chains}, where he argued that everyone on earth is at most five acquaintances away from anyone else \citep{newman-pap}.}
Random networks have the small world property, due to the typical node--node distance \mbox{$l \sim \ln n / \ln \langle k \rangle$}.
However, the clustering coefficient $C=\langle k\rangle /{n}$ is much smaller than for most real-world networks \citep{newman:sfn}.
One of the first models with both the property of high clustering coefficient and the small-world effect was the small-world model of \citet{ws:smallworld}, also known as the WS-model. \citet{ball} presented at the same time a very similar model in the context of spreading of epidemics.
In the WS-model, the network is created from a one-dimensional lattice with periodic boundary conditions (see \autoref{ER-graph}(c)).
Every node is not only connected to the nearest neighbor, but also to at least one next-nearest neighbor in the one-dimensional lattice.
In this way, a node has four or more nearest neighbors and not only two, which would result in a vanishing clustering coefficient.
Then, one by one, every link is rewired with probability $p$ in such a way that long range connections appear. In other words, many links connect locally and a few links are used for long range connections as in \autoref{ER-graph}(c).
The model is therefore a way to go from lattice models to random networks by adjusting the parameter $p$.
For large $p$ the WS-model is the same as the random network. But even for small probabilities of rewiring, where the local properties are almost unchanged, the average shortest path length is of the order of that for random networks.
The short-cut links might represent acquaintances one has in other states or countries, explaining the short acquaintance chains in the experiment by Milgram.
The WS-model was one of the first models that could simultaneously explain high clustering and small-world effects. However, the discovery of broad degree distributions in real-world networks ignited a burst of new models to explain this feature \citep{faloutsos99powerlaw,barabasi:scaling,caldarelli,goh}.
\section*{Scale-free networks}
%\begin{floatingfigure}[v]{0.45\columnwidth}
%\leavevmode
%\centering
%\includegraphics[width=0.4\columnwidth]{./figs/falinternet.ps}
%\caption{\label{falinternet}The degree distribution of the Internet approximately follows a power law, $P(k) \sim k^{-2.2}$. Modified Figure from \citet{faloutsos99powerlaw}, degree distribution covering 98\% of the internet in December 1998.\vspace{1.5cm}}
%\end{floatingfigure}
% The absence of data on large real networks in nature has restricted the possibility to validate or reject network models.
% Random networks have therefore for such a long time been the only objects of interest.
% However, a few years ago there was a dramatic breakthrough, driven by the computerization and the acquisition of fresh data.
Today information about different networks in nature is becoming increasingly available \citep{barabasi:mf}.
The small-world effect and high clustering in real world networks, which have long been assumed to exist, have now been validated. Much more unpredicted was the recent discovery that the degree distribution of at least several large (growing) networks in nature followed a power law \citep{faloutsos99powerlaw,barabasi:scaling} (\autoref{ER-graph}(d)). As mentioned before, the power-law distribution has been reported for the Internet \citep{faloutsos99powerlaw}, the WWW \citep{albert}, some molecular networks \citep{Jeong2000}, as well
as for the size distributions of industrial companies \citep{zajdenweber} and much earlier for citation networks \citep{price1}. However, the \emph{Zipf law} has been known for a long time in a nonnetwork context, for example that the rank of word frequencies, city sizes and income distributions follow a power law \citep{zipf}.
The power-law distribution of degrees $k$ is given by
\begin{eqnarray}
P(k) \sim k^{-\gamma},
\end{eqnarray}
where $\gamma$ is approximately between 2 and 3 in real-world networks \citep{Mendes:eon}.
\citet{barabasi:scaling} called such networks \emph{scale free}, since they are free of a characteristic scale---they are scale invariant. In other words, the average degree $\langle k \rangle$ is an insignificant quantity in a power-law distribution. Contrary, a node in a network with an exponential or Poisson degree-distribution typically has degree $k \sim \langle k \rangle$.
The main feature of scale-free networks, is that each node has a statistically significant probability of having a very high degree compared with the average degree $\langle k \rangle$ of the network.
These highly connected nodes dominate the topology of the network by forming hubs, which is not the feature of random or small-world networks (see \autoref{ER-graph}).
\subsection*{The Barabási-Albert Model}
\citet{barabasi:scaling} argued that the scale-free nature of these networks originated from two generic features of many real-world networks:
First, most real networks are open systems that \emph{grow} by the addition of nodes.
Second, the nodes are not linked randomly. There is a \emph{preferential attachment}, which favors attachment to nodes with a high degree.
This manifests itself in facts like ``rich gets richer'' and that being popular is attractive.
As was mentioned in \autoref{chapter-networks}, the idea was originally modeled in a nonnetwork version by \citet{simon} and in a network version by \citet{price2} to explain the power-law degree distributions of citation networks \citep{price1}.
However, I will here present the Barabási-Albert (BA)-model \citep{barabasi:smcn}, which contains three elements:
\begin{list}{}{}
\item \emph{Initial condition:} The network consists of a small number $n_0$ of nodes and no links.
\item \emph{Growth:} At every time step, a new node with $m < n_0$ links is added, linking to $m$ different nodes already present in the system.
\item \emph{Preferential attachment:} The probability $P$ that the new node will be connected to node $i$ depends linearly on the degree $k_i$.
\end{list}
The power-law exponent $\gamma$ is 3 in this model, independent of $m$ \citep{barabasi:scaling}.
% The impact of the BA-model on the complex-networks community has been dominating to the extent that it has been
% almost damaging and must therefore be criticized.
The continued attempts to find better models are driven by the
discovery of new topology properties of real-world networks.
For example, the model introduces correlations among the nodes \citep{gronlund-corr} and other unnatural features, such as a vanishing clustering coefficient as $n \rightarrow \infty$ \citep{barabasi:smcn}. The fact that the power-law degree distribution only arises for linear attachment \citep{redner}, questions the model's generality. Another unappealing feature of the BA-model is that the networks are in some way dead, as it is only a slight generalization of Simon's nonnetwork model in a network context. This means that if the model is a good description of the evolution of networks in nature, then the question ``Who is connected to whom?'' looses its value, since only the degree of the nodes counts.
Also, many networks in nature with a broad degree distribution show deviations from a pure power-law, typically an exponential cutoff at high degrees
\begin{eqnarray}
P(k) \sim k^{-\gamma}\phi(k/\xi),
\end{eqnarray}
where $\phi(k/\xi)$ is the cutoff at some scale $\xi$. In the context of the growing BA-model, this phenomenon can be explained if an aging of nodes is introduced \citep{barabasi:smcn}.
The effect is that old nodes that are added early in the network-evolution process, lose some of their attractive power.
This limits the ability for nodes with a very high degree to evolve.
Despite my criticism, I accept without question the necessary impact on the network community from the pre\-feren\-tial-attachment model. In particular, attention has shifted to growing networks to capture the features of real-world networks since the BA-model was introduced.
The BA-model belongs, together with the ER-model and the WS-model, to the first generation of simple models.
However, local and global optimization models inspired by evolutionary processes, have been developed in the field of biology \citep{enquist}. Local optimization models of nongrowing networks have, on the other hand, scarcely been investigated. The merging model presented below belongs to this class and in my \href{http://www.diva-portal.org/umu/abstract.xsql?dbid=840}{thesis} I present an social communication approach to the optimization problem.
\subsection*{Merging}
Information transfer is the fundamental
task of many networks in nature.
In these networks, some of the nodes receive signals
from outside and then transfer these signal to other nodes,
which then take appropriate action.
Information networks include for example neural networks with their synaptic rewiring
\citep{cohen}, molecular networks evolved to modulate the
protein production in living cells
\citep{Jeong2000,maslov:sstpn}, and social networks
exemplified by the Internet \citep{faloutsos99powerlaw}.
Networks where material or energy is transmitted between nodes,
as for example business networks
or ecosystems, often also contain an inherent ingredient of
information transfer.
Therefore, a natural source of inspiration to invent new models is to consider real-world networks in the perspective of information transfer.
\begin{figure}[hbt]
\centering
\includegraphics[width=1.2\columnwidth]{./figs/mergeexample.eps}
\caption{\label{fig:networkmerging}The core in the merging model: To the right two computing units in a network
merge to a larger unit. The reason could be economical, based on efficiency or performance. This is a likely
process in for example the network of autonomous systems of the Internet (modeled to the left by a scale-free network with $P(k)\propto k^{-2.2}$), where companies that own the autonomous systems merge, mainly for financial reasons.}
\end{figure}
Every action or process in and between the subunits of a system, here simplified and represented by nodes,
is more or less complicated and time consuming.
Is there a way to optimize the system, locally or globally, so that
the overall functionality increases? Two appealing modifications of the
network would be better access between nodes (shorter shortest paths) and
more efficient and possibly larger computing units.
This inspired \citet{bjk-merge} to introduce the idea of \emph{merging-and-creation} on networks, exemplified in \autoref{fig:networkmerging} by the network of \emph{autonomous systems}\footnote{An autonomous system is a collection of Internet-protocoll networks and routers with a common routing policy under the control of typically an Internet service provider.} of the Internet.\footnote{A simulation is available on \url{http://cmol.ni.dk/javaapp.php}}
One version of the merging and regeneration model is:
\begin{itemize}
\item{Choose a node $i$ and one of its neighbors $j$ randomly.}
\item{Let $i$ and $j$ merge by absorbing the link between them. Choose one of the
nodes as the surviving node ($i$). By copying all links from $j$ to $i$ that $i$ did not already have (all double links are removed),
$i$ has now the same neighbors as $i$ and $j$ had before the merging event.}
\item{Add a new node with $k_{\mathrm{new}}$ links to $k_{\mathrm{new}}$ nodes in the network. The distribution of $k_{\mathrm{new}}$
is not essential for the model.}
\end{itemize}
The regeneration step exists to keep the network alive with a fixed number of nodes. The model presented above gives rise to a power-law degree distribution $P(k) \propto k^{-\gamma}$ with a slope $\gamma$ between 1.5 and 3 depending on variations of the model.
%\newpage%s
\begin{floatingfigure}{0.9\columnwidth}
\leavevmode
\centering
\includegraphics[width=0.7\columnwidth]{./figs/pmmergingeps.eps}
%\hspace{-0.3cm}\input{./figs/pmmerging.pstex_t}
\caption{\label{pm-merge}
One realization of the
merging algorithm with different sign in (a)
and equal sign in (b).
}
\end{floatingfigure}
In \citet{pap:merge1} we investigated different variations of the model and primarily the effect on the degree distribution. We also considered a nonnetwork implementation of the model, to be able to treat the model analytically. \citet{dorogovtsev} considered later a simplified network implementation analytically. Instead of merging nodes we let \emph{elements} $q_i$ without any links merge, in a large system ($i=1,2,....,n$).
The scalar may be either just positive \citep{field} (\autoref{pm-merge}(b)) or both positive and
negative \citep{takayasu,krapivsky} (\autoref{pm-merge}(a)). To get a real-world idea about the simplified model,
consider the elements as particles and the scalar as the
mass of a particle in a model of dust aggregation in space \citep{field}. Other examples are companies and the financial
assets of the company, and the vorticity of vortices studied in the context of superconductors and turbulence.
In the core of the process two randomly chosen elements, $i$ and $j$, merge and the
new element acquires the sum of the scalars $q_i+q_j$.
In parallel, there is a creation process
of elements with small size $|q|$, both positive and negative.
I emphasize that the size of the created elements is not essential for the dynamics,
since the size of a new element can be
drawn from a narrow distribution or be assigned deterministic values.
Obviously there are many variants of this process, but they can be classified
into three main classes according to the scaling of the size
distribution $P(q)\propto q^{-\gamma}$ of the elements. (see \autoref{mergetable}).
\begin{table}[!hptb]
\caption{\label{mergetable}Three different classes of the merging. In the first model all elements $q \ge 0$, but in the second model the elements $q$ can be both positive or negative. In the third model a condition for the merging-and-creation event is that $q \ge 0$ and $r$ is a random number from a narrow distribution $ \langle r \rangle=1 $}
\begin{tabular}{r>{\columncolor{light-grey}}ll>{\columncolor{light-grey}}l}
$\gamma$ &
\multicolumn{1}{c}{3/2} &
\multicolumn{1}{c}{2} &
\multicolumn{1}{c}{5/2}\\
\hline
$\mathrm{merging}$ &
$q_i \rightarrow q_i + q_j$
&
$q_i \rightarrow q_i + q_j$
&
$q_i \rightarrow q_i + q_j - 1$
\\
$\mathrm{creation}$ &
$q_j \rightarrow 1$
&
$q_j \rightarrow \pm 1$
&
$q_j \rightarrow r \in [0,2]$
\\
\end{tabular}
\end{table}
In \citet{pap:merge2} we studied one of the models in detail, the $\gamma=2$ version, and returned to a network implementation.\footnote{A simulation is available on \url{http://cmol.nbi.dk/javaapp.php}} The process we studied is the completely \mbox{$\pm$-symmetric} case. A link has a positive end connected to a donor node ($+$ node) and a negative end connected to an acceptor ($-$ node). Interesting with this model is the close connection to the magnetic network of flux tubes on the sun and the associated size distribution of solar flares and sun spots \citep{russel,hughes} (see \autoref{fig:sunreconnect}). The model is:
\begin{itemize}
\item{%
Merge the two random nodes $i$ and $j$. There are now two
possibilities:\\ (a)
If they have the same sign all the links from $i$ and $j$
are assigned to the merged node.
Thereby the merged node has the same neighbors
as $i$ and $j$ had together prior to the merging (\autoref{fig:sunreconnect}(a)).\\
(b) If $i$ and $j$ have different signs,
the resulting node is assigned the sign of
the sum $q_i+q_j$. Thereby a number $max\{ |q_i|, |q_j| \} -
|q_i+q_j|$ of links are annihilated in such a way that
only the two merging nodes change their number of links.
This is done by reconnecting
donor nodes of incoming links to
acceptor nodes of outgoing links (\autoref{fig:sunreconnect}(b)).}
\item{%
One new node is created of random sign, with one
link
being connected to a randomly chosen node.}
\end{itemize}
\begin{figure}[tpb]
\centering
\includegraphics[width=1.2\columnwidth]{./figs/sunreconnecteps.eps}
%\input{./figs/sunreconnect.pstex_t}
\caption{\label{fig:sunreconnect}Merging (a) and annihilation (b) in a model of the magnetic network of flux tubes on the sun. Positive nodes
(donors) have outgoing links and negative nodes (acceptors) have incoming links. This $\pm$-symmetric dynamics give rise to a
scale-free size distribution of both donors and acceptors ($P(q)\propto q^{-2}$) and energy bursts scaling as $P(s)\propto s^{-3}$, with $s$ being the size of a burst.}
\end{figure}
This is a minimalistic model of the more complicated 2D-model presented by \citet{hughes}. Nevertheless, it gives the same
qualitative results. The degree distribution is $P(q)\propto q^{-2}$ both for donors and acceptors and the size distribution of
the solar flares is $P(s)\propto s^{-3}$. With the degree of a node interpreted as the size of a magnetic flux line, the results agree with other modeling \citep{hughes2,russel}. However, the size distribution of solar flares measured by \citet{dennis,lin} follows a power law $P(s)\propto s^{-1.7\pm0.3}$ that disagrees with our result. According to \citet{russel}, the explanation is that the annihilation of solar flares triggers an avalanche of events that makes the size distribution of the solar flares broader.
The advantage with our model is that it can be treated and understood analytically. Its simplistic form also opens for a coarse understanding of the fundamental mechanisms of some dynamic systems, not necessarily limited to sun processes in the magnetic corona. The total dominance of preferential attachment in the network community has overshadowed
and possibly hindered other dynamic models to be investigated in detail. I believe that merging is one of the underestimated
mechanisms that must be investigated further in the search for a complete understanding of the connection between the evolution of networks and their structure and function.
\section*{Beyond simple models}
As already mentioned, there are numerous other models that produce power-law degree distributions. However, instead of giving examples I would like to mention a model that does not yet exist.
The \emph{tolerance} is the inherent robustness of a network \citep{barabasi:attack}.
An \emph{error} is defined as the successive removal of randomly chosen nodes and an \emph{attack} as the removal of a few nodes that play a vital role in maintaining the network's connectivity.
Vital nodes are those with the highest degree. Compared with random networks the tolerance against random failures, that is errors, is high in scale-free networks. The reason is that a high percentage of the nodes have just a few connections in a scale-free network. However, the price for a high error-tolerance is the vulnerability to attacks; when only a few of the dominating nodes are removed the network falls apart.
The question is therefore, what does the topology look like in the scale-free network that is optimized against attacks \citep{latora-rew}? A better understanding is of crucial importance to ensure the safety, capacity, performance and to use all of the possibilities of man-made structures like the Internet and WWW \citep{Mendes:eon}.
\bibliography{phdhtml}
\end{document}