publwordle This word cloud, based on the content of this page, demonstrates my interest in information and networks.
Alcides Viamontes Esquivel and Martin Rosvall
[arXiv:1202.0425]

In network science, researchers often use mutual information to understand the difference between network partitions produced by community detection methods. Here we extend the use of mutual information to covers, that is, the cases where a node can belong to more than one module. In our proposed solution, the underlying stochastic process used to compare partitions is extended to deal with covers, and the random variables of the new process are simply fed into the usual definition of mutual information. With partitions, our extended process behaves exactly as the conventional approach for partitions, and thus, the mutual information values obtained are the same. We also describe how to perform sampling and do error estimation for our extended process, as both are necessary steps for a practical application of this measure. The stochastic process that we define here is not only applicable to networks, but can also be used to compare more general set-to-set binary relations.
Renaud Lambiotte and Martin Rosvall
[arXiv:1112.5252]

Random teleportation is a necessary evil for ranking and clustering directed networks based on random walks. Teleportation enables ergodic solutions, but the solutions must necessarily depend on the exact implementation and parametrization of the teleportation. For example, in the commonly used PageRank algorithm, the teleportation rate must trade off a heavily biased solution with a uniform solution. Here we show that teleportation to links rather than nodes enables a much smoother trade-off and effectively more robust results. We also show that, by not recording the teleportation steps of the random walker, we can further reduce the effect of teleportation with dramatic effects on clustering.
Atieh Mirshahvalad and Martin Rosvall
[arXiv:1110.0305]

Researchers use community-detection algorithms to reveal large-scale organization in biological and social networks, but community detection is useful only if the communities are significant and not a result of noisy data. To assess the statistical significance of the network communities, or the robustness of the detected structure, one approach is to perturb the network structure by removing links and measure how much the communities change. However, perturbing sparse networks is challenging because they are inherently sensitive; they shatter easily if links are removed. Here we propose a simple method to perturb sparse networks and assess the significance of their communities. We generate resampled networks by adding extra links based on local information, then we aggregate the information from multiple resampled networks to find a coarse-grained description of significant clusters. In addition to testing our method on benchmark networks, we use our method on the sparse network of the European Court of Justice (ECJ) case law, to detect significant and insignificant areas of law. We use our significance analysis to draw a map of the ECJ case law network that reveals the relations between the areas of law.
Morten L. Bech, Carl T. Bergstrom, Rodney J. Garratt, and Martin Rosvall
[NY Fed 507]

We use an information-theoretic approach to describe changes in lending relationships between federal funds market participants around the time of the Lehman Brothers failure. Unlike previous work that conducts maximum-likelihood estimation on undirected networks, our analysis distinguishes between borrowers and lenders and looks for broader lending relationships (multibank lending cycles) that extend beyond the immediate counterparties. We find that significant changes in lending patterns emerge following implementation of the Interest on Reserves policy by the Federal Reserve on October 9, 2008.
Alcides Viamontes Esquivel and Martin Rosvall
Phys. Rev. X 1, 021025 (2011).

To better understand the organization of overlapping modules in large networks with respect to flow, we introduce the map equation for overlapping modules. In this information-theoretic framework, we use the correspondence between compression and regularity detection. The generalized map equation measures how well we can compress a description of flow in the network when we partition it into modules with possible overlaps. When we minimize the generalized map equation over overlapping network partitions, we detect modules that capture flow and determine which nodes at the boundaries between modules should be classified in multiple modules and to what degree. With a novel greedy-search algorithm, we find that some networks, for example, the neural network of the nematode Caenorhabditis elegans, are best described by modules dominated by hard boundaries, but that others, for example, the sparse European-roads network, have an organization of highly overlapping modules.
Atieh Mirshahvalad and Martin Rosvall
Phys. Rev. E 84, 036102 (2011).

In social systems, people communicate with each other and form groups based on their interests. The pattern of interactions, the network, and the ideas that flow on the network naturally evolve together. Researchers use simple models to capture the feedback between changing network patterns and ideas on the network, but little is understood about the role of past events in the feedback process. Here we introduce a simple agent-based model to study the coupling between peoples' ideas and social networks, and better understand the role of history in dynamic social networks. We measure how information about ideas can be recovered from information about network structure and, the other way around, how information about network structure can be recovered from information about ideas. We find that it is in general easier to recover ideas from the network structure than vice versa.
Martin Rosvall and Carl T. Bergstrom
PLoS ONE 6(4): e18209 (2011).
[Source code]

To comprehend the hierarchical organization of large integrated systems, we introduce the hierarchical map equation that reveals multilevel structures in networks. In this information-theoretic approach, we exploit the duality between compression and pattern detection; by compressing a description of a random walker as a proxy for real flow on a network, we find regularities in the network that induce this system-wide flow. Finding the shortest multilevel description of the random walker therefore gives us the best hierarchical clustering of the network — the optimal number of levels and modular partition at each level — with respect to the dynamics on the network. With a novel search algorithm, we extract and illustrate the rich multilevel organization of several large social and biological networks. For example, from the global air traffic network we uncover countries and continents, and from the pattern of scientific communication we reveal more than 100 scientific fields organized in four major disciplines: life sciences, physical sciences, ecology and earth sciences, and social sciences. In general, we find shallow hierarchical structures in globally interconnected systems, such as neural networks, and rich multilevel organizations in systems with highly separated regions, such as road networks
Carl T. Bergstrom and Martin Rosvall
Biology and Philosophy 26, 159-176 (2011).
[arXiv:0810.4168] [Response to commentaries on The Transmission Sense of Information]

Biologists rely heavily on the language of information, coding, and transmission that is commonplace in the field of information theory developed by Claude Shannon, but there is open debate about whether such language is anything more than facile metaphor. Philosophers of biology have argued that when biologists talk about information in genes and in evolution, they are not talking about the sort of information that Shannon's theory addresses. First, philosophers have suggested that Shannon theory is only useful for developing a shallow notion of correlation, the so-called "causal sense" of information. Second, they typically argue that in genetics and evolutionary biology, information language is used in a "semantic sense," whereas semantics are deliberately omitted from Shannon theory. Neither critique is well-founded. Here we propose an alternative to the causal and semantic senses of information: a transmission sense of information, in which an object X conveys information if the function of X is to reduce, by virtue of its sequence properties, uncertainty on the part of an agent who observes X. The transmission sense not only captures much of what biologists intend when they talk about information in genes, but also brings Shannon's theory back to the fore. By taking the viewpoint of a communications engineer and focusing on the decision problem of how information is to be packaged for transport, this approach resolves several problems that have plagued the information concept in biology, and highlights a number of important features of the way that information is encoded, stored, and transmitted as genetic sequence.
Ludvig Lizana, Martin Rosvall, and Kim Sneppen
Phys. Rev. Lett. 104, 040603 (2010).
[arXiv:0910.4045] [Java simulation]

The distribution of information is essential for living system's ability to coordinate and adapt. Random walkers are often used to model this distribution process and, in doing so, one effectively assumes that information maintains its relevance over time. But the value of information in social and biological systems often decay and must continuously be updated. To capture the spatial dynamics of ageing information, we introduce time walkers. A time walker moves like a random walker, but interacts with traces left by other walkers, some representing older information, some newer. The traces forms a navigable information landscape. We quantify the dynamical properties of time walkers moving on a two-dimensional lattice and the quality of the information landscape generated by their movements. We visualise the self-similar landscape as a river network, and show that searching in this landscape is superior to random searching and scales as the length of loop-erased random walks.
Martin Rosvall, Daniel Axelsson, and Carl T. Bergstrom
Eur. Phys. J. Special Topics 178, 13 (2009).
[arXiv:0906.1405] [Map generator]

Many real-world networks are so large that we must simplify their structure before we can extract useful information about the systems they represent. As the tools for doing these simplifications proliferate within the network literature, researchers would benefit from some guidelines about which of the so-called community detection algorithms are most appropriate for the structures they are studying and the questions they are asking. Here we show that different methods highlight different aspects of a network's structure and that the the sort of information that we seek to extract about the system must guide us in our decision. For example, many community detection algorithms, including the popular modularity maximization approach, infer module assignments from an underlying model of the network formation process. However, we are not always as interested in how a system's network structure was formed, as we are in how a network's extant structure influences the system's behavior. To see how structure influences current behavior, we will recognize that links in a network induce movement across the network and result in system-wide interdependence. In doing so, we explicitly acknowledge that most networks carry flow. To highlight and simplify the network structure with respect to this flow, we use the map equation. We present an intuitive derivation of this flow-based and information-theoretic method and provide an interactive on-line application that anyone can use to explore the mechanics of the map equation. The differences between the map equation and the modularity maximization approach are not merely conceptual. Because the map equation attends to patterns of flow on the network and the modularity maximization approach does not, the two methods can yield dramatically different results for some network structures. To illustrate this and build our understanding of each method, we partition several sample networks. We also describe an algorithm and provide source code to efficiently decompose large weighted and directed networks based on the map equation.
Martin Rosvall and Carl T. Bergstrom
PLoS ONE 5(1): e8694 (2010).
[arXiv:0812.1242] [Source code] [Alluvial generator]

Change is the very nature of interaction patterns in biology, technology, economics, and science itself: The interactions within and between organisms change; the air, ground, and sea traffic change; the global financial flow changes; the scientific research front changes. With increasingly available data, networks and clustering tools have become important methods used to comprehend instances of these large-scale structures. But blind to the difference between noise and trends in the data, these tools alone must fail when used to study change. Only if we can assign significance to the partition of single networks can we distinguish structural changes from fluctuations and assess how much confidence we should have in the changes. Here we show that bootstrap resampling accompanied by significance clustering provides a solution to this problem. We use the significance clustering to realize de Solla Price's vision of mapping the change in science.
Martin Rosvall and Kim Sneppen
Phys. Rev. E 79, 026111 (2009)
[arXiv:0809.4803] [Java simulation]

To investigate the role of information flow in group formation, we introduce a model of communication and social navigation. We let agents gather information in an idealized network society, and demonstrate that heterogeneous groups can evolve without presuming that individuals have different interests. In our scenario, individuals' access to global information is constrained by local communication with the nearest neighbors on a dynamic network. The result is reinforced interests among like-minded agents in modular networks; the flow of information works as a glue that keeps individuals together. The model explains group formation in terms of limited information access and highlights global broadcasting of information as a way to counterbalance this fragmentation. To illustrate how the information constraints imposed by the communication structure affect future development of real-world systems, we extrapolate dynamics from the topology of four social networks.
Martin Rosvall and Carl T. Bergstrom
PNAS 105, 1118 (2008)
[arXiv:0707.0609] [Source code] [Map generator] [Interactive map]

To comprehend the multipartite organization of large-scale biological and social systems, we introduce a new information-theoretic approach to reveal community structure in weighted and directed networks. The method decomposes a network into modules by optimally compressing a description of information flows on the network. The result is a map that both simplifies and highlights the regularities in the structure and their relationships to each other. We illustrate the method by making a map of scientific communication as captured in the citation patterns of more than 6000 journals. We discover a multicentric organization with fields that vary dramatically in size and degree of integration into the network of science. Along the backbone of the network — which includes physics, chemistry, molecular biology, and medicine — information flows bidirectionally, but the map reveals a directional pattern of citation from the applied fields to the basic sciences.
Martin Rosvall and Carl T. Bergstrom
PNAS 104, 7327 (2007)
[physics/0612035] [Source code]

To understand the structure of a large-scale biological, social, or technological network, it can be helpful to decompose the network into smaller subunits or modules. In this article, we develop an information-theoretic foundation for the concept of modularity in networks. We identify the modules of which the network is composed by finding an optimal compression of its topology, capitalizing on regularities in its structure. We explain the advantages of this approach and illustrate them by partitioning a number of real-world and model networks.
Martin Rosvall and Kim Sneppen
[arXiv:0708.0368] [Java simulation]

Social groups with widely different music tastes, political convictions, and religious beliefs emerge and disappear on scales from extreme subcultures to mainstream mass-cultures. Both the underlying social structure and the formation of opinions are dynamic, and changes in one affect the other. Several positive feedback mechanisms have been proposed to drive the diversity in social and economic systems, but little effort has been devoted to pinpointing the interplay between a dynamically changing social network and the spread and gathering of information on the network. Here we analyze this phenomenon in terms of a social network model that explicitly simulates the feedback between information assembly and the emergence of social structures: changing beliefs are coupled to changing relationships because agents self-organize a dynamic network to facilitate their hunter-gatherer behavior in information space. Our analysis demonstrates that tribal organizations and modular social networks can emerge as a result of contact-seeking agents that reinforce their beliefs among like-minded. We also find that prestigious persons can streamline the social network into hierarchical structures around themselves.
Martin Rosvall, Ian B. Dodd, Sandeep Krishna, and Kim Sneppen
Phys. Rev. E 74, 066105 (2006)
[q-bio.PE/0609031] [Java simulation]

Bacteria and their bacteriophages are the most abundant, widespread and diverse groups of biological entities on the planet. In an attempt to understand how the interactions between bacteria, virulent phages and temperate phages might affect the diversity of these groups, we developed a novel stochastic network model for examining the co-evolution of these ecologies. In our approach, nodes represent whole species or strains of bacteria or phages, rather than individuals, with "speciation" and extinction modelled by duplication and removal of nodes. Phage-bacteria links represent host-parasite relationships and temperate-virulent phage links denote prophage-encoded resistance. The effect of horizontal transfer of genetic information between strains was also included in the dynamic rules. The observed networks evolved in a highly dynamic fashion but the ecosystems were prone to collapse (one or more entire groups going extinct). Diversity could be stably maintained in the model only if the probability of speciation was independent of the diversity. Such an effect could be achieved in real ecosystems if the speciation rate is primarily set by the availability of ecological niches.
Martin Rosvall and Kim Sneppen
International Journal of Bifurcation and Chaos 17, 2509 (2007)
[cond-mat/0604036]

In this paper we quantify our limited information horizon, by measuring the information necessary to locate specific nodes in a network. To investigate different ways to overcome this horizon, and the interplay between communication and topology in social networks, we let agents communicate in a model society. In this way, they build a perception of the network that they can use to create strategic links to improve their standing in the network. We observe a narrow distribution of links when communication is low and a network with a broad distribution of links when communication is high.
Martin Rosvall and Kim Sneppen
Europhys. Lett. 74, 1109 (2006)
[physics/0603218] [Java simulation]

We model self-assembly of information in networks to investigate necessary conditions for building a global perception of a system by local communication. Our approach is to let agents chat in a model system to self-organize distant communication pathways. We demonstrate that simple local rules allow agents to build a perception of the system that is robust to dynamical changes and mistakes. We find that messages are most effectively forwarded in the presence of hubs, while transmission in hub-free networks is more robust against misinformation and failures.
Martin Rosvall and Kim Sneppen
Phys. Rev. E 74, 016108 (2006)
[physics/0512105] [Java simulation]

This paper introduces a model of self-organization between communication and topology in social networks, with feedback between different communication habits and the topology. To study this feedback, we let agents communicate to build a perception of a network and use this information to create strategic links. We observe a narrow distribution of links when the communication is low and a system with a broad distribution of links when the communication is high. We also analyze the outcome of chatting, cheating, and lying as strategies to get better access to information in the network. Chatting, although only adopted by a few agents, results in a global gain in the system. Contrary, a global loss is inevitable in a system with too many liars.
Jacob Bock Axelsen, Sebastian Bernhardsson, Martin Rosvall, Kim Sneppen, and Ala Trusina
Phys. Rev. E 74, 036119 (2006)
[physics/0512075] [Java simulation]

We generalize the degree-organizational view of real-world networks with broad degree-distributions in a landscape analogue with mountains (high-degree nodes) and valleys (low-degree nodes). For example, correlated degrees between adjacent nodes correspond to smooth landscapes (social networks), hierarchical networks to one-mountain landscapes (the Internet), and degree-disassortative networks without hierarchical features to rough landscapes with several mountains. We also generate ridge landscapes to model networks organized under constraints imposed by the space the networks are embedded in, associated to spatial or, in molecular networks, to functional localization. To quantify the topology, we here measure the widths of the mountains and the separation between different mountains.
Martin Rosvall, Andreas Grönlund, Petter Minnhagen, and Kim Sneppen
Phys. Rev. E 72, 046117 (2005)
[cond-mat/0505400]

We investigate the searchability of complex systems in terms of their interconnectedness. Associating searchability with the number and size of branch points along the paths between the nodes, we find that scale-free networks are relatively difficult to search, and thus that the abundance of scale-free networks in nature and society may reflect an attempt to protect local areas in a highly interconnected network from nonrelated communication. In fact, starting from a random node, real-world networks with higher order organization like modular or hierarchical structure are even more difficult to navigate than random scale-free networks. The searchability at the node level opens the possibility for a generalized hierarchy measure that captures both the hierarchy in the usual terms of trees, as in military structures, and the intrinsic hierarchical nature of topological hierarchies for scale-free networks, as in the Internet.
Ala Trusina, Martin Rosvall, and Kim Sneppen
Phys. Rev. Lett. 94, 238701 (2005)
[cond-mat/0412064]

We investigate and quantify the interplay between topology and the ability to send specific signals in complex networks. We find that in a majority of investigated real-world networks, the ability to communicate is favored by the network topology at short distances, but disfavored at longer distances. We further discuss how the ability to locate specific nodes can be improved if information associated with the overall traffic in the network is available.
Martin Rosvall, Petter Minnhagen, and Kim Sneppen
Phys. Rev. E 71, 066111 (2005)
[cond-mat/0412051] [Java simulation]

We study navigation with limited information in networks and demonstrate that many real-world networks have a structure that can be described as favoring communication at short distance at the cost of constraining communication at long distance. This feature, which is robust and more evident with limited than with complete information, reflects both topological and possibly functional design characteristics. For example, the characteristics of the networks studied derived from a city and from the Internet are manifested through modular network designs. We also observe that directed navigation in typical networks requires remarkably little information on the level of individual nodes. By studying navigation, or specific signaling, we take a complementary approach to the common studies of information transfer devoted to the broadcasting of information in studies of virus spreading and the like.
Kim Sneppen, Ala Trusina, and Martin Rosvall
Europhys. Lett. 69, 853 (2005)
[cond-mat/0407055]

Signaling pathways and networks determine the ability to communicate in systems ranging from living cells to human society. We investigate how the network structure constrains communication in social, man-made and biological networks. We find that human networks of governance and collaboration are predictable on tête-è-tête level, reflecting well-defined pathways, but globally inefficient. In contrast, the Internet tends to have better overall communication abilities and more alternative pathways, and is therefore more robust. Between these extremes, the molecular network of Saccharomyces cerevisea is more similar to the simpler social systems, whereas the pattern of interactions in the more complex Drosophilia melanogaster resembles the robust Internet.
Martin Rosvall, Ala Trusina, Petter Minnhagen, and Kim Sneppen
Phys. Rev. Lett. 94, 028701 (2005)
[cond-mat/0407054]

Traffic is constrained by the information involved in locating the receiver and the physical distance between sender and receiver. We here focus on the former, and investigate traffic in the perspective of information handling. We re-plot the road map of cities in terms of the information needed to locate specific addresses and create information city networks with roads mapped to nodes and intersections to links between nodes. These networks have the broad degree distribution found in many other complex networks. Mapping to an information city network makes it possible to quantify the information associated with locating specific addresses.
Petter Minnhagen, Martin Rosvall, Kim Sneppen, and Ala Trusina
Physica A 340, 725 (2004)
[cond-mat/0406752] [Java simulation]

We discuss merging-and-creation as a self-organizing process for scale-free topologies in networks. Three power-law classes characterized by the power-law exponents 3/2, 2 and 5/2 are identified and the process is generalized to networks. In the network context, the merging can be viewed as a consequence of optimization related to more efficient signaling.
Kim Sneppen, Martin Rosvall, Ala Trusina, and Petter Minnhagen
Europhys. Lett. 67, 349 (2004)
[nlin.AO/0403005] [Java simulation]

We suggest a minimalistic model for directed networks and suggest an application to the injection and merging of magnetic field lines. We obtain a network of connected donor and acceptor vertices with degree distribution 1/s^2, and with dynamic reconnection events of size \Delta s occurring with a frequency that scales as 1/\Delta s^3. This suggests that the model is in the same universality class as the model for self-organization in the solar atmosphere suggested by Hughes et al.
Martin Rosvall, and Kim Sneppen
Phys. Rev. Lett. 91, 178701 (2003)
[cond-mat/0308399] [Java simulation]

We propose an information-based model for network dynamics in which imperfect information leads to networks where the different vertices have a widely different number of edges to other vertices, and where the topology has hierarchical features. The possibility to observe scale-free networks is linked to a minimally connected system where hubs remain dynamic.
Contact
Martin Rosvall
Assistant Professor Vita
+46 (0)90-786 5044


Department of Physics
Umeå University
SE-901 87 Umeå
Sweden
Misc.
Are you looking for a PhD or postdoc position? Does the content of this page excite you? Please drop me an and let me know.
Source code for clustering papers.
Short introduction to complex networks available in html format.
igroup Complete PhD thesis available in pdf format.
Links
mapequation.org
IceLab
CMOL
Eigenfactor.org
Carl Bergstrom
Petter Holme
My Picasa Albums