Chapter 3: Community Detection

About This Chapter

Zoom out from any large social network — Facebook’s friendship graph, a citation network linking academic papers, a market where stocks move together — and a pattern emerges that is unmistakable even before you run a single algorithm. The nodes do not mix uniformly. They cluster. Some groups of people are densely interconnected; sparse bridges connect those groups to the rest of the world. Within a group, almost everyone knows almost everyone else; between groups, connections are rare and precious. This structural feature is called community structure, and discovering it is one of the central problems in network science.

The intuition is immediate. In a university social network, communities correspond to dormitories, departments, sports teams, and study groups — social categories that shape who meets whom. In a co-authorship network, communities correspond to research fields. In a stock correlation network, communities correspond to industry sectors: technology stocks move together, energy stocks move together, but technology stocks and energy stocks do not move together nearly as tightly. Community detection is the algorithmic task of recovering these latent groups from the observable network structure alone, without labels, without surveys, without any knowledge of what the communities mean.

What is a community, formally?

The short answer is: no one agrees. Unlike a connected component (which has a precise mathematical definition) or a spanning tree (which also has one), a “community” has no single canonical definition. Different researchers and different algorithms operationalize the concept differently, and those choices lead to different partitions of the same graph. This is not a weakness of the field; it is an honest reflection of the fact that the social world is rich enough to support multiple useful notions of cohesion.

Here are four definitions that appear in the literature, ranging from strictest to loosest:

Clique. A clique is a set of nodes in which every pair is connected by an edge. A clique is the strongest possible notion of community: total mutual acquaintance. Real communities almost never satisfy this criterion — even a tight-knit research group will have a few pairs who have never collaborated directly.

\(k\)-core. A \(k\)-core is a maximal subgraph in which every node has at least \(k\) neighbors within the subgraph. As \(k\) increases, the \(k\)-cores nest inside one another, producing a hierarchy of progressively denser subgraphs. This definition is tractable and widely used in social network analysis, but it does not partition the graph — nodes can belong to multiple cores.

Partition-based community. The most widely used operational definition in practice is statistical: a community is a subset of nodes that has more internal edges than would be expected in a random graph with the same degree sequence. This is the idea that underlies modularity, the central quantity in this chapter.

Planted partition / stochastic block model. In the generative perspective, communities are latent labels. Edges are drawn at random, but the probability of an edge between two nodes depends on their labels — nodes in the same community are more likely to be connected than nodes in different communities. A community detection algorithm is, from this vantage point, a procedure for inferring the latent labels from the observed edges. This perspective is developed in detail in Section 6.

Why detection is hard

The hardness of community detection is not merely computational — though it is that too — it is conceptual. Because there is no canonical definition, there is no ground truth against which to measure success, except in toy examples where the communities were planted by design. In real networks, evaluating a partition requires either external labels (which are often unavailable) or an internal quality measure (which embeds a particular definition of community). The choice of quality measure and the choice of algorithm are inseparable.

Three further difficulties recur throughout this chapter:

Resolution limit. Modularity maximization, the most popular objective function, has a well-documented blind spot: it cannot reliably detect communities that are small relative to the total network size. This is the Fortunato–Barthélemy resolution limit, discussed in Section 8.

Degeneracy. Many different partitions can have nearly the same modularity score. The optimization landscape is flat near the maximum, which means different runs of the same algorithm, with different random initializations, can return genuinely different partitions — all approximately optimal.

Overlapping communities. All algorithms discussed in this chapter produce a hard partition — each node belongs to exactly one community. Real communities often overlap: a person can belong to a family, a work group, and a hobby club simultaneously. Hard-partition algorithms cannot represent this. Overlapping community detection is an active research area, but the methods are more complex and will not be covered here.

Modularity — the Central Quality Measure
Girvan–Newman: Edge Betweenness
The Louvain Method
Infomap and the Map Equation
Stochastic Block Models
Evaluating Partitions
When Algorithms Disagree — the Resolution Limit
Mini Case Study: Les Misérables

Modularity — the Central Quality Measure

Background: what should a good partition maximize?

Suppose someone hands you a partition of a graph — a labeling of every node with a community index — and asks you: “Is this a good partition?” What number would you compute?

The natural answer is: count the fraction of edges that fall within communities. But this naive answer is too easy to game. The trivial partition that assigns all nodes to a single community has every edge within its single group. We need a baseline — an expectation of how many within-community edges we would see in a random graph that preserves the degree sequence but scrambles the edge endpoints. The fraction of within-community edges in the real graph, minus that baseline, is a meaningful signal of non-trivial community structure.

This is the idea behind modularity, introduced by Newman and Girvan (2004).

The Newman–Girvan modularity

Let \(G = (V, E)\) be an undirected graph with \(n\) nodes and \(m\) edges. Let \(A_{ij}\) denote the \((i,j)\) entry of the adjacency matrix: \(A_{ij} = 1\) if there is an edge between nodes \(i\) and \(j\), and \(0\) otherwise. Let \(k_i = \sum_j A_{ij}\) be the degree of node \(i\). A partition assigns each node \(i\) to a community label \(c_i\).

The null model is a random graph with the same degree sequence as \(G\). In this random graph, the expected number of edges between nodes \(i\) and \(j\) is \(\frac{k_i k_j}{2m}\), because node \(i\) contributes \(k_i\) edge stubs out of a total of \(2m\), and the probability that one of those stubs connects to node \(j\) is \(\frac{k_j}{2m}\).

Definition (Modularity). The modularity of a partition \(\{c_i\}\) is

\[Q = \frac{1}{2m} \sum_{i,j} \left(A_{ij} - \frac{k_i k_j}{2m}\right) \delta(c_i, c_j)\]

where \(\delta(c_i, c_j) = 1\) if nodes \(i\) and \(j\) are in the same community and \(0\) otherwise.

Each term in the sum measures the excess connectivity between nodes \(i\) and \(j\) over what the null model predicts, but only when \(i\) and \(j\) are in the same community. The \(\frac{1}{2m}\) prefactor normalizes the sum so that \(Q\) lies in the interval \((-\frac{1}{2}, 1)\) in practice. A value of \(Q = 0\) means the partition is no better than random. Values above \(0.3\) are typically considered evidence of meaningful community structure; values above \(0.7\) are rare outside of near-perfectly separated networks.

The modularity can also be written in a form that makes the per-community contribution explicit. Let \(e_c = \frac{1}{2m}\sum_{i \in c, j \in c} A_{ij}\) be the fraction of edges inside community \(c\), and \(a_c = \frac{1}{2m}\sum_{i \in c} k_i\) be the fraction of edge ends attached to nodes in community \(c\). Then

\[Q = \sum_c \left(e_c - a_c^2\right).\]

This form shows clearly that each community contributes positively when it has more internal edges (\(e_c\)) than the null model would predict for a randomly wired node set of the same total degree (\(a_c^2\)).

Computing modularity by hand on a tiny graph

Before running any algorithm, it is instructive to compute modularity directly on a small example. The karate club graph — Zachary (1977)’s study of a university karate club that split into two factions — is the canonical benchmark for community detection. The graph has 34 nodes and 78 edges. The true split (known from field observation) separates the club into two groups: the instructor’s faction and the administrator’s faction.

A modularity of roughly \(0.37\) for the ground-truth split of the karate club graph is a strong signal — it indicates that the two known social factions correspond to a genuine structural divide in the network, well beyond what a random wiring would produce.

Visualizing the null-model intuition

The figure below plots the karate club graph with nodes colored by ground-truth faction. A good community-detection result should produce a coloring where each color forms a visually cohesive cluster.

In practice

Modularity is the dominant quality measure in applied network analysis — it appears in software tools from Gephi to Neo4j to NetworkX — but practitioners use it critically, not blindly. At a financial firm doing stock-correlation-network clustering, a quant analyst would compute modularity for several candidate industry groupings and use the comparison to support or challenge the standard GICS sector classification. Modularity above \(0.30\) is considered evidence that the chosen partition is capturing real structure; below \(0.10\) suggests the communities may be arbitrary. That said, the resolution limit (Section 7) means modularity alone should never be the final word.

Girvan–Newman: Edge Betweenness

The algorithm

The Girvan–Newman algorithm (2002) is the oldest and most conceptually transparent community-detection method. Its guiding insight is simple: edges that lie between communities carry more traffic than edges within communities. If you imagine random-walk traffic flowing between every pair of nodes, the bridges connecting communities must carry all the cross-community traffic. Removing them will split the graph into its constituent groups.

The algorithm operationalizes this idea using edge betweenness centrality. Recall from Chapter 2 that node betweenness counts the fraction of shortest paths passing through a node. Edge betweenness counts the fraction of shortest paths passing through an edge:

\[c_B(e) = \sum_{s \neq t} \frac{\sigma(s, t \,|\, e)}{\sigma(s, t)}\]

where \(\sigma(s, t)\) is the total number of shortest paths between nodes \(s\) and \(t\), and \(\sigma(s, t \,|\, e)\) is the number that pass through edge \(e\).

The Girvan–Newman procedure:

Compute edge betweenness for all edges.
Remove the edge with the highest betweenness.
Recompute edge betweenness for all remaining edges.
Repeat steps 2–3 until no edges remain.

At each step, the graph may split into new components. Recording the sequence of splits produces a dendrogram — a hierarchical tree of partitions, from the original single component down to \(n\) isolated nodes. The best partition along this hierarchy is the one that maximizes modularity.

The computational cost of a single betweenness calculation is \(O(nm)\) for unweighted graphs using Brandes’ algorithm. Because the full Girvan–Newman procedure runs this calculation after each edge removal — and there are \(m\) edges — the total cost is \(O(nm^2)\), which becomes prohibitive for networks with more than a few thousand edges. Girvan–Newman is therefore used primarily for small to medium networks where interpretability and the dendrogram are valued.

Live demonstration on the karate club

The Girvan–Newman first split on the karate club graph should produce two communities. Before running the code above, predict: will the split match the ground-truth Mr. Hi / Officer partition perfectly, or will a few nodes be misassigned? Which nodes are most likely to be ambiguous?

Consider: the two nodes with the highest degree in the karate club graph are node 0 (the instructor, Mr. Hi) and node 33 (the administrator, Officer). Edges crossing between their local neighborhoods are the ones with highest betweenness.

Comparison with ground truth

In practice

Girvan–Newman is rarely deployed at production scale because its \(O(nm^2)\) complexity makes it infeasible for networks with more than a few thousand edges. In practice, it is used in three situations: (1) small domain-specific networks where interpretability of the dendrogram is important, such as protein interaction subgraphs; (2) as a baseline for benchmarking faster algorithms; and (3) in teaching, precisely because its logic is transparent and its connection to betweenness centrality (a concept already familiar from network analysis) is direct. For networks with millions of edges — Facebook’s social graph, trading networks, citation databases — the Louvain method discussed next is the standard choice.

The Louvain Method

Greedy modularity maximization in two phases

The Louvain algorithm (Blondel et al., 2008) is the most widely deployed community detection algorithm in practice. It maximizes modularity greedily, and it scales to networks with hundreds of millions of nodes. The key to its speed is a two-phase alternation that avoids the \(O(nm^2)\) cost of Girvan–Newman.

Phase 1 — Local moves. Each node starts in its own singleton community. For each node \(i\), the algorithm evaluates the change in modularity \(\Delta Q\) that would result from moving \(i\) into each of the communities occupied by its neighbors. If any move increases \(Q\), the node is moved to the community with the largest gain. This local-move sweep is repeated over all nodes until no single move improves \(Q\).

The modularity gain from moving node \(i\) into community \(C\) can be computed in \(O(k_i)\) time per move (where \(k_i\) is the degree of node \(i\)), which makes the sweep over all nodes fast. The key formula for the gain is:

\[\Delta Q = \left[\frac{\Sigma_{in} + 2k_{i,in}}{2m} - \left(\frac{\Sigma_{tot} + k_i}{2m}\right)^2\right] - \left[\frac{\Sigma_{in}}{2m} - \left(\frac{\Sigma_{tot}}{2m}\right)^2 - \left(\frac{k_i}{2m}\right)^2\right]\]

where \(\Sigma_{in}\) is the sum of weights of edges inside \(C\), \(\Sigma_{tot}\) is the sum of degrees of all nodes in \(C\), \(k_{i,in}\) is the sum of weights of edges from \(i\) to nodes in \(C\), and \(k_i\) is the degree of \(i\).

Phase 2 — Aggregation. Once Phase 1 converges, the communities found become super-nodes. Two super-nodes are connected by a weighted edge whose weight equals the number of edges between the corresponding communities in the original graph. The algorithm then returns to Phase 1 on this reduced graph.

The two phases alternate until modularity can no longer be improved. The hierarchy of graphs produced at each level of aggregation forms a natural hierarchy of communities.

Complexity. Empirically, Louvain runs in approximately \(O(n \log n)\) time on sparse graphs, which is dramatically faster than Girvan–Newman and explains its dominance in large-scale applications: detecting communities in Twitter’s follower graph, identifying industry clusters in stock-return correlation matrices, segmenting citation networks.

Live demonstration on the Les Misérables network

The Les Misérables character co-appearance network has 77 nodes (characters) and 254 edges (co-appearance in a chapter). It is a standard benchmark with rich known communities corresponding to the novel’s narrative arcs and character groupings.

Exercise: vary the resolution parameter

NetworkX’s louvain_communities accepts a resolution parameter (default \(1.0\)). Values greater than \(1\) favor smaller, more numerous communities; values less than \(1\) favor larger, fewer communities. Try resolution=0.5 and resolution=2.0 on the Les Misérables graph and observe how the number and size of communities change. Report the modularity \(Q\) for each setting. This exercise foreshadows the resolution-limit discussion in Section 7.

Non-determinism in Louvain

Louvain is a randomized algorithm: the order in which nodes are visited during local moves is shuffled at each pass. Two runs on the same graph with different random seeds can return different partitions, both with high modularity. This is not a bug — it reflects the degeneracy of the modularity landscape. For reproducible results, always pass an explicit seed parameter. In production, running Louvain multiple times and taking a consensus partition is a common practice.

Infomap and the Map Equation

Flow-based community detection

Modularity measures structural density; Infomap (Rosvall and Bergstrom, 2008) measures something fundamentally different — it measures how efficiently you can describe the trajectory of a random walker on the network. The idea is that if communities are real, a random walker will spend a long time inside each community before crossing to another. A good community assignment is one that allows you to describe the walker’s path using few bits: short code words for frequently visited within-community movements, and longer code words for rare cross-community transitions.

The map equation

Let there be \(m\) communities. Assign each community \(\alpha\) a two-level code: a community-level code word of length \(L(\alpha)\) bits, used every time the walker enters community \(\alpha\), and node-level code words within each community. The expected description length of the random walk is:

\[\mathcal{L}(\mathbf{M}) = q_{\curvearrowleft} H(\mathcal{Q}) + \sum_{\alpha=1}^{m} p_{\circlearrowleft}^{\alpha} H(\mathcal{P}^{\alpha})\]

where:

\(\mathbf{M}\) is the partition (the “map”),
\(q_{\curvearrowleft}\) is the rate at which the walk switches between modules,
\(H(\mathcal{Q})\) is the entropy of the module-exit code book (the code for crossing community boundaries),
\(p_{\circlearrowleft}^{\alpha}\) is the rate of within-module movement plus the exit rate for module \(\alpha\),
\(H(\mathcal{P}^{\alpha})\) is the entropy of the within-module code book for module \(\alpha\).

Minimizing \(\mathcal{L}(\mathbf{M})\) over all partitions \(\mathbf{M}\) finds the community structure that best compresses the random-walk description. Communities are, in this view, the compression artifacts of network structure.

Why Infomap is complementary to modularity

Modularity and the map equation optimize different objectives and therefore produce different — often complementary — partitions. Modularity is purely structural: it counts edges. The map equation is dynamical: it is about information flow. The two approaches tend to agree on assortative networks (more edges within communities than between them) but diverge on networks where the flow structure does not mirror the edge density, such as directed networks with asymmetric flows, or networks with heterogeneous node weights.

Practical note

The infomap Python package (which wraps the C++ Infomap engine) is not available in Pyodide, the browser-based Python runtime used by this book. For production use — where you want the fastest and most accurate Infomap implementation — install the package in a local environment:

# In a local terminal or Colab:
# pip install infomap
import infomap

im = infomap.Infomap("--two-level --seed 42")
for u, v in G.edges():
    im.add_link(u, v)
im.run()

# Extract partition
partition = {}
for node in im.tree:
    if node.is_leaf:
        partition[node.node_id] = node.module_id

The infomap package is available on all standard Python environments (pip-installable) and handles both undirected and directed graphs, weighted and unweighted. For directed networks — such as Twitter follower graphs or citation graphs — Infomap consistently outperforms Louvain because the map equation naturally incorporates edge directionality through the random-walk dynamics.

In practice

Google’s internal community-detection tooling reportedly uses variants of the map equation for segmenting the web graph, precisely because the web is a directed graph and Infomap handles directionality naturally. In academic citation analysis, Infomap has been used to map the structure of scientific fields — the result is a hierarchy of research communities that matches expert judgment about disciplinary boundaries more closely than modularity-based methods. The key intuition — “find the groups that a random walker spends time in” — aligns naturally with how information propagates through a directed network.

Stochastic Block Models

Communities as latent structure

All the algorithms discussed so far are descriptive: given a graph, find a partition with some property. The stochastic block model (SBM) takes the opposite, generative perspective: how would a graph be drawn if communities were the underlying organizing principle?

Definition (Stochastic Block Model). Given \(n\) nodes partitioned into \(k\) blocks (communities) with labels \(z_i \in \{1, \ldots, k\}\), and a \(k \times k\) matrix \(B\) of connection probabilities, the SBM draws each edge independently:

\[P(A_{ij} = 1) = B_{z_i, z_j}\]

The block matrix \(B\) encodes the community structure. In an assortative SBM, \(B_{cc} > B_{cd}\) for \(c \neq d\): nodes are more likely to connect within their community than across communities. This is the positive-community case, and the one that Louvain and Girvan–Newman are designed to detect. In a disassortative SBM, \(B_{cc} < B_{cd}\): nodes prefer to connect to nodes in other communities — a pattern that arises in bipartite-like structures (buyers and sellers in a market, predators and prey in a food web). Disassortative communities are invisible to modularity maximization — indeed, they have negative modularity.

Generating and recovering communities

Visualizing the difference

The modularity blind spot for disassortative structure

Any algorithm that maximizes modularity — including Girvan–Newman, Louvain, and the greedy methods in NetworkX — will fail to detect disassortative communities, because the true partition has negative modularity. If your domain knowledge suggests nodes prefer to connect to different types (a buyer-seller network, a food web, a bipartite projection), do not use modularity-based methods. Instead, use spectral methods based on the Bethe Hessian or non-backtracking operator, which can detect both assortative and disassortative structure. The graph_tool library implements these correctly.

Evaluating Partitions

The two evaluation regimes

There are two distinct regimes for evaluating a community partition, and conflating them is a source of confusion in the literature.

Unsupervised evaluation uses an internal quality measure — a number computed from the graph and the partition alone, without reference to any external ground truth. Modularity \(Q\) is the canonical example. The advantage is that it applies to any network; the disadvantage is that it embeds the same assumptions as the algorithms that maximize it (dense communities, assortative structure, resolution limit).

Supervised evaluation uses external ground truth — labels that come from outside the graph, such as the known faction membership of karate club members, the known genre classification of music artists, or the known sector membership of stocks. The standard metric is Normalized Mutual Information (NMI), which measures the statistical dependence between the algorithm’s partition and the ground-truth partition.

Normalized Mutual Information

Let \(X\) be the random variable representing ground-truth community labels and \(Y\) be the random variable representing algorithm-assigned community labels. The NMI is:

\[\text{NMI}(X, Y) = \frac{2 \, I(X; Y)}{H(X) + H(Y)}\]

where \(I(X; Y) = H(X) - H(X | Y)\) is the mutual information and \(H(\cdot)\) is the Shannon entropy. The normalization ensures that \(\text{NMI} \in [0, 1]\): a value of \(0\) means the two partitions are statistically independent (the algorithm learned nothing about the ground truth), and a value of \(1\) means the two partitions are identical up to a permutation of community labels.

NMI is implemented in scikit-learn as sklearn.metrics.normalized_mutual_info_score. Computing it requires converting both partitions to flat arrays of integer labels.

Evaluating Girvan–Newman and Louvain on the karate club

Before running the comparison cell, predict: will Girvan–Newman or Louvain achieve higher NMI against the karate-club ground truth? Consider that Girvan–Newman is deterministic and based on a principled structural criterion, while Louvain is a greedy heuristic. Does higher modularity necessarily imply higher NMI?

Interpreting NMI scores

An NMI of \(1.0\) means perfect agreement with ground truth. An NMI of \(0.0\) means the partition is no more informative than random. In community detection benchmarks:

NMI \(> 0.9\): near-perfect recovery of planted communities.
NMI \(0.7\)–\(0.9\): good recovery with a few misassigned nodes.
NMI \(0.4\)–\(0.7\): partial recovery; the algorithm captures the main structure but misses details.
NMI \(< 0.4\): weak or no correspondence with ground truth.

The modularity score and the NMI score measure different things and do not always agree. A partition with high modularity is internally self-consistent according to the modularity criterion; a partition with high NMI agrees with external labels. Both are useful, and the gap between them is itself informative — when a high-modularity partition has low NMI, it suggests either that the algorithm is over-fitting to graph structure that does not correspond to the social or semantic communities of interest, or that the ground-truth labels are noisy.

In practice

In financial networks — specifically, stock correlation graphs — modularity-maximizing community detection is used to build diversified portfolios. If stocks within the same community move together (they have similar factor exposures), then a portfolio that selects one stock from each community is approximately factor-diversified. The standard test is whether the detected communities align with GICS sector classifications (the ground truth): NMI between the algorithmic partition and GICS sectors is typically \(0.5\)–\(0.7\), indicating that graph-based communities capture industry clustering but also discover cross-sector patterns (tech and consumer growth stocks, defensive stocks across sectors) that GICS does not. This is a case where NMI below \(1.0\) is actually informative — it tells you the graph-based communities are not just recovering the known taxonomy.

When Algorithms Disagree — the Resolution Limit

Different methods, different answers

The same graph, processed by different community detection algorithms, will often produce different partitions. This is not a sign that something has gone wrong — it is a consequence of the fact that each algorithm operationalizes “community” differently. Girvan–Newman finds groups separated by high-betweenness bridges; Louvain finds groups with high modularity given the degree sequence; Infomap finds groups that trap random walkers. These are related but distinct properties, and in borderline cases they point to different partitions.

The cell below runs three methods on the same graph and displays the partitions side by side:

The resolution limit

Even within the class of modularity-maximizing algorithms, there is a fundamental problem discovered by Fortunato and Barthélemy (2007): modularity cannot detect communities smaller than a scale set by the total number of edges in the network. Specifically, two communities will be merged into one by any modularity-maximizing algorithm if the number of edges between them \(l\) exceeds \(\sqrt{2m}\), where \(m\) is the total number of edges in the entire graph.

This means that in large networks, small but real communities will be systematically merged with their neighbors by modularity-based algorithms. A network with \(m = 10^6\) edges has a resolution scale of \(\sqrt{2 \times 10^6} \approx 1414\) edges — any community smaller than \(\sim 1414\) internal edges is in danger of being absorbed.

The resolution limit has a practical implication: the Louvain algorithm (and any modularity optimizer) is not reliable for detecting small communities in large networks. For such cases, either use multi-resolution modularity (passing a higher resolution parameter to louvain_communities) or use a method not based on modularity, such as Infomap.

The resolution limit in practice

The resolution limit is not a pathological edge case — it occurs routinely in large-scale networks. A social network with millions of users and hundreds of millions of edges will have a resolution scale that absorbs any community smaller than roughly tens of thousands of nodes. This means that modularity-based community detection on Facebook-scale graphs will find a small number of very large macro-communities, missing the rich micro-community structure that is often the real object of interest. Practitioners working at that scale typically use either multi-resolution modularity, hierarchical methods, or Infomap.

Mini Case Study: Les Misérables

Three algorithms, one network, one verdict

The Les Misérables co-appearance network is an ideal benchmark for comparing algorithms: it is small enough to visualize, large enough to be interesting, and the communities correspond to recognizable narrative units of the novel (Valjean’s world, Thénardier’s underworld, the students of the ABC society, the Bishop’s circle). There is no definitive ground truth — the character groupings are interpretive — which makes it a realistic analogue of applied community detection, where ground truth is typically unavailable.

We run Greedy Modularity Optimization, Louvain, and Girvan–Newman, and compare their partitions on three dimensions: number of communities detected, modularity score, and visual coherence.

Discussion: which algorithm to trust?

After comparing the three partitions, three observations are worth noting.

Agreement on macro-structure. All three algorithms agree on the highest-level split: Valjean’s large central cluster and the peripheral clusters around supporting characters. This agreement across different algorithmic paradigms is the strongest evidence that this macro-structure is a robust feature of the network, not an artifact of any one method.

Disagreement at the margins. The methods disagree on boundary nodes — characters who appear with characters from multiple groups. Cosette, for instance, sits at the intersection of Valjean’s world and the Thénardier household; different algorithms assign her to different communities depending on exactly how they weight her edge pattern. This disagreement is structurally meaningful: boundary nodes are genuinely ambiguous, and a hard partition necessarily makes an arbitrary choice for them.

Modularity scores are close. The three methods produce modularity scores in a narrow range around \(0.54\)–\(0.56\). This is a manifestation of the degeneracy problem: the modularity landscape has a broad, flat region near its maximum, and many qualitatively different partitions achieve nearly the same score. A higher modularity score does not imply a qualitatively better partition — it may simply mean the algorithm found a slightly different point on a flat plateau.

The right answer depends on the question. If you are studying narrative arcs, you might want fewer, larger communities that correspond to story chapters; Greedy Modularity with its larger communities may be more interpretable. If you are studying character interaction patterns at a finer granularity, Louvain with more communities may reveal structure that the coarser partition obscures. If you want to understand which characters act as bridges between groups, Girvan–Newman’s edge-betweenness framing is the most principled. Community detection is not a question with one right answer; it is a tool whose output must be interpreted in the context of the question being asked.

Exercise: build your own benchmark

Choose a real-world network available in NetworkX — nx.karate_club_graph(), nx.les_miserables_graph(), nx.davis_southern_women_graph(), or nx.florentine_families_graph() — and run all three algorithms on it. For each method, report: number of communities, modularity \(Q\), and the NMI between that method and each of the other two. Write two or three sentences interpreting what the comparison reveals about the community structure of your chosen network.

Key Takeaways

The central lessons of this chapter are best understood as a set of tensions rather than a checklist.

Modularity is useful but not universal. The Newman–Girvan modularity \(Q\) is the most widely used quality measure for community partitions, and it is directly interpretable: a value above \(0.3\) indicates meaningful community structure. But modularity embeds a specific definition of community (assortative, with the random-graph null model), and any algorithm that maximizes it is blind to disassortative communities, and unreliable for small communities in large networks.

Speed scales with approximation. Girvan–Newman is conceptually the clearest but scales as \(O(nm^2)\), limiting it to small networks. Louvain runs in approximately \(O(n \log n)\) on sparse graphs and handles networks with hundreds of millions of nodes. Infomap is competitive with Louvain in speed and handles directed graphs naturally. The choice of algorithm involves a trade-off between interpretability, scalability, and the assumptions one is willing to make about what “community” means.

Evaluation requires both internal and external criteria. Modularity measures internal consistency; NMI measures correspondence with external labels. High modularity does not guarantee high NMI, and vice versa. When external labels are available — even noisy or partial ones — NMI is the more informative metric. When labels are unavailable, comparing the outputs of multiple algorithms and looking for agreement is the best available substitute for ground truth.

The resolution limit is not optional knowledge. Any practitioner using modularity-based community detection on large networks must understand the Fortunato–Barthélemy result: communities smaller than \(\sqrt{2m}\) edges are invisible to these algorithms. In large networks, this is a severe limitation. Multi-resolution modularity or flow-based methods are the recommended workarounds.

Community detection is not data mining — it is hypothesis generation. The output of any community detection algorithm is a structured hypothesis about latent groups in a network. Like all hypotheses, it must be evaluated against domain knowledge, alternative interpretations, and the question being asked. An algorithm that finds ten communities is not reporting a fact; it is suggesting a framing that may or may not be useful for the problem at hand.

Looking Ahead

Chapter 4 asks a different question: not “how do we find communities in an observed network,” but “how do observed networks come to have the structures they do?” The Erdős–Rényi model, the Watts–Strogatz small-world model, and the Barabási–Albert preferential attachment model are generative processes that produce networks with qualitatively different structural signatures — and the stochastic block model introduced in this chapter is a special case of this generative perspective, one focused specifically on community structure. Understanding how networks form provides the theoretical grounding for the structural patterns we have been measuring throughout Chapters 1–3.

← Chapter 2: Identifying Important Nodes · Chapter 4: Formation of Networks →

About This Chapter

What is a community, formally?

Why detection is hard

Table of Contents

Modularity — the Central Quality Measure

Background: what should a good partition maximize?

The Newman–Girvan modularity

Computing modularity by hand on a tiny graph

Visualizing the null-model intuition

Girvan–Newman: Edge Betweenness

The algorithm

Live demonstration on the karate club

Comparison with ground truth

The Louvain Method

Greedy modularity maximization in two phases

Live demonstration on the Les Misérables network

Infomap and the Map Equation

Flow-based community detection

The map equation

Why Infomap is complementary to modularity

Practical note

Stochastic Block Models

Communities as latent structure

Generating and recovering communities

Visualizing the difference

Evaluating Partitions

The two evaluation regimes

Normalized Mutual Information

Evaluating Girvan–Newman and Louvain on the karate club

Interpreting NMI scores

When Algorithms Disagree — the Resolution Limit

Different methods, different answers

The resolution limit

Mini Case Study: Les Misérables

Three algorithms, one network, one verdict

Discussion: which algorithm to trust?

Key Takeaways

Looking Ahead