Academic Research · April 2026

A Mathematical Model of
Multiplicative Knowledge Growth

A probabilistic framework for knowledge acquisition, separating simple information intake from insight-generating knowledge that drives exponential growth.

Published · Zenodo ∂ Differential Equations Cognitive Psychology Mathematical Modelling Optimal Control 〜 Stochastic Differential Equations ℐ Information Theory April 2026

Preprint ↗ Zenodo · preprint

Summary — for recruiters

Mathematical Modelling & Academic Research · Theoretical Research · Zenodo · 2026

An original mathematical framework for quantifying knowledge acquisition, combining Cognitive Psychology with probabilistic differential equations and optimal control theory. Central finding: knowledge doesn't accumulate linearly — under the right conditions, it compounds. Front-loaded study schedules achieve up to 96% higher cumulative knowledge than constant-rate study.

IMPACT Published on Zenodo. Models multiplicative knowledge growth, identifies 3 distinct regimes, and generates testable predictions for educational design.

Mathematical Modelling Differential Equations Cognitive Psychology Probability & Statistics Academic Writing Zenodo

01 Abstract

What this research aims to do

A mathematical framework formalising the multiplicative growth patterns observed across expert learning domains — distinguishing simple information intake from insight-generating integration, where existing knowledge catalyses understanding of new information. Closed-form solutions cover all three growth regimes; a stochastic extension models individual-level variance; optimal control proves front-loaded study schedules yield 55–96% higher cumulative knowledge than constant pacing; and an information-theoretic result ties the leverage coefficient $\alpha$ to the mutual information between new and existing knowledge.

Growth regimes · closed-form solutions

+96%

Cumulative knowledge gain · front-loading

25×

α more influential than r · ∂λ/∂α vs ∂λ/∂r

Zenodo

Published & indexed · DOI: 10.5281

02 The Core Idea

Why knowledge is not simply additive

Learning a new theorem, a mathematician doesn't simply add a fact — it reveals connections, new approaches, patterns across domains. Each connection is additional knowledge beyond the original input. Every learning event has probability $p$ of triggering a «eureka moment» that compounds on existing knowledge — the more you know, the more likely the next insight: a self-reinforcing cycle.

Simple Integration · prob. (1−p)

You learn exactly one thing — gradual, predictable. How beginners mostly learn: memorising isolated facts.

Insight-Generating · prob. p

+1+I

Triggers $I$ additional insights — connections with what you already know. The «aha!» moment experts experience continuously.

«We don't learn facts — we learn connections. And every connection makes the next one more likely.»

03 Mathematical Framework EN only

Variables · Equations · Derivations

In plain terms: knowledge grows when new material sticks (r), some of it sparks extra insight (p), and some of it fades from memory (δ). The equations below make that precise.

Symbol	Name	Description
K(t)	Knowledge base	Total concepts, facts, skills or connections at time t
r(t)	Learning rate	New learning opportunities encountered per unit of time
p(t)	Insight probability	Probability that a learning event generates extra derivative ideas
I	Insight size	Random variable: number of extra insights per insight event
δ	Forgetting rate	Proportional decay — knowledge lost as a fraction of current
α	Knowledge leverage	How much each unit of existing knowledge raises future insight probability
λ	Growth coefficient	$r\alpha \cdot \mathbb{E}[I] - \delta$ — determines whether growth accelerates or plateaus

Expected Knowledge Increment

$$\mathbb{E}[\Delta K] = (1-p)\cdot 1 + p\cdot(1 + \mathbb{E}[I]) = 1 + p\cdot\mathbb{E}[I]$$

Each learning event adds one unit of knowledge on average, plus a bonus $p \cdot \mathbb{E}[I]$ from insight events — the variance explains why learning feels unpredictable.

Fundamental Growth Equation (with Forgetting)

$$\frac{dK(t)}{dt} = r(t)\bigl[1 + p(t)\cdot\mathbb{E}[I]\bigr] - \delta K(t)$$

Change in knowledge = learning inflow minus forgetting outflow, the latter proportional to current knowledge.

Knowledge-Dependent Insight Probability

$$p(t) = p_0 + \alpha K(t)$$

The more you know, the greater the probability of insight — why master chess players see patterns invisible to beginners.

Knowledge-Dependent Growth Dynamics

$$\frac{dK(t)}{dt} = r(1 + p_0\cdot\mathbb{E}[I]) + \underbrace{(r\alpha\cdot\mathbb{E}[I] - \delta)}_{\lambda}\, K(t)$$

$\lambda = r\alpha\cdot\mathbb{E}[I] - \delta$ is the critical quantity: it decides whether knowledge grows exponentially or hits a ceiling.

04 Three Growth Regimes EN only

What happens depending on λ = rα·𝔼[I] − δ

The sign of $\lambda$ determines your entire learning trajectory.

Case 1 · λ > 0

Accelerating Growth

rα·𝔼[I] > δ

Exponential growth — each new piece makes you better at learning more. A virtuous cycle.

$$K(t) = \tfrac{r(1+p_0\mathbb{E}[I])}{\lambda}\!\left(e^{\lambda t}-1\right) + K(0)e^{\lambda t}$$

Case 2 · λ < 0

Plateau Effect

rα·𝔼[I] < δ

Forgetting dominates — learning hits a ceiling. Why some people don't progress no matter how much time they invest.

$$K^* = \frac{r(1+p_0\,\mathbb{E}[I])}{\delta - r\alpha\,\mathbb{E}[I]}$$

Case 3 · λ = 0

Linear Growth

rα·𝔼[I] = δ

Leverage and forgetting balance exactly — steady, but no acceleration. The midpoint between the other two regimes.

$$K(t) = K(0) + r(1+p_0\,\mathbb{E}[I])\,t$$

Interactive · The λ coefficient

λ = +0.05

λ < 0 λ > 0

Accelerating Growth — each new piece of knowledge raises the odds of the next insight

Exact closed-form · Theorem 3.1 · r=0.5, p₀=0.20, 𝔼[I]=2.0, K₀=5, T=40

Small changes in learning conditions compound dramatically over time — a slight rise in eureka probability or knowledge leverage can produce wildly different long-term outcomes.

05 Sensitivity Analysis EN only

Which lever actually moves λ?

At baseline ($r=0.5$, $\alpha=0.02$, $\mathbb{E}[I]=2.0$, $\delta=0.05$, $p_0=0.20$), $\lambda = -0.03$ — plateau regime. Which parameter pushes the system fastest towards accelerating?

∂λ / ∂(·) — local sensitivity at baseline

α (knowledge-leverage coeff.) +1.00

δ (forgetting rate) −1.00

r (learning rate / encounter frequency) +0.04

𝔼[I] (expected insight magnitude) +0.01

p₀ (baseline insight probability) 0 (indirect)

Source: Table 2, Dimakopoulos (2026) · baseline: r=0.5, α=0.02, 𝔼[I]=2.0, δ=0.05, p₀=0.20

In plain terms: $\partial\lambda/\partial\alpha = 1.0$ vs $\partial\lambda/\partial r = 0.04$ — $\alpha$ is a 25× stronger lever than the learning rate $r$. Adding more material without adding connection-making moves you far from optimal.

Why α & δ equivalent $\partial\lambda/\partial\alpha=1.0$, $\partial\lambda/\partial\delta=-1$ — opposite direction. $\delta$ falls with spaced repetition (Ebbinghaus); $\alpha$ rises with elaborative interrogation, interleaved practice, analogical reasoning.

Phase transition asymmetry $r$ or $\mathbb{E}[I]$ lower the threshold $\alpha_c = \delta/(r\mathbb{E}[I])$, but only $\alpha$ moves the system along the phase boundary for a given learner — it helps, it doesn't replace.

06 Mathematical Extensions EN only

Beyond the baseline model

Variable insight sizes $I$ follows a geometric distribution — most insights are small, occasionally a major one. Captures the long tail of «eureka» moments.

Nonlinear forgetting $\delta K$ → $\delta K^\beta$. $\beta > 1$: interference dominates. $\beta < 1$: consolidation — entrenched knowledge is harder to forget.

Bounded insight probability Logistic cap: $p(t) = \frac{p_{\max}}{1 + e^{-\alpha(K(t)-\theta)}}$ — prevents the unrealistic $p(t) > 1$ as knowledge grows without bound.

07 Stochastic Extension EN only

Individual variance & the Matthew Effect

The deterministic model captures average behaviour. In reality, individual trajectories vary a lot — some «take off», others with identical parameters stagnate. Modelled with a Cox–Ingersoll–Ross SDE (Stochastic Differential Equation — an equation describing how a quantity evolves under randomness; CIR is a well-known form of it used in economics):

The equations below describe how knowledge "grows" over time — you don't need to read the math to follow the conclusion.

SDE Formulation (Itô, 1951)

$$dK(t) = \bigl(\beta + \lambda K(t)\bigr)\,dt + \sigma\sqrt{K(t)}\,dW(t)$$

$W(t)$: Brownian motion. $\sigma > 0$: diffusion coefficient. The square-root form keeps $K(t) \geq 0$ (Feller: $2\beta \geq \sigma^2$) — variance scales with insight opportunities, not constant noise.

Variance Dynamics (Proposition 4.2)

$$\frac{dV}{dt} = 2\lambda V(t) + \sigma^2\,\mathbb{E}[K(t)], \qquad V_\infty = \frac{\sigma^2 K^*}{2|\lambda|}$$

In the accelerating regime ($\lambda > 0$), variance grows exponentially at rate $2\lambda$ — individual differences widen continuously. For $\lambda = +0.08$, $T = 30$: the IQR (Interquartile Range — the spread containing the middle 50% of sample values) becomes $\approx 11$× wider.

Matthew Effect: Learners with higher initial $K_0$ keep gaining a larger absolute advantage — «the rich get richer». For homogeneous $K_0$, variance comes solely from the diffusion term $\sigma\sqrt{K}\,dW$.

Plateau regime · λ < 0

V → V∞

Variance converges to a finite value $V_\infty = \sigma^2 K^* / (2|\lambda|)$. Trajectories «cluster» around $K^*$.

Accelerating regime · λ > 0

V ~ e²λt

Exponential variance growth. With $n=30$ Monte Carlo paths, identical starting conditions diverge into radically different outcomes.

08 Optimal Control of Learning Schedules EN only

When to study — the mathematical answer

Given fixed total effort $R = \int_0^T r(t)\,dt$, which learning schedule maximises cumulative knowledge $\int_0^T K(t)\,dt$? The answer follows from Optimal Control theory (Pontryagin et al., 1962).

Optimal Schedule — Theorem 5.1 (Bang-Bang)

$$r^*(t) = \begin{cases} r_{\max} & t \in [0,\,\tau] \\ 0 & t \in (\tau,\, T] \end{cases}, \qquad \tau = \frac{R}{r_{\max}}$$

The optimal schedule is bang-bang: maximum intensity for time $\tau$, then zero. Holds under accelerating dynamics ($\lambda > 0$) for the cumulative-knowledge objective.

Schedule	Pattern	Rank	vs Constant	Mechanism
Front-loaded	r_max → 0	1st	+55–96%	Cumulative leverage — early K boosts all future p(t)
Spaced	r̄(1 + 0.8 sin)	2nd	+4%	Reduced interference
Constant	r̄	3rd	ref.	Baseline
Back-loaded	0 → r_max	4th	−33%	Forgetting in idle period

Intuition: Early knowledge increases $p(t)$ for all subsequent periods — the same effort invested early yields compound returns. Formalises deliberate practice (Ericsson, 2008): intensive foundation-building before diversification.

Caveats Holds for the cumulative objective — if only terminal $K(T)$ matters with high forgetting $\delta$, back-loading may dominate. The model also ignores fatigue/sleep: the front-loading result is an upper bound on real gains, not a prescriptive schedule.

09 Information-Theoretic Bridge EN only

What determines α — the connection to Shannon entropy

The coefficient $\alpha$ is a 25× stronger lever than the learning rate $r$ (see Sensitivity Analysis). But what determines it physically?

Information-Theoretic Characterisation of α (Proposition 6.1)

$$\alpha \;\propto\; \frac{I(X_{\text{new}};\, X_{\text{existing}})}{H(X_{\text{new}})}$$

Ratio of mutual information between new/existing knowledge to the entropy of new information — measures what fraction of new information is «explained» by what you already know: the basis for insight.

$\alpha$ is a joint property of the learner and the domain. Structured domains (mathematics, chess) yield high $\alpha$ regardless of prior knowledge; fragmented domains yield low $\alpha$ even in experts.

Domain	α (estimate)	δ (estimate)	p₀	Regime (K=50)
Chess (grandmaster)	0.06–0.10	0.02–0.04	0.35–0.55	Accelerating
Clinical medicine	0.03–0.06	0.03–0.06	0.20–0.40	Borderline
Intro. mathematics	0.02–0.05	0.04–0.08	0.15–0.30	Plateau → Linear
Factual recall	0.01–0.02	0.06–0.10	0.10–0.20	Plateau

Pedagogical corollary: Focusing on conceptual connections — not just information volume — is the critical lever for educational systems.

10 Falsifiable Predictions

Three testable predictions from the model

A theoretical framework is only valuable if it can be falsified. The model produces three classes of falsifiable predictions, testable with suitable longitudinal data.

Prediction I
Connection emphasis → higher α Learners receiving connection-focused instruction (elaborative interrogation, analogies, concept mapping) will show higher $\alpha$ and faster approach to the accelerating regime vs. isolated-fact instruction with the same $r$.
Operationalisation: concept-map density or cued-recall latency as proxy for $\alpha$ in MOOC (Massive Open Online Course, e.g. Coursera) cohorts.

Prediction II
Spaced repetition → lower αc Spaced repetition reduces effective $\delta$, lowering the threshold $\alpha_c = \delta/(r\mathbb{E}[I])$ (Corollary 3.3) — learners with spaced-repetition schedules will reach the accelerating regime at a lower $\alpha$ than those with massed practice.
Operationalisation: Anki/SuperMemo logs as a natural experiment for estimating $\delta_{\text{eff}}$ per learner.

Prediction III
Matthew Effect widening In the accelerating regime ($\lambda > 0$), individual knowledge differences will widen exponentially at rate $2\lambda$ (Proposition 4.2) — learners with initially higher $K_0$ will continuously gain a larger absolute advantage, even with identical parameters $\alpha$, $r$, $\delta$.
Operationalisation: longitudinal IQR(K) analysis on datasets like Khan Academy or edX.

All three predictions are testable with existing naturalistic datasets — the hallmark of a scientific framework.

11 Educational Implications

What the model suggests for educational design

The framework generates concrete, testable predictions about which strategies should be most effective — those that push $\lambda$ above zero by maximising $p$ and $\alpha$.

Maximise insight probability $p$ — Explicitly highlight new-old relationships, use analogies, multiple contexts for the same concept.

Foundational knowledge first — Reach $r\alpha\cdot\mathbb{E}[I] > \delta$ before advancing. Strong foundations make each concept more likely to spark insight.

Reduce cognitive load — Lower $\delta$ shifts the regime towards accelerating growth, consistent with Cognitive Load Theory (Sweller et al.).

Distributed practice vs cramming — High $r$ over a short period doesn't compensate for low $\alpha\cdot\mathbb{E}[I]$. Distributed practice grows $K(t)$, raising future insight probability.

Interdisciplinary education — Cross-domain connections dramatically increase $\alpha$. Maths + biology → more likely insights in computational biology than either domain alone.

12 Empirical Grounding

Cognitive Psychology research supporting the model

Insight frequency Individuals produce insightful solutions 20–60% of the time under controlled conditions (Kounios & Beeman, 2014), often more correct than analytic ones (Salvi et al., 2016).

Chess expertise Masters recognise patterns faster and more accurately than novices (Chase & Simon, 1973) — supports $p(t) = p_0 + \alpha K(t)$.

Deliberate practice Expert performance involves seeing relationships invisible to novices (Ericsson, 2008) — consistent with high $\alpha$ in expert domains.

Cognitive load theory Environments emphasising conceptual connections produce greater gains than isolated skill practice (Sweller et al., 2011) — aligns with the model's prediction.

13 Limitations & Future Directions

Honest assessment of the model's limits

The model introduces significant abstractions. Acknowledging them doesn't weaken it — it precisely delineates where it holds and opens directions for extension.

Scalar knowledge $K(t)$ models knowledge as a scalar, ignoring structure and domain-specificity. A future graph-valued $G(t)$ extension would capture knowledge's network nature.

Cognitive fatigue $r(t)$ is treated as freely controllable, ignoring fatigue/sleep. In practice sustained high-intensity learning degrades both $r(t)$ and $p(t)$ — the front-loading result is an upper bound, not a prescriptive schedule.

Within-domain only Doesn't address transfer between domains (positive or negative). A multi-domain extension with cross-domain $\alpha$ coefficients is needed.

Linear p(t) overflow $p(t) = p_0 + \alpha K(t)$ allows $p(t) > 1$ for large $K$. The logistic saturation model (Eq. 7) fixes this; practical applications at high $K$ should use the saturating variant.

No empirical calibration Parameters $\alpha$, $\delta$, $p_0$ in the domain estimates table are order-of-magnitude, not empirically calibrated. Future work: Bayesian estimation from MOOC/Anki logs.

Proposed extensions: network-valued $G(t)$ · Bayesian calibration · multi-domain transfer · multi-agent collective dynamics · empirical validation with longitudinal datasets.

14 References

21 Cited Works · Cognitive Psychology, Information Theory & Optimal Control

View all 21 references

Anderson, J. R. (1982) — Acquisition of cognitive skill. Psychological Review, 89(4):369–406.

Anderson, M. C. & Neely, J. H. (1996) — Interference and inhibition in memory retrieval. In Memory, Academic Press, pp. 237–313.

Chase, W. G. & Simon, H. A. (1973) — Perception in chess. Cognitive Psychology, 4(1):55–81.

Chi, M. T. H., Feltovich, P. J. & Glaser, R. (1981) — Categorization and representation of physics problems by experts and novices. Cognitive Science, 5(2):121–152.

Corbett, A. T. & Anderson, J. R. (1995) — Knowledge tracing: Modelling the acquisition of procedural knowledge. User Modeling and User-Adapted Interaction, 4(4):253–278.

Cox, J. C., Ingersoll, J. E. & Ross, S. A. (1985) — A theory of the term structure of interest rates. Econometrica, 53(2):385–407.

Ebbinghaus, H. (1885) — Über das Gedächtnis. Duncker & Humblot, Leipzig.

Ericsson, K. A. (2008) — Deliberate practice and acquisition of expert performance. Academic Emergency Medicine, 15(11):988–994.

Itô, K. (1951) — On stochastic differential equations. Memoirs of the American Mathematical Society, 4:1–51.

Kornell, N. & Bjork, R. A. (2008) — Learning concepts and categories: Is spacing the 'enemy' of induction? Psychological Science, 19(6):585–592.

Kounios, J. & Beeman, M. (2014) — The cognitive neuroscience of insight. Annual Review of Psychology, 65:71–93.

Metcalfe, J. & Wiebe, D. (1987) — Intuition in insight and noninsight problem solving. Memory & Cognition, 15(3):238–246.

Newell, A. & Rosenbloom, P. S. (1981) — Mechanisms of skill acquisition and the law of practice. In Cognitive Skills and their Acquisition, Erlbaum, pp. 1–55.

Pashler, H. et al. (2007) — Enhancing learning and retarding forgetting: Choices and consequences. Psychonomic Bulletin & Review, 14(2):187–193.

Pontryagin, L. S. et al. (1962) — The Mathematical Theory of Optimal Processes. Wiley, New York.

Salvi, C. et al. (2016) — Insight solutions are correct more often than analytic solutions. Thinking & Reasoning, 22(4):443–460.

Schmidt, H. G. & Boshuizen, H. P. A. (1993) — On the origin of intermediate effects in clinical case recall. Memory & Cognition, 21(3):338–351.

Shannon, C. E. (1948) — A mathematical theory of communication. The Bell System Technical Journal, 27(3):379–423.

Simon, H. A. & Gilmartin, K. (1973) — A simulation of memory for chess positions. Cognitive Psychology, 5(1):29–46.

Sweller, J., Ayres, P. & Kalyuga, S. (2011) — Cognitive Load Theory. Springer, New York.

Wozniak, P. & Gorzelany, E. (1994) — Effect of the SuperMemo method on learning efficiency. Acta Neurobiologiae Experimentalis, 54:271–278.

The question «how to learn more effectively» has a quantitative answer: maximise α, not r. The sign of λ = rαE[I] − δ decides whether knowledge self-reinforces or hits a ceiling — and that's testable. Closed-form solutions, bang-bang optimal control with +55–96% gain from front-loading, and three falsifiable predictions make this framework testable, not merely descriptive.

Spilios Dimakopoulos · April 2026 · Published on Zenodo

Full Paper PDF ↗ Zenodo Record ↗

Back to Portfolio

A Mathematical Model ofMultiplicative Knowledge Growth

A Mathematical Model of
Multiplicative Knowledge Growth