← Portfolio

Technical Report · Preprint under review · April 2026

Σε απλά λόγια: Έδειξα μαθηματικά πώς ένα σύστημα κοινωνικών παροχών που φαίνεται δίκαιο σε κάθε μεμονωμένη απόφαση, μπορεί με τον καιρό να γίνει άδικο συνολικά — και πρότεινα πώς να το εντοπίζεις πριν συμβεί.

ARIA: Accountable
Real-time Intelligence Arbiter

Algorithmic governance of social benefits with formal temporal-fairness guarantees: proof that a locally correct system can still produce systematic injustice over time.

Systems Engineering · AI Governance Formal system modeling 3 research questions — all 3 answered EU AI Act (Article 14)

Preprint ↗ Zenodo · preprint GitHub aria-tfas · 30/30 tests passing

Summary — for recruiters

Systems Engineer & AI Ethics Researcher · Formal Methods · Algorithmic Fairness · 2026

I designed a formal AI–human architecture that mathematically proves a system can make every single decision correctly and still produce systematic injustice over time.

ΚΥΡΙΑ ΕΥΡΗΜΑΤΑ Αποδείχθηκε επίσημα (μαθηματική επαγωγή) και επαληθεύτηκε σε n=100.000 εφαρμογές. Πλήρης αρχιτεκτονική 4 υποσυστημάτων με 30/30 unit tests και 1:1 αντιστοίχιση με EU AI Act Art. 14.

Formal Methods · Petri Nets SysML · BPMN 2.0 Python · Bootstrap CI STRIDE Security Analysis SSM / CATWOE EU AI Act Compliance

Abstract

The problem and the approach in one paragraph

Abstract

Public administrations deploy algorithmic welfare systems with no way to prove they stay fair over time: a system can satisfy every individual fairness rule and still produce systematic injustice at the population level. ARIA introduces TFAS (Temporally Fair Arbitration System — the five-condition framework this paper defines) — five independently checkable conditions — and proves the Temporal Injustice Theorem: for δ ≥ 1, decisions diverge monotonically across groups, regardless of local correctness. Verified both analytically (induction) and empirically (n=100,000, B=200 bootstrap, |G|=4 groups).

Το Πρόβλημα: Δομική Αόρατη Αδικία

Why local correctness doesn't guarantee global fairness

"A system can reject every application strictly by the rules, while denying entire communities support they're legally owed. No single decision is wrong — the injustice is structural and invisible without temporal analysis."

The temporal-injustice loop — 6 steps

Every step is locally correct — the injustice only emerges in the sequence.

Post-hoc audit failure

SyRI (Netherlands), Robodebt (Australia): bias surfaced years later — thousands of harmful decisions before any correction.

Static metrics fall short

Individual Fairness, Equal Opportunity, Counterfactual Fairness all define fairness as static — none model the human–retraining feedback loop.

ARIA's answer

TFAS introduces δ as a formal system parameter. The condition δ < 1 bridges individually fair decisions with systemic temporal fairness.

Το Κενό στη Βιβλιογραφία

No other definition combines all five dimensions at once

Algorithmic-fairness research covers many dimensions individually — no single definition combines all five, until TFAS.

Καλύπτεται Μερικώς Δεν καλύπτεται

Ορισμός	Temporal Stability temporal drift	Bounded Feedback δ as parameter	Human Loop human-in-the-loop	Auditability / Audit legal record	Demographic Privacy G separation
Individual Fairness [Dwork et al., 2012]
Equal Opportunity [Hardt et al., 2016]
Counterfactual Fairness [Kusner et al., 2017]
G/B-Fairness (Petri) [Fröschle, 2015]
Runtime Monitoring [Henzinger et al., 2023]
TFAS — ARIA Dimakopoulos, 2026

No other definition covers all five dimensions. The intersection of these five is ARIA's original contribution.

Ορισμός TFAS

Temporally Fair Arbitration System — 5 independently checkable conditions

Definition 4.1 — TFAS

$$\text{TFAS}(S) \iff LF \land TS \land BF \land AUD \land LS$$

System S satisfies TFAS if and only if all five conditions are satisfied simultaneously.

(i) LF — Local Fairness

Local Fairness

P(esc | eᵢ, f(eᵢ)) ⊥ group(eᵢ)

The escalation decision is independent of group membership — enforced architecturally by SS2.

Arch: SS2 · Gate #1

(ii) TS — Temporal Stability

Temporal Stability

r(gⱼ,t)/r(gₖ,t) ∈ [1−ε, 1+ε] ∀ t

Escalation rates across groups stay within a tolerance band ε ∈ (0, 0.15] — monitored by the SS1 using SPC (Statistical Process Control — control-chart-based statistical monitoring).

Arch: SS1, W · Gate #5

(iii) BF — Bounded Feedback

Bounded Feedback

|Δθ(g,t)| ≤ δ |Δθ(g,t−1)|, δ < 1

The central innovation: the first formal parameter modelling the human reviewer's contribution to retraining. Requires geometric convergence.

Arch: Thm. 5.1 · Gate #2

(iv) AUD — Auditability

Auditability

∀eᵢ: ∃ rec(eᵢ) ∈ SS4 ∧ notify(DPA) ≤ τ

Every decision is logged in a Merkle append-only log; the DPA (Data Protection Authority — the regulator responsible for enforcing data-protection law) is notified within τ=48h. Implements EU AI Act Art. 14.

Arch: SS4 · Gate #3

(v) LS — Level Separation

Level Separation — G ∈ SS1.inputs, G ∉ SS2.inputs

Demographic data G is stored exclusively in SS1 — never linked to individual decisions

Architectural privacy separation: G is used only for monitoring, never as a model feature. Statically verified on every deployment via Gate #1.

Arch: Gate #1 (static verification at every CI/CD run)

Temporal Injustice Theorem

Proof by mathematical induction — verified at n=100,000

The central theoretical result: local correctness is not equivalent to global temporal fairness. Even if every individual decision is unbiased, retraining feedback can still lead to irreversible divergence.

Theorem 5.1 — Temporal Injustice Theorem

$$\text{If } \delta \geq 1 \text{, then } \exists\, g_j, g_k \in G : |\Pi(g_j,t) - \Pi(g_k,t)| \to \infty \text{ as } t \to \infty$$

Independently of the local correctness of any individual transition. Base case: r(gⱼ,1) = r₀(1+δβ). Induction: r(gⱼ,t)/r(gₖ,t) = (1+δβ)ᵗ. For δ≥1 and β>0, the ratio grows geometrically without bound.

Corollary 5.1 — Stability Condition

$$\delta < 1 \implies |\Delta\theta(g,t)| \to 0 \text{ geometrically} \implies \text{thresholds converge}$$

TFAS(iii) δ<1 is necessary to prevent monotonic divergence: thresholds converge, giving SS1 time to intervene before TFAS(ii) is irreversibly violated.

Empirical verification — ratio r(A)/r(B) per time step (t = 0 → 20)

OLS (Ordinary Least Squares regression) slopes: δ=0.5 → 0.0067/step · δ=1.2 → 0.0163/step · R²>0.999 · p<10⁻⁴⁰ · 2.4× faster divergence · n=100,000, β=0.04, B=200 bootstrap, seed=42

The multi-group extension (|G|=4: migrants β=0.04, long-term unemployed β=0.02, people with disabilities β=0.05) confirms the theorem for |G|>2 — under δ=1.2, all three protected groups diverge simultaneously.

Αρχιτεκτονική Τεσσάρων Υποσυστημάτων

Each subsystem enforces exactly one group of TFAS conditions

SS1 — Bias Monitor

Output Distribution Monitoring

SysML BDD (Block Definition Diagram) · SPC Control Chart

Monitors per-group rates r(g,t) using a sliding window W=10,000 and CUSUM k=2.0 (a statistical test for detecting gradual drift). Under δ=0.5: 6 alarms across 120,000 decisions; under δ=1.2: persistent alarms.

Enforces: TFAS(ii) Temporal Stability

SS2 — Intervention Engine

Escalation Logic & Automaton

Finite Automaton · Stateflow

4 states: MONITOR → ESCALATE → SUSPEND / OVERRIDE. G never enters as input; SUSPEND halts retraining whenever SS1 raises a TFAS alarm.

Enforces: TFAS(i) Local Fairness · TFAS(v) Level Separation

SS3 — Human Interface

XAI Context & Anti-Automation-Bias HMI

UML · BPMN 2.0 · awaiting HMI (Human-Machine Interface) prototype

Provides XAI (Explainable AI) context and counterfactual explanations without G — addresses automation bias (~4% empirical). The only subsystem defined in SysML/BPMN without an executable implementation.

Enforces: TFAS(iii) Bounded Feedback (behavioural)

SS4 — Audit Logger

Append-only Merkle Audit Log

Python (executable · verified)

Every decision is recorded with a Merkle hash chain (a tamper-evident log: any edit breaks the chain and is detectable); every threshold change is logged with timestamp+admin_id. External notary + DPA read-only access. Implements EU AI Act Art. 14(4)(d).

Enforces: TFAS(iv) Auditability

Ερευνητικά Ερωτήματα & Απαντήσεις

Three research questions — all three answered

RQ1 ✓ Answered

Can human feedback be embedded as a formally bounded parameter?

Yes. TFAS(iii) is the first fairness definition that treats the human reviewer's retraining signal as an explicit, bounded parameter — grounded by the Theorem, estimated via proxy verification in CI/CD, and legally documentable in SS4.

TFAS(iii) — the first formal human-feedback parameter in the fairness literature

RQ2 ✓ Proven

Under what conditions does a locally correct system produce globally unjust outcomes?

The Temporal Injustice Theorem proves δ≥1 is sufficient for monotonic divergence. Important: δ<1 is necessary but not sufficient — for δ=0.9 and β≥0.07, the violation rate exceeds 50%.

δ=1.2 → 2.4× faster divergence · p<10⁻⁴⁰ · R²>0.999 · Final ratio T=20: 1.38 vs 1.19

RQ3 ✓ Verified

How can a 4-subsystem architecture implement, check and prove TFAS compliance?

1:1 TFAS ↔ SS ↔ CI/CD gate mapping. Executable Python implementations of SS1, SS2, SS4 verify that each subsystem enforces its TFAS condition in practice.

30/30 unit tests · 1:1 TFAS↔CI/CD · SS1+SS2+SS4 executable · SS3 in SysML/BPMN

CI/CD Deployment Gate

5 automated checks — 1:1 mapping to TFAS conditions

The deployment gate translates EU AI Act compliance into a verifiable engineering artefact — unlike Model Cards, Gate #5 blocks deployment without current verification.

Model Feature Inspection — static check that dem_grp ∉ the model's features

TFAS(v) Level Separation · every commit

Feedback Coefficient Config — fb_coef < 1.0, enforced in an externally configurable file

TFAS(iii) Bounded Feedback

Merkle Log Integrity — verifies the SS4 audit log's hash chain

TFAS(iv) Auditability · EU AI Act Art. 14(4)(d)

Parameter Bounds — verifies that θ ∈ [0.5, 0.95]

TFAS(i) Local Fairness

Demographic Smoke Test — |rA/rB −1| ≤ ε across n=1,000 synthetic applications

TFAS(ii) Temporal Stability · every deployment

Contributions

Five original contributions to the algorithmic-fairness literature

C1 · TFAS Definition

First formal definition with bounded feedback

TFAS is the first fairness definition that embeds the human reviewer's retraining contribution as an explicit parameter — rather than treating it as an exogenous disturbance.

C2 · Temporal Injustice Theorem

Proven both analytically and empirically

Mathematical induction + verification at n=100,000 (B=200 bootstrap). The first formal bridge between locally fair decisions and systemic temporal injustice.

C3 · ARIA Architecture

4 subsystems in SysML/BPMN 2.0

1:1 TFAS ↔ SS ↔ CI/CD gate mapping. Security analysis using STRIDE (a framework for identifying security threats; 13 threats identified). A verifiable engineering artefact for EU AI Act Art. 14 + GDPR (EU General Data Protection Regulation) Art. 22.

C4 · Sensitivity Analysis

15×10 grid (δ, β) space

Answers: for which combinations is runtime monitoring sufficient? Key finding: δ≤0.6 and the SUSPEND mechanism is the primary control — not just parameter choice.

C5 · Privacy-Preserving Fairness

Fairness verification without demographic exposure

Demographic data G is stored exclusively in SS1 — never linked to individual decisions. CI/CD Gate #1 architecturally verifies this separation on every deployment, resolving the privacy paradox [Veale & Binns, 2017].

Practical Implications

For public administrations, legal practitioners, and AI ethics researchers

Public Administrations: append-only log → Art. 14(4)(d), CI/CD gate → Art. 9 risk management, DPA notification → Art. 14(3)(b). The first architecture to address all three at once.

Citizens & Legal Practitioners: The audit log creates a legally defensible record — citizens can contest automated decisions (GDPR Art. 22), with every threshold change carrying a timestamp and admin ID.

AI Ethics Researchers: The (δ,β) sensitivity analysis offers concrete guidance. The theorem generalises beyond benefits — criminal law, medical AI, hiring — anywhere human judgments feed into retraining.

The "Fairness Washing" Risk: ARIA acknowledges its own risk — algorithmically satisfying the conditions while failing their spirit (e.g. gaming Gate #5). The CI/CD gate is a floor, not a ceiling.

Technical Appendix EN ONLY

Reproducibility · STRIDE security · Limitations — for technical reviewers

Methodology & Reproducibility — open-source · fixed seed 42

Systems Methodology

SSM (Soft Systems Methodology — a method for structuring complex, ill-defined problems into stages) / CATWOE (its analysis tool) — chosen because algorithmic governance is a "wicked problem" (no single correct framing); CATWOE forces an explicit worldview before technical design.

Empirical Setup

n=100,000 · T=20 steps · W=10,000 window · ε=0.10 · β=0.04 · rA₀=0.72 · rB₀=0.68 · Beta(5,5) · B=200 bootstrap · seed=42

Petri Net Verification

Verified that the system never deadlocks (liveness) and resource counts stay capped (k-boundedness), across T=20 cycles.

Sensitivity Analysis

15×10 grid (δ, β) · n=50,000 · B=30 per cell · 150 cells total.

STRIDE Security Analysis

13 threats analysed under STRIDE (a standard attack-category checklist); 4 high-priority: Spoofing (mTLS+SS4), Tampering (Merkle log + notary), Information Disclosure (Gate #1), Elevation of privilege (rate limit+WORM).

Code Repository

github.com/SpiliosDimakopoulos/aria-tfas · 6 scripts · 30/30 unit tests (~5s) · MIT license · seed=42

Limitations & Future Work

Petri Net scope: verified only for T=20 cycles, not an unlimited time horizon. Future work: symbolic model checking (SPIN/LTL).

Synthetic β: calibrated from [Alon-Barkat & Busuioc, 2023], not measured on a live deployment — a field study with a consenting administration is needed.

SS3 unverified: the only subsystem without an executable implementation. Future work: an HMI prototype with a usability study on real reviewers.

Static G: TFAS assumes fixed groups — legislative change requires re-verification. Future work: dynamic G with versioned TFAS certificates.

ARIA δεν αυτοματοποιεί αποφάσεις.
Αυτοματοποιεί το ερώτημα ποιος πρέπει να τις λάβει — και αποδεικνύει επίσημα τι συμβαίνει όταν αυτό το ερώτημα απαντάται σωστά, αλλά το σύστημα λειτουργεί αρκετά καιρό.

Spilios Dimakopoulos · Systems Engineering · April 2026

Back to Portfolio

ARIA: AccountableReal-time Intelligence Arbiter

ARIA: Accountable
Real-time Intelligence Arbiter