← Portfolio
Technical Report · Preprint under review · April 2026
Σε απλά λόγια: Έδειξα μαθηματικά πώς ένα σύστημα κοινωνικών παροχών που φαίνεται δίκαιο σε κάθε μεμονωμένη απόφαση, μπορεί με τον καιρό να γίνει άδικο συνολικά — και πρότεινα πώς να το εντοπίζεις πριν συμβεί.

ARIA: Accountable
Real-time Intelligence Arbiter

Algorithmic governance of social benefits with formal temporal-fairness guarantees: proof that a locally correct system can still produce systematic injustice over time.

Systems Engineering · AI Governance Formal system modeling 3 research questions — all 3 answered EU AI Act (Article 14)
5
TFAS conditions
100K
Applications verified
30/30
Unit tests passing
2.4×
Faster divergence δ=1.2
Art.14
EU AI Act mapped
Summary — for recruiters
Systems Engineer & AI Ethics Researcher · Formal Methods · Algorithmic Fairness · 2026
I designed a formal AI–human architecture that mathematically proves a system can make every single decision correctly and still produce systematic injustice over time.
ΚΥΡΙΑ ΕΥΡΗΜΑΤΑ Αποδείχθηκε επίσημα (μαθηματική επαγωγή) και επαληθεύτηκε σε n=100.000 εφαρμογές. Πλήρης αρχιτεκτονική 4 υποσυστημάτων με 30/30 unit tests και 1:1 αντιστοίχιση με EU AI Act Art. 14.
Formal Methods · Petri Nets SysML · BPMN 2.0 Python · Bootstrap CI STRIDE Security Analysis SSM / CATWOE EU AI Act Compliance
Abstract
The problem and the approach in one paragraph
Abstract

Public administrations deploy algorithmic welfare systems with no way to prove they stay fair over time: a system can satisfy every individual fairness rule and still produce systematic injustice at the population level. ARIA introduces TFAS (Temporally Fair Arbitration System — the five-condition framework this paper defines) — five independently checkable conditions — and proves the Temporal Injustice Theorem: for δ ≥ 1, decisions diverge monotonically across groups, regardless of local correctness. Verified both analytically (induction) and empirically (n=100,000, B=200 bootstrap, |G|=4 groups).

Το Πρόβλημα: Δομική Αόρατη Αδικία
Why local correctness doesn't guarantee global fairness
"A system can reject every application strictly by the rules, while denying entire communities support they're legally owed. No single decision is wrong — the injustice is structural and invisible without temporal analysis."
The temporal-injustice loop — 6 steps
δ · β feedback loop Κλιμάκωση Αμφίβολη αίτηση → αξιολογητή Συγγένεια β Ασύνειδη μεροληψία β ≈ 4% Training signal Έγκριση → θετικό σήμα Μοντέλο μαθαίνει Δημογρ. proxy ↔ έγκριση Κατώφλι μετατοπίζεται Λιγότερες κλιμακώσεις Μεροληψία ενισχύεται χωρίς λανθασμένη απόφαση
Every step is locally correct — the injustice only emerges in the sequence.
Post-hoc audit failure
SyRI (Netherlands), Robodebt (Australia): bias surfaced years later — thousands of harmful decisions before any correction.
Static metrics fall short
Individual Fairness, Equal Opportunity, Counterfactual Fairness all define fairness as static — none model the human–retraining feedback loop.
ARIA's answer
TFAS introduces δ as a formal system parameter. The condition δ < 1 bridges individually fair decisions with systemic temporal fairness.
Το Κενό στη Βιβλιογραφία
No other definition combines all five dimensions at once

Algorithmic-fairness research covers many dimensions individually — no single definition combines all five, until TFAS.

Καλύπτεται Μερικώς Δεν καλύπτεται
Ορισμός Temporal
Stability
temporal drift
Bounded
Feedback
δ as parameter
Human
Loop
human-in-the-loop
Auditability
/ Audit
legal record
Demographic
Privacy
G separation
Individual Fairness [Dwork et al., 2012]
Equal Opportunity [Hardt et al., 2016]
Counterfactual Fairness [Kusner et al., 2017]
G/B-Fairness (Petri) [Fröschle, 2015]
Runtime Monitoring [Henzinger et al., 2023]
TFAS — ARIA Dimakopoulos, 2026
No other definition covers all five dimensions. The intersection of these five is ARIA's original contribution.
Ορισμός TFAS
Temporally Fair Arbitration System — 5 independently checkable conditions
Definition 4.1 — TFAS
$$\text{TFAS}(S) \iff LF \land TS \land BF \land AUD \land LS$$
System S satisfies TFAS if and only if all five conditions are satisfied simultaneously.
(i) LF — Local Fairness
Local Fairness
P(esc | eᵢ, f(eᵢ)) ⊥ group(eᵢ)
The escalation decision is independent of group membership — enforced architecturally by SS2.
Arch: SS2 · Gate #1
(ii) TS — Temporal Stability
Temporal Stability
r(gⱼ,t)/r(gₖ,t) ∈ [1−ε, 1+ε] ∀ t
Escalation rates across groups stay within a tolerance band ε ∈ (0, 0.15] — monitored by the SS1 using SPC (Statistical Process Control — control-chart-based statistical monitoring).
Arch: SS1, W · Gate #5
(iii) BF — Bounded Feedback
Bounded Feedback
|Δθ(g,t)| ≤ δ |Δθ(g,t−1)|, δ < 1
The central innovation: the first formal parameter modelling the human reviewer's contribution to retraining. Requires geometric convergence.
Arch: Thm. 5.1 · Gate #2
(iv) AUD — Auditability
Auditability
∀eᵢ: ∃ rec(eᵢ) ∈ SS4 ∧ notify(DPA) ≤ τ
Every decision is logged in a Merkle append-only log; the DPA (Data Protection Authority — the regulator responsible for enforcing data-protection law) is notified within τ=48h. Implements EU AI Act Art. 14.
Arch: SS4 · Gate #3
(v) LS — Level Separation
Level Separation — G ∈ SS1.inputs, G ∉ SS2.inputs
Demographic data G is stored exclusively in SS1 — never linked to individual decisions
Architectural privacy separation: G is used only for monitoring, never as a model feature. Statically verified on every deployment via Gate #1.
Arch: Gate #1 (static verification at every CI/CD run)
Temporal Injustice Theorem
Proof by mathematical induction — verified at n=100,000

The central theoretical result: local correctness is not equivalent to global temporal fairness. Even if every individual decision is unbiased, retraining feedback can still lead to irreversible divergence.

Theorem 5.1 — Temporal Injustice Theorem
$$\text{If } \delta \geq 1 \text{, then } \exists\, g_j, g_k \in G : |\Pi(g_j,t) - \Pi(g_k,t)| \to \infty \text{ as } t \to \infty$$
Independently of the local correctness of any individual transition. Base case: r(gⱼ,1) = r₀(1+δβ). Induction: r(gⱼ,t)/r(gₖ,t) = (1+δβ)ᵗ. For δ≥1 and β>0, the ratio grows geometrically without bound.
Corollary 5.1 — Stability Condition
$$\delta < 1 \implies |\Delta\theta(g,t)| \to 0 \text{ geometrically} \implies \text{thresholds converge}$$
TFAS(iii) δ<1 is necessary to prevent monotonic divergence: thresholds converge, giving SS1 time to intervene before TFAS(ii) is irreversibly violated.
Empirical verification — ratio r(A)/r(B) per time step (t = 0 → 20)
TFAS(ii) zone 1.40 1.30 1.20 1.10 1.00 0 5 10 15 20 χρονικό βήμα t 1η παραβίαση TFAS(ii) t=2 1.19 1.40 δ = 0.5 (TFAS-compliant) δ = 1.2 (adversarial) — 2.4× ταχύτερη απόκλιση
OLS (Ordinary Least Squares regression) slopes: δ=0.5 → 0.0067/step · δ=1.2 → 0.0163/step · R²>0.999 · p<10⁻⁴⁰ · 2.4× faster divergence · n=100,000, β=0.04, B=200 bootstrap, seed=42

The multi-group extension (|G|=4: migrants β=0.04, long-term unemployed β=0.02, people with disabilities β=0.05) confirms the theorem for |G|>2 — under δ=1.2, all three protected groups diverge simultaneously.

Αρχιτεκτονική Τεσσάρων Υποσυστημάτων
Each subsystem enforces exactly one group of TFAS conditions
SS1 — Bias Monitor
Output Distribution Monitoring
SysML BDD (Block Definition Diagram) · SPC Control Chart
Monitors per-group rates r(g,t) using a sliding window W=10,000 and CUSUM k=2.0 (a statistical test for detecting gradual drift). Under δ=0.5: 6 alarms across 120,000 decisions; under δ=1.2: persistent alarms.
Enforces: TFAS(ii) Temporal Stability
SS2 — Intervention Engine
Escalation Logic & Automaton
Finite Automaton · Stateflow
4 states: MONITOR → ESCALATE → SUSPEND / OVERRIDE. G never enters as input; SUSPEND halts retraining whenever SS1 raises a TFAS alarm.
Enforces: TFAS(i) Local Fairness · TFAS(v) Level Separation
SS3 — Human Interface
XAI Context & Anti-Automation-Bias HMI
UML · BPMN 2.0 · awaiting HMI (Human-Machine Interface) prototype
Provides XAI (Explainable AI) context and counterfactual explanations without G — addresses automation bias (~4% empirical). The only subsystem defined in SysML/BPMN without an executable implementation.
Enforces: TFAS(iii) Bounded Feedback (behavioural)
SS4 — Audit Logger
Append-only Merkle Audit Log
Python (executable · verified)
Every decision is recorded with a Merkle hash chain (a tamper-evident log: any edit breaks the chain and is detectable); every threshold change is logged with timestamp+admin_id. External notary + DPA read-only access. Implements EU AI Act Art. 14(4)(d).
Enforces: TFAS(iv) Auditability
Ερευνητικά Ερωτήματα & Απαντήσεις
Three research questions — all three answered
RQ1 ✓ Answered
Can human feedback be embedded as a formally bounded parameter?
Yes. TFAS(iii) is the first fairness definition that treats the human reviewer's retraining signal as an explicit, bounded parameter — grounded by the Theorem, estimated via proxy verification in CI/CD, and legally documentable in SS4.
TFAS(iii) — the first formal human-feedback parameter in the fairness literature
RQ2 ✓ Proven
Under what conditions does a locally correct system produce globally unjust outcomes?
The Temporal Injustice Theorem proves δ≥1 is sufficient for monotonic divergence. Important: δ<1 is necessary but not sufficient — for δ=0.9 and β≥0.07, the violation rate exceeds 50%.
δ=1.2 → 2.4× faster divergence · p<10⁻⁴⁰ · R²>0.999 · Final ratio T=20: 1.38 vs 1.19
RQ3 ✓ Verified
How can a 4-subsystem architecture implement, check and prove TFAS compliance?
1:1 TFAS ↔ SS ↔ CI/CD gate mapping. Executable Python implementations of SS1, SS2, SS4 verify that each subsystem enforces its TFAS condition in practice.
30/30 unit tests · 1:1 TFAS↔CI/CD · SS1+SS2+SS4 executable · SS3 in SysML/BPMN
CI/CD Deployment Gate
5 automated checks — 1:1 mapping to TFAS conditions

The deployment gate translates EU AI Act compliance into a verifiable engineering artefact — unlike Model Cards, Gate #5 blocks deployment without current verification.

#1
Model Feature Inspection — static check that dem_grp ∉ the model's features
TFAS(v) Level Separation · every commit
#2
Feedback Coefficient Config — fb_coef < 1.0, enforced in an externally configurable file
TFAS(iii) Bounded Feedback
#3
Merkle Log Integrity — verifies the SS4 audit log's hash chain
TFAS(iv) Auditability · EU AI Act Art. 14(4)(d)
#4
Parameter Bounds — verifies that θ ∈ [0.5, 0.95]
TFAS(i) Local Fairness
#5
Demographic Smoke Test — |rA/rB −1| ≤ ε across n=1,000 synthetic applications
TFAS(ii) Temporal Stability · every deployment
Contributions
Five original contributions to the algorithmic-fairness literature
C1 · TFAS Definition
First formal definition with bounded feedback
TFAS is the first fairness definition that embeds the human reviewer's retraining contribution as an explicit parameter — rather than treating it as an exogenous disturbance.
C2 · Temporal Injustice Theorem
Proven both analytically and empirically
Mathematical induction + verification at n=100,000 (B=200 bootstrap). The first formal bridge between locally fair decisions and systemic temporal injustice.
C3 · ARIA Architecture
4 subsystems in SysML/BPMN 2.0
1:1 TFAS ↔ SS ↔ CI/CD gate mapping. Security analysis using STRIDE (a framework for identifying security threats; 13 threats identified). A verifiable engineering artefact for EU AI Act Art. 14 + GDPR (EU General Data Protection Regulation) Art. 22.
C4 · Sensitivity Analysis
15×10 grid (δ, β) space
Answers: for which combinations is runtime monitoring sufficient? Key finding: δ≤0.6 and the SUSPEND mechanism is the primary control — not just parameter choice.
C5 · Privacy-Preserving Fairness
Fairness verification without demographic exposure
Demographic data G is stored exclusively in SS1 — never linked to individual decisions. CI/CD Gate #1 architecturally verifies this separation on every deployment, resolving the privacy paradox [Veale & Binns, 2017].
Practical Implications
For public administrations, legal practitioners, and AI ethics researchers
P1
Public Administrations: append-only log → Art. 14(4)(d), CI/CD gate → Art. 9 risk management, DPA notification → Art. 14(3)(b). The first architecture to address all three at once.
P2
Citizens & Legal Practitioners: The audit log creates a legally defensible record — citizens can contest automated decisions (GDPR Art. 22), with every threshold change carrying a timestamp and admin ID.
P3
AI Ethics Researchers: The (δ,β) sensitivity analysis offers concrete guidance. The theorem generalises beyond benefits — criminal law, medical AI, hiring — anywhere human judgments feed into retraining.
P4
The "Fairness Washing" Risk: ARIA acknowledges its own risk — algorithmically satisfying the conditions while failing their spirit (e.g. gaming Gate #5). The CI/CD gate is a floor, not a ceiling.
Technical Appendix EN ONLY
Reproducibility · STRIDE security · Limitations — for technical reviewers
Methodology & Reproducibility — open-source · fixed seed 42
Systems Methodology
SSM (Soft Systems Methodology — a method for structuring complex, ill-defined problems into stages) / CATWOE (its analysis tool) — chosen because algorithmic governance is a "wicked problem" (no single correct framing); CATWOE forces an explicit worldview before technical design.
Empirical Setup
n=100,000 · T=20 steps · W=10,000 window · ε=0.10 · β=0.04 · rA₀=0.72 · rB₀=0.68 · Beta(5,5) · B=200 bootstrap · seed=42
Petri Net Verification
Verified that the system never deadlocks (liveness) and resource counts stay capped (k-boundedness), across T=20 cycles.
Sensitivity Analysis
15×10 grid (δ, β) · n=50,000 · B=30 per cell · 150 cells total.
STRIDE Security Analysis
13 threats analysed under STRIDE (a standard attack-category checklist); 4 high-priority: Spoofing (mTLS+SS4), Tampering (Merkle log + notary), Information Disclosure (Gate #1), Elevation of privilege (rate limit+WORM).
Code Repository
github.com/SpiliosDimakopoulos/aria-tfas · 6 scripts · 30/30 unit tests (~5s) · MIT license · seed=42
Limitations & Future Work
L1
Petri Net scope: verified only for T=20 cycles, not an unlimited time horizon. Future work: symbolic model checking (SPIN/LTL).
L2
Synthetic β: calibrated from [Alon-Barkat & Busuioc, 2023], not measured on a live deployment — a field study with a consenting administration is needed.
L3
SS3 unverified: the only subsystem without an executable implementation. Future work: an HMI prototype with a usability study on real reviewers.
L4
Static G: TFAS assumes fixed groups — legislative change requires re-verification. Future work: dynamic G with versioned TFAS certificates.

ARIA δεν αυτοματοποιεί αποφάσεις.
Αυτοματοποιεί το ερώτημα ποιος πρέπει να τις λάβει — και αποδεικνύει επίσημα τι συμβαίνει όταν αυτό το ερώτημα απαντάται σωστά, αλλά το σύστημα λειτουργεί αρκετά καιρό.

Spilios Dimakopoulos · Systems Engineering · April 2026
✓ COPIED