ICML 2026 · Educational AI Simulation

EduMirror

Modeling Educational Social Dynamics with Value-driven Multi-agent Simulation

Jingzhe Lin*1, Hengbin Yu*2, Yongdan Zeng*1,3, Fangwei Zhong1,✉

1 School of Artificial Intelligence, Beijing Normal University, Beijing, China.
2 School of Systems Science, Beijing Normal University, Beijing, China.
3 Information Hub, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China.
* Equal contribution. Corresponding author.

Abstract

A safer laboratory for educational social dynamics.

EduMirror overview of educational scenarios and intervention workflow

EduMirror is a multi-agent simulator for studying educational social dynamics when real-world controlled experiments are ethically difficult and observational studies lack causal power. It combines a configurable library of education-oriented agents with value-driven behavior grounded in social value and intrinsic motivation. A dual-track measurement protocol uses LLMs to quantify both overt actions and latent psychological states, enabling structured in-silico research. Case studies on school bullying and group cooperation show that EduMirror can generate theory-aligned, measurable social phenomena for hypothesis testing in education.

Architecture

From theory to simulation traces.

EduMirror framework flowchart
EduMirror follows a closed experimental workflow. Researchers first construct theory-grounded educational scenarios by integrating domain theory, defining context, profiling roles, and configuring measurement metrics. These settings initialize the Concordia-based simulation engine, where the Game Master manages scene setting, narration, rule enforcement, and time. Agents act through a value-driven cognitive architecture: profiles, traits, goals, memories, theory-of-mind, psychological needs, and social value orientation jointly guide the action planner to generate, evaluate, select, and reflect on behaviors. The resulting interaction traces flow into user toolkits, where LLM Raters and LLM Surveyors quantify explicit behaviors and implicit states, while intervention tools create parallel timelines for counterfactual comparison and visualization.

Computable scenario construction

EduMirror turns an abstract educational phenomenon into a measurable simulation package before running agents. The process keeps each scenario tied to theory, roles, metrics, and intervention logic.

Select grounding theory

Choose theories that explain the target phenomenon, such as social comparison, family stress, or social anxiety.

Identify constructs

Break the theory into measurable concepts, such as self-esteem, belonging, pressure, anxiety, or competition.

Profile agents

Map constructs into roles, traits, goals, formative memories, and initial psychological or social-value states.

Configure metrics

Operationalize outcomes with behavior rubrics and validated-scale-inspired survey probes for internal states.

Run comparisons

Generate matched timelines, apply interventions, and compare explicit behavior with latent psychological change.

Scenario Library

A theory-grounded scenario library.

Peer & Group Dynamics7 · 35%
Individual Social Cognition5 · 25%
Classroom Culture3 · 15%
Home-School Dynamics5 · 25%

Each scenario is a computable research package.

EduMirror contains 20 pre-designed educational scenarios. In the paper, each scenario is specified by its social phenomenon, participating roles, number of agents, grounding theory, and measurement protocol. This lets the same simulation pipeline support both descriptive observation and controlled intervention experiments.

Browse the full 20-scenario library in compact pages. Each card keeps the table's phenomenon, roles, theory, and measurement protocol.

1 / 5

Experiments

Evidence from realism, scale, and intervention tests.

Scenario-wide win-rate comparison

The heatmap summarizes pairwise post-hoc evaluations across seventeen educational scenarios: six representative settings, eight bullying simulations, and three social-interaction simulations. Each cell reports the column model's win rate against the row model, offering a compact system-level view of relative realism, contextual appropriateness, and human-likeness.

Scalability under larger groups

Agents EduMirror LLMob BabyAGI D2A ReAct
54.804.254.103.352.35
154.183.603.573.532.93
304.033.833.863.122.41

The kindergarten scalability study increases the group from 5 to 15 and 30 agents, then averages four rubric dimensions: naturalness, coherence, plausibility, and developmental typicality. EduMirror remains the top method at every group size, suggesting that the simulator can preserve age-appropriate classroom dynamics even as simultaneous child-agent interactions become denser.

Psychological trajectories in bullying simulation

Psychological need trajectories in the dormitory bullying scenario under different initial states

The dormitory bullying case tracks five latent dimensions on a 0-10 scale across simulated time: Safety, Social Belonging, Esteem, Meaning & Growth, and Psychological Health. The comparison shows that healthier initial values make Alice more resilient, while vulnerable initial values accelerate deterioration under repeated exclusion, rumor-spreading, and confrontation.

Intervention outcomes by strategy

Neglectful intervention boxplots across psychological needs Authoritative-punitive intervention boxplots across psychological needs Supportive-individual intervention boxplots across psychological needs Supportive-cooperative intervention boxplots across psychological needs

The intervention study creates matched bullying branches with different teacher goals: neglectful response, authoritative-punitive discipline, supportive-individual care, and supportive-cooperative class action. The distributions show a progression in effectiveness, with cooperative support improving the broadest set of psychological needs.

Counterfactual Outcomes

Citation

BibTeX

@inproceedings{edumirror2026,
  title     = {EduMirror: Modeling Educational Social Dynamics with Value-driven Multi-agent Simulation},
  author    = {Lin, Jingzhe and Yu, Hengbin and Zeng, Yongdan and Zhong, Fangwei},
  booktitle = {Proceedings of the 43rd International Conference on Machine Learning},
  year      = {2026},
  address   = {Seoul, South Korea},
  publisher = {PMLR}
}