Audit how LLM agents communicate: tone and explanations change decisions even when outcomes don't

May 17, 20257 min

Overview

Production Readiness

0.4

Novelty Score

0.6

Cost Impact Score

0.3

Citation Count

0

Authors

Ruta Binkyte

Links

Abstract / PDF

Why It Matters For Business

How agents phrase decisions affects cooperation and task success; monitoring and nudging tone and explanations reduces coordination failures and builds trust in agentic workflows.

Summary TLDR

The paper introduces a practical framework to measure "Interactional fairness" in multi-agent systems driven by large language models. Interactional fairness splits into Interpersonal fairness (respectful tone) and Informational fairness (explanation quality). The authors adapt human-survey tools (Colquitt's scales, Critical Incident Technique, journaling) into prompt-based tests and a JSON evaluation card. In a controlled negotiation study (24 conditions × 5 runs), respectful tone and clear justification raised acceptance rates and fairness ratings; context changed which signal mattered most (tone in collaborative settings, explanations in competitive ones). The framework is a low-cost, aud

Problem Statement

Existing fairness work for multi-agent systems focuses on outcomes and procedures. As agents talk more, how they speak and explain decisions becomes a separate, measurable fairness axis that can change cooperation and outcomes. We need a practical way to audit and debug communicative fairness in LLM-driven multi-agent systems.

Main Contribution

A conceptual adaptation of Interactional fairness (Interpersonal + Informational) for non-sentient LLM agents, treating fairness as observable communicative behavior.

A mixed-method evaluation pipeline: prompt-based Likert ratings, Critical Incident Technique sketches, Explanation Journaling, and a JSON Interactional Fairness Evaluation Card.

A controlled case study (resource negotiation) showing tone and justification affect acceptance decisions and that the relative importance of those cues shifts with task context.

Key Findings

Respectful tone and clear justification increase proposal acceptance even when resource splits are identical.

NumbersHigh-High (5:5) acceptance = 1.0 (Table 3)

Distributional fairness (the proposed split) remains the strongest predictor, but communicative cues can partially offset inequality.

NumbersDecision tree importance: split = 0.70 (collaborative) (Table 5)

Which interactional signal matters depends on task framing: tone matters more in collaboration, explanations matter more in competition.

NumbersInformational importance = 0.33 in competitive vs. interpersonal = 0.30 in collaborative (Table 5)

Results

Acceptance rate for equal (5:5) proposals under High-High

Value1.0 (100%)

Decision Tree feature importance (collaborative)

Valuesplit = 0.70, interpersonal = 0.30, informational = 0.0

Logistic regression coefficient for split (Ridge, collaborative)

Valuesplit coef = -1.579 (reduces acceptance as inequality grows)

Who Should Care

What To Try In 7 Days

Run a small negotiation test where agents use the Interactional Fairness Evaluation Card to log tone, explanation scores, and accept/reject decisions.

Add a prompt template that enforces a respectful opening line and a 1–2 sentence justification for proposals and measure acceptance change.

Track acceptance rate by context (collaborative vs competitive) to decide whether to emphasize tone or explanation in policies.

Agent Features

Memory

  • one-shot / no long-term memory (study)
  • supports journaling for longitudinal logging

Tool Use

  • prompt templates
  • JSON evaluation card

Frameworks

  • Colquitt fairness scales (adapted)
  • Critical Incident Technique
  • Explanation Journaling

Is Agentic

true

Architectures

  • LLM-based agent (prompted LLM)

Collaboration

  • Agent Communication
  • Multi-agent Coordination

Reproducibility

Open Source Status

  • unknown

Risks & Boundaries

Limitations

  • Simple one-shot negotiation setup limits ecological validity for real multi-step systems.
  • Agents self-evaluate with prompts; this can introduce judge bias and circularity.
  • Small number of runs per condition (five) limits statistical power.
  • No human-in-the-loop validation in the reported study.

When Not To Use

  • As the only fairness check for complex, long-running multi-agent deployments.
  • To infer agent sentience or moral understanding; the framework measures observable behavior only.
  • As a substitute for outcome-based fairness audits when resource distribution is the primary risk.

Failure Modes

  • Agents may be tuned to game the evaluation prompts without genuine improvement in cooperative behavior.
  • Context mismatch: a one-size communication policy harms performance when task framing changes.
  • Judge bias: prompted agents used as evaluators can reflect the same stylistic biases as proposers.

Core Entities

Models

  • GPT-4

Metrics

  • Likert interpersonal rating (1-5)
  • Likert informational rating (1-5)
  • accept/reject rate
  • Interactional fairness composite score

Context Entities

Metrics

  • acceptance rate by condition
  • feature importance from Decision Tree
  • logistic regression coefficients

Datasets

  • The Fair Divide (resource negotiation simulation)