Agentic chatbots need an 'interactional' ethics that centres on respect

January 17, 20246 min

Overview

Production Readiness

0.3

Novelty Score

0.6

Cost Impact Score

0.35

Citation Count

1

Authors

Lize Alberts, Geoff Keeling, Amanda McCroskery

Links

Abstract / PDF

Why It Matters For Business

Agentic conversational features can damage user trust, engagement, and wellbeing if systems ignore context and treat people as data points; fixing this protects brand trust and long-term product adoption.

Summary TLDR

This paper argues that current LLM ethics (helpful, honest, harmless) focuses on words and fails to capture situational, relational harms that arise when conversational systems act like social agents. It proposes 'interactional ethics' centred on respect, operationalised as duties to support users' autonomy, competence, and self-worth. The paper lists three classes of interactional harms (direct, influence, collective) and gives design suggestions: embed respectful assumptions, operationalise respect checks in self-evaluation, and keep/limit memory of sensitive user details.

Problem Statement

As conversational systems become proactive and agent-like, existing alignment criteria (helpful, honest, harmless) miss pragmatic, relational harms that arise in real interactions. We need an ethics that evaluates how systems treat people in context, not only the semantic content of outputs.

Main Contribution

Argues that agentic conversational AI should be evaluated as social actors, not only as output engines.

Defines three interactional harm types: direct (overt/covert), behaviour-influencing (misleading/manipulating), and collective (cumulative relational harms).

Proposes an operational lens of respect grounded in psychology: duties to affirm user autonomy, competence, and self-worth.

Maps interactional duties to concrete design implications: model transparency, selective memory, self-evaluation heuristics, and GUI vs CUI trade-offs.

Key Findings

Semantic-focused HHH alignment (helpful, honest, harmless) can miss situational disrespect.

Interactional harms cluster into three types: direct, behaviour-influencing, and collective.

Respect can be operationalised as duties to support autonomy, competence, and self-worth (grounded in psychological theory).

Who Should Care

What To Try In 7 Days

Audit conversational flows for potential interactional disrespect (tone, timing, assumptions).

Add lightweight memory rules: only store explicit user permissions and clear 'do-not-remember' flags.

Prototype a consent/negotiation UI that lets users set interaction style and memory preferences.

Agent Features

Memory

  • short-term memory (conversation context)
  • long-term memory (selective user facts)

Planning

  • proactive initiation (discussed as perceived agency)

Frameworks

  • Constitutional AI
  • Self-correction strategies
  • Person-centred care

Is Agentic

true

Reproducibility

Open Source Status

  • no

Risks & Boundaries

Limitations

  • Primarily conceptual: lacks original empirical tests or user studies.
  • Cultural variation and differing social norms are acknowledged but not operationalised.
  • Does not provide concrete automated metrics or classifiers to detect disrespect at scale.

When Not To Use

  • When you need narrow, task-focused performance metrics unrelated to ongoing social interaction.
  • In systems without any user-facing conversational role or without persistent user relationships.

Failure Modes

  • Over-personalisation that invades privacy or feels manipulative.
  • Selective memory leading to perceived insincerity or betrayal.
  • Rigid respect rules that reduce usefulness (e.g., refusing necessary blunt warnings).