Agentic chatbots need an 'interactional' ethics that centres on respect

January 17, 20246 min

Overview

Decision SnapshotNeeds Validation

Conceptually strong and grounded in psychology, but mostly theoretical with limited empirical validation for specific engineering choices.

Citations1

Evidence Strength0.55

Confidence0.80

Risk Signals8

Trust Signals

Findings with numeric evidence: 0/3

Findings with evidence refs: 3/3

Results with explicit delta: 0/0

Reproducibility

Status: No open assets linked

Open source: No

At A Glance

Cost impact: 35%

Production readiness: 30%

Novelty: 60%

Authors

Lize Alberts, Geoff Keeling, Amanda McCroskery

Links

Abstract / PDF

Why It Matters For Business

Agentic conversational features can damage user trust, engagement, and wellbeing if systems ignore context and treat people as data points; fixing this protects brand trust and long-term product adoption.

Who Should Care

Summary TLDR

This paper argues that current LLM ethics (helpful, honest, harmless) focuses on words and fails to capture situational, relational harms that arise when conversational systems act like social agents. It proposes 'interactional ethics' centred on respect, operationalised as duties to support users' autonomy, competence, and self-worth. The paper lists three classes of interactional harms (direct, influence, collective) and gives design suggestions: embed respectful assumptions, operationalise respect checks in self-evaluation, and keep/limit memory of sensitive user details.

Problem Statement

As conversational systems become proactive and agent-like, existing alignment criteria (helpful, honest, harmless) miss pragmatic, relational harms that arise in real interactions. We need an ethics that evaluates how systems treat people in context, not only the semantic content of outputs.

Main Contribution

Argues that agentic conversational AI should be evaluated as social actors, not only as output engines.

Defines three interactional harm types: direct (overt/covert), behaviour-influencing (misleading/manipulating), and collective (cumulative relational harms).

Key Findings

Semantic-focused HHH alignment (helpful, honest, harmless) can miss situational disrespect.

Practical UseAdd interactional checks (context, role, timing) to alignment pipelines rather than only filtering output text.

Evidence RefAbstract; Intro: critique of HHH criteria

Interactional harms cluster into three types: direct, behaviour-influencing, and collective.

Practical UseDesign evaluations and mitigations tailored to each harm type (e.g., tone controls for direct harms; citation/verification for misleading; memory controls for collective harms).

Evidence RefTable 1 and 'Social-interactional harms' section

What To Try In 7 Days

Audit conversational flows for potential interactional disrespect (tone, timing, assumptions).

Add lightweight memory rules: only store explicit user permissions and clear 'do-not-remember' flags.

Prototype a consent/negotiation UI that lets users set interaction style and memory preferences.

Agent Features

Memory
short-term memory (conversation context)long-term memory (selective user facts)
Planning
proactive initiation (discussed as perceived agency)
Frameworks
Constitutional AISelf-correction strategiesPerson-centred care
Is Agentic

Yes

Reproducibility

Code AvailableNo
Data AvailableNo
Open Source StatusNo
LicenseUnknown

Risks & Boundaries

Limitations

Primarily conceptual: lacks original empirical tests or user studies.

Cultural variation and differing social norms are acknowledged but not operationalised.

When Not To Use

When you need narrow, task-focused performance metrics unrelated to ongoing social interaction.

In systems without any user-facing conversational role or without persistent user relationships.

Failure Modes

Over-personalisation that invades privacy or feels manipulative.

Selective memory leading to perceived insincerity or betrayal.