8. Digital Trust in the Age of AI

Chapter 1: Learn how pervasive consumer concerns about data privacy, unethical ad-driven business models, and the imbalance of power in digital interactions highlight the need for trust-building through transparency and regulation.

Chapter 2: Learn how understanding the digital consumer’s mind, influenced by neuroscience and behavioral economics, helps businesses build trust through transparency, personalization, and adapting to empowered consumer behaviors.

Chapter 3: Learn how the Iceberg Trust Model explains building trust in digital interactions by addressing visible trust cues and underlying constructs to reduce risks like information asymmetry and foster consumer confidence.

Chapter 4: Learn how trust has evolved from personal relationships to institutions and now to decentralized systems, emphasizing the role of technology and strategies to foster trust in AI and digital interactions.

Chapter 5: Learn that willingness to share personal data is highly contextual, varying based on data type, company-data fit, and cultural factors (Western nations requiring higher trust than China/India).

Chapter 6: Learn about the need to reclaim control over personal data and identity through innovative technologies like blockchain, address privacy concerns, and build trust in the digital economy.

Chapter 7: Learn how data privacy concerns, questionable ad-driven business models, and the need for transparency and regulation shape trust in the digital economy.

Chapter 8: Learn how AI’s rapid advancement and widespread adoption present both opportunities and challenges, requiring trust and ethical implementation for responsible deployment. Key concerns include privacy, accountability, transparency, bias, and regulatory adaptation, emphasizing the need for robust governance frameworks, explainable AI, and stakeholder trust to ensure AI’s positive societal impact.

The rapid advancement and widespread adoption of artificial intelligence (AI) marks a pivotal moment in human technological evolution. As a general-purpose technology, AI’s transformative impact extends across industries and deeply penetrates everyday life. Recent research by McKinsey & Company (2024) indicates that 72% of organizations have already implemented at least one AI technology, highlighting the technology’s pervasive influence in contemporary society.

As Levin (2024) articulates in his groundbreaking work on diverse intelligence, the fundamental nature of change and adaptation suggests that persistence in current forms is not only impossible but potentially undesirable. Levin’s research at Tufts University emphasizes AI’s potential role as a bridge toward understanding and developing diverse forms of intelligence, which could prove crucial for humanity’s future development. However, this rapid integration of AI technologies into societal frameworks brings significant trust and ethical implementation challenges. Holweg (2022) notes a concerning trend: as AI becomes more prevalent, instances of applications that violate established social norms and values have increased proportionally. This observation underscores the critical importance of developing robust frameworks for digital trust in AI systems.

fundamental nature of change of an ai robot
Nature of change of an ai robot - Generated by AI

Developing digital trust in AI systems requires a multi-faceted approach that considers technical reliability, ethical implications, and societal impact. This includes establishing transparent frameworks for AI development, implementing robust safety measures, and ensuring equitable access to AI benefits across society. Users and stakeholders should have access to clear explanations of how AI systems operate and make decisions. This can be achieved through techniques such as explainable AI (XAI), which aims to make AI decision-making processes interpretable to humans (Arrieta et al., 2020). The challenge lies not merely in advancing AI capabilities but in doing so in a manner that maintains and strengthens social trust while promoting responsible innovation.

The performance trajectory of AI models provides both promise and pause for consideration. Recent comprehensive analyses (Bengio et al., 2024) demonstrate remarkable progress in AI capabilities across various benchmarks from 1998 to 2024. Particularly noteworthy is the rapid progression from relatively poor performance to surpassing human expert levels in specific domains. This acceleration in capabilities, while impressive, heightens the urgency of addressing trust-related concerns. As Hassabis (2022) emphasizes, AI’s beneficial or harmful impact, like that of any powerful technology, ultimately depends on societal implementation and governance. This perspective aligns with findings from Kiela et al. (2021), who documented significant improvements in AI performance across multiple benchmarks, including MNIST, Switchboard, ImageNet, and various natural language processing tasks.

The performance of AI models on various benchmarks has advanced rapidly. It is important to note that some earlier results used machine learning AI models that are not general-purpose models. On some recent benchmarks, models progressed within a short period from having poor performance to surpassing the performance of human subjects who are often experts (Bengio et al., 2024; Bengio et al., 2025). This rapid evolution from specialized systems to more general-purpose AI models marks a significant milestone in artificial intelligence development, warranting careful consideration of its implications.

The advancement of general-purpose AI systems faces multiple possible trajectories, from slow to extremely rapid progress, with expert opinions and evidence supporting various scenarios (Bengio et al., 2025). Recent improvements have been primarily driven by exponential increases in compute (4x/year), training data (2.5x/year), and energy usage (3x/year), alongside the adoption of more effective scaling approaches such as ‘chains of thought’ methodology. While it appears feasible for AI developers to continue exponentially increasing resources for training through 2026 (reaching 100x more training compute than 2023) and potentially through 2030 (reaching 10,000x more), bottlenecks in data, chip production, financial capital, and energy supply may make maintaining this pace infeasible after the 2020s. Policymakers face the dual challenge of monitoring these rapid advancements while developing adaptive risk management frameworks that can respond effectively to increasing capabilities and their associated risks.

AI performance vs human performance on select benchmarks
AI performance vs human performance on select benchmarks (Bengio et al., 2025)

1. New worlds, new problems

In response to these challenges, the iceberg.digital project aims to promote sustainable data access and develop trustworthy AI systems that users can confidently rely upon daily.

Privacy Intrusion

50%%

Algorithmic Bias

30%%

The Problem of Explainability

14%%
AI systems with potential to transfer societal biases - Generated by AI
AI systems with potential to transfer societal biases - Generated by AI

Technical implementation introduces additional AI-specific biases. These include deployment bias (wrong application context), measurement bias (distorted success metrics), and label bias (prejudiced data categorization). Particularly concerning is automation bias, where humans tend to overestimate AI capabilities and neglect critical evaluation of outputs (Mishina et al., 2012). As noted in recent analyses of AI failures, adequate human oversight remains a crucial regulatory requirement to prevent automation bias.

Several types of biases affect AI systems:

 

  • Deployment Bias: Incorrect application context leading to misaligned outputs
  • Overfitting: Excessive alignment with sample data, reducing generalizability
  • Selection Bias: Unbalanced data sampling that skews results
  • Automation Bias: Overreliance on technology at the expense of human judgment
  • Label Bias: Prejudiced assigned data categories
  • Measurement Bias: Distorted success metrics
  • Algorithmic Bias: Distorted coding approaches

Prevention of these biases requires a comprehensive approach. As demonstrated by the cases analyzed by Holweg et al. (2022), organizations must implement regular system audits, maintain diverse development teams, and establish robust testing protocols. However, the most crucial element remains human oversight – a key regulatory requirement that helps prevent automation bias and ensures AI systems remain accountable to human judgment and ethical considerations.

The framework for AI risks provides a clear, structured categorization of the primary threats and issues associated with AI.
The framework for AI risks provides a clear, structured categorization of the primary threats and issues associated with AI.

2. Navigating Risks and Trust Challenges in an Evolving Technological Environment

The rapid advancement of artificial intelligence technologies has created a complex landscape where regulatory frameworks struggle to keep pace with technological innovation. This disconnect has led to significant compliance risks and trust challenges that organizations must navigate while implementing AI systems. The prevailing view is that AI has progressed far more rapidly than regulators have been able to address its associated risks. This analysis explores the key complications and tensions that characterize the current state of AI technology implementation.

Compliance Risks and Technical Challenges

 

The implementation of AI systems presents multiple compliance risks that organizations must address. According to recent audit findings, these risks manifest in several critical areas. Inaccurate or imprecise results can lead to a loss of customer trust and potential legal liability, particularly when AI systems influence accounting processes through Robotic Process Automation (RPA). Privacy protection threats, including violations of the “right to be forgotten”, present significant legal challenges that organizations must navigate.

 

 

A particularly concerning aspect is the potential for social discrimination and injustice through inappropriate practices and biases related to race, gender, ethnicity, age, and income. The complexity of learning algorithms, which often operate as “black boxes,” further complicates accountability and oversight. This lack of transparency can lead to a loss of accountability in decision-making processes and difficulties maintaining human oversight of AI systems. Furthermore, there are growing concerns about the potential for manipulation or malicious use of AI systems, including criminal activities, interference in democratic processes, and economic optimization that may harm societal interests.

Must promote inclusivity & mitigate biases.
Fairness & Non-discrimination0%
Must respect privacy, ensuring data protection & user control.
Privacy0%
Mechanisms to distribute responsibility & provide remedies
Accountability0%
Clarity on usage and decision-making.
Transparency & Explainability0%
Safeguarded against unauthorized interference.
Safety & Security0%
Act with integrity and long-term foresight.
Professional Responsibility0%
Critical decisions remain under human oversight.
Human Control of Technology0%
Align with fundamental human values and well-being.
Promotion of Human Values0%
A Map of Ethical and Rights-Based Approaches to Principles for AI
A Map of Ethical and Rights-Based Approaches to Principles for AI

Capability: Given that most stakeholders have limited access to such information as resources owned by an organization, appropriateness of the algorithms and quality of the training data, they tend to judge the capability of the organization based on whether the AI system works or not, that is, how accurate, reliable and robust the AI system is. “Capability reputation is about whether the firm makes a good product or provides a good service” (Bundy, 2021)

Character: However, stakeholders’ AI concerns often are not primarily focused on these technical aspects. Instead, they are focused on what AI strategies reveal about the values and priorities of the organization itself. “Character reputation is about integrity. Does it do good things in terms of corporate social responsibility? Does it take care of its stakeholders?” (Bundy, 2021).
Prevalence of AI failure modes by stakeholders’ perception of firms’ capability and character. (The size of the area denotes prevalence.)
Prevalence of AI failure modes by stakeholders’ perception of firms’ capability and character. (The size of the area denotes prevalence.)
find out more

Park & Rogan (2019) suggest that in the aftermath of adverse events, potential exchange partners prioritize a firm’s capability reputation over its character reputation, implying that capability reputation serves as a buffer in relationship formation.

 

Conversely, existing exchange partners place greater emphasis on a firm’s character reputation rather than its capability reputation, making them less inclined to terminate relationships with organizations that possess strong character reputations.

 

Furthermore, the buffering effects of both capability and character reputations are significantly diminished when adverse events arise from factors within the firm’s control.

Social adaptation and institutional trust gain traction
Social adaptation and institutional trust gain traction
Trust within human-machine collectives depends on the perceived consensus about cooperative norms (Makovi et al., 2023)

The evolving trust dynamics within the iceberg model affect generations in distinct ways. Research by Hoffmann et al. (2014) suggests that Digital Natives, while strongly influenced by brand perception and associated signals, demonstrate greater receptivity to structural trust-building measures in technological environments. In contrast, older generations – Digital Immigrants –  exhibit more difficulty in adapting to new “situation normality” paradigms, typically prioritizing a careful assessment of the risk-benefit ratio before establishing trust. This generational divide in trust formation mechanisms suggests that organizations may need to adopt differentiated approaches to building and maintaining trust across different age demographics.

A multi-faceted approach in required for trust-building
A multi-faceted approach in required for trust-building

3. From Principles to Action: Ensuring Trustworthy AI

The implementation of trustworthy artificial intelligence (TAI) requires a comprehensive approach that bridges theoretical principles with practical action. As Bryson (2020) argues, AI development produces material artefacts that require the same rigorous auditing, governance, and legislation as other manufactured products. This perspective frames our understanding of moving from abstract principles to concrete implementation of trustworthy AI systems.

Framework and Principles

 

As detailed above in this text, the European Union’s framework for trustworthy AI formed the foundation for the EU AI Act, establishing four fundamental principles: respect for human autonomy, prevention of harm, fairness, and explicability. These principles are operationalized through seven key requirements: human agency and oversight, technical robustness and safety, privacy and data governance, transparency, diversity and non-discrimination, societal and environmental well-being, and accountability.

 

System trustworthiness properties, as outlined by Pavlidis (2021), emphasize that a trustworthy system must be capable of meeting not only stated customer needs but also unstated and unanticipated requirements. This understanding is particularly crucial given Bengio et al.’s (2024) warning that powerful frontier AI systems cannot be considered “safe unless proven unsafe,” highlighting the need for proactive safety measures.

AI and machine learning occurs through design and produces a material artifact; auditing, governance, and legislation should be applied to correct sloppy or inadequate manufacturing, just as we do with other products (Bryson, 2020).

Despite widespread discussions on AI ethics, there remains a significant gap between ethical principles and real-world AI practices. Developers and companies often fail to integrate ethics into their workflows. Studies show that even when explicitly instructed, software engineers and AI practitioners do not change their development practices to align with ethical guidelines, revealing a systemic issue in which ethics remain theoretical rather than actionable (McNamara et al., 2018; Vakkuri et al., 2019). This is a worrying situation because organisations are the first line of defence in protecting personal privacy, avoiding biases, and ensuring that nudging is not overused or misused (Sattlegger & Nitesh, 2024).

While efforts are underway to operationalize AI ethicssuch as through ethics boards and collaborative codes – translating high-level principles into concrete technological implementations remains a daunting challenge (Munn, 2022). Clearly, these admirable principles do not enforce themselves, nor are there any tangible penalties for violating them (Calo, 2021). This is further complicated by competing ethical demands, technical constraints, and unresolved questions of fairness, privacy, and accountability, which require ongoing social, political, and technical engagement rather than simplistic technical fixes.

The figure presents key properties of trustworthy systems as identified in the literature, highlighting various dimensions such as tangibility, transparency, reliability, and other behavioural and structural characteristics that influence cognitive trust in AI (Glikson, 2020). Our model incorporates these attributes, aligning with the framework proposed by Palvidis (2021), who, in turn, builds upon the foundational work of Hoffman et al. (2006).

Two key aims emerge: transparency, which ensures visibility into how AI systems function through tools like auditing frameworks and model reporting, and accountability, which establishes mechanisms to address harms through governance, enforcement, and community-driven redress. A broad range of stakeholders – including developers, managers, business leaders, policymakers, and professional organizations – must collaborate to integrate ethical AI practices into design, oversight, and regulation. By combining transparency with accountability, this approach moves beyond abstract ethical principles toward practical and enforceable AI governance.

Properties of trustworthy systems and operationalizable aims
Properties of trustworthy systems and operationalizable aims

While ethical principles and guidelines have been widely discussed (Floridi & Cowls, 2019; Jobin et al., 2019), they lack enforceability, rendering them ineffective in regulating AI systems. Regulatory approaches may fail to provide adequate guidance for building public trust in AI. Instead, a shift toward enforceable AI governance frameworks – away from toothless principles – is necessary to align legal structures with the evolving nature of human-machine collaboration (Calo. 2021). Without tangible penalties for violations, organizations and developers may have little incentive to adhere to them (Mittelstadt et al., 2016).

Given the rapid integration of AI into various domains, we face two possible paths: A) Ignore AI’s transformative potential and allow technological advancements to unfold without legal adaptation. B) Recognize that AI fundamentally alters human affordances, requiring an evolution of legal structures to account for new human-AI interactions (Binns, 2018). This chapter advocates for the latter approach and outlines the necessary first steps: Enforceable AI governance requires the elaboration of three dimensions:

1 Human-AI Synergy:

Understanding Roles and Enhancing Collaboration

 

AI should augment, not replace, human decision-making—especially in high-risk areas such as healthcare, finance, and criminal justice. Effective collaboration requires clear role definition, explainability, and human oversight to ensure AI remains an assistant rather than an unchecked decision-maker.

2 AI Alignment & Transparency:

Refining, Guiding, & Managing AI Systems

 

AI must be aligned with human values, transparent in its operations, and continuously monitored to ensure fairness, reliability, and adaptability. This requires real-time oversight, dynamic governance, and continuous improvement through transparent AI development practices.

3 Accountability & Oversight:

Establishing Responsibility & Redress Mechanisms

 

AI accountability demands clear ownership, governance structures, and legal frameworks to ensure responsible use, prevent harm, and enable redress for affected individuals. Organizations must embed legal, technical, and operational mechanisms to enforce accountability across AI development and deployment.

Building on the figure “Properties of trustworthy systems and operationalizable aims,” it is essential to operationalize our relatively abstract principles further for trustworthy AI. The objectives of transparency and accountability must be addressed through a multidimensional approach encompassing legal, technical, and human perspectives. Each dimension comprises a non-exhaustive set of instruments that collectively contribute to establishing a genuinely enforceable AI governance framework.

Amins and instruments enabling trustworthy systems
Amins and instruments enabling trustworthy systems

1. Human-AI Synergy: Understanding Roles and Enhancing Collaboration

A critical aspect of AI regulation is understanding how humans should collaborate with AI systems. Emerging research suggests that AI should augment rather than replace human decision-making (Shneiderman, 2020). Thus, regulatory efforts should focus on:
  • Establishing clear boundaries of AI decision-making authority (Russell, 2019).
  • Ensuring human oversight in critical applications such as healthcare and legal decision-making (Danks & London, 2017).

Know your place

Discussions on the future of human labour are pervasive and widely debated across various disciplines. Rather than viewing AI as a replacement for human labour, a more nuanced understanding reveals its potential as a catalyst for more meaningful and fulfilling work. This perspective emphasizes AI’s capacity to handle repetitive and mundane tasks, freeing humans to focus on activities that leverage their unique qualities, such as creativity, critical thinking, and emotional intelligence.

 

Daugherty and Wilson (2024) present a comprehensive framework for understanding this human-AI relationship through their “Missing Middle” model. This model identifies three distinct categories of work: human-only activities, human-machine hybrid activities, and machine-only activities. Human-only activities encompass leadership, empathy, creativity, and judgment – capabilities that remain uniquely human. Machine-only activities include transactions, iterations, predictions, and adaptations, where AI’s computational power excels.

The Human + Machine Model by Daugherty & Wilson (2024)
The Human + Machine Model by Daugherty & Wilson (2024)

The most interesting developments occur in the hybrid space, what Daugherty and Wilson term “the Missing Middle.” This space is divided into two key areas: humans complementing machines and machines augmenting human capabilities. In the first category, humans serve as trainers, explainers, and sustainers. Trainers educate AI systems on appropriate values and performance parameters, ensuring that these systems reflect desired behaviours and ethical principles. Explainers or translators bridge the communication gap between technical and business domains, making complex AI systems comprehensible to non-technical stakeholders. Sustainers focus on maintaining system quality and ensuring long-term value through principles of explainability, accountability, fairness, and symmetry.

 

On the other side of this hybrid space, AI systems enhance human capabilities through amplification, interaction, and embodiment. Amplification allows professionals to process and analyze vast amounts of data, enabling more informed strategic decision-making. Interaction capabilities, particularly through natural language processing, create more intuitive human-machine interfaces. Embodiment represents the physical collaboration between humans and robots in shared workspaces, enabled by advanced sensors and actuators.

 

This framework suggests that the future of work lies not in competition between humans and machines but in their complementary collaboration. Organizations that understand and effectively implement this model can create more productive and fulfilling work environments, where both human and artificial intelligence contribute their unique strengths to achieve superior outcomes.

Actionable instruments supporting human-AI synergy

 

Enhancing Human-AI Synergy involves implementing actionable instruments that foster effective collaboration between humans and AI systems. We recommend putting the human at the centre of technology design. This is the disipling of  Human-centered Explainable AI (HCXAI). “It develops a holistic understanding of “who” the human is by considering the interplay of values, interpersonal dynamics, and the socially situated nature of AI systems” (Ehsan & Riedl, 2020).

Human-in-the-Loop (HITL) Decision-Making

 

Integrating human oversight into AI processes ensures that critical decisions, especially in high-stakes domains, are reviewed and validated by humans (Van Rooy & Vaes, 2024, Bansal et al., 2019). This approach leverages the strengths of both human judgment and AI efficiency.

 

Implementation Strategies:

 

  • Risk-Based Oversight: Implement tiered oversight models where the level of human intervention corresponds to the associated risk of AI decisions.
  • Decision Thresholds: Establish specific criteria that dictate when human intervention is necessary, ensuring that AI operates within predefined boundaries.

AI Explainability and Interpretability

 

Ensuring that AI systems are transparent and their decision-making processes are understandable to human collaborators is crucial for trust and effective collaboration (Hemmer et al., 2024).

 

Implementation Strategies:

 

  • Model Documentation: Utilize standardized documentation methods, such as datasheets and model cards, to provide comprehensive information about AI models’ purposes, data sources, performance metrics, and limitations.
  • Explainability Tools: Employ tools like SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-agnostic Explanations) to elucidate AI decision-making processes.

Human–Computer Interaction (HCI) for AI Systems Design

 

HCI for AI systems focuses on designing interfaces and interactions that optimize human engagement, usability, and trustworthiness in AI-powered technologies (Ehsan & Riedl, 2020). Ensuring that AI systems are transparent, understandable, and reliable is crucial to fostering user confidence.

 

Implementation Strategies:

 

  • User-Centered Design (UCD): Design AI around user needs through research, prototyping, and usability testing. Measure AI success based on usability, not just accuracy.
  • Trust Calibration & Confidence Management:
    Confidence scores: Show how specific AI is about its predictions. Use adaptive trust levels: Let users customize AI automation based on their trust preference.

 

Context-Aware AI Systems

 

Developing AI systems that adapt to the context and expertise level of human users enhances collaboration and ensures that AI provides relevant support without overwhelming the user (Zheng et al., 2023).

 

Implementation Strategies:

 

  • User Profiling: Design AI systems that can assess and adapt to the user’s knowledge level, providing assistance that complements the user’s expertise.
  • Dynamic Interaction Models: Create interfaces that allow users to adjust the level of AI assistance based on the task complexity and their comfort level.

Training and AI Literacy Programs

 

Educating stakeholders about AI capabilities, limitations, and ethical considerations fosters a culture of informed collaboration and trust in AI systems (Sturm et al., 2021).

 

Implementation Strategies:

 

  • Comprehensive Training Programs: Develop curricula that cover AI fundamentals, ethical issues, and practical applications tailored to different stakeholder groups.
  • Continuous Learning Platforms: Implement platforms that provide ongoing education and updates on AI developments, ensuring stakeholders remain informed about the latest advancements and best practices.

Ethical Role-Based AI Governance

 

Defining clear roles and responsibilities within AI systems ensures that ethical considerations are integrated into AI operations, preventing misuse and promoting accountability (Te’eni et al., 2023).

 

Implementation Strategies:

 

  • Responsibility Matrices: Develop frameworks that delineate the responsibilities of AI systems and human operators, ensuring clarity in decision-making processes.
  • Ethical Guidelines: Establish and enforce ethical guidelines that govern AI behaviour, aligning with organizational values and societal norms.
Click to launch interactive HCI design steps mind map
Click to launch interactive HCI design steps mind map

Key Components of HCI-led Design for AI Systems

 

A. Initial HCI perspective: The design process addresses four fundamental challenges: control, interpretability, agency, and governance. Control focuses on how users can guide and influence AI system behaviour. Interpretability ensures users can understand system decisions and actions. Agency determines the balance of autonomy between the user and the system. Governance establishes frameworks for responsible AI deployment and usage (Shneiderman, 2020).

 

B. Problem definition and constraints: A solution-neutral problem statement helps avoid premature commitment to specific technologies. This approach considers various abstraction levels and system boundaries, incorporating user-elicited requirements alongside technical, business, and regulatory constraints. The target audience analysis considers behavioural, technological, and demographic factors influencing system adoption and usage.

 

C to F. Function design and function automation. Design decomposition helps manage complexity by breaking down overall system functionality into manageable components. The automation framework follows a systematic approach to determining appropriate levels of automation, considering both human and AI capabilities. This includes strategies for maximizing automation efficiency while maintaining human oversight where necessary (Parasuraman et al., 2000).

 

G. Interaction strategy. Mixed-initiative interaction, as described by Horvitz (1999), provides a framework for balancing system autonomy with user control. This approach considers utility, balance, control, and uncertainty in determining when and how the system should take initiative. The concept of alignment and teaming (Bansal et al., 2019) emphasizes the importance of creating effective human-AI partnerships.

 

H. to I. Interpretability and control. Modern AI systems require multiple levels of interpretability evaluation: application-grounded (with domain experts), human-grounded (with general users), and functionally-grounded (technical evaluation). The H-metaphor for shared control, inspired by horse-rider interaction, provides an intuitive framework for understanding different control modes. This includes “loose rein” control for familiar situations and “tight rein” control for uncertain or critical scenarios (Flemisch et al., 2003).

 

J. Risk assessment and safety. A comprehensive approach to risk assessment considers the dynamic system boundary, including user interfaces, applicable rules, principals (personas), agents, regulations, economic factors, and training data. The framework employs various assessment methods, such as SWIFT (Structured What-If Technique) and FMEA (Failure Mode and Effects Analysis), to identify and mitigate potential risks.

 

K. Ethics and Governance. Ethical considerations are integrated throughout the design process, with particular attention to potential human errors using the Skills, Rules, and Knowledge (SRK) model (Rasmussen, 1983). This helps ensure AI systems support human capabilities rather than introducing new risks or limitations.

2. AI Alignment & Transparency: Refining, Guiding, & Managing AI Systems

AI Steering: Improving and Controlling AI Systems

 

Regulation should focus on AI’s impact and mechanisms for continuous oversight and improvement. Similar to those in finance and biotechnology, adaptive regulatory frameworks may provide a viable model (Baldwin et al., 2012). Methods such as AI auditing, explainability requirements, and dynamic regulatory sandboxes should be explored (Veale et al., 2018). Ensuring that AI remains controllable and that human agency is preserved (Rahwan et al., 2019) will be critical to establishing effective oversight mechanisms.

A Maturity-Based Approach to Implementation

 

The journey of implementing and improving AI systems follows a natural progression that aligns with organizational maturity and resource availability. Understanding this progression is crucial for organizations to make informed decisions about their AI development strategy (Bommasani et al., 2021).

1 Prompt Engineering: Entry-Level Enhancement Prompt engineering represents the most accessible starting point for AI improvement, requiring relatively minimal technical infrastructure and investment. This approach focuses on optimizing the interaction with existing AI models through carefully crafted inputs. Effective prompt engineering can significantly improve model performance without model modifications, though improvements vary greatly depending on the specific task and context (Liu et al., 2022).

 

Complexity: Low to Medium Time Investment: Days to Weeks

Required Expertise: Domain knowledge and basic AI understanding

Infrastructure Needs: Minimal

2 Retrieval Augmented Generation (RAG): Intermediate Enhancement RAG represents a significant step up in complexity while offering substantial improvements in AI system performance. This approach combines the power of large language models with custom knowledge bases, enabling more accurate and contextually relevant outputs (Lewis et al., 2020).

 

Complexity: Medium to High Time Investment: Weeks to Months

Required Expertise: Software engineering, data engineering, ML basics

Infrastructure Needs: Moderate

  • Vector databases
  • Document processing pipelines
  • API integration capabilities

3 Fine-tuning Open Source Models: Advanced Implementation Fine-tuning represents a more sophisticated approach requiring substantial technical expertise and computational resources. This method allows organizations to adapt existing models to specific use cases while leveraging pre-trained capabilities (Wei et al., 2022).

 

Complexity: High Time Investment: Months

Required Expertise: Machine learning engineering, MLOps

Infrastructure Needs: Substantial

  • GPU/TPU resources
  • Training infrastructure
  • Model versioning systems

4 Pre-training Custom Models: Expert-Level Implementation Pre-training custom models represent the most complex and resource-intensive approach to AI improvement. This method provides maximum flexibility and potential for innovation but requires significant organizational maturity in AI capabilities (Brown et al., 2020).

 

Complexity: Very High Time Investment: Months to Years

Required Expertise: Advanced ML research, distributed systems

Infrastructure Needs: Extensive

  • Large-scale distributed computing
  • Substantial data storage
  • Advanced monitoring systems
AI improvement measures in context of maturity and complexity
AI improvement measures in context of maturity and complexity

Actionable instruments supporting AI alignment & transparency

 

AI must be aligned with human values, transparent in its operations, and continuously monitored to ensure fairness, reliability, and adaptability. This requires real-time oversight, dynamic governance, and continuous improvement through transparent AI development practices.

Algorithmic Impact Assessments (AIA)

 

AIAs help evaluate the ethical, social, and safety risks of AI models before deployment, ensuring alignment with organizational and societal values (Reisman et al., 2018).

 

Implementation Strategies:

 

  • Develop structured evaluation frameworks to analyze the risks of AI models before they go live.
  • Conduct pre-deployment and post-deployment AIAs to assess long-term AI behaviour.
  • Require organizations to publicly disclose AI impact statements for high-risk AI applications.

Bias, Fairness & Robustness Audits

 

These audits ensure that AI models operate without bias, maintain fairness, and perform reliably across diverse conditions (Mitchell et al., 2019).

 

Implementation Strategies:

 

  • Conduct automated bias detection and fairness testing using standardized fairness metrics (e.g., demographic parity, equalized odds).
  • Utilize AI fairness tools like IBM AI Fairness 360 and Google’s What-If Tool.
  • Perform regular adversarial testing and stress testing to assess AI robustness.

 

AI Model Version Control & Continuous Monitoring

 

Tracking model updates, training data modifications, and decision logic ensure accountability and consistency in AI behaviour (Amershi et al., 2019).

 

Implementation Strategies:

 

  • Use version control systems to track changes in model architecture and training datasets.
  • Implement automated monitoring dashboards that flag deviations in AI behaviour.
  • Set up automated alerts for shifts in AI model performance due to concept drift or unexpected correlations.

 

Data Provenance & Traceability

 

Ensuring transparent data sourcing, preprocessing, and labelling prevents hidden biases and maintains data integrity (Gebru et al., 2021).

 

Implementation Strategies:

 

  • Maintain audit logs for every stage of data processing.
  • Use blockchain or cryptographic hash functions to track data lineage.
  • Data documentation (Datasheets for Datasets) must disclose potential biases and limitations.

Transparency Reports & Disclosure Requirements

 

AI systems that impact critical decisions should have publicly available transparency reports that explain decision logic, risk assessments, and system limitations (Raji et al., 2021).

 

Implementation Strategies:

 

  • Organizations must publish AI transparency reports detailing their models’ intended purpose, data sources, fairness measures, and real-world performance.
  • Ensure public disclosure of AI system limitations in critical domains (e.g., predictive policing, facial recognition).
  • Develop AI fact sheets that summarize key ethical concerns and technical limitations.
  • Implement sector-specific disclosure requirements in finance, healthcare, criminal justice, and hiring systems.

Stress Testing & Adversarial Resilience

 

AI should be stress-tested under extreme conditions to ensure robustness, security, and fairness in real-world applications Carlini et al., 2019).

 

Implementation Strategies:

 

  • Perform synthetic stress tests where AI is exposed to worst-case scenarios.
  • Use adversarial training techniques to prevent AI manipulation.
  • Conduct external red teaming exercises to identify vulnerabilities in AI models.

3. Accountability & Oversight: Establishing Responsibility & Redress Mechanisms

Actionable instruments supporting accountability & oversight

 

AI accountability requires clear ownership, governance structures, legal frameworks and accountability mechanisms to ensure responsible use, prevent harm, and enable redress for affected individuals (Zerilli et al., 2019). Organizations must embed legal, technical, and operational mechanisms to enforce accountability across AI development and deployment.

Legal Liability & Redress Frameworks

 

Establish clear accountability structures for AI-related harms, ensuring individuals and organizations can seek redress when AI systems cause unintended consequences (Ge & Zhu, 2024).

 

Implementation Strategies:

 

  • Define legal liability assignments for developers, deployers, and users when AI systems cause harm.
  • Establish dispute resolution mechanisms for AI-related incidents, including mediation and legal redress.
  • Align liability frameworks with existing regulatory requirements (e.g., EU AI Act, GDPR).

Internal & External AI Audits & Certifications

 

Ethics-based auditing (EBA) is a systematic approach to evaluate an entity’s past or current actions to ensure alignment with ethical principles or standards (Mökander & Floridi, 2022). Independent audits enhance trust, transparency, and regulatory compliance, preventing unchecked AI risks (Raji et al., 2022).

 

Implementation Strategies:

 

  • A systematic process for evaluating an entity’s past or present behaviour to ensure alignment with relevant principles or norms.
  • Mandate third-party AI audits to validate AI compliance with transparency, fairness, and ethical standards.
  • Develop industry certifications for ethical AI, similar to ISO 42001 or NIST AI RMF.
  • Require companies to disclose audit results to ensure public transparency.

User Rights & AI Recourse Mechanisms

 

Allow individuals affected by AI decisions to contest, appeal, and seek remedies for unjust outcomes (Wachter et al., 2017).

 

Implementation Strategies:

 

  • Implement right-to-appeal processes for users who experience negative AI-driven outcomes (e.g., denied loans, unfair job rejections).
  • Require explainability mechanisms (e.g., model cards, SHAP, LIME) so users understand why a decision was made.
  • Ensure corrective measures are in place, enabling organizations to reverse or adjust AI decisions when errors occur.

AI Ethics Review Boards

 

Independent oversight committees ensure that AI deployments align with ethical principles, regulatory requirements, and social responsibility (Lechterman, 2023).

 

Implementation Strategies:

 

  • Establish multidisciplinary AI Ethics Review Boards within organizations and regulatory bodies.
  • Require review boards to assess major AI deployments before public release, ensuring ethical compliance.
  • Promote public participation in AI governance, allowing external experts and affected communities to contribute.

Whistleblower & Incident Reporting Mechanisms

 

Provide secure channels for employees and users to report AI-related harms, biases, or ethical concerns (Brundage et al., 2020).

 

Implementation Strategies:

 

  • Implement confidential reporting systems within organizations for AI-related ethical concerns.
  • Establish government-backed AI ethics hotlines for public whistleblowing related to AI misuse.
  • Ensure legal protections for whistleblowers who expose AI-related discrimination or harm.

Regulatory Alignment & Compliance

 

Ensure AI systems adhere to international and sector-specific regulatory frameworks, including ISO 42001 (AI Management Systems), the EU AI Act, GDPR, and NIST AI RMF, with transparent audits and compliance reporting (Jobin et al., 2019, European Commission, 2021).

 

Implementation Strategies:

 

  • AI Regulatory Compliance Audits
  • Impact-Based AI Regulation Frameworks
  • Transparency in AI Compliance Reporting
  • AI Risk Mitigation & Legal Accountability
  • refer also to the tool below

The Challenge of AI Accountability and Regulatory Compliance

 

Establishing accountability and building effective regulatory compliance frameworks for AI systems presents unique challenges due to several interconnected factors. Krafft et al. (2020) identify three fundamental challenges: the complexity of AI systems, the dynamic nature of AI development, and the distributed responsibility across multiple stakeholders.

 

The first major challenge lies in the technical complexity of AI systems. Modern AI, particularly deep learning models, often operates as “black boxes” where the relationship between inputs and outputs isn’t easily interpretable. Rudin (2019) highlights how this opacity makes it difficult to attribute responsibility when systems produce undesired outcomes. Traditional accountability mechanisms, designed for deterministic systems, struggle to address the probabilistic nature of AI decision-making.

 

A second critical challenge involves the dynamic and evolving nature of AI systems. Unlike traditional software, many AI systems continue to learn and adapt after deployment. Morley et al. (2019) highlight how this creates particular challenges for translating ethical principles into practice, as system behaviours may change over time, making it difficult to maintain consistent accountability frameworks.

 

The distributed nature of AI development and deployment further complicates accountability. Modern AI systems often involve multiple stakeholders: data providers, model developers, system integrators, and end-users. Cath et al. (2017) point out how this distributed responsibility creates challenges in attributing accountability when issues arise. Traditional legal and regulatory frameworks, designed for clear lines of responsibility, struggle to address these complex webs of interaction.

Nevertheless, building self-reinforcing regulatory compliance frameworks is essential for further developing and applying the technology.  The following key barriers must be addressed and overcome (Singh et al., 2022):

 

    1. The rapid pace of AI advancement often outpaces regulatory development
    2. The global nature of AI development makes it difficult to enforce consistent standards
    3. The tension between innovation and regulation creates resistance to comprehensive frameworks

 

Despite these challenges, recent research suggests potential paths forward. Floridi et al. (2020) propose a “design for values” approach that embeds ethical considerations and compliance requirements into the development process itself. This proactive approach might help create more naturally self-reinforcing regulatory systems.

 

In this “value led” sense: Start by using our resources map:

Show references used in the chapter

It all starts with a better understanding of digital trust.

Did you know ?

You can now directly contribute to iceberg.digital. Click here to contribute.

Contact Us

    Your Name (required)

    Your Email (required)

    Subject

    Your Message

    Please master this little challenge by retyping these letters and numbers

    Contribute to iceberg.digital

    Use this form to directly contribute to the iceberg project

    View latest Contributions