
I. Introduction: From Legal Prediction to Judicial Assistance
The integration of artificial intelligence into legal decision-making is no longer speculative. Large language models (LLMs), trained on massive corpora of text and capable of sophisticated pattern recognition, now demonstrate unprecedented competence in core legal tasks such as extracting issues from pleadings, mapping statutory provisions, identifying relevant precedents, predicting outcomes, and drafting reasoned opinions1. Empirical studies show increasingly high predictive accuracy in judicial decision-making contexts, raising serious questions about their potential role in adjudication. Katz, Bommarito, and Blackman famously achieved over 70% accuracy in predicting U.S. Supreme Court outcomes2, while subsequent transformer-based architectures have improved in accuracy in legal document classification and case outcome prediction.
For adjudication forums grappling with chronic delays and ever-expanding dockets, such as India’s commercial tribunals, where statutory timelines are routinely breached, commercial disputes are mired in procedural congestion, and tribunal members are required to adjudicate highly technical matters under intense time pressure, these developments appear almost providential, inasmuch as they provide tangible assistance by functioning as advanced research clerks and institutional decision-support systems.
II. Structural and Ethical Challenges
- The Interpretation Problem: Why Prediction Is Not Adjudication
Despite their predictive power, LLMs confront a deep structural limitation: they do not interpret law in the normative sense.A. Statistical Correlation vs Normative JudgmentLegal adjudication is not merely about forecasting outcomes but about engaging in normative judgment. Surden3 cautions that machine learning systems lack the capacity for purposive interpretation, moral reasoning, and counterfactual analysis that sophisticated adjudication requires. LLMs infer probable outcomes based on statistical regularities, but adjudication requires normative evaluation, deciding what ought to be done given competing values, purposes, and social consequences.
Kleinberg et al.4 underscore this limitation through the “policy layer” problem: while algorithms can predict risk, they cannot determine acceptable risk thresholds without normative input. In insolvency adjudication, for instance, deciding whether to approve a resolution plan involves value judgments about creditor fairness, economic revival, and distributive equity, judgments irreducible to historical data.
B. Counterfactual and Teleological Reasoning
Judicial reasoning often turns on counterfactuals (“but for this breach…”) and purposive interpretation. LLMs, trained on past text, struggle with genuinely novel factual matrices or purposive departures from precedent, precisely the cases that define appellate and constitutional jurisprudence and tribunal innovation.
In the Indian context, evolving doctrines like the IBC, such as the treatment of government dues, avoidance transactions, or group insolvency, require forward-looking interpretation that cannot be reliably extrapolated from past decisions alone
- The Legitimacy Problem: Opacity, Reasons, and Accountability
A. Algorithmic Opacity and Judicial Reasoning
Burrell5 identifies three forms of algorithmic opacity: intentional secrecy, technical complexity, and interpretive opacity. LLMs embody all three. Their internal reasoning processes are not transparent even to their designers, making it difficult to reconstruct why a particular output was generated, rendering it devoid of any meaningful explanation.This opacity poses a direct challenge to judicial legitimacy. Courts and tribunals derive authority not merely from outcomes, but from reasoned justification. Judgments must be intelligible, contestable, and subject to appellate scrutiny.B. Incompleteness and Legal Disputes
Doshi-Velez et al.6 argue that interpretability becomes essential when problems are incompletely specified, exactly the condition characterizing most legal disputes. Legal adjudication involves open-textured standards (“reasonableness”, “good faith”, “public interest”) that cannot be exhaustively formalised.
An LLM-generated conclusion, even if statistically accurate, lacks the deliberative transparency necessary for interrogation, challenges and appellate review under Articles 226 and 227 of the Indian Constitution.
- The Bias Reproduction Problem: Historical Data as Structural Injustice
A. Learning from a Biased Past
Angwin et al.’s exposé7 of racial bias in criminal risk assessment tools illustrates a broader concern: algorithms trained on historical decisions inherit, perpetuate and even amplify historical injustices and systemic biases even when designers do not explicitly encode discriminatory rules. Selbst et al.8 expand this critique by identifying “abstraction traps” (the simplification necessary to build technical systems) that render algorithmic fairness elusive in sociotechnical systems, often stripping away social context, embedding structural inequalities into apparently neutral models. Unlike earlier tools, LLMs can reproduce bias in persuasive prose, making discriminatory patterns harder to detect and contest. The risk is not merely unfair outcomes, but epistemic capture, where biased patterns are normalized through repeated algorithmic articulation of “what the law is.”
For example, LLMs trained on Indian judicial data risk reproducing systemic biases, against MSMEs in insolvency, against operational creditors, or against certain classes of litigants, embedded in past decision-making.
B. Automation Bias
Stevenson9 provides an empirical look at how risk assessment tools operate in real courtrooms. Judges do not blindly follow algorithms; they interpret, adapt, and sometimes resist them. Yet even when formally advisory, algorithmic outputs exert a gravitational pull on decision-making.
This phenomenon, often termed automation bias, is especially concerning with LLMs. Because their outputs are discursive rather than numeric, they integrate seamlessly into legal workflows: bench memos, draft opinions, research notes. Over time, this may subtly reshape judicial cognition, privileging statistically typical reasoning over creative or counter-majoritarian interpretation.

