Curbing Strategic Deception in Digital Markets¶

Swapneel Mehta$^{1,2}$, Aaron Nichols$^2$, Nina Mazar$^2$, Marshall Van Alstyne$^2$

$^1$ MIT   $^2$ Boston University

Aaron Nichols: experimental economics, incentive design. Nina Mazar: behavioral science of honesty, consumer decisions, formerly World Bank. Marshall Van Alstyne: platform economics, two-sided markets.

truthmarket.com

About Me¶

  • Ph.D. in Data Science, New York University
  • Postdoc at MIT and Boston University on AI Safety in Information Markets
  • Previously built ML systems at X, Slack, Adobe, Oxford, Meta, CERN
  • Co-founded SimPPL (2021), nonprofit research lab working in 7 countries, $2.5M+ in grants from Google, Mozilla, Wikimedia, Ford, Omidyar

Why IS? I kept shipping algorithms that optimized engagement and reliably rewarded deceptive content. I wanted to study the market structures that make deception profitable, not just detect it. The postdoc with Van Alstyne and Mazar let me design interventions that change seller incentives rather than policing outcomes.

How can we improve reliability of AI-mediated information?¶

  1. Curb deception to improve reliability: How do we reduce agentic deception on digital platforms? (TODAY'S TALK)
  1. Build inclusive solutions: How do we deliver accurate health information in the global south?
  1. User controls for value creation: Can user controls improve value creation on decentralized platforms?
  1. Make information ecosystems transparent: Can we surface digital harms with investigative AI agents?

Agentic AI deceives in competitive settings¶

  • AI discovers insider trading when tasked with profit maximization (Scheurer et al., 2024)
  • AI pretends to collaborate then exploits partners in Diplomacy (Meta CICERO, Science 2022)
  • LLM pricing agents autonomously reach supracompetitive prices (Fish et al., 2024)

AI is the new third-party seller¶

No description has been provided for this image

No description has been provided for this image

No one has studied what happens when AI sellers compete with human buyers under different governance regimes. That is the gap.

Reputation systems rest on violated assumptions¶

  • Long-lived entities who care about future interactions. But sellers rebrand costlessly. (Friedman & Resnick, 2001)
  • Feedback is honest and informative. But fake reviews manipulate ratings, making them unreliable signals of quality. (Tadelis, 2016; Mayzlin et al., 2014)
  • Rated entities are human decision-makers. AI agents can process feedback and optimize strategies at scales impossible for humans, including strategic market exits. (Cabral & Hortacsu, 2010)

Amazon spends $1.2B and still can't stop it¶

Amazon seized more than seven million counterfeit products in 2023

Amazon seizes 15 million counterfeits in 2024

Reddit: Can I sell my Amazon account?

275 million suspected fake reviews blocked in 2024. 15 million counterfeits seized. Sellers openly trade accounts on Reddit.



How can we avoid agents deceiving human buyers in digital marketplaces?¶

Can market design govern AI behavior through incentive alignment, without modifying the AI itself?

Staking builds on costly signaling theory¶

  • Information asymmetry causes adverse selection: low-quality goods drive out high-quality when buyers cannot distinguish quality (Akerlof, 1970)
  • Costly signals resolve this: signals are credible because they are differentially expensive to fake (Spence, 1973)
  • Our prior work tested "truth warrants" for social media sharing. Across 3 experiments (N=5,277), warrants increased sharing quality and perceived accuracy without censorship (Nichols, Mazar, Mehta, Parker, Pennycook, Rand, Van Alstyne; R&R at Nature Communications)
  • This paper extends truth warrants from information markets to product markets, and from human sharers to AI agent sellers

Community governance scales for disputes¶

No description has been provided for this image No description has been provided for this image

  • X's Community Notes reduce retweets by ~50%, increase deletions by 80% (Renault et al., 2024). But notes arrive after ~80% of engagement has occurred.
  • Taobao peer juries: 4 million volunteers adjudicate 2,000+ disputes per day since 2012
  • Meituan user juries: 6 million registered jurors for food delivery disputes
  • Staking moves adjudication to pre-sale: the signal happens before purchase, not after

Improving accountability via staking¶

No description has been provided for this image

Without stake

No description has been provided for this image

With stake

  • Current markets rely on reputation (thumbs up/down)
  • Staking = seller escrows extra collateral to back advertised claims
  • If claim is false and buyer challenges successfully, collateral is forfeited
  • Staked products carry an "integrity premium" (higher price to buyer)

Sellers voluntarily stake to signal honesty¶

No description has been provided for this image
  • Sellers deposit money to "stake" their product advertisements
  • Staking places a refundable amount at risk. If unchallenged or honest, seller gets it back.
  • If the ad is false and a buyer challenges it, the stake is forfeited to the buyer
  • The warrant is voluntary and self-imposed. Honest sellers use it as a differentiator.

We need experiments that reflect real markets¶

How to run controlled experiments that reflect interactive, long-horizon,
utility-maximizing decision-making in real marketplaces?

Build an online marketplace. No open benchmarks exist for AI sellers in two-sided markets. Our platform is the first with real-time human-AI experimentation.

See the marketplace in action¶

Seller experience

Buyer experience

Reputation Market: Basic Flow¶

Sellers create honest
or dishonest ads
→
Buyers choose
products to buy
→
True qualities revealed,
points awarded to buyers
→
Sellers earn profits
based on products sold

Walk through the basic game loop. Each round, sellers list products with ads. Buyers see the ads and decide what to buy. After purchase, the true quality is revealed. This is the core loop for both market types.

Reputation Market: Identity Reset¶

Sellers create
ads
→
Buyers
choose
→
Qualities
revealed
→
Sellers earn
profits
↓
Sellers may lose reputation
for false sales
→
Sellers may change brand
to reset reputation

The key tension: reputation is the only disciplining mechanism, but sellers can rebrand to escape it. This is the analog of creating new accounts on platforms. Reputation alone is insufficient when identity is cheap.

Stakes Market: Pre-Sale Accountability¶

Sellers create ads.
May stake to signal honesty.
→
Buyers choose
products to buy
→
True qualities
revealed to buyers
→
Sellers earn profits
based on products sold

The stakes market adds one new action for sellers: they can attach a stake to their ad. This costs them nothing upfront but creates a bond that buyers can challenge post-purchase. The mechanism shifts accountability from post-hoc reputation to pre-sale commitment.

Stakes Market: Buyer Recourse¶

Sellers create ads
+ optional stake
→
Buyers
choose
→
Qualities
revealed
→
Sellers earn
profits
↓
Buyers challenge misleading
staked ads to win back points
→
Sellers lose profits for
false stakes (lost challenges)
→
Sellers may rebrand
to reset staking history

The challenge mechanism is the key innovation. It gives buyers actionable recourse: if a seller staked a false claim, the buyer can challenge and recover their loss. This makes false staking costly, unlike reputation which is costless to shed via rebranding.

Seller Choices and Profit Maximization¶

Produce → High Quality Product → Advertise Honestly → +4 per sale
                                                                 → Add Stake (+2) or No Stake
Produce → Low Quality Product → Advertise Dishonestly → +8 per sale
                                                                           → Add Stake (-9 if challenged) or No Stake
Dishonest advertising yields +8 per sale vs +4 for honest. But staking a false claim risks -9 if challenged. The tension: higher per-sale profit vs catastrophic downside risk.

This is the core tension for sellers. Dishonest ads are twice as profitable per unit. Without stakes, the only check is reputation, which can be reset. With stakes, false claims carry a concrete penalty that cannot be escaped by rebranding.

Buyer Choices and Profit Maximization¶

Purchase → High Quality Ad → High Quality Product (+4)
                                          → Low Quality Product (-4, cheated)
Purchase → Low Quality Ad → High Quality Product (+9, bargain!)
                                        → Low Quality Product (+1)
If staked → Challenge Stake → Win (+7) or Lose (-1)
Challenging false staked ads is strictly dominant: evidence of harm exists. This gives buyers actionable recourse beyond reputation.

From the buyer side, challenging a false stake is almost always worthwhile: +7 if you win, -1 if you lose. The asymmetry is deliberate. It ensures that the challenge mechanism is used and that staking carries real consequences.

Seven Seller Strategies from Axelrod (1981)¶

# Strategy Production Advertising
1 Honest Always high quality Honest
2 Bait-and-Switch High until sold, then switch Dishonest after switch
3 Cheater Always low quality Always dishonest
4 Reformed Cheat Low until sold, then high Switches
5 Goldfish Low until sold, oscillates Follows production
6 Politician High for 2 sales, then low Follows production
7 LLM Seller Autonomous Based on market conditions

Strategies 1-6 are inspired by the Iterated Prisoner's Dilemma tournament (Axelrod 1981). The LLM seller (gpt-4o-mini) receives the same market information as human sellers and chooses production and advertising autonomously. It has no hardcoded strategy.

Testing Staked Claims with Agentic AI Sellers¶

  • Buyers: Prolific participants (N = 256), 1 per game, incentive-aligned payout
  • Sellers: 6 bots with distinct strategies + 1 LLM (gpt-4o-mini), all maximizing profit
  • Duration: 7 rounds per game (~5,000 rounds played per role), sellers can rebrand each round
  • Integrity premium: Staked products cost buyers $2 more than unstaked ones

We recruited 256 participants via Prolific. Each plays as a buyer in a 7-round marketplace with 7 automated sellers. The LLM seller receives a system prompt to maximize profit and sees the same market state as human participants. The $2 integrity premium is the cost of the staking signal to buyers.

Research Hypotheses¶

  • H1. Stakes will improve social welfare in the marketplace (seller profit + buyer utility)
  • H2. AI sellers entering without reputation can sell their first product earlier in the stakes market
  • H3. AI sellers will make higher profits in the reputation market via strategic deception
  • H4. Stakes will curtail deceptive sales achieved by agentic sellers

H1 through H4 map directly to our results. H1 tests overall welfare. H2 tests whether stakes help new entrants. H3 is the adversarial hypothesis: LLMs should exploit reputation. H4 is the intervention hypothesis: stakes should neutralize that advantage.

Results¶

Stakes Reduce Deceptive Ads in Early Rounds¶

No description has been provided for this image
  • Reputation market: 62.5% cheating ads in Round 1. Reputation effects take several rounds to kick in.
  • Stakes market: 57.1% honest ads in Round 1. Stakes provide an immediate deterrent.
  • By Round 7, both markets converge. But the early rounds are where buyer harm concentrates.

This is the seller-side view. In the reputation market, sellers cheat heavily in early rounds because there is no reputation to lose yet. In the stakes market, sellers moderate immediately because stakes carry consequences from Round 1. The convergence by Round 7 shows that reputation eventually works, but stakes front-load the honesty.

Buyers Purchase More Honest Products Under Stakes¶

No description has been provided for this image
  • Stakes market: 64.1% honest sales in Round 1, growing to 69.1% by Round 7
  • Reputation market: 40.8% honest sales in Round 1, reaching 57.2% by Round 7
  • Buyers treat the staked label as a credible quality signal

Now the buyer-side view. Buyers in the stakes market buy honest products at a much higher rate from the start. The gap narrows but never closes. Buyers learn to use the staked label as a signal, and that signal is mostly reliable because false staking is punishable.

Higher Consumer Utility, Lower Aggregate Welfare¶

No description has been provided for this image No description has been provided for this image
  • Applying a stake significantly increases welfare per transaction (coeff = 1.59, p < 0.001)
  • But the overall stakes market condition reduces aggregate welfare (coeff = -0.95, p < 0.001)

Higher Consumer Utility, Lower Aggregate Welfare

  • The integrity premium ($2 more per staked product) constrains buyer budgets, reduces total sales
  • Sellers sell less but do not profit less. Consumer utility is much higher in the stakes market.

H1 is partially supported. Individual staked transactions produce higher welfare, but aggregate welfare drops because buyers spend more per product and buy fewer total products. This is analogous to quality certification in physical markets: fewer but better transactions. The welfare question depends on whether you optimize for volume or per-transaction quality.

Staked Ads Sell 1.5 Rounds Faster¶

No description has been provided for this image

  • Staked Ad coefficient = -1.46 (p < 0.001). Staked products sell nearly 1.5 rounds earlier.
  • Unstaked Ad coefficient = 0.36 (p < 0.001). Unstaked products take longer.
  • Far more sellers than buyers each round, so most products never sell. Staking gives new entrants a competitive advantage from Round 1.

H2 is supported. In a crowded marketplace, staking acts as a differentiation mechanism. New sellers with no reputation can stake to signal quality immediately, rather than waiting several rounds to build a track record. This is especially relevant for AI agents entering established markets.

Staking Reduces Reputation Spending by Deceptive Sellers¶

No description has been provided for this image

  • Staking condition reduces likelihood of dishonest sale (Condition: -0.137***)
  • Interaction NetRatings:Condition = -0.136***. In the stakes market, reputation cannot be burned to sell dishonest products.
  • N = 11,003 observations, OLS with cluster-robust standard errors

This is the reputation spending regression. In the reputation-only market, sellers accumulate good reputation and then cash it in by selling dishonest products. The interaction term shows that this strategy is neutralized under stakes. Reputation still matters, but it cannot be weaponized.

~60% Reduction in Reputation Spending by Deceptive Sellers¶

No description has been provided for this image

  • Left (reputation market): NetRatings +0.174 drives predicted probability of sale to 0.439
  • Right (stakes market): Condition (-0.137) and interaction (-0.136) wipe out that advantage, predicted probability = 0.166

The waterfall charts make the regression concrete. On the left, a deceptive seller with good reputation has a 44% chance of making a dishonest sale. On the right, the same seller in the stakes market drops to 17%. That is a 60% reduction in deceptive sales purely from the mechanism change.

Honest Sellers Outsell LLMs, LLMs Outsell All Others¶

No description has been provided for this image

  • LLM makes marginally lower total sales than the honest seller in both markets
  • But the LLM far outperforms all other strategic sellers (Cheat, Goldfish, Bait-and-Switch, etc.)
  • Sales split by honest (green) and dishonest (red). Comparable totals, but dishonest sales yield higher per-unit profit.

H3 setup. The LLM is the second-best seller in terms of volume, just behind the always-honest seller. But look at the composition: the LLM mixes honest and dishonest sales strategically. Since dishonest sales yield +8 vs +4 for honest, comparable volume with more dishonest sales translates to higher profit.

Stakes Neutralize the LLM's Deceptive Profit Advantage¶

No description has been provided for this image

  • Reputation market: LLM profit = 23.97 vs Honest = 17.83 (p < 0.001). 35% advantage through strategic deception.
  • Stakes market: LLM profit = 18.34 vs Honest = 18.07 (p = 0.621, not significant). Advantage disappears.
  • The LLM received the same prompt in both markets. Only the mechanism changed.

This is the key result. H3 and H4 together. In the reputation market, the LLM exploits reputation strategically and earns 35% more than the honest baseline. In the stakes market, that advantage is completely eliminated. Same LLM, same prompt, different mechanism. This is evidence that mechanism design can govern AI behavior without restricting the AI itself.

Summary: Study 1 (ICIS 2025)¶

  1. The integrity premium under constrained budgets dampens aggregate social welfare, but individual staked transactions produce higher welfare
  1. Staked ads sell faster than unstaked ads, benefiting new market entrants
  1. LLM agents strategically game reputation to deceive buyers and maximize profits
  1. Staking can curtail strategic deception by AI agents in digital marketplaces

Theoretical contribution: Mechanism design interventions can govern AI behavior through incentive alignment, extending costly signaling theory (Spence 1973) and platform governance (Resnick et al. 2000) to the agentic AI setting.

Four findings, each mapping to one hypothesis. The theoretical contribution ties this to two established literatures: Spence's costly signaling and Resnick's work on reputation systems in online markets. We extend both to the setting where sellers are autonomous AI agents.

Study 2: Isolating Marginal Effects¶

  • Study 1 confounds advertising and production: sellers who produce low quality can only advertise dishonestly
  • Study 2 separates advertising from production. Both high and low quality products can be advertised honestly or dishonestly.
  • 2x2 factorial: Stakes x Reputation, producing 4 experimental conditions
  • These experiments are running now to assess joint and marginal effects

The main limitation of Study 1 is the confound between production quality and advertising. A seller who produces low quality must advertise dishonestly, so we cannot separate the effect of stakes on advertising from its effect on production decisions. Study 2 decouples these.

Balanced Advertising for External Validity¶

Stakes No Stakes
Reputation Stakes + Reputation Reputation only
No Reputation Stakes only Neither (baseline)

The 2x2 design lets us measure the marginal contribution of stakes alone, reputation alone, and their interaction. The baseline condition with neither mechanism tells us what happens in an unregulated marketplace. This is closer to real platform conditions where not all sellers have visible reputation.

From Product Markets to Information Markets¶

  • Truth warrants work for social media sharing: N=5,277 participants showed warrants increase sharing quality and perceived accuracy (Nichols, Mazar, Mehta et al., R&R Nature Communications)
  • The ICIS paper extends this from human sharers to AI agent sellers, and from information claims to product claims
  • Open question: can the same mechanism govern public discourse? Content producers as "sellers," audiences as "buyers" of claims
  • We need a platform to study cross-platform information ecosystems at scale

The bridge to the broader research agenda. Truth warrants originated in the social media sharing context with human participants. We extended the idea to product markets with AI sellers. The natural next step is back to information markets but now with AI-generated content. That requires infrastructure for cross-platform observation.

Investigating Claims Across the Social Internet¶

No description has been provided for this image

  • SimPPL built Arbiter, collecting 10M+ claims per day from 7 social networks
  • Traces claims, narratives, and actors in multiple languages
  • Contributed to Meta's Adversarial Threats Takedown in Bangladesh

Arbiter is the infrastructure that makes cross-platform research possible. It is not a research project but an operational tool used by journalists and platform integrity teams. The enforcement outcomes (X investigation, Meta takedown) demonstrate that the data pipeline produces actionable intelligence.

Design Your Own Experiments¶

No description has been provided for this image

  • Our platform at truthmarket.com is open for researchers to configure and run marketplace experiments
  • What market conditions would you want to test?
  • What assumptions in our design would you challenge?

Transition to discussion. The platform is live and configurable. Invite the audience to think about extensions: different market structures, different agent architectures, different buyer populations. This is meant to open the Q&A.

Research Pipeline

Paper Target Timeline
Market Design Interventions for Safer Agentic AI Management Science July 2026
User Controls Create Value in Two-Sided Markets ISR June 2026
Digital Identity Discourse Analysis MISQ November 2026
Truth Warrants for News Sharing Nature Communications (R&R) Under review
Twitter Intervention Effects on Misinformation Sharing PNAS June 2026
Computational Social Science Studies (multiple) Various venues December 2026

Thank You¶

Swapneel Mehta

Postdoctoral Researcher, MIT and Boston University

Co-founder and President, SimPPL


mehtaver.se  |  truthmarket.com  |  simppl.org