The fun thing about watching people play a game you helped build is that you get a fresh perspective you no longer have yourself. I would walk around the room between rounds, catching bits of conversation, watching decisions being made in real time. When the session ended and players filled out their exit surveys, I would sometimes find myself reading their responses as they wrote — small glimpses into how each person understood the game and their own performance.
For the past several months I have been building and running experiments at the Platform Governance Lab, a research group focused on understanding how digital marketplaces work — and how they fail. The core question we are investigating: what mechanisms actually prevent fraud in online markets? To study this, we designed a marketplace simulation — a game where participants take on the role of sellers and buyers over multiple rounds. As a seller, each round you decide what quality product to list and whether to represent it honestly. The game mirrors the structure of real platforms like eBay or Amazon, but in a controlled environment where we can observe every decision.
We ran four versions of the market, varying two fraud-deterrence mechanisms: staking — where sellers escrow money alongside a listing, forfeiting it if caught misrepresenting their product — and reputation, the familiar system of reviews and ratings, with an added option to rebrand and re-enter with a fresh score. Every combination was tested across four weeks, with small refinements to the game design between each session to give players more flexibility and help them engage more meaningfully with the mechanisms.
The ProblemWe needed to compare fraud rates across 4 market variants, 4 sessions, and 161 individual sellers — all at once. And for each of those 161 people, we also needed to understand how they performed, what strategy they relied on, and whether they actually understood the game they were playing. Standard charts force you to pick one dimension. We needed all of them.
Each seller had the option to produce any of the following strategies within their provided budget:
- Cheating — advertise high quality, but produce low quality.
- Honest High — advertise high quality and produce high quality.
- Honest Low — advertise low quality and produce low quality.
- Irrational — advertise low quality but produce high quality — which ends up costing the seller money.
A nested sunburst that combines all five layers.
I built a nested sunburst — a circular chart organized from the outside in. The outer ring shows the market variant. Inside that, the week. Inside that, each individual seller as a wedge, with color layers showing their strategy proportions, sorted within each week by cheating proportion. Their final score sits in the gap between the seller wedges and the week ring.
The part that made it actually useful: hover over any seller's wedge and everything else fades. The tooltip shows their player ID, their score, their strategy breakdown as a bar, and — at the bottom, in italics — exactly what they wrote in the exit survey. The chart answers all three questions simultaneously: you can read market-level patterns at a glance, track how strategy distributions shift across weeks, and drill into any individual to compare their data against their own explanation of it.
The chart can also be rotated to bring any strategy to the center — useful for comparing how a particular behaviour is distributed across all four market variants at once.
The ObservationsPatterns that only became visible once everything was in one place.
Several things become immediately clear that would take significant effort to piece together from a bunch of separate tables. The No Staking · No Reputation quadrant is visually the most red across all four weeks — consistently the highest concentration of cheating in the dataset. When there is no financial penalty and no reputational cost, fraud is the dominant strategy. The Staking · No Reputation quadrant shows a notably different picture: cheating drops sharply, and the red is visually thinner and more scattered across weeks. Staking alone, without reputation, appears to be the stronger deterrent of the two.
The Staking · Reputation quadrant shows something more gradual: a visible shift across the four weeks, with later weeks — particularly week 4 — carrying more blue than red compared to the earlier sessions. This suggests that as the game design matured and players became more familiar with the mechanics, the combination of both mechanisms started to have a compounding effect on honest behaviour. The No Staking · Reputation quadrant is more mixed — reputation alone does not produce a clean pattern, and the presence of the rebrand option means some players were able to exploit it strategically, cheating and re-entering with a clean slate.
Twenty minutes with the chart was also enough to group every player into one of four types — not by running a clustering algorithm, but simply by reading the visual:
Some of the individual responses were genuinely worth pausing on. Here are a few that stood out:
“At first when I had no reputation, it was hard to get people to buy my products even when they were advertised as high quality. I felt like I had no power without the stake to convince buyers to purchase my products.”
This player scored only 20 points — but their reasoning was the sharpest in the dataset. With 63% cheating and 37% honest-high, they were far from conservative. What makes the response remarkable is the articulation: at the start of a market, reputation is neutral and means nothing. Staking is the only credible signal available because it costs something. They felt that gap directly, and named it. The score doesn't reflect the quality of the thinking.
“Ability to recognize when to place high vs low.”
One hundred and two points — and the response is almost understated. The data tells the fuller story: 39% honest-high, 41% honest-low, and only 20% cheating. This player read the market condition each round and adjusted accordingly. That kind of adaptive switching across strategies is exactly the kind of behaviour that is hard to see in a table but immediately visible in the chart.
“My strength was being able to build up the reputation for the product.”
A score of 26 with 42% cheating and 47% honest-high in a no-staking, reputation-only market. My interpretation: produce honestly in the early rounds to build a good reputation, then cheat and exit — re-entering under a fresh brand name to reset the score and repeat the cycle. In a market with no staking, nothing stops that loop. The chart shows the pattern; the tooltip tells you what they thought they were doing.
That is what the visual gave me: not just a summary of what 161 people did, but answers to questions that mattered — which market conditions produced the most fraud, whether our design changes were working, and whether individual players actually understood what they were doing. The messy, multi-dimensional data became navigable. The market-level patterns became visible at a glance. And a dataset that would have taken days to work through became a 20-minute conversation.
Built with D3.js. Data from in-person sessions at the Platform Governance Lab.