Led 4 parallel research tracks—experience audit, 30+ app competitive study, user feedback synthesis, and telemetry analysis—to build a prioritized roadmap. Reframed chart drop-off from UX issue to adoption crisis, forming the strategic foundation for every charting feature shipped.
Liked this project?
Let's talk about what we can build together.
Introduction
How four parallel research tracks reframed Excel's charting problem — from a UX fix into a product adoption crisis.
Excel has ~400M users. Only ~8M create charts — a 98% drop-off. I led four parallel research tracks (experience audit, 30+ app compete study, user feedback synthesis, telemetry analysis) and converged them into a framework that reframed the problem: we weren't fixing a UX issue — we were solving a product adoption crisis hiding inside a legacy feature.
That reframe, and the prioritised roadmap it produced, became the foundation for every feature we built. The individual case studies (Modern Colors, Copilot Insights, AI Recommendations) each tell their own story — but every one answers a question this work asked first.
Starting Point:
No Brief, New Team, Loud Pain
September 2024. A new team, no roadmap, charting generating 30% of all Excel feedback frowns. The context that shaped the structure.
September 2024. Newly formed team. No roadmap. Charting was generating ~30% of all Excel Web feedback frowns. The instinct was to jump in and fix the loudest complaints.
Instead, I structured four parallel research tracks — each answering a different question:
| Track | Question |
|---|---|
| Experience Audit | Where exactly does it hurt — in our own hands? |
| Telemetry & Funnel | How bad is it at scale? |
| User Feedback | Is our assessment real, or just our opinion? |
| Compete Study (30+ apps) | What does "good" look like — and where's the whitespace? |
Track 1:
Experience Audit
First-person walkthrough of the full chart creation journey — screen-recorded, timestamped, catalogued across 60+ friction points.
We used Excel to make charts from scratch — screen-recorded everything, timestamped every friction point across three scenarios.
| Scenario | Desktop | Web | Key Friction |
|---|---|---|---|
| Insert a chart | ~50s, 12+ clicks | ~30s, 7+ clicks | 3 decisions before any visual feedback. Hit-and-trial loop. |
| Basic formatting | ~80s, 20+ clicks | ~60s, 20+ clicks | Controls scattered across 4 surfaces. 20 clicks for basics. |
| Add elements | ~10s, 5+ clicks | ~24s, 8+ clicks | Web 2.4× slower. Inconsistent undo killed experimentation. |
We also catalogued 60+ craft issues — no snapping guides, overlapping labels, jargon-heavy UI ("bounds," "intervals"), and a Format Task Pane described internally as "extremely long and difficult to navigate."
Track 2:
The Funnel
The telemetry that reframed everything — 98% drop-off, 40% deletion, and the data that forced a strategic pivot.
| Stage | Users | Drop-off |
|---|---|---|
| Excel MAU | ~400M | — |
| Aware of charts | ~368M | -8% |
| Creating charts | ~8M | -98% |
Within that ~8M: 40% deleted in the same session. 22% produced blank charts. 5–9 minutes and 200+ clicks per presentable chart.
The pivot this forced: We'd been hired to fix customization for ~10M existing users. The funnel proved the real opportunity was the 360M who never tried. I led the hardest conversation of the project — telling partners "the thing we were hired to do is the wrong priority" — backed by three data points: the 98% drop, the 40% deletion rate, and the compete landscape. The reframe: not "abandon customization" but resequence — fix defaults first.
Leadership aligned. The roadmap shifted.
Track 3:
User Feedback & the 0% That Settled Every Debate
Thousands of OCV signals and one usability study — the 0% axis configuration rate that ended every internal debate.
OCV themes from thousands of signals:
“Making a graph in the web version is painful."
"UI is cluttered... hard to find things."
"Web has less functionality than offline."
"Default chart looks outdated."
"Can't edit charts easily. Can't add a trend line."
Past research confirmed: trial-and-error was the dominant pattern, users searched externally (ChatGPT, social media) for guidance, and when presentations mattered, they exported to Canva. Even experts were frustrated.
The Ease-of-Use Study gave us the number that settled every internal debate:
| Task | Success | Ease (/5) |
|---|---|---|
| Configure axis | 0% | — |
| Set bounds/intervals | 43% | 1.6 |
| Change line colour | 86% | 3.7 |
| Add data labels | 57% | 3.2 |
| Overall | 47% | 1.6 |
Zero percent could configure a chart axis on Web. And right-click had 86% success — the highest path — validating that context-sensitive controls beat menu navigation.
Track 4:
Compete Study — 30+ Apps
Across AI-native, design-focused, and BI platforms — mapping where Excel sat on the ease-complexity matrix and where the gap was.
I pushed for breadth because Excel doesn't just compete with Google Sheets — it competes with every tool users reach for when they think "I need to visualize this."
| Competitor | Key Learning | Applied As |
|---|---|---|
| Napkin.ai | One-click, zero decisions | Sample data, AI recommendations |
| Canva | Template-first, beautiful defaults | Modern defaults that look polished instantly |
| Pitch | On-chart context toolbar | Floating toolbars for direct manipulation |
| Google Sheets | Explore suggests charts proactively | Contextual nudges on data selection |
| Power BI / Tableau | Auto-insights alongside charts | Copilot chart insights |
| Flourish | Vibrant modern aesthetics | Updated palette, typography, contrast |
The strategic insight: The upper-right quadrant — high ease AND high data complexity — was nearly empty. Every tool traded one for the other. Excel had the data power (400M MAU, formulas, enterprise trust). It just needed to move rightward on ease. Copilot was the bridge.
The compete study also decoded OCV: "I use Canva because it looks better" = aesthetic gap. "Charts suck" = first-impression problem. "I wish it told me what's interesting" = intelligence gap.
Convergence:
Four Streams → One Framework
The moment four contradicting research signals collapsed into one framework: Pre-Insert → Insert → Post-Insert.
By this point we had four research tracks each pulling in slightly different directions:
The experience audit said: "Fix the 20-click formatting journey, the fragmented surfaces, the 60+ craft bugs"
The funnel said: "Forget formatting — nobody's even making charts. Fix adoption."
OCV and usability said: "Everything is painful. Charts look bad. Web is behind Desktop. Customization is impossible."
The compete study said: "Everyone's ahead of you on ease, defaults, and intelligence. But nobody owns ease + complexity."
We developed a framework that divided the entire charting journey into three stages, each with its own user need, design interventions, success metrics, and priority level.
| Phase | User Question | What Research Said | Direction | Metric |
|---|---|---|---|---|
| Pre-Insert | How do I start? | 98% drop before insertion. No triggers. | Nudges, previews, sample data, AI recs | Insertion rate |
| Insert 🔴 P0 | Does this look good enough to keep? | 40% deleted. 0% could customize. Default IS the product. | Modern colours, typography, smart defaults | Chart Kept Rate |
| Post-Insert | What does this mean? | 20+ clicks for config. Zero insight generation. | Copilot insights, AI recs, floating toolbars | Retention, frown reduction |
The Roadmap
Every initiative prioritised, sequenced, and tied back to specific evidence — the framework that drove FY25-26 investment.
| Initiative | Phase | Priority | Timeline | Why |
|---|---|---|---|---|
| Modern Default Colours | Insert | P0 | FY25 H1 | 100% reach. Fixes 40% deletion. Low-med effort. |
| Copilot Chart Insights | Post-Insert | P0 | FY25 H2 | Fills compete gap. Drives dual flywheel (charts + Copilot). |
| Contextual Nudges | Pre-Insert | P0 | FY25 H1 | Attacks 98% discovery drop-off. |
| AI Design Recommendations | Post-Insert | P1 | FY26 H1 | Builds on modern defaults foundation. |
| Sample Data / Cold Start | Pre-Insert | P1 | FY26 H1 | Solves 22% blank chart sessions. |
| AI Chart Recommendations | Insert | P1 | FY26 H1 | Addresses 43% generic Copilot requests. |
| Craft Bug Fixes (30+) | All | Shield | FY25→ | Can't build AI on a broken foundation. |
Modern Colours ← Audit (dated output) + Compete (Flourish/Canva) + OCV ("looks outdated") + Funnel (40% deletion)
Copilot Insights ← Compete (Power BI generates insights; Excel had zero) + OCV ("tell me what's interesting") + Usability (0% axis config)
AI Recommendations ← Audit (hit-and-trial) + Compete (Napkin one-click) + Copilot data (43% generic requests)
Sample Data ← Funnel (22% blank sessions) + Audit (blank chart dead-end) + Compete (every modern tool shows something immediately)
What This Enabled
The research produced more than a roadmap — it produced organisational momentum:
My Reflections
The hardest part of this project wasn't any single design decision. It was operating in an environment with no clear direction, multiple competing priorities, a legacy codebase with deep technical debt, and a team that was new to each other and to the problem space.
My job in that context wasn't to have answers. It's to build the framework that makes answers discoverable. The Pre-Insert → Insert → Post-Insert model emerged from weeks of systematically absorbing data from four different tracks until the structure became obvious.
There's a temptation at the senior level to focus on big strategic moves and delegate the small stuff. This project proved that craft is strategy. Each of those 60+ craft bugs was individually trivial. But their cumulative effect across a 200+ click journey is the experience users describe as "painful." You can't separate the forest from the trees when users are tripping on every root.
Every prioritisation argument in this project was backed by specific evidence:
Data didn't eliminate disagreement. But it moved debates from "I think" to "the evidence shows." That's a fundamentally different — and more productive — conversation.
Internal research had identified charting pain points for years. What changed the conversation was showing how competitors solved the same problems — and how far ahead they'd gotten. The compete study didn't introduce new information so much as it made existing information undeniable. When you can show leadership a side-by-side of Excel's default chart vs. Canva's, the argument makes itself.
This foundational research — the experience audit, the telemetry deep-dive, the user feedback synthesis, and the 30+ app competitive study — is what made everything that followed possible. The individual feature case studies (Modern Default Colours, Copilot Chart Insights, AI Recommendations) each tell their own story. But every one of them answers a question this work asked first.