Skip to content
Paul Sherman
April 15, 2026

Mandated Enthusiasm: Bonuses, Bubbles, and the View from the Grid

P4 - Senior UX Researcher, Software

A senior UX researcher on one-week sprint cycles describes an organization simultaneously investing in custom AI research tools and mandating AI usage through performance bonuses, while fabricated pain points flow unchecked through vibe-coded PRDs and data centers consume farmland in their community.

My bonus, my performance, is attached to how much I use AI at work. So I have to [use it]... if I don't I might not get my bonus.

P4: Survey Data and Session Summary

Survey Responses

QuestionResponse
Age25-34
EducationMaster's degree
Role / LevelIndividual contributor
Job titleSenior UX Researcher
Years of experience8-15 years
Organization descriptionMake an agentic AI tax return software for accountants/CPA firms to automate their clients tax returns (Tax industry, but software company)
IndustryOther or not sure
Individual AI tools usedText generation (creating documents, emails, summaries), Media creation (images, audio, video), Search and information retrieval, Data analysis and synthesis, Workflow automation and process automation, AI prototyping/vibe coding
Organizational AI toolsInternal search and knowledge summarization, Code generation and developer tools
AI adoption involvementNo direct involvement in adoption or deployment (mostly a user of a deployed AI system)
Biggest work win with AII am on a one week research sprint for my new product (Ready to Review). We went from PowerPoint to MVP in under 10 months and the field of tax is complex with each CPA being mostly mistrusting of AI and is extremely skeptical of AI agents. Accuracy is important. With our constant feedback, we were able to launch features that even convinced detractors. However, since our product has a lot of friction on the front-end, it is been difficult to get adoption (migrating clients over; supporting enough source documents). Because of this short timeline, I have to host 5-7 customer calls every week and get the report done in a day. I clean up the transcripts with a Transcript cleaner that our doctor UX researcher on our team created and then throw those into a chain in Claude to look for themes and make a draft report. I generally have a skeleton of the top themes, but use this draft supplementally. I then write my report and have Claude edit to be concise. Also, this one week research sprint put a lot of stress on design (who had to have designs ready each week), so v0 and FigmaMake have been a gamechanger for us.
Biggest work disappointment with AII have one tool that our Data Analyst built for us that is a Vector Search + Reranking based on Claude that analyzes all sales and pre-demo calls uploaded to Gong (what our sales team hosts their calls). Me and my product manager can set filters and ask any questions. Although it is built on Claude, it has hallucinated, so I spend more time having to fact-check, which you guessed it, my product manager does not do. Also, when prompting it, you get different results each time. It will not quantify how it ranks it and comes up with the top ten themes, but does not feel like it is actually holistically looking across all calls. It is better than nothing, but it is not ideal or accurate.
Organization's biggest AI success[organization] has actually been quite thoughtful and I have been impressed with how they have invested in tools. One of the doctors (super smart dude, Yeti Li) has been reserved to only make tools for us fellow researchers to be faster in our jobs for this year. Happy to show you the roadmap of what we will get built for us. Q1 is the transcript screener and Q2 will be a Notetaker that integrates with Claude workspaces with the quotes on sticky notes.
Organization's biggest AI challengeIn a big corporation, everyone is scared that we are utilizing a tool that will replace us. They cut a significant portion of product and rehired the bottom of the barrel devs in India (who are not as competent as the former US employees and the India devs lost so much context/internal knowledge that never got passed over since they cut whole departments). TR announced they will want 50 percent of all code written to be done agentically. However, devs are frustrated because they rather just write it perfectly the first time and KNOW where an error is then have to find it later. Also, it is hard to run tests to explain why it failed - we just implemented a new DataDog but for AI tool, but it is confusing. Because this large corporation has very specialized roles, folks who are UX Content Designers or UX Accessibility Designers get very nervous as they build tools in Cursor that help automate their processes that they will get cut in 1-3 years.

Background

P4 is a senior UX researcher at a large software maker of business productivity tools, working on an agentic AI product that automates tax return preparation for accountants and CPA firms. At the time of the interview, they were operating on one-week research sprint cycles: designs finalized on Friday, customer sessions on Tuesday and Wednesday, analysis on Thursday, stakeholder debrief on Friday morning. The pace is intense enough that P4 described it as making you "want to really pant."

The organization has made substantial investments in researcher-specific AI tooling. A PhD researcher on the team has been reassigned full-time to build custom tools for the research function, including a transcript cleaner (delivered in Q1) and an upcoming notetaker that integrates with Claude workspaces (planned for Q2). P4 also has access to an internal platform that provides Claude-based prompt chains, an LLM marketplace, and recently issued Claude licenses to all employees. The tooling investment is real, but so is the pressure: the organization has set an OKR requiring researchers to document 10 hours per week of time savings from AI tools by end of year, and P4's performance bonus is tied to AI usage metrics.

The session was the longest in the study at over an hour and included screen-sharing of P4's internal tools and workflows, making it the most operationally detailed interview so far. It was also the most politically candid, with P4 describing data center construction in their community, grid stress their spouse witnesses as a lineman, and a generational anxiety about whether the AI investment bubble will collapse.

Key Findings

The Role-Based Trust Gradient

P4 articulated a trust continuum that runs across professional roles within the organization. Researchers occupy the high-verification end, performing a second pass on all AI output. Developers fall roughly in the middle. Product managers and marketers occupy the low-verification end, tending to accept AI output without substantive review.

This pattern manifested concretely in P4's description of their product manager using the customer insights explorer. P4 observed that the PM would "spin anything to it being okay" and, when asked whether she had verified a source, responded "I don't have time to check it." The trust gradient is not a matter of individual personality but of role-based accountability norms: researchers are trained to verify, and their professional identity depends on the accuracy of their analysis. Product managers face different incentive structures.

"If everybody was on a scale of who it is, research is more on I'm going to do my second pass. We're probably on the highest end and then product managers are way over here."

Hallucination Across the Stack

P4 encountered fabricated AI output in three distinct contexts: a customer insights explorer that hallucinated an entire quote, vibe-coded PRDs containing "hallucinated pain points" disconnected from user evidence, and an LLM that silently skipped input documents without disclosing the gap. Each failure mode is different (fabrication, confabulation, silent omission), but the downstream effect is the same: unreliable output flowing into decisions.

The PRD hallucination is particularly consequential. Product managers use AI to generate requirements documents that include sections for "secondary research and customer research," but the cited pain points are invented. Because the documents look polished, they pass review without scrutiny. P4's frustration centers on the gap between appearance and substance.

"A lot of them now are making them look really cool and have vibe coding but I don't think they ever go back in and add anything just whatever they prompted and told our customers and there's parts where it says supposed to have secondary research and customer research and it's just making up pain points in there and so everything looks put together and there's a lot of words on a page but nobody's still going in for that second layer."

Tool Proliferation Without Absorption

The organization is deploying AI tools at a pace that exceeds its capacity to train employees on them. P4 listed their internal AI platform, Claude licenses, Cursor, Figma Make, Linear, and v0 as tools that have been thrown at various teams. The research team is relatively well-supported, with structured training sessions, pilot studies, and a job-to-be-done mapping exercise that identified where AI could reduce pain points. Designers, by contrast, are "literally just being told by their manager like, 'Here's Figma. Go play with it.'"

The churn compounds the problem. P4 described cycling through "Rive mania to Figma Make mania and now we're on Cursor mania," each wave requiring new learning before the previous tool has been fully integrated. The gap here is not between official and shadow IT (as in P3's organization), but between the volume of tools deployed and the organizational capacity to absorb them meaningfully.

"We already went from Rive mania to Figma Make mania and now we're on like Cursor mania. It seems like there is always a new tool and you have to almost use all of them to feel comfortable."

Mandated Adoption and Perverse Incentives

P4's organization has tied AI usage to measurable performance targets. The OKR calls for 10 hours per week of documented time savings. P4's individual performance bonus is attached to how much they use AI at work. Before P4 had figured out how to integrate AI meaningfully into their workflow, they were generating unnecessary queries (asking for grocery lists, "other dumb stuff") to hit the usage target, while simultaneously feeling guilty about the environmental cost of each query.

This is structurally distinct from expectation escalation. The organization is not just expecting more output; it is measuring and rewarding AI usage itself as a KPI, regardless of whether that usage produces value. The 50% agentic code mandate for developers creates a parallel dynamic: developers report that AI-generated code requires more cleanup time than writing it from scratch, but the mandate persists.

"My bonus, my performance, is attached to how much I use AI at work. So I have to [use it]... if I don't I might not get my bonus. So at first before I was really figuring out how to do it in my workflow. I was just asking it for my grocery list and other dumb stuff and I felt bad because everybody tells you one search is dumping out a water bottle and I'm like oh no I have to do so many searches a day or else I don't get my bonus."

The View from the Grid

The session took an unusually personal turn when P4 described the physical infrastructure of AI in their community. Two data centers are under construction near their home, consuming farmland and straining water resources. P4's spouse works as a lineman for the local electrical company and sees the grid stress firsthand. A friend who poured concrete for one of the data centers told P4 that the builders are already designing the facility to convert into a warehouse if the market shifts.

P4 described a split between their professional persona ("wear a mask on LinkedIn and like I love AI") and their personal convictions about the environmental and social costs. Their hope for the future is that "it pops a little bit," not that the underlying technology disappears, but that the hype and bloatedness deflate. They compared the current moment to the dot-com bust but noted it feels "more apocalyptic" because of the environmental impact, workforce displacement, and the reach of misinformation through AI-generated content.

"Our friend who does the concrete for the [nearby city] data center that there was a big push to get that closed, but there's just not very many laws to protect the rights of what people want. They're already building it in a way that they're like, 'Well, we can turn this into a warehouse or maybe this would just be an Amazon warehouse afterwards.' So they're already kind of predicting like the people that are building it are already like this bubble might pop."

Emerging Themes

ThemeDescriptionKey Quote
Trust CalibrationDeliberate, ongoing practices for evaluating AI trustworthiness on a spectrum"Research is more on I'm going to do my second pass. We're probably on the highest end and then product managers are way over here."
Hallucination FrustrationDisappointment at AI confidently producing fabricated content"Right off the bat, the first time I used this like a month ago, it hallucinated a whole quote."
Expectation EscalationAI enabling faster delivery while simultaneously raising stakeholder expectations"I think they want us to be like down 10 hours of work a week with these tools by the end of the year."
Corporate Tooling GapMismatch between tools deployed and organizational capacity to absorb them"They are throwing every single tool our way. And I feel bad for our designers because they have even more."
Job Security AnxietyFear that AI will reduce headcount, prompting career anxiety"Are they still going to have me in three years or will there just be less of us?"
Apprenticeship ErosionConcern that skipping the "second layer" of review eliminates what juniors would apprentice into"Everybody's like, 'Looks cool.' And then I'm like, 'No, but read it.' Does any of this [make sense]?"
Infrastructure AnxietyConcern about AI's physical and environmental costs, grounded in direct personal proximity"We're just seeing all these horror stories of people running out of water and we know they're coming for the Midwest because of our water."
Organizational AI Adoption ChallengesOrganizations struggling to find an effective path forward with AI, from arbitrary code targets to bonuses tied to usage metrics"They're saying, 'Oh, we want 50% of code to be written by AI.'... it just would have been faster."

Interview Transcript

00:06:14

Paul: What was the first AI tool you remember trying and what were you hoping it would do for you?

00:06:14

P4: Yeah, that part's always tricky because I feel like there's been a lot of features that have like technically been AI that we want to categorize as AI. Especially when I was at my former company, [former company], and it seemed like any type of marketing or like SEO buzz, you almost had to say something to say, I don't like get it in like other like hits and other blog posts. I would take it. So I would say technically like the OCR capabilities was like AI that was in some of our like Snag It products for like grabbing the text from a screen capture image and the video editing audio portion.

00:07:33

P4: So, we had this tool called Audiate, which like integrates with Camtasia, which now they're just like smashing those two things together. so, you might not even really find the word audiate, but essentially Audiate was a tool where it's kind of like the video editing to script platform where you're able to edit just the words out of your videos, like all the ums and a's like instantly. So it kind of builds on like a lot of people forget that like a whole bunch of like transcripts and auto captions that's still like a lot powered by AI. Or at least like one form of it. So, those are probably like the first tools. And I would say it's only been a muscle that I've been flexing recently. Due to like the speed of I'm on one week sprint cycles here at a large software publisher on my product, which is makes you want to like really pant. So I like have to use AI to like at least help me get drafts or like clean up a report. And I just lean on lean on Claude. I was using like rag chains with chat GPT, but I like the way Claude like words things better.

I'm on one week sprint cycles here at [organization] on my product, which makes you want to really pant. So I have to use AI to at least help me get drafts or clean up a report.

00:08:32

P4: I'm pure research thankfully. They're giving me tools to vibe code. But because I'm on one week's sprint cycles, I already told him like, "Guys, if I am making the helping you make the prototype and still having to like turn around this test, no." So, I'm like put my foot down there, but I'm seeing co-workers who are vibe coding like CXO dashboards and things like that. So, I know I could use vibe coding to make like fun dashboards or like artifacts with my research, but I haven't haven't touched that yet.

00:08:32

Paul: Okay. What do you think has been your biggest win, success, or efficiency gain from using AI tools in your work?

00:09:45

P4: Definitely using it to help me keep up with this intense pace. so for like well first like just AI in general like the product I work on is fully agentic AI. So that's been cool to like try to like see my company hold a high standard for like accuracy cuz we work in tax which you know it's not good enough for it to be 96% accurate because like one thing can ripple effect across a whole messy complicated tax return. So that's been good that like everyone here is very much like trust and verify keep the human in the loop. I feel like there are a lot of products that are building without that thought behind it. They're like, "Let's just ship fast and break stuff, but ours is like, no, no, this has to be nearly I know we can't be perfect, but we're trying to be nearly perfect." so that's just like a win for the culture on my team. And then the second thing is definitely I'll share my screen because it's always like easy for me.

00:10:53

P4: Let me pull up. Actually, I probably have to log into our Zcaler. I have some of those things open as like examples that I had in my free screener.

00:10:53

Paul: Great.

00:10:53

P4: so the first one is just like using like a classic chain, which we have this thing called [internal AI platform], which is like a large software publisher approved tools. they just literally just got everyone like last week Claude licenses which we have like all these tools in [internal AI platform] that we could like choose from. So I am kind of confused.

00:10:53

Paul: It sounds like there's a proliferation of tools that are sanctioned.

00:10:53

P4: They are throwing every single tool our way. And I feel bad for our designers because they have even more. like for example linear and then there's cursor and then we have all the figma makes all like it's oh let me allow what's to I'm going to revisit and try to share screen entire screen.

They are throwing every single tool our way. And I feel bad for our designers because they have even more. Like for example Linear and then there's Cursor and then we have all the Figma Make.

00:12:15

P4: So this kind of like we have like you know all of these options that you can hop into. So, like for quick things, I'll just like always like hop into quad and like ask it, you know, my random questions and upload documents. And then we have these like you can like have these prompt builders that we all have. I haven't played around with our marketplace, but essenti if I can do I have mine favorite. Okay, I don't know why it says all chains, but if I can go in and then find and transcripts. I might put it in the description, but essentially that's what I was doing. It'll take a second to load.

00:12:15

Paul: What is this?

00:13:25

P4: Yeah. Oops. Yeah. So, this is like we got to take a class on it and I don't really like to fully remember how it works, but you essentially have your like saved system prompt and you with these chains. I guess the benefit of running it through one of these chains compared to going back to one of these [internal AI platform] experiences and tossing it into there is that you can do more transcripts at a time and then like switch the LL LLM one on the back end. So I could be like oh what does Claude say? What does chatbt say? I think that's what it means. One of the other things I wanted to share with you is, kind of like how we're like working through this stuff. So, we've one thing that has been a pain point for other users not necessarily me cuz I think it's been going okay is the transcript cleanup, especially if you're tossing like 27 transcripts at a time.

00:14:39

P4: We just have one of we have a really smart guy, he's a researcher on our team, but he got kind of moved on this team that is just creating AI tools for us researchers. So he's still a researcher like us. but that's another thing that I think a large software publisher is totally making that investment in us of like we're going to pull people from your team just to make custom things for the team, which is great because I'm on one research sprints. I don't want to waste time experimenting. I just want it to work.

P4: And I want to still like practice with the transcripts cleaner, but I don't Yeah, I don't have time to teach myself these things, especially as we already went from like RV mania to Figma make mania and now we're on like cursor man. It seems like there is always a new tool and you have to almost use all of them to like feel comfortable.

We already went from Rive mania to Figma Make mania and now we're on like Cursor mania. It seems like there is always a new tool and you have to almost use all of them to feel comfortable.

00:15:38

Paul: Is that person that you mentioned, is this someone who's not assigned to a product team like you are, but is more in a centralized, center of excellence or corporate function?

00:15:38

P4: He was on kind of like our like the future discovery work of our products. And I think he'll still get pulled into those if there's like they need the bandwidth. But like right now he's attached to only trying to like help us hit our OKRs of like reducing work for us other researchers. So he got kind of pulled into like yeah he's always been kind of on like the forefront type of stuff which was for our products and he still will like come in time to time if there's like a future of audit project he'll come in and like kind of be a lead on it but he won't actually have to like facilitate all the sessions.

00:15:38

Paul: How did this come about? I'm looking at your team's SharePoint right now and I see that you've got a research team page and it's labeled AI plus UXR and then there's samples and you also showed me some tools and workflow chains.

00:16:37

P4: Which this is so great.

00:16:37

Paul: Is this something that was top down driven, more bottom up, or a little bit of both?

00:16:37

P4: I think it was like [person] being good at this stuff and then maybe not having as many projects on his plate and then from the top up of them being like we're going to set our OKRs and we know our head of leadership like all the way to the CEO is like we want we're we're taking bets on this.

00:17:35

P4: we're going to invest in this heavily, but we want you guys to start measuring that in your OKRs. So for example, I think they want us to be like down 10 hours of work a week with these tools by the end of the year. So then they were like

I think they want us to be like down 10 hours of work a week with these tools by the end of the year.

00:17:35

Paul: So explicitly an efficiency gain. When you say down 10 hours, they're saying we want you to document having saved 10 hours of work.

00:17:35

P4: Yeah. Yeah. With these type of tools, which I was like, okay, 10 hours. I'm not. And again, it's all like perceived time, right? And we all want to hit our OKRs. So, I feel a little weird about how they're keeping track of it. But I like that at least for us, thanks to having like Yeti have the time and the bandwidth to do this. Like my designers are not getting something as thorough as this.

00:18:26

P4: Like we've had like training sessions as a group. We've he's done it like little mini pilot sessions before we even get to the training sessions as a group. So, we're like super lucky because my designers are literally just being told by their manager like, "Here's Figma. Go play with it." and they always like talk about like how cool it looks, but I don't think their training gets as like practical or for example, I don't think theirs gets as attached to like a use case as much. So for example, like this job map is awesome because of course the researchers are going to like attach everything to a use case. So we they had like surveys of like where our pain points were. I know this image is kind of small but essentially like you can see like where our core jobs to be done and like where we wanted like AI to like intervene which actually is the hyperlink. No, it's not clicking up to that. oh actually it's here.

My designers are not getting something as thorough as this. We've done little mini pilot sessions before we even get to the training sessions as a group. So we're super lucky because my designers are literally just being told by their manager like, "Here's Figma. Go play with it."

00:19:34

P4: Yeah. And I yeah this so they're really like thinking through that thought process which has made they were just they were trying to convince all of us that unfortunately when you are throwing things into like any type of even like Claude like if you attach seven we have they had were doing plenty of studies where it's not actually reading all seven. It might have only referenced five. So that's been an issue for us is like trusting to be like, okay, did it actually analyze all the calls? Which leads me into one of the other tools that our data analysts made, which is this one, which is like you, so you'll see this customer insights explorer. And this is probably like the worst, well, not the worst, but it's like the least reliable tool I use. So I'm assuming it's like it's like a rag chain vector searching ranking that's like based on claude.

We had were doing plenty of studies where it's not actually reading all seven. It might have only referenced five. So that's been an issue for us is trusting to be like, okay, did it actually analyze all the calls?

00:20:43

P4: So again I don't know why it's not as reliable but this is searching I can pump it up and like rank I can do like final results after. So if I just want like the top 10 themes and then I can like set filters and I would do like all customer calls but just make sure they're external. So this would include like customer experience like those customer success managers and then this would be sales. So if I want customer success and sales that's what that filter means. and then I would just do it for my product which is called R2R. And then I could ask it like okay like hey say but I say what are the top pain points? and usually I this is what my PM does. She has a very short prompt. I'm always like, don't be verbose. Make sure to add like I have like a copy and paste that I will pull from like a prompt library that I've already written.

00:21:44

P4: That's a paragraph and then it will rewrite like some of the things. So I just like in ready to review. that does that to me. so it'll like rewrite the query and then it will supposedly search across all of those. So it's finds 849 calls. So, there's a ton of calls that are happening, which I'm glad we have this and not have this because there's no way I would go have time to go through 849 calls. But like right off the bat, the first time I used this like a month ago, it like hallucinated a whole quote.

Right off the bat, the first time I used this like a month ago, it hallucinated a whole quote.

00:21:44

Paul: This is interesting what you're showing me and I'm going to do a little bit of restating just for the transcript because I don't want to share your org's work on a video to the world. So this has taken in calls to support and someone set up a front end on your intranet and it's using RAG chains and some LLM in the background to let you query the the calls and set some filtering parameters around what type of calls that are maybe categorized by tag.

00:23:15

Paul: I love the irony of you as a researcher derailing me because it's this is really good stuff and I want to make sure that we keep going with this. But what I'm seeing here and hearing is that yeah, it kind of works, but you don't 100% trust it because you're not sure that it's picking up the right type of calls or amount of calls. What do you think is going? What's the failure mode? Or are you not sure?

00:23:15

P4: Yeah. So I saw that there was like back when I was at [former company] like I took pride in using Dovetail as our research repository. I don't know if you've like used Dovetail in your in the past, but essentially we were so customer call like videoheavy at [former company] that like we uploaded everything into 1 and I would spend a good amount of time like tagging everything.

00:24:11

P4: So then when there was the moment which I know Dovetail has like AI tags now it wasn't very good when I left but it's like getting better. So they like we would have it and we I don't know if maybe my tagging helped, but their AI was like so good on the back end of and then I could go to like my tag library and be like, see this is every time somebody talked about removing and I don't know in their video just making something up. So I liked having a qualitative number to my quantitative analysis and I don't think LLM's are good with numbers yet or like citing things in that way that I like. So essentially like there were 849 calls, but why if this is the number one theme, why do I only see two things cited? I want to see 80 things cited. And I don't know if it's because of how he the our analyst who made it is also like is it because it's 10?

00:25:16

P4: Like I've tried it with like 50 and then it's only gives me like 50 results. It doesn't like ever do more than like 10 sources is what I found. And that just isn't compelling for me to be like you're giving me three or then why does this one have three quotes? I don't know. So, I know it says that it has this like reranking. I just don't understand. And then when I have like looked into them, it's just not very accurate. There's been times where it's like it's picking up what the saleserson is saying, not the actual participant. or it's loosening a quote. or and then because there's all this like foggess, I spend more time clicking into these and like just like reading or re-watching the video to like get a feel for like what they were actually meaning.

00:25:16

Paul: you're doing a lot of pogo sticking in a way.

00:26:15

P4: Yeah. And I have a very sensitive product manager who's like she would she can spin anything to it being okay and that we don't have to improve it. So that's really hard for me because I know she's coming in here and saying like what are the wins and then if she doesn't find win she's going to like twist it and then I'm like wait did you check that? And she's like I don't have time to check it. So that's the one I'm at least glad it's better than nothing. But yeah, I don't know. This is something and I've reached out to the guy who created that this and he hasn't responded to me at all.

She can spin anything to it being okay and that we don't have to improve it. So that's really hard for me because I know she's coming in here and saying like what are the wins and then if she doesn't find win she's going to like twist it and then I'm like wait did you check that? And she's like I don't have time to check it.

00:26:15

Paul: Interesting. I really appreciate you showing this to me.

P4: yeah.

00:26:15

Paul: I've used NotebookLM in a way that has given me high confidence that it's citing correctly. So it's using the appropriate sources and citing appropriately and using my tagging and thematic structure. But I don't want to fill up this chat with that.

00:27:13

P4: Oh, okay.

00:27:13

Paul: But I'm happy to talk to you about it because it was pretty cool and even though I was using capabilities from a few months back, it's only better now.

00:27:13

P4: Yeah, that's great. I've like gotten good things of just like Yeah.

00:27:13

P4: Again when I've used either like those chains that I showed you or like attaching things in here to do like my first draft for like my reports. I could show you like one report that I barely tweaked, which well, other than that, like generally I So write like my skeleton of a report and then I will have like just use AI to like fill in for like a couple quotes or maybe something that is kind of important and it makes sense that it was like the easiest report.

00:27:13

P4: I guess it's more of like a blog post that we call these but this was essentially like nearly 100% written by AI.

00:28:10

P4: So I was like didn't have to write a report very long for this one week of research.

00:28:10

Paul: When you run a session, do you do session-level quick summary?

00:28:10

P4: Yeah, so if I So my typical weekly schedule is so we kind of figure out we're doing hopefully two weeks in advance, but essentially like that Friday I'll be in a jam session with a product manager and the designer who if they don't have the designs ready, we're all vibe coding. live together. And then on we're kind of still tweaking that on Monday. Now I've gotten to the schedule where we do half day sessions on Tuesdays because I was having an issue with like sample size. So I'll run customer sessions on Tuesdays and Wednesdays to hopefully get five to six people. And then I spend Thursday, all Thursday analyzing and writing like a little blog post like you just saw. And then on Friday morning, we have the debrief of all the stakeholders.

00:29:18

P4: And then we kind of just like rinse and repeat every single week.

00:29:18

Paul: Okay. the reason I asked about whether you do session level summaries is I found that in using NotebookLM it helped it develop the themes without going off into hallucination land when I would both feed it the transcripts and my one to two paragraph Slack summaries which I would do at the end of each session, it's just human-only brain dump basically. And I was I've always been curious about how much that's used in other organizations.

00:29:18

P4: I know they say to do like smaller slices, you know, to like avoid that. And a lot of my teammates are like running it through to give like the call summary notes to their stakeholders in between. Unfortunately I think they're too long for stakeholders to even read. I just do like the key decision questions and then maybe like two bullet points.

00:30:26

P4: But I do that because I take notes live during it and like a spreadsheet. So I'm like a little bit different. So I feel like I am AI. But maybe I know it's like super easy if they want like a summary all of our stuff you can just use co-pilot because we upload everything on SharePoint. So yeah.

00:30:26

Paul: You've talked about the wins, efficiency gains, that you've gained and your organization. You talked about a disappointment or maybe something that wasn't working as well as you would like it to. And that's when you showed the the call front the call transcript front end from the support calls. what do you think has been the organization's biggest disappointment or failure without bringing you know the dirty laundry out into the light? Do have you encountered a time where there was an AI roll out and it and it just didn't work as advertised or as hoped for?

00:31:11

P4: Yeah, I think we're still kind of there. I think there has been time where we've like rolled out products to our customers that don't didn't have the level of accuracy that's needed in the tax market and that's been disappointing. And they even named it the same name as my product that we just launched. And so that's just confusing. They called it review ready and mine's ready to review. I don't know. There's just like they probably could have thought about that a little harder because it was only they were only launched like within two years of each other too and they're completely...yeah. So that's a bit messy but I would say it's more like my and I put this in my screener but it's more like the devs are really concerned about it.

00:32:32

P4: they see more of that as like job there. they have a fear of job replacement and the quality has not stayed consistent. So there used to be I'm in this enormous corporate office and there used to be way more people here and then they cut 2third of the staff. So overnight like entire departments with that like internal knowledge of really old code and you know how like if old how important old code is to like have that internal knowledge of why it's so like quirky. So like a lot of that was just like disappeared when they like cut everybody. and then they rehired in India and I don't want it to sound like ethnosentric or anything or not ethnosentric we don't like think of like oh Americans are better because that's not what I'm saying. It's because they still like in order for like to you know get those capital gains they hired the lowest rung of like salary range that they could over there. So it's not like they like tried saving money and saving money even more where they're just not getting maybe as high quality that they could.

00:33:42

P4: So now they're saying, "Oh, we want 50% of code to be written by AI." And I have some of my lone ranger [midwest US city] located developers who are like, I already spend so much time cleaning up this lowquality code from our overseas colleagues and now I have even crappier code in my AI that they have to review and they're like, it just would have been faster. So people they I would say one side of the coin is people are seeing it as job replacement or just like ways to be faster but it's still like yeah there's like the quality isn't there. So some of my colleagues who are like everybody's very specialized in a large corporation which was a new culture change for me where I liked being ambidextrous at my former place but I was like oh you have somebody who just does the wording on the page they're the content designer for your UX then you have your UX designer and then you like I'm only a researcher I don't know why there doesn't get to be more of us but so you have some folks who are worried that people like the first like kind of slop they see, they just kind of like go with it.

They're saying, "Oh, we want 50% of code to be written by AI." And I have some of my locally located developers who are like, I already spend so much time cleaning up this low-quality code from our overseas colleagues and now I have even crappier code in my AI that they have to review and they're like, it just would have been faster.

00:34:58

P4: So my accessibility and content designers are a little concerned that like as accessibility builds this really cool thing in cursor to remind everybody to be accessible that they've like trained this agent to do they're like okay well like are they still going to have me in three years or will there just be less of us? So that's a real a real fear. And I don't the research program here at a large software publisher is pretty robust. I think our downside is moving slow, but with rolling research, it's like you can't get faster than a week. But yeah, everybody has this fear of like people are just like, oh, this first thing is like good enough. And people even arguing that like AI moderated researchers are better than like human researchers, which I'm like, I don't know about that. So, I think we'll see how it is. I think I know it's going to disrupt a lot and I think people who don't value like organizations that don't value quality are going to take they're going to cut corners, but those are companies I don't want to work for.

So my accessibility and content designers are a little concerned that as accessibility builds this really cool thing in Cursor to remind everybody to be accessible that they've trained this agent to do they're like okay well are they still going to have me in three years or will there just be less of us?
People are even arguing that AI moderated researchers are better than human researchers, which I'm like, I don't know about that.

00:36:04

Paul: You mentioned AI moderated research and I'm already at the point where I watch something some sci-fi short on YouTube and I'm not sure whether I'm looking at human actors or AI generated anymore because that shininess that we all saw in 2024 and 2025 is now pretty much gone and you see real pores and realistic expressions and I don't like that.

00:36:04

P4: yeah.

00:36:04

Paul: Is there anything you've completely stopped doing now because AI does it for you? And flip side of that, is there anything you've started doing that you couldn't you wouldn't have done a year or two ago?

00:37:11

P4: Yeah, I think I'm now in the pace. I don't think I've stopped doing anything, but I have now. I'm always reaching for it as like to pull it where before I would like just like kind of use it like a Google search like I would just kind of like ask it questions but now I'm trying to do it at like every stage of the step. So, it's like, okay, I come up with like my draft of my recruitment screener, but I'm going to throw it in there and be like ask it to help me like, you know, buff up the screener and like come up with like what the recruitment spiel should be at the beginning. So, it's always like I like do what I like I like put my like splat of ideas and then just have it like kind of like fluff it up and like make it so long every step.

00:38:07

Paul: So ideulating and iterating on Yeah, that's that's also how I've been using it.

00:38:07

P4: It's I haven't created any agents to like full on do a task for me yet. I know we'll get like a notetaking task and then we'll get like a report writing output. Those are all those things that were on like Yeti's road map to have by the end of the year. so that will be cool, but a lot of researchers I know don't like take notes live during sessions or they have to go back in afterwards. So that's probably going to save them more time than it would me. And also I don't I can't get away with as much silliness here as I did at [former company], but I used to make like very silly shareouts like every because a lot of them were video and I would have like a bald cap on and I would be like an SNL character. So I can't get quite as silly there.

00:39:00

P4: And a lot of my reports like are written before in those like blog post style, but then I do have to go make an academic report later, which is a hassle. so if like AI could like kind of automate that, that would be cool. but for right now, it's just it's kind of just like a co- a co-editor.

00:39:00

Paul: Right. Okay.

00:39:00

P4: That's all it really is.

00:39:00

Paul: Let's talk about disclosing AI use. I think everyone's had the experience where you look at someone's content either personal life or work product and it just feels like a loweffort AI slop phoned it in sort of delivery. What norms and unwritten rules are forming, if any, around disclosing AI use and making sure that it's got quality that you would expect of a human effort.

00:40:02

P4: that's a good question and nobody has set those norms yet. I still feel like it could go either way. Like you'll have some folks who are like, "Whoa, this looks really good." And they're like, "Thanks, I vibe coded it." And everybody like compliments some for it. And I think it's for me though I still see it as like okay so for example we have PRDS which is project I don't know even know what it means in dev language but essentially what the product manager makes of the requirements they yeah for like design and like a lot of them now are like making them look really cool and have vibe coding and like but I don't think they ever go back in and add anything just whatever they prompted and told our customers and there's parts where it says supposed to have secondary research and customer research and it's just like making up pain points in there and so everything like looks put together and there's a lot of like words like on a page but no nobody's still going in for that second layer and that's the part where I'm like disappointed.

A lot of them now are making them look really cool and have vibe coding but I don't think they ever go back in and add anything just whatever they prompted and told our customers and there's parts where it says supposed to have secondary research and customer research and it's just making up pain points in there and so everything looks put together and there's a lot of words on a page but nobody's still going in for that second layer.

00:41:18

Everybody's like, 'Looks cool.' And then I'm like, 'No, but read it.' Does any of this [make sense]?

00:41:18

Paul: Have you ever gotten any push back from developers who presumably have to consume those documents to build? Is is that coming up at all or is that do you think the devs even read the documents?

P4: No. I just feel like we all get.. Yeah. I think we all get like just stuck in more meetings or more like design jams where then like instead of being like, "Hey, I'm going to make sure I get secondary research from Aaron instead of this made-up gobbledygook because I'm like where is it pulling from? It's not connected to my research bank. Like we're all just happening on the fly like aneidcotally which I don't love either

00:42:09

Paul: This is directly related to my other question about trust, which is how do you decide when and whether to trust AI?

00:42:09

P4: I verify even starting with like a transcript cleaner first to be like hey like we know the transcript is like kind of okay with the output of what comes from [Microsoft] Teams. Userzoom is a platform that we use that's trash their transcript sucks most of the time I don't think they've like updated their software in so long and they just keep merging and you're like okay I guess this is going to be the enshitification as they call it of like a lot of merging companies.

00:43:21

P4: So essentially we have the transcript cleaner first to kind of get better. But like we said, I had no idea that Claude wasn't if I dropped seven transcripts in there, it wasn't always cleaning them up or like referencing all of them in the report. So there's little things that we can do of like make sure to like site how many of the link documents that you read. Like you can try to like tell it to and it'll be like oops sorry here's here's the other two. So there's been like some ways that we've checked it on our side and I like that our team was like starting so small instead of going for like the biggest time thing. It seems like we're we're taking our steps that way. but then yeah, we our designers feel like they're getting their toes stepped on a little bit with like the product managers also vibe coding, but the thing with AI is it like falls down pretty quick. Like again sub style over substance like it looks cool but then you're like or it becomes like if you're making just like a very small feature it makes the feature like so prominent in the design where you're like this is the smallest feature like you still need a designer to go through and be like actually no this is what the workflow is. So I think every there's like a healthy bit of skepticism. So if everybody was like on a scale of like who it is like research is more on like I'm going to do my second pass. They're we're probably on like the highest end and then product managers are like way over here and I don't interact with the devs but what I've heard from the devs I like have lunch with. They're kind of probably actually more in the middle even though like studies show that you get more accurate code than you do for other things. And then my drinks are on

If everybody was on a scale of who it is, research is more on I'm going to do my second pass. We're probably on the highest end and then product managers are way over here.

00:44:28

Paul: So just to clarify when you were saying researchers are over here and product managers are over here in terms of a continuum of trust?

P4: And I would add trust slash actually going to go back in and change something, or my product managers and my marketing team.

00:45:26

Paul: Yeah. Okay.

00:45:26

P4: I would also put like right over there where they still think like whatever first happened is cool and they're going to go with it.

00:45:26

Paul: cool. Let me see. I want to just cover, we covered trust, we covered, norms developing. I was going to ask a question. How do you feel AI is changing, how you approach solving problems, but you've demonstrated that, you've showed me. So, I want to skip that in favor of my wrap-up questions, which are how does the increasing presence of AI in the world, both work and personal, how's that making you

00:47:24

P4: I definitely am like using it more than I would like to.

00:47:24

Paul: feel?

00:47:24

P4: I would say I come like my friend group who think like how we think like politically and stuff. It's like a it's a swear word around them. They don't think it's cool. They don't buy the hype. so I live in [midwest US city] and there is a data center getting put in [city], which is where a large software publisher headquarters is, and there's a data center being put in [neighboring city], which is technically the city I live in. And so we're just like, we're seeing all these horror stories of people run out of water and we know they're like coming for the Midwest because of our water and it makes us like worried that it's like all some big dumb bubble. Also, my husband works or our electrical company is like a lineman. So, like he already sees how stressed out the grid is from people just flicking on their air conditioning in the summer. And a lot of these they're just like they just get a free pass at a lot of our utilities whether it be water and power without building their own substation because that would cost way too much money.

00:48:38

P4: and I just I just I'm like will they even be around in 10 years? So, our friend who does the concrete for the [city] data center that there was a big push to get that closed, but there's just not very many laws to protect like the rights of what people want. Like they's already saying how they're building this massive, it's huge. The [city] data center is huge. If you look it up, it's like so many acres and acres and acres of farmland. But essentially they're building it in a way that they're like, "Well, we can turn this into a warehouse or like maybe this would just be an Amazon warehouse afterwards." So, they're already kind of predicting like the people that are building it are already like this bubble might pop. So personally, like it's it feels kind of like you have to like wear a mask on LinkedIn and like I love AI. But but in my personal life, I'm like I'm seeing like I love Perfect Union is one of those like independent reporting accounts that you can like find on like Instagram or YouTube.

I live in [midwest US city] and there is a data center getting put in [city], which is where [organization] headquarters is, and there's a data center being put in [neighboring city], which is technically the city I live in. And so we're just seeing all these horror stories of people running out of water and we know they're coming for the Midwest because of our water and it makes us worried that it's all some big dumb bubble.
My husband works our electrical company as a lineman. So he already sees how stressed out the grid is from people just flicking on their air conditioning in the summer. And a lot of these they just get a free pass at a lot of our utilities whether it be water and power without building their own substation because that would cost way too much money.
Our friend who does the concrete for the [nearby city] data center that there was a big push to get that closed, but there's just not very many laws to protect the rights of what people want. They're already building it in a way that they're like, 'Well, we can turn this into a warehouse or maybe this would just be an Amazon warehouse afterwards.' So they're already kind of predicting like the people that are building it are already like this bubble might pop.

00:49:41

P4: Like you see all this like reporting of like stuff that's happening and just like how because they're not building it their own substation, they have to use all these awful generators that can increase like asthma and all this other bad stuff for people with issues. So like environmentally and like what it means for like just how people like live. I don't love that. And we've we have pushed within our company of like, hey, can we make sure we're doing like carbon like off like set anything we can do to make sure like what we're doing is there. and they always like say some corporate spiel, but I don't really feel like it there's like truth or like cided behind it where I've even had a friend who he is an immigration lawyer in Chicago who's one of my dear friends and he uses one of a large software publisher's legal tools which is like for AI research Co-Counsel is what it's called and sure it saves him time but he still describes using AI which he still does he calls crap. Like that's how he just views it. Like it's for a lot of people in my generation, which he's like maybe like six years younger, so he's a little closer to Gen Z cuz I'm a millennial, like they view it that negatively. Like they think it's that off-putting. And even though my husband's a lineman, he does animation on the side. And so there are cool things to do it, but there like if anybody is even like a little bit environmentally or maybe like politically like one way, but that's not true because my parents are boomers and they voted for Trump and my mom is extremely against data centers. So I think this can be something on both sides of the aisle that people agree with. but yeah, I would say it's hard.

00:50:50

Paul: Interesting.

00:50:50

P4: It's hard because I'm still like gonna ask it and I actually my bonus my performance is attached to how much I use AI at work so I have to if I don't I might not get my bonus which is strange.

00:51:45

P4: So at first before I was really figuring out how to do it in my workflow. I was just asking it for my grocery list and other dumb stuff and I felt bad because everybody tells you one search is dumping out a water bottle and I'm like oh no I have to do so many searches a day or else I don't get my bonus. So, I don't know. That's all.

My bonus, my performance, is attached to how much I use AI at work. So I have to [use it]... if I don't I might not get my bonus. So at first before I was really figuring out how to do it in my workflow. I was just asking it for my grocery list and other dumb stuff and I felt bad because everybody tells you one search is dumping out a water bottle and I'm like oh no I have to do so many searches a day or else I don't get my bonus.

00:51:45

Paul: What's your single biggest concern or fear? And then what do you think is the most significant breakthrough or positive outcome that AI might enable within the next decade? So let's talk about what's your single biggest fear and then what's your single biggest hope.

00:51:45

P4: I guess my biggest fear is like that we will never have regulation that we need. and there's just so many ways you could say like all right like okay if you want to build it in my town then have your own substation and you don't you get we're going to monitor how much water you use. like we could have done that as a community, but it didn't get pushed through because of like the political climate. or like I fear a lot for like children's safety with AI. I have two little ones and I think it's I'll just use that bomb.

00:52:35

P4: It's so fucked up what people can do with that type of technology with like deep fakes and everything. So yeah, I just I just think there needs to be an misinformation of just like what can happen across all like all political systems. like we didn't we saw like what happened with the violence in Myanmar because of like Facebook and I just think like we haven't learned from anything and there's going like with the amount of misinformation and like AI slop in those videos and how good deep fakes are.

00:53:39

P4: There's just so much that should be regulated. You could like think of anything and come up with a law and we just have zero laws and even the states rights there isn't something for us to be like well I want this because we just federally had a law against that. So I just think regulation would be sweet and I'm trying not to be like I'm still using it. But yeah, I just wish like as a whole we could regulate it because I think if we just like ignored how we voted and talked about it clearly, we'd be like, "Yeah, we should." And you were seeing that movement of states trying to come up with their own laws like California, Michigan, like we all were like coming up with our own laws, but there just needs to be regulation to protect people.

00:54:31

Paul: How about biggest hope? You know, what's your biggest positive hope for outcome when it comes to AI?

00:54:31

P4: Oh my gosh. I hope it pops a little bit.

00:54:31

Paul: That's fair.

00:54:31

P4: I just think there's a lot of hype and bloatedness to it and I just want to see some of it pop. I think there's still going to be great underlying technology. but yeah, I just want some of like the savior aspect to it to just chill out a little bit. It's it's not even it's it's cool but it's not everything. And I think yeah until we see how much like ROI these companies need I don't I don't I don't like is it the coolest thing we can still be offering good value in other aspects in our economy.

AI Use Disclosure

I used AI to analyze the data collected via interviews and surveys. How?

  • I took notes after each session.
  • I fed those notes to several AIs, along with the moderator guide, project proposal, session transcript, the participant's survey responses, and a codebook of tags and themes I've been iterating as I collect data.
  • I prompted each to write a background, findings, and emerging themes section.
  • Then I iterated on each AI's draft, challenging the AI where appropriate and removing what I'm euphemistically calling "hallucinatory content" :-).
  • I collected each AI's drafts, added them to the project I've set up in Claude Cowork, and prompted it to draft the background, findings, and emerging themes section, pushing back as appropriate.
  • Then I edited the content, because "human in the loop" means "I have final edit." At least to me it does.
  • I then published each session writeup.

There's a bit more to it, but I'm trying to keep this short. Reach out if you want to talk about my AI-assisted workflow, which I'm still evolving as I go.