Building an Effective Data Strategy via UX
by StratMinds
- Full TranscriptBuilding an Effective Data Strategy via UXJohn Whaley
Yeah, anyway, I mean, so we're talking about the problem of, when you think about user experience, right, I think, really the one true differentiator for AI companies in the future is going to be data and data strategy. If you think about this in the future, like, you know, once GPT-5 comes out and it's anything that is now publicly accessible on the web, it will now be crawled and it will be parsed and completely available to anybody who is out there using. Perfect. Yeah.
So anything that's out there, anyone out there that's using, you know, GPT-5, like a small team, they're going to go and they're going to go and eat your lunch. Unless you have some clear differentiation about what the data that you have and that is somehow unique, right. And so it's not only about the fact that you have some data at one point in time, you need a continuous stream of data. Like you need to be able to have this continuously updated.
And so, you know, if you think of data as a strategic asset, then the question is, well, how do you get access to the best data? And, you know, my assertion is that the best way to do it is like via integrating this into the user experience, right, in terms of the natural uses of the product. If that naturally is generating unique data and unique data insights for you, then that's going to have a huge... Got it. No problem. Got it. Yeah.
And so, you know, I'm sure you've probably heard this before, like data is the new oil. That's not the perfect analogy. I think that, you know, data... Here's the lightly copyedited version of the transcript, following the guidelines provided:
The key about data is in its differentiation and targeting. Data is not actually a commodity. There's modified data, but that's boring. That's the stuff that everyone else is going to have. What you want is the data that's going to be really unique. I think that the winners in this group are going to be the ones that have the best data flywheel, and they're going to be able to have these effective feedback loops.
When you're measuring an AI-based product or one that's differentiating based on data and AI, you want to find the ones that get that feedback loop the best. They're going to be the winners in the future.
It's also important to understand where to be able to use data. You can use data to train a brand new neural network or fine-tune a model, but you can also use it to provide much more personalized experiences that make your product much more sticky and compelling. You can also use it within your company to make more data-driven decisions. Understanding where it's useful to do that versus not becomes really important. Here's the lightly copyedited version of the transcript, following the guidelines provided:
And so when you think about, you know, if I want to have a strong data strategy, I want to say that data is going to be a strategic asset for us, then how do you get that? Well, I think the best way to do that is not by going and buying data from data brokers and things like that. It has to be through product engagement.
When you're talking about data, the most important thing is to actually have a product which is compelling for users, that people actually want to use. Because if you don't have that, then your whole strategy is going to be really hard. So you have to make sure that you do have great product engagement, great user engagement. And then, once you have those insights, you're actually able to do a much better job and improve things.
The most important part of this is you want to be able to make it so that the feedback from users, gathering the data from users, becomes effortless. Because I'm sure you've all seen these situations where products don't do this in a great way. It's like I'm trying to do something and then it's interrupting me to ask me for some type of feedback. It's like, I don't know, I don't want to do that. I'm trying to do this other thing. Why are you trying to get me to give you feedback? Here's the lightly copyedited version of the transcript, following the guidelines provided:
Or you know, you do it in a place where the user is in a very emotional state, where they're kind of upset about something. These are maybe not the best places to do feedback. So the best way to do it is, number one, make it inherent. Just capture feedback seamlessly during the product usage so that you can not disrupt the user experience but still be able to gather data.
This is also where part of the power of large-scale models now is that you can take large amounts of data and then summarize it in interesting ways and extract out really unique insights based on large quantities of data.
Another way to think about this is if you actually want to capture this, you need to make sure that the feedback loop is inherent in your product. So one way to do that is not just, "Hey, I'm going to generate you this thing and I give it to you," and I don't know if it's any good or not. I don't know how you used it. Like, did you edit this later? I don't really know.
If you're able to build those type of editing, collaboration tools directly into the product, then you can use that as an additional data source and understand, "Okay, well, here's the parts where you got it right. Here's the parts where the user went and edited these things." That becomes extremely valuable information to know in terms of building and differentiating the product. Here's the lightly copyedited version of the transcript, preserving all original content:
And then the last part is, whenever users are giving us some type of feedback, if you can give some type of instant gratification to them. One good example of this is, we work with a product that's filling in information for recruiters so they can review things. And there were little things that the AI gets wrong. It triggers people's OCD. They're like, "Oh no, I have to correct this thing. This thing is wrong. I have to correct it." And then they make it super easy to correct, and they immediately see the feedback there. Then they feel a sense of satisfaction when they actually go and fix it and make it all correct.
That's the type of thing that, because if it was like, "Oh, file this, fill out this form and file this ticket and then you will go and fix this data later," it's like, no, that's not good enough. You want to have that type of instant gratification there to basically encourage users to actually go and give you feedback.
There's a lot of different ways you can do this. You have the ways where you have labels, like you can have users label. You can ask for just discrete feedback, like, "Hey, is this good or bad?" I'm sure you've seen these things on ChatGPT where it generates you two answers. Here's the lightly copyedited version of the transcript, following the guidelines provided:
And I'll be like, which one is better? That sort of thing. But with LLMs, it opens up some new possibilities. You can actually ask the user in natural language things like, "Hey, what was your intent here? How did we get this wrong? We want to improve things."
In the past, it would have been way too hard to use all that data because you'd need a human to review it. A human's not going to go and read 10,000 or 50,000 reviews, maybe Anton did back in the WhatsApp days. But now you can't do that. However, with LLMs, you can - they're actually really, really good at this stuff.
I think that the most useful pieces are these ones on the right, where it's passive learning. You just observe the user behavior and you're gathering data implicit in the usage of the platform. Also, you find out the cases where users end up overriding the AI-generated feedback, and then try to learn from those changes they make as a feedback mechanism for what you're getting right and wrong.
Like I mentioned, LLMs allow for these natural follow-up questions. You can capture a lot more subtleties. There's a question of quantitative versus qualitative feedback that you're trying to get. Here's the lightly copyedited version of the transcript, preserving all original content:
And these, this is where LLMs can really understand not just the what, but the why behind it. If you do this in the right way where it's engaging for the user, it doesn't feel like these are extra steps they have to do. They feel like there's some kind of satisfaction in terms of something they get out of it where they're actually engaging here. Then you can actually use this as kind of an enhanced user feedback. And again, you can use LLMs to just take all of that huge amount of data and then summarize and extract the key points from that unstructured data.
Another important point is that when you think about product, the product is not just the externally facing product that you sell to customers. You also have to think about internal tools. These can be as important or even more important from the data standpoint.
One example: at RedQuityEye, we make a system that can help find social engineering attacks. We use LLMs and then we analyze text and transcripts and we say, does this look like a social engineering attack or not? Interestingly, we spent about the same amount of time on our internal tool that we use for data labeling as we did for our external customer-facing tool. Just making that a great experience for everyone involved, all the people involved in data labeling. We've built a lot of features in there that we're going to tailor very much towards that. Here's the lightly copyedited version of the transcript, preserving all original content:
Because again, we viewed the data as a strategic asset there. We viewed that as, if we can have the world's best data set in terms of examples of social engineering scams with detailed labels, et cetera, that becomes a huge advantage. So it's usually a worthwhile investment to, when you're investing in these internal tools, make sure that they're great, especially when you consider the data they're going to be able to generate. That small attention to detail becomes really important for these types of things.
I mean, if you think of, once upon a time there used to be these programs called spell checkers. It was like separate software. And it'd be like, "Oh, I wrote my document. Now let me run it across a spell checker and it'll tell me all the things where my spelling or grammar was wrong." And then the new innovation came on, which is really simple, but it was just doing it in real time and it makes this little red squiggly line underneath when you misspell something. And overnight, as soon as that feature was implemented, the standalone spell checker or the standalone grammar checkers, the industry just disappeared because this was so much better, right?
And so there's a lot of things that you can do when you're trying to think about this in terms of how can I get better data from the user. Here's the lightly copyedited version of the transcript, preserving all original content:
You know, you can explore things around gamification. Gamification is sometimes a loaded term, but it's just ways to make the feedback process more enjoyable for users and encourage them to participate. These types of things can be very effective. Again, it's like you want users to naturally be able to give you good feedback, good data, and good results through the natural uses of their product. But then you want them to see that every time they give that little bit of feedback, they get this little dopamine hit. If you're able to achieve that, then you're in a really good spot.
And then also, the immediate feedback, like correcting data in real time or just giving immediate feedback. Not like, "Oh yeah, well, you submit this thing and then we'll get back to you someday in the future." It's like, no, I immediately get some positive feedback that helps to really enforce that loop.
Another thing to think about is that it's not just about quantity of data, it's also about quality of data. In general, if you imagine you have a photo editing software, what you actually really care about is the edits of professional photographers and people who are professionally editing photos. What do they do? Not sort of the amateurs, right? And so you can't treat all data as equal. It's not like, "Well, we have data from 10,000 users or 100,000 users." Here's the lightly copyedited version of the transcript, preserving all original content:
It's like, well, actually, probably what you want is like 100 experts and not the 100,000 amateurs, right? And the other thing that's interesting around this is that data from consistent users is often more valuable than data from inconsistent ones.
Finally, longitudinal data, basically data where you've collected this over a long period of time, becomes extremely valuable because it's very hard to get. The only way to get it is to actually spend the time. So if somebody else is coming after you and trying to build the same thing and you have a history of years of data, like one year of data from users and they're just getting started, there's no way to short circuit or fast track that. They can go to scale, but they can't get that longitudinal data. And so that becomes super, super important. It becomes really important to understand what is a long-term value of your product and what's a long-term retention and how to maintain that retention there, right?
When you're looking at these type of things, you want to look at things like, okay, well, let's do cross validation across the different labels from different users. Just try to understand where your best users are, who are the most valuable there. And then try to kind of, if you can use additional metadata, you can use information that you know about the user to say, hey, these ones are experts or not.
When you're thinking about it, it's not about volume of data. You shouldn't be focused on how many terabytes of data that we've collected. Here's the lightly copyedited version of the transcript, preserving all original content:
It's much more about diversity and equality versus just quantity. I also wanted to mention bias and diversity in data because this is a really important point.
I'll give you one example from my last company, Unified EAD. We made a product that would take the sensor data from your phone and then use that for authentication. So, you would do data analysis based on the most data from your phone. As you walk in, we'll say, is this you or not? We used other behavioral data like touch screen and location and a bunch of other stuff.
We launched one stage of TechMarshal's routes and we got a huge fan for it. We got like 20,000 users that downloaded our app. The app literally all it did was just sucked all your data and took all the sensor data and uploaded it to the cloud. That was the only value that it had. But people, these are all early adopters. They just wanted to be part of this exclusive thing and this exciting company and stuff.
Anyway, we gathered a huge set of data there from all of these 20,000 users. We used the trained models and we saw really high accuracy rates. Things were going super well. So we did a wide distribution. We started to sell to some companies, including Samsung. And then Samsung started to test it. Their performance was terrible. It was horrible. Here's the lightly copyedited version of the transcript, preserving all original content:
And then we dug into it and it's like, "Wait a second. We have 20,000 users. That should be enough, like, more than enough data points to train this. And we got great accuracy in all of this." But it's like, who are these users exactly? Well, it was the people who followed Tech Runs to Strops. It was like 95% male. It was almost everybody from the Bay Area. It was all at a particular age range. And these were not representative.
So when Samsung was testing it, they're testing on middle-aged women in Korea. And their behavior was entirely different than the tech people that were in the Bay Area. That was a big eye-opening point for us.
The way we ended up solving that was that we became much more thoughtful and strategic about how we acquire data. We got to partner with a bunch of different apps where the demographics were much more complementary with the type of demographics that we had. We needed more middle-aged women, so we partnered with an app that was like a virtual slot machine app where people would play virtual slot machines. It's like, "Hey, if you donate your data in this authorized form, then you'll get some extra tokens to play the game and stuff like that." And so we got a bunch of middle-aged women.
We had another one where it's like we need more blue collar workers. Here's the lightly copyedited version of the transcript, preserving all original content:
So we partnered with an app where they were using a road surveying tool app. There's all these people who are doing surveying out in the field. And so we gathered, we were able to get a bunch of other people like that. Ultimately, we ended up with about 35 million users. The performance went way up, just across a much more diverse set of people.
I think there are a lot of examples like that. You can really fool yourself. You can believe that your system is working better than it actually is if you don't have a lot of diversity in your data. You also have a lot of bias and other problems there which can lead to it working fine for one group, but not working well when you try to scale and grow to different, diverse sets.
It's also important to understand that there's inherent bias in many of these generative AI tools and models. Even if you're using OpenAI or using Llama or Anthropic or any other system, yeah, they try to do this type of alignment correction thing. They do not do a good job. I mean, it's just a hard problem, but that's real, it is inherent in there. Here's the lightly copyedited version of the transcript, preserving all original content:
You know, it's always the example in big classes of, like, "Well, because the elements are doing the spoken prediction, you could say, 'A woman's place is in the...' and you put the next word. What should the next word be?" And if you see, statistically speaking, across all of the text that has been published on the Internet and Reddit and all over the place, what statistically should be next? There's a natural bias there.
So just understanding the fact that there's inherent bias in those models, and you can't just rely on the vendors to go and solve the problem for you. You are responsible for your product. So you're going to have to address it, and even if they may try to do all those things around alignment, it's not going to actually solve the problem for you. You're responsible for solving the problem, right?
And so the last point here is, building diversity in a dataset is really hard, but doing hard things is what creates differentiation and value, right? And so you should be working on hard things, especially in this world now where it's trivial to go and, you know, anybody can go to all these hackathons, right? And I see people in a day or two days, they build amazing, seemingly amazing things, right? Because they're built on the back of GPT-4 and many of these other kind of generative AI tools. It's also now a lot easier, you know, more in the past maybe you needed like a dozen employees. Here's the lightly copyedited version of the transcript, preserving all original content:
Now you can get away with one or two because you can use AI digits for a lot of these things. And this is going to continue to be the case. But what that means is, when you're building things, especially for startups, you want to focus on something where you can actually have some real differentiation. One of those things is solving hard problems. This is an example of a hard problem there. If you do that, that becomes real value.
As we look forward to the conference, I just wanted to bring up a few things. Number one, I think we can all do better than just conversational chatbots. ChatGPT is amazing. I mean, there's a reason why it went to the fastest product ever to 100 million users, because of the simplicity in all this. But the chatbot is not the right, not a great interface for a lot of uses. If we can have things that would be a lot more interesting and interactive, that would be better.
But on the flip side, let's not recreate Clippy. I mean, so many things I see where it's like, "Oh, we're going to make this agent that's going to watch what you're doing and it's going to make suggestions." That's really just talking about Clippy. It's like, "Oh, you're trying to write a letter and let me help you with that." People don't want that. Here's the lightly copyedited version of the transcript, preserving all original content:
You have to understand where that kind of fell down and then embrace more creativity there. Again, with AI agents, I think there's an opportunity to be much more interactive. All the chatbots are extremely passive right now. It's like you just tell it the thing you're doing and it's kind of just responding.
If you think about what good humans do, the best humans know when to ask questions and what questions to ask. I think there's a huge opportunity there for AI to do the same type of thing when you're interacting with it. Much like a great human interviewer would do.
Finally, consider data strategy as you interact here. We need to hear about different ideas. Think about how you can incorporate data strategy into products. When you're evaluating user experience ideas, try to do it from a lens of considering the value of the data these ideas will create. If you think about it in terms of, "By doing it this way, we're going to create a huge, really valuable data set that is going to be highly differentiated versus anyone else," that, I think, is going to be a lot more valuable.
Anyway, this is me. You can connect with me on LinkedIn. And yeah, thanks to a handful of people, including Lana Free for helping me with the presentation. Here's the lightly copyedited version of the transcript, preserving all original content:
So the great thing about having such a small conference, right, is that 70-ish people here is to make it super interactive. We're actually going to give all the speakers at least five minutes, I mean, it depends if they want to talk. You can find the speakers where they're late. It's a pretty tradition here. So you can email them or you can talk to them. But let's take maybe a few questions and then we'll move over to Soojin.
By the way, we actually introduced Cura, but he designed the badge. And then your LinkedIn Cura code is in there already. So if you want to use it, it's really cool. But anyway, yeah, let's do the question.
"Yeah. Thank you for the presentation. A question on bias, bias of the models. Do you have any thoughts on how to be biased as we use different elements? Also, do you have any understanding of how biases compare across different models today?"
"Yeah, I mean, this is a really hard problem. The reason that these large-scale models have such amazing emergent behavior is because they ingest huge amounts of data, like massive amounts of data, like on the order of like the sum of digitized human knowledge, like on that order. And it becomes impossible to filter that in any meaningful way, like, you know, in terms of before the training process. So that stuff is getting into the training process also because training these things are extremely expensive. And it's not like you can't go back. You can't be like, oh, oops, we accidentally included this like, you know, this data that we shouldn't have. Let's go back in time. You can't do that. Right." Here's the lightly copyedited version of the transcript, preserving all original content:
And so what they typically do, they'll try to add these kind of dandy things afterwards where they kind of, you know, there's a bunch of different techniques that they use to try to do these things. As you've seen from all these jailbreaks and other things that people have, like none of them really work in a reliable way. And so that's the real world that we live in. Right.
It's much better to just understand here's the nature of these models. Here's what they're good at. Here's what they're not good at. Here are the biases that are inherent in them. And then use them with that in mind versus like this is not going to be a case where someone's going to build this magic model that's going to be perfectly aligned. First of all, the question is like aligned to whom? Right. That's a real question. And there's like disagreements about who it's aligned towards. Right.
And yeah, because they build these things to be extremely general. And so you have to understand like that's what you're getting with OpenAI and other ones. They try to build these guardrails around them. You see that they continuously fail even when they don't fail. I mean, I tried this recently. I was trying to generate an image of like a group of investors at Deloitte. And no matter what I did, it was like the investors are always fail. I tried I told it I said, you know, make it diverse group, make it half women and something like that. Here's the lightly copyedited version of the transcript, preserving all original content:
It was like you refused to do it because in its data set, it was just like, "Oh, guess what? Ninety-eight percent or 99 percent of the time you're talking about an investor is male." OK, that's what it's going to do. It's really hard to correct that. And sometimes when you correct it, you overcorrect it. Which am I, right? These problems are really tricky. Understanding where, not in the context of these general models, but in the context of your specific product, how do you deal with that? That's my recommendation.
Thank you. We take one more question.
Yeah. Actually, I wasn't going to ask a question, but just add a few comments to your answer. I think, you know, as we're seeing, there's a lot of general use cases of these technologies. But as we move forward, I think very bespoke use cases are going to come to life more as you see in the GPT of the world, whether it's for creativity or learning something or education. And so creating more bespoke data sets and making sure that we have a diverse human...
Oh, no, totally. I think this is definitely the future. I mean, I think there's a controversy about this where it's like, you know, for my whole opinion, I was kind of like, well, there's like one model to rule them all. And like everyone's going to be for, right? And then personally, I think there's a lot more value in kind of more smaller specialized models, both in terms of like the cost, but also like the self-determinants of having control and all this. Here's the lightly copyedited version of the transcript, following the guidelines provided:
And there's more stuff like models running at the edge or running closer to where the user is for privacy reasons and scalability and a bunch of other things. So that's where having data sets becomes really valuable in that world where you can't just rely on GPT for everything.
Most of the time when I see startups when they start, they start with OpenAI or, I mean, now there are a few other models that are also comparably good. They start there. They try to build a proof of concept. But then when they're actually building a real product, they rarely use those. They're thinking they start to build their own data sets. They start to do fine tuning and techniques like that because ultimately they want to own their own models. They want to own their own future. And so this becomes really important.
Are we out of time or do we have...? We'll take one more question. One more. Yeah, one more question. Oh, my God. By the way, John's here.
Is there something unique about humans in solving for bias or could we potentially use synthetic data that's not biased or could we have like robot rent teams? Like how can we make this more than just... I worry about the diversity of the loop because that starts to look like a data center.
No, and this is a real problem. Here's the lightly copyedited version of the transcript, preserving all original content:
I mean, I think there are these shortcuts that we like. But basically, the AI version of that's going to do this. The problem there is that these errors compound. And so you end up eventually with model collapse. Whereas if you go all the way down where it's like a small error, and then that becomes an error in your agents that then gets reinforced and propagated, and reinforced and propagated. So eventually it collapses.
We've had this exponential rise in terms of the capabilities of the models. I think the next big frontier, which is still starting to be tapped right now, is video. If you think about how pictures are worth a thousand words, imagine if you take all of the semantic information that's in all of YouTube and every digitized video ever. There's a lot of knowledge in there that is not currently in LLMs, you know, that you can gather.
But beyond that, it's like where do we get... we're going to start to run out of digitized human data. There's only a limited number of humans. They're only generating a certain amount of data. And then we're not gonna be able to keep running that exponential curve there. And so I think we will hit this, not plateau, but like a slowdown at some point where it's like, okay, now we're running low on data. Here's the lightly copyedited version of the transcript, preserving all original content:
And so now we have to start to work on other techniques, reinforcement learning and self-learning, like things we do with Offal Go and those other types of things. There's maybe hope there to generate more data that's not from humans that could be useful.
But yeah, I mean, we're banked, the performance of all those LLMs is banked on basically the entirety of human knowledge until now. Like the fact that in language, there is all this inherent structure, which is very informative about the way the world works and all this. We've crawled the entire web. We've digitized every book. Where do we get more data now? It's just that becomes a real challenge. In terms of scaling these models, you don't pass to like a multi-trillion parameter, like that sort of thing.
Anyway, cool. Thank you.
Join Swell
At StratMinds, we stand by the conviction that the winners of the AI race will be determined by great UX.
As we push the boundaries of what's possible with AI, we're laser-focused on thoughtfully designing solutions that blend right into the real world and people's daily lives - solutions that genuinely benefit humans in meaningful ways.
Builders
Builders, founders, and product leaders actively creating new AI products and solutions, with a deep focus on user empathy.
Leaders
UX leaders and experts - designers, researchers, engineers - working on AI projects and shaping exceptional AI experiences.
Investors
Investors and VC firms at the forefront of AI.
AI × UX
Summit by:
StratMinds
Who is Speaking?
We've brought together a unique group of speakers, including AI builders, UX and product leaders, and forward-thinking investors.
Portal AI
Ride Home AI fund
Google Gemini
Metalab
Slang AI
Tripp
& Redcoat AI
Stanford University
Google DeepMind
Grammy Award winner
Portal AI
Ride Home AI fund
Google Gemini
Metalab
Slang AI
Tripp
& Redcoat AI
Stanford University
Google DeepMind
Grammy Award winner
Google Empathy Lab Founder
Blossom
Lazarev.
Chroma
Resilient Moment
Metalab Ventures
of STRATMINDS
Google Empathy Lab Founder
Blossom
Lazarev.
Chroma
Resilient Moment
Metalab Ventures
of STRATMINDS