Accomplish your goals at scale
Alright. Welcome, Dmitry. Would love if you would take a few minutes to introduce yourself.
Yes, thank you. I graduated with a PhD in machine learning from UC Irvine, and I became a scientist at NEC research in New Jersey, where I worked on personalization systems—published quite a few articles. Then I moved to Yahoo, and I was working for Yahoo Shopping, and this is how I guess my career took off. And from Yahoo, I went on to Yandex, which is a Russian search engine company, [I] was working on relevance. Then I was doing relevance and research for ads at the company called Criteo, a French company, that I think is just awesome from the viewpoint of: the mentality of “everything should be experimented with.” Right, like we were A/B testing a whole lot of ideas, and there was this no-fear mentality, which I think is just absolutely awesome. And most recently, I was leading a B team at Walmart labs, doing personalization and the advertising, so I was the head of technology proposals of these directives. So, this is me.
That is an awesome trajectory, and the thought process that—take the large Fortune 100—they are deploying lots of data science teams that are working on various models of various types. I’m curious to know: how do you experiment, A/B test, do rapid testing, with those models in production? What is the infrastructure that these companies are putting in place to test models in combination with each other? With historical data, current data, streaming data, so on and so forth, there needs to be a platform in place—at scale—to be able to do that. So what’s the thought process there?
Yeah, so I think, Debjani, you’re absolutely right. The platform is extremely important, and what is also—I think—important, is that you actually have enough data coming through your front end to actually be able see the traffic and test. And, unfortunately, in a lot of cases, this is one of the major obstacles because I think a lot of companies get caught up to the idea that, “Yes, we need to A/B test and put infrastructure in place,” and you will see it in a lot of big players. Now the issue is that you take, for instance, a homepage of a large, trafficked e-commerce website. You may actually have enough traffic to actually figure out whether to show the results of model X’s prediction or model Y’s prediction. Personalization, I think, is a great example. But then you take an item page, or probably even better, you take an obscure item page. There may not be enough traffic there, and so you would inevitably run into issues where you either need to wait a long time to actually get to any kind of statistical significance, or you really need to be looking at a major improvement that will actually show up with much less data. So I think there is still—even in personalization, in applications that we’re dealing with on a day-to-day basis—there is a thought process where people would just look at it and say, “Yes, go ahead and deploy.” And I will tell you that some of these ideas actually make a lot of sense.
One example I’ll give you, very simply. When people shop, lots of times what they want is: they want to have some of their options defaulted to. For example, I want to search just one P data, and I want to, let’s say, [select] Prime delivery (if it were Amazon), and I want to default to a certain payment method, and I always want to apply my rewards points. And so, if the retailer is forcing you every time, time and time again, to make those choices, I think the level of satisfaction for you as a customer goes down immediately. Versus, if all of that is sort of pre-built for you. And you know, this is clearly something minute, it’s not a big deal. But the total experience is actually integral (all of those small things), and if more of them are actually set in the right places, then you are actually in the game.
And actually, Debjani, this is also an interesting question, I think, that we may want to mention here, is about: where the whole deal with personalization is going in the future, and I think where it’s kind of interesting because I think there are two trends. One of them is that companies are becoming progressively aware that having first-party data, and having the data at scale, is an extreme advantage, and they’re sitting on a goldmine where they can leverage that data to actually be significantly more relevant for their customers in a lot of ways. And I think the counter trend to this is that the customers are becoming progressively more aware that, you know, they could enjoy privacy and not share their data—and don’t sell, don’t track, just don’t. And so I think where this trend is going to end up being, is that those of those companies who actually have first-party data and who are able to leverage the data to be relevant for the customers—those are the ones who are going to win the prize, ultimately. And so I think this is where we will see the progression of it: that the education of customers will take place, and that they will only be willing and able to shop at the sites where volunteering their data to [those sites] actually yields them the benefit of speed, of convenience, of “saving money and living better,” as we used to say at Walmart. And I think this kind of motto is so very true across the board. Unless, me, as a customer, I feel the benefit, why would I volunteer my data? And I think more and more customers are going to be progressively more aware of that.
That’s an interesting one. I have a question for you. So we are obviously thinking a lot about personalization, and you’ve spent years on personalization, at the leading companies. Personalization has been a long journey, but what’s next?
Again, Debjani, it’s an awesome, awesome question. And maybe an interesting example I’ll give you is: I was listening to somebody who I really got fascinated recently by, his name is Chris Voss. He used to be a lead FBI negotiator for hostage situations in the U.S., and you can go on YouTube, he’s also teaching a Master Class, I think it’s just excellent to hear from him. And so, what he’s referring to, either in his talks or in his book (which I also find extremely fascinating: The Art of Negotiation), is the fact that in every situation there [are] a number of what he’s calling black swans. And if you know at least one, or maybe two of them, it actually helps you clarify the situation to a level where you actually can drive it to a point where it is very easily resolvable, or much [more easily] resolvable than if you don’t know this information. And so, this is the analogy that I like to use for personalization. I think if we were able to read people’s minds, in a good way, to help them, we’d be indispensable; they would love us for that.
Take the following analogy: Let’s imagine that you want to collect the grocery basket with some products that you need to buy. You can go into the store and browse aisle to aisle. I don’t know how long a time, Debjani, it takes you; it takes me quite a bit, maybe thirty to forty-five minutes. So, even if I know the layout, and then you also know that they change the positioning of the products. Now, given all of that, where is my convenience? My convenience is that I don’t want to spend thirty or forty-five minutes doing this. I’d rather somebody does it for me, and my shopping is extremely—at least as far as groceries are concerned—is extremely predictable. So I would claim that, you take my Costco shopping, and with little kind of tid bits here and there, you can pretty much predict my basket. I don’t need to go and pick it up, I don’t need to actually build it online, you can pre-build it. And so, the pipe dream there, that could be, is that: I log into a website once a week, and I actually have my grocery list pretty much set up for me, and all I need to do is say approve (well maybe toss out a yogurt and add pickles), and I’m done. Is it too much to ask for, Debjani? I don’t know. It would seem like, that if you are a customer who had volunteered a lot of data to this website, and you know the repetitive behavior, and you can also build a model of it—based on not just this guy, but everybody who you saw—then you should be able to do much better over time, and quite possibly limit the time of basket check out [and reorder] for grocery—if not to zero, but to a significantly lower time than 45 minutes.
Debjani, the question that should be asked: Is it true today? The answer is it’s not true today. I don’t think it’s true pretty much anywhere, so if you were to check out a basket online, even if you had done this multiple times before, it probably will take you more or less the same time as it would take you to actually go into the store. I mean, obviously there is an advantage of delivery and what-not, but I still want the ultimate convenience: trying to predict the basket, in this case, and almost read your mind. Because this will, Debjani, definitely create a “Wow” effect.
So, ultimately, it’s all a question of relevance, and relevance—in this case—means that I can read your mind and predict the basket, and you’re able to enjoy a lot of convenience by checking out very fast. Do people understand that that actually requires a lot of data? Well, that’s the rep[utation], and so I think it’s something that can go up in trend very quickly, if the company gets data-rich. It also can go down very quickly, if we cannot produce that convenience, and the product, in terms of personalization in basket building, is not very convenient. Then I don’t think anybody would want to shop there because it’s more convenient to actually go into [stores]. So I think it’s something that the companies need to pay a whole lot of attention to because I think the personalization there is a cornerstone that may easily define success and failure, depending on how well it is deployed.
But let me ask you this, Dmitry, that—what you describe—is obviously a need, a recognized need, and the bigger players do have that data. Why hasn’t it been done? I don’t see technologically there being too much of a barrier. What is stopping the big players from doing this?
No, so there is nothing stopping [them], Debjani. What I think is happening in a lot of cases is that you will see a power law distribution within the customers, but that power law ultimately means that there are a lot of customers with very few, very little data. And if there is very little data, there is very little I can do to you, to help you in your customer journey. And so, what I would imagine, Debjani, still, is that there is definitely space in personalization to tackle big problems. There are other problems in this upper funnel realm, where there are lots of customers with very little data—I need to drive them down the funnel. Even from a perspective of data collection, because if I have more data against a particular customer, obviously I can do a better job for them. Personalizing their experience, and showing relevant content, and actually, them becoming more engaged with me than otherwise.
And so I think these are the problems, the big problems, that are still there that need to be tackled. But I would also say that there is time and space, and probably there should be more time and space, allocated to the problems that are really simple and transparent, that will drive customer satisfaction so much higher. You and I were talking about this problem the other day when we spoke—about the customers collecting the baskets and being shy a few dollars off of the free shipping limit. This is a big issue because this is the time where you will see the customers are just getting stuck. It’s like, “I had put my two items that I really want into a basket, and I need to pay $5.99 for shipping. and I’m only two bucks away, what do I do?” And so, if the website isn’t reading into that situation, and understanding that it’s a really simple thing…You need to show a pack of gum, and hopefully that pack of gum is actually personalized, so it doesn’t have to be gum—it could be candy or whatever. There is probably also a consideration that it shouldn’t ship from a different DC, because then you are going to create a split shipment, and that’s not going to be viable for business at the end of the day. So, there are certain constraints, but subject to these constraints, if you realize that this is a problem, you can significantly improve the customer satisfaction and flow, probably improving the checkouts and GMV by a large percentage. And so, I would say that these are, in a way, Debjani, those black swans that we were talking about, which is: Do you realize that this is the issue, because if you realize that something is an issue, oftentimes, fixing it is easy.
So I think what you are getting at, which I personally love, is that it is about really getting in the consumer’s mindset—contextually, in that moment—and being able to add value to that journey, in that moment. So ZineOne and all of us are thinking about it in the thought process of understanding the consumer from a genome perspective, at an individual level, and predicting outcomes. Thanks for taking the time, and we will talk soon.
Sounds good, Debjani. Thank you.
-Dmitry Pavlov, Former VP of Personalization and Advertising Technology at Walmart Labs
We look forward to getting to know your business!