Rockset Podcast Episode 10: Self-service data analytics with Grafbase CEO, Fredrik Bjork

Fredrik Bjork has traversed the data space, creating a gaming social network, scaling the RealReal's marketplace and now investing in the data startups. Hear how the data experience has changed with each endeavor and what trends he's must bullish on in this podcast.

Giovanni Tropeano:

Welcome to the real-time analytics podcast Why Wait? The Rise of Real-Time Analytics hosted by Rockset. We invite engineering, business, and thought leaders to analytics specialists to talk about their world, providing insights into what your peers are doing to improve data and application analytics. I'm your host Gio Tropeano of Rockset. I'm here with my cohost Dhruba Borthakur, the co-founder and CTO of Rockset. Dhruba, it's always great to have you on the podcast. How have you been?

Dhruba Borthakur:

Yeah. Good to be here. We have a great guest today. I'm excited to talk to him.

Giovanni Tropeano:

Absolutely. Absolutely. Our guest today is Fredrik Bjork, and he has an interesting set of milestones in his career. In 2010, Fredrik founded Avatars United, which is a gaming social network. It was sold to Second Life's social network, which is actually super relevant now with the work-from-home boom and the gaming explosion. Since then, as CTO of The RealReal, the luxury brand and online resell marketplace, Fredrik scaled The RealReal from $12 million in 2013 to $1 billion in an IPO in 2019. And he's now an angel investor with over 10 investments plus to date.

Giovanni Tropeano:

Fredrik is currently investing in the serverless space and development tools, data and infrastructure, with one of his most recent investments being Meroxa. So, Fredrik, super excited to have you here today. Can't wait for this conversation. Welcome to the real-time analytics podcast.

Fredrik Bjork:

Thank you. I'm excited to be, to be here and talk about real-time analytics and more.

Giovanni Tropeano:

Absolutely. Before we jump into that, you're based in Stockholm, Sweden. You've traveled a bit, and you've worked around the globe having been in Spain, in Redwood City, if I'm not mistaken, or at least in the San Francisco area, of which I used to live in Redwood City as well. It's a great area to at least either work or commute to. The RealReal was out of San Francisco. You graduated from Rochester Institute of Technology, so you've spent your time in various areas in the US and around the world. What would you say is the biggest difference between living in the US and living in Spain and Sweden?

Fredrik Bjork:

I mean, it's an interesting question. I suppose you could think about quality of life, work-life balance. I see the US being on one extreme where you have the least amount of flexibility and vacation, but also probably the biggest possibilities to really make it big there. Right? So, if you think about that lever. And then you have Sweden somewhere in the middle where you have pretty good quality of life. Work-life balance is pretty good.

Fredrik Bjork:

But then Spain, it's all pleasure. So, I would say it's probably harder to make it there. Of course, there's a lot of cool startups there. But generally, when I was there, it was just enjoying life. Right? So it's the spectrum of really enjoying life versus really enjoying working.

Giovanni Tropeano:

Oh, I get it. It's the Southern European way. I spent a lot of time in Italy, and I know what that's like. Finding anybody that wants to work in August is like a diamond in the rough.

Fredrik Bjork:

It doesn't work. I mean, I like to compare if you think about e-commerce, San Francisco, arguably, you can get things in an hour. Right? If you want toothpaste, you'll get it. And people are quite nice. The customer experience is good. Coming back to Sweden, it was not as good as I remembered it. Then in Spain, which where I was earlier this summer, yeah, it's even worse. So, that's something you get used to, and it's hard to forget how good it was in the US.

Giovanni Tropeano:

Moving to analytics then, what were some of the game-changing ways that you used real-time analytics at The RealReal? How could real-time analytics in the luxury goods industry impact the business?

Fredrik Bjork:

Yeah. So, if you talk about The RealReal for a minute, they have a really interesting business model where every single SKU is unique. Right? Which means that it's either available or it's sold. There's no depth. Right? So, you can't have 10,000 of the same, because it's not. It belongs to one person, and it's one condition. It will be transacted upon, and then it's gone. Right? So, you got to make sure that your data is real-time.

Fredrik Bjork:

So, that's one challenge, which is actually a pretty hard challenge, because then if you start to think about how do you recommend something to someone that's available to them... Because you have stores. Right? You have online, and you have brick and mortar. They're all working in this real-time inventory. So, it's a pretty hard problem for tens of thousands of items to get out of every day. So, you have millions of items to work with.

Fredrik Bjork:

And then you have to figure out, okay. How do you recommend something to someone with relevant analytics? Right? So, that's where, as many people do, you start to invest in that. Right? You have to figure out framework. You have to do infrastructure to get behavioral data, right, so purchase history, browsing history, returns, size, all that stuff, and make sure it gets presented to you when you recommend items. So, that's something we achieved, but it was not seamless. And the time to market is quite high.

Giovanni Tropeano:

Just a quick question about that. They were unique SKUs. When you say that, you mean like, if I wanted to put up my wife's old Gucci belt or something or even my Gucci belt, would that SKU be managed by me the user, or would that then be managed by the system itself at The RealReal in the backend?

Fredrik Bjork:

Yeah. So, it's consignment based. Right? So you send in your item to us. We create the SKU, and then it's available at the price we set. Then you get a cut. Right? So, we manage it for you.

Giovanni Tropeano:

You manage everything. Okay. Got it. Understood.

Dhruba Borthakur:

It's also-

Giovanni Tropeano:

It's not peer-to-peer. Yeah, sorry.

Dhruba Borthakur:

Fredrik, I'm guessing it's also possible that every piece of data is different, right, because you have different types of goods, maybe people selling there, different forms of data. And then maybe some days you have a lot of demand on your backend systems. Some days maybe it's less. Generally, I've have heard that these kinds of doing real-time analytics on e-commerce where the demand is fluctuating, the data chain changes rapidly, is it very difficult from a people-side of things, in the sense, do you need a lot of people to manage these real-time analytics systems? If you want to make these real time analytic systems be very mission critical, how hard is it from the people-side of things or the human resources side of things to keep them up and alive and ready to run 24 by seven?

Fredrik Bjork:

I'd say the biggest investment is just getting everything specked out, the schema. Where does the data live? What's the output? Is it in GraphsQL at endpoint? Is it a REST API? Does it live in S3, or is it in GCP? Reading from Kafka to getting all of that stuff to work, because usually it's siloed, distributed teams. Just getting it working end-to-end is probably the biggest investment.

Fredrik Bjork:

But once everything up and running, at least at The RealReal, if you have auto scaling, all that stuff, generally maintaining it's probably not so bad. But, yeah. Every time we make a change, it is time consuming. But I'd say the biggest investment is just getting it to work. So, time to market is the biggest concern, I would say, like democratizing it. Because once you have it up and running, it is an API that other people could use. But just If you have to do that every time, that's months of work. Right? It's not minutes or days. It's months. So, that's the biggest concern I had.

Dhruba Borthakur:

Yeah. You mentioned something about schema change or data changing in format and shape. In that case, you need to probably have somebody go massage the data again to be useful for a real-time app in the backend. And that, I think, is the point that you are trying to make right, saying that time to market is critical. So, you need people doing some of these things.

Fredrik Bjork:

Yeah.

Dhruba Borthakur:

I saw similar or a similar kind of challenges when I was at Facebook building a lot of social applications. Right? So, they are also... I mean, real-time analytics was very important for us at the time. But even with your background, you have experience in building social applications. Then you also have great experience in building a real huge e-commerce backend. We know that for social applications, real-time analytics was super useful and impactful.

Dhruba Borthakur:

How much is the impact for real-time analytics, not just analytics, but real-time analytics on the e-commerce business or the e-commerce industry? Can you give us some examples or tidbits or opinions about how real-time analytics is useful in a global e-commerce perspective?

Fredrik Bjork:

Well, I think it's the rate that you can innovate. Right? So, let's say you can... For us at The RealReal, we had brick and mortar stores where we actually built our own point of sale app that was actually customized, fully native by us so that we could control the user experience. And in there, we could recommend the products that a customer was looking at, but maybe it was not available in the store. But it is available online or in another store.

Fredrik Bjork:

And that is actually... The average order value is high. So, if you could just sell one more item from that iPad point of sale app, that's an increase in your sales. Right? So, absolutely. Our ability to leverage real-time analytics to build features is essential, right, to move faster and to innovate on a monthly basis. Right?

Dhruba Borthakur:

So, you're talking mostly about personalization, probably when a customer visits your website or your application. Can you them personalized real-time analytics?

Fredrik Bjork:

Exactly. Yeah. So, recommended, but based upon their previous browsing or purchase history, but also in their site, right, because you don't want to recommend something that's not like a shoe or a [inaudible 00:11:38] because every single SKU is unique. Right? So, we might only have that other thing in another size, but then we should not recommend it to you. Right? So, it's multifaceted, pretty complex searches. Anytime you need to join data with another silo, then that's days or weeks of work to get it to the real-time API.

Dhruba Borthakur:

That's a good point that you make because you talked about joining different silos of data. Right? Some data could be in your database, and some data could be in your streaming system. Then you probably want to join them together. Is that what you are hinting at?

Fredrik Bjork:

Yes. Exactly. Yeah.

Dhruba Borthakur:

Yeah. I think, I mean, real-time analytics is difficult; but I think you also have a lot of general purpose analytics, which is about not maybe very real-time but something which can crunch a lot of past historical data for the user or browsing history of the user, which is in the past. And for e-commerce applications, do you think that these two go hand-in-hand, being ability to be able to scan or process a large set of data that has been generated historically versus ability to join it with the most recent data or the most real-time data that is being produced as when the person is looking for an item? Is it important to be able to get a mesh or a join between these two data sets to give them good experience on your website?

Fredrik Bjork:

I mean, absolutely. I think it's common now that you have multiple data stores of the customer. Right? If you have a 360 of your customer's data, some of it's going to live in maybe a Postgres database. Some of it maybe lives in snowflake or BigQuery, another one in Salesforce. Right? Joining them, I don't want to do that. It's something you have to spend a lot of engineering time putting it somewhere where you can join it easily. So, I'd rather use a managed product for me, if I can just to move faster.

Dhruba Borthakur:

Moving faster is definitely something I think that is what you see from most of users of analytic systems, right, whether it's real-time or not. But I was chatting about this with Gio the other day, and we were just chatting about some general purpose data challenges that Gio was mentioning. What was your exact thing that you were asking me, Gio, the other day?

Giovanni Tropeano:

So, the fact that this is an analytics podcast, a lot of developers listen to us, right, and are curious about what their peers are facing. So, not only have you been involved in leading teams and heading organizations, engineering teams, but you're also investing in companies that are helping companies manage their data. What are some data challenges that engineering teams are facing today? What are you hearing from teams that you've worked with, teams that you've led, and teams that you're running; and how are they overcoming those data challenges?

Fredrik Bjork:

I mean, I think some of them we've already discussed, but it's just the communication overhead. If you have an organization where you have maybe a data engineering team that owns the analytical data, and then you have engineering teams or squads or pods owning the real-time data, so to speak, whether it's in MySQL or Postgres or Dynamo. But then getting them to talk is usually where the breakdowns happen or the time consumption happens, and they have to agree upon a standardized format to know how to consume that data, and then schema. Is there a scheme even? Right? Maybe the schema changes. No one knows about it. Are they using Avro for Kafka or not?

Fredrik Bjork:

Those are all things that have to be in place before you can start to really start to move faster. Right? So, the first few iterations are going to go a lot slower; but once you're up and running, it's usually faster. But still there's infrastructure. Right? So, you got to deploy an API data engineering team. That might not be their strength. Right? They're probably not writing graphs or GraphQL queries. Now, they need to learn how to deploy that. What's available to them? Can they use Lambda or Kubernetes? And that might be a whole new thing.

Fredrik Bjork:

So, who does what is usually a pretty big time sink. Ideally, we're all full-stack. But in practice, that's not how it works because you can't have front-end developers doing the data engineers work and vice versa. So, that's where I would say the biggest challenges are. And the way to overcome that is, well, I guess, A, use a manned solution if there is one. Use of framework or document what you do. But in general, try to build reusable systems. Right?

Fredrik Bjork:

So, ideally getting to self-serve where you can automate as much as possible so that if I want to display data from two or three different data sources, it should not take weeks. It should take me a few minutes to start testing and then a few days to get to staging and maybe a week or two to production. But that's for sure not happening in reality, at least not in my experience.

Giovanni Tropeano:

I like to ask folks that are guests, right, if you could give one piece of advice, what would it be? I think that was a good piece of advice that you just gave. So, you just answered that question without me asking it. In order to do that though, what are some steps from an engineering leadership perspective? Right. You've been a director of engineering. You've been a CTO. What advice then would you give to leaders to take those steps? Right? Because it's the leaders that have to kind of understand the challenges and make life easier for their teams.

Fredrik Bjork:

Yeah. I mean, I'd say this one is more challenging, because where does the data live? Who reports to who? Sometimes BI or data engineering, the data team lives under marketing or even under the CFO. That could create a lot of friction or things can go quite slower. Maybe should they move under engineering instead? What does that mean? That's a pretty significant obstacle to overcome.

Fredrik Bjork:

I guess. Yeah. There's probably more things that you can do to improve it. Obviously, documentation, automating it, and achieving the self-serve, right, that should be the goal so that data isn't just sitting there siloed and is protected by three or four people and people are scared to ask them. No. Data is for everyone, and we own it. Right? There's no he or she or it. It's just we. Data is for us to use, and then go get it.

Giovanni Tropeano:

Yeah. So, it's almost like the operational side of it is facilitating that, streamlining it, providing access to the right people at the right time so that they have the self-service to be able to do something with the data versus waiting around. I have a project to build, a tool to build; and I can't build it because I'm going back and forth with trying to figure out access. We hear that a lot in both large and small enterprises.

Giovanni Tropeano:

And you mentioned the hierarchy. Where does the data live? Who owns the data, documentation, automation? I mean, these are all things that engineering leaders need to answer, questions that they need to answer and solve in order to make their data culture move at the speed of real-time.

Fredrik Bjork:

Yeah. I mean, I like to think of it as usually you have analysts or even business with SQL skills. They can query, whether it's Tableau, Looker, they could query the analytical data. It should be the same for real-time use cases with features. Right? A developer should be able to say, "All right. I want to use this in my feature or product manager," and it should be readily available. If it's not, I mean, that's an opportunity to invest in. But it's exciting to see, because I'd say it's a fairly recent trend, maybe the last three years, where it's actually happening.

Giovanni Tropeano:

That's super interesting. Well, that'll do it for this episode of the Rockset real-time analytics podcast. Fredrik and Dhruba, thank you so much for your time, your insights today. I really appreciate it. If you are listening to this podcast and you found the episode insightful, please share the episode. Help us share the thought leadership and the insights on real-time analytics with your team, your peers. Though, why wait?

Giovanni Tropeano:

Real-time analytics podcast is brought to you by Rockset. At Rockset, we're building a real-time analytics cloud-based platform. Check us out at rockset.com. If you can try real-time analytics today, feel free to give us a shot. You can sign up for free for two weeks with the $300 free trial credits. Once again, thank you for joining, gentlemen. Thank you for the audience for listening, and stay tuned for our next episode. Cheers.

Dhruba Borthakur:

Thanks a lot. Bye, now.

Fredrik Bjork:

Thank you. Bye.

Resources

mouse pointer

See Rockset in action

Real-time analytics at lightning speed