 It's theCUBE live in Las Vegas, covering AWS re-invent 2022. This is the first full day of coverage. We will be here tomorrow and Thursday, but we started last night. So hopefully you've caught some of those interviews. Lisa Martin here in Vegas with Paul Gillin. Paul, it's great to be back. We just saw a tweet from a very reliable source saying that there are upwards of 70,000 people here at re-invent 22. I think there's 70,000 people just in that aisle right there. I think so. It's been great so far. We've gotten, what are some of the things that you have been excited about today? Data. I just see data everywhere, which very much relates to our next guest. Companies realizing the value of data, the strategic value of data, we need to treat it as an asset rather than just exhaust. I see a lot of focus on app development here and building scalable applications. Now developers have to get over that, have to sort of reorient themselves toward building around the set of cloud native primitives, which I think will see some amazing applications come out of that. Absolutely we will. We're pleased to welcome back one of our alumni to the program. Shinji Kim joins us, the CEO and founder of SelectStar. Welcome back, Shinji. It's great to have you. Thanks for being back. Great to be back. So for the audience who may not know much about SelectStar, before we start digging into all of the good stuff, give us a little overview about what the company does and what differentiates you. Sure. So SelectStar is an automated data discovery platform. We act like it's Google for data scientists, data analysts and data engineers to help find and understand their data better. A lot of companies today, like what you mentioned Paul, have hundreds and thousands of database tables now, swimming through large volumes of data and variety of data today. And it's getting harder and harder for people that wants to utilize data, make decisions around data and analyze data to truly have the full context of where this data came from, who's using this inside the company or what other analysis might have been done. So SelectStar's role in this case is we connect different data warehouses, BI tools, wherever the data is actually being used inside the company, bring out all the usage analytics and the pipeline and the models in one place. So anyone can search through what's available and how the data has been created, used and being analyzed within the company. So that's why we call it, it's kind of like your Google for data. What are some of the biggest challenges to doing that? I mean, you've got data squirreled away in lots of corners of the organization, Excel spreadsheets, thumb drives, cloud storage accounts. How granular do you get and what's the difficulty of finding all this data? So today we focus primarily on a lot of cloud data warehouses and data lakes. So this includes data warehouses like Redshift, Snowflake, BigQuery, Databricks, S3 buckets, where a lot of the data from different sources are arriving because this is one area where a lot of analysis are now being done. This is the place where you can join other data sets within the same infrastructure umbrella. And so that is one portion that we always integrate with. The other part that we also integrate a lot with are the BI tools. So whether that's Tableau, Power BI, Looker, where you are running analysis, building reports and dashboards, we will pull out how those are, which analysis has been done and which business stakeholders are consuming that data through those tools. So Lisa, you also mentioned about the differentiation. I would say one of the biggest differentiation that we have in the market today is that we are born in the cloud, so it's very cloud-native, fully managed SaaS service, and it's really focused on user experience of how easily anyone can really search and understand data through Select Star. In the past, data catalogs as a sector has been primarily focused on inventorizing all your enterprise data, which are in many disparate sources. So it was more focused on a technical aspect of the metadata. At the same time, now, this quote-unquote enterprise data catalog is important and is needed for even smaller companies because they are dealing with tonal data. Another part that we also see is more of a democratization of data. Many different types of users are utilizing data, whether they are fully technical or not. So we had basically emphasis around how to make our user interface as intuitive as possible for business users or non-technical users, but also bring out as much context as possible from the metadata and the laws that we have access to, to bring out these insights for our customers. Got it. What was the impetus or the catalyst to launch the business just a couple of years ago? Yeah, so prior to this, I had another data startup called Concord Systems. We focused on distributed stream processing framework. I sold a company to Akamai, which is now called, and the product is now called IoT Edge Connect. Through Akamai, I started working with a lot of enterprises in automotive and consumer electronics. And this is where I saw a lot of the issues starting to happen when enterprises are starting to try to use the data. Collection of data, storage of data, processing of data with the help of a lot of cloud providers, scaling that is not going to be a challenge as much anymore. At the same time, now a lot of enterprises, I've got to realize that a lot of enterprises were sitting on top of ton of data that they may not know how to utilize it, or know even how to give the access to because they are not 100% sure what's really inside. And more and more companies, as they are building up their cloud data warehouse infrastructure, they're starting to run into the same issue. So this is a part that I felt like was missing gap in the market that I wanted to fulfill, and that's why I started the company. I'm fascinated with some of the mechanics of doing that in March of 2020, when lockdowns were happening worldwide, you're starting to a company, you have to get funding, you have to hire people, you don't have a team in place presumably, so you have to build that as free to core. How did you do all that? Yeah, that was definitely a lot of work, just like starting from scratch. But I've been growing this idea, I would say three, four months prior. I had a few other ideas. Basically, after Akamai, I took some time off, and then when I decided I wanted to start another company, there were a number of ideas that I was going around with. And so late 2019, I was talking to a lot of different potential customers and users to learn a little bit more about whether my hypothesis around data discovery was true or not, and that kind of led into starting to build prototypes and designs and showing them around to see if there is an interest. So it's only after all those validations and conversations in place that I truly decided that I was going to start another company, and it just happened to be at the timing of end of February or early March. So that's kind of how it happened. At the same time, I'm very lucky that I was able to have had a number of investors that I kept in touch with and I kept them posted on how this process was going. And that's why I think during the pandemic, it was definitely not an easy thing to raise our initial seed round, but we were able to close it and then move on to really start building the product in 2020. Now you were also entering a market that's, there's quite a few competitors already in that market. What has been your strategy for sort of getting a foot in the door, getting some name recognition for your company other than being on the queue? Yes, this is certainly part of it. So I think there are a few things. One is when I was doing my market research, and even today, there are a lot of customers out there looking for an easier, faster, time to value solution in the market. Today, existing players and legacy players have a whole suite of platform. However, the implementation time for those platforms take six months or longer and they don't necessarily are built for a lot of users to use. They are built for database administrators or more technical people to use so that they end up finding, quote, unquote, their data governance project, not necessarily succeeding or getting as much value out of it as they were hoping for. So this is an area that we really try to fill the gaps in because for us, from day one, you will be able to see all the usage analysis, how your data models look like, and the analysis right up front. And this is one part that a lot of our customers really like and also some of those customers have moved from the legacy players to select star four. Interesting. So you're actually taking business from some of the legacy guys and gals that may not be able to move as fast and quickly as you can. But I'd love to hear, you know, every company these days has to be a data company. Whether it's a grocery store or obviously a bank or a car dealership, there's no choice anymore. As consumers, we have this expectation that we're going to be able to get what we want self-service so these companies have to figure out where all the data is, what's the insights, what does it say, how can they act on that quickly? And that's a big challenge to enable organizations to be able to see what it is that they have, where's the value, where's the liability as well. Give me a favorite customer story example that you think really highlights the value of what select star is delivering. Sure. So one customer that we helped and have been working with closely is Pitnebos. It's one of the oldest companies, 100-year-old company in logistics and manufacturing. They have tons of IoT data they collect from parcels and all the tracking and all the manufacturing that they run. They have recently, I would say a couple of years ago, moved to a cloud data warehouse and this is where their challenge around managing data have really started because they have many different teams accessing the data warehouses but maybe different teams creating different things that might have been created before and it's not clear to the other teams and there is no single source of truth that they could manage. So for them, as they were starting to look into implementing data mesh architecture, they adopted select star and they have, being a very large and also mature company, they have considered a lot of other legacy solutions in the market as well but they decided to give it a try with select star mainly because all of the automated version of data modeling and the documentation that we were able to provide upfront and with all that, with the implementation of select star, now they claim that they save more than 30 hours a month of every person that they have in the data management team and we have a case study about that. So this is like one place where we save a lot of time for the data team as well as all the consumers that data teams are served. I have to ask you this, as a successful woman in technology, a field that has not been very inviting to women over the years, what do you think this industry has to do better in terms of bringing along girls and young women, particularly in secondary school, to encourage them to pursue careers in science and technology? Like what could they do better? What could this industry do? What does this industry, these people, these 70,000 people here, need to do better? Of which maybe 15% are female. Yeah, so actually I do see a lot more women and minority in the data analytics field which is always great to see. Also like bridging the gap between technology and the business point of view. If anything is a takeaway, I feel like just making more opportunities for everyone to participate is always great. I feel like there has been or just like being an industry, like a lot of people tend to congregate with people that they know or more close groups but having a more inclusive open groups that is inviting regardless of the level or gender gender I think is definitely something that needs to be encouraged more just overall in the industry. I agree, I think the inclusivity is so important but it also needs to be intentional. We've done a lot of chatting with women in tech lately and we've been talking about this very topic and they all talk about the inclusivity, diversity, equity, but it needs to be intentional by companies to be able to do that. Right, and I think that in a way if you were to put it as like woman in tech then I feel like that's also making it more explosive. I think it's better when it's focused on the industry problem or the subject matter but then intentionally inviting more women and minority to participate so that there's more exchange with more diverse attendees in the events. That's a great point and I hope to your point one day that we're able to get there where we don't have to call out women in tech but it is just so much more even playing field but I hope and I hope like you that we're on our way to doing that but it's amazing that Paul brought up that you started the company during the pandemic also as a female founder getting funding is incredibly difficult so kudos to you for all the successes that you've had tell us what's next for SelectStar before we get to that last question. Yeah, we have a lot of exciting features that have been recently released and also coming up. First and foremost, we have an auto documentation feature that we recently released. We have a fairly sophisticated data lineage function that parses through activity laws and SQL queries to give you what the data pipeline models look like. This allows you to tell what is the dependency of different tables and dashboards so you can plan what your migration or any changes that might happen in the data warehouse so that nothing breaks whenever these changes happen. We went one step further to that to understand how the data replication actually happens and based on that, we are now able to detect which are the duplicated data sets and how each different field might have changed their data values and if the data actually stays the same then we can also propagate the same documentation as well as tagging. So this is particularly useful if you are doing like a PII tagging. You just mark one thing once and based on the data model we will also tag the rest of the PII that it's associated with. So that's one part. The second part is more on the security and data governance front. So we are releasing policy-based access control where you can define who can see what data in the catalog based on their team, tags and how you want to define the model. So this allows more enterprises to be able to have different teams to work together. And last one, at least we have more integrations that we're releasing. We have an upgraded integration now with Redshift so that there is an easy cloud formation template to kind of get it set up because we now have not added Databricks and Power BI as well. So there are lots of stuff coming up. Man, you have accomplished a lot in two and a half years, Shinji. My goodness. Last question for you. Describing SelectStar in a bumper sticker. What would that bumper sticker say? So this is on our website but yes, automated data catalog in 15 minutes will be what I would call. 15 minutes, that's awesome. Thank you so much for joining us back on the program. Reintroducing our audience to SelectStar and again, congratulations on the successes that you've had. You have to come back because what you're creating is a flight wheel and I can't wait to see where it goes. Awesome, thanks so much for having here. Our pleasure. Prishanji Kim and Paul Gillan. I'm Lisa Martin. You're watching theCUBE, the leader in live enterprise and emerging tech coverage.