 Hi guys, I am Kumar Shruan and I work as a data engineer at KitaWeep, a company that provide competitive intelligence as a service to grants and retailers. At DataWeep we crawl millions of products across different e-commerce sites and match them to provide competitive insights to our clients. For example, an online fashion retailer might require consumable information of a pricing of certain product listed across their competitors. How often the prices varies or what sort of discounts are being provided and are this product even promoted by their competitor. To provide this kind of information you first need to identify those products and match them across different e-commerce sites. Typically, text based information like titles and descriptions are used to a product matching, but when you come to the fashion vertical, these titles and descriptions are not detailed enough to capture all sort of variations. For example, at Mintral you can have more than 1000 product with name white check collar shirt. So, we use deep learning algorithms to do image matching in fashion vertical. So, what is image matching? Given millions of product listed across different e-commerce sites, finding product which basically shares similar and exactly same looking images to given series. If you look into the overall image matching process, it can be divided into two subcategories. Identifying the semantics and categorical attributes which are associated with an image. Semantic attributes like what type of sleeves is it or what type of nick type is present on a given image. Categorical expects of an image like whether it belongs to Batman category or whether it belongs to Superman category. And apart from that there are some fine level and minute grain feature which is associated with an image which this deep learning which this semantics and categorical attributes fail to capture. For example, how to strives shirts are different or what are the difference between two graphic shirts or how to what how are two rural shirts different from each other. With recent development in deep learning, the deep learning architecture will are working at fine in capturing this variation associated with an image. So, if you look into the present architecture how people are doing image matching, regenerate trail now deep learning model to identify both this semantic and categorical aspects related to an image. And at the same time, identifying the fine level features which are associated with an image. But compared to having a deep learning model or deep learning architecture like Alex net and Inception or VGG net and extracting this fine level features for capturing this fine level attributes are developing a model capture all type of semantics attributes which is going to be associated with image is both complex and expensive. Complex in the sense that here we are basically planning to have a model which knows all type of patterns, all type of sneaks and all type of nick type which is associated with an image and that too on a very large scale. While most of the people can afford big GPUs clusters, but generating training data set is still hard. And even the accuracy of such model is questionable because here we are planning to capture all sort of variation which is present across geography and replicate that in your training data set. One thing which is unique when you work with e-commerce site is the availability of a lot of titles and description associated with a given product. So, at data we asked ourselves that can we use this titles and description along with the deep learning feature to boost our image matching capability in fashion workable. And the solution we arrived at was solar. And open source enterprise searching platform which is well known for its energy capability. Like any big data search engine platform it provide you real time indexing mechanism in the portal. Solar is a standalone full text search platform and which is powered by losing library. And it provide you risk like APS for most of its query over its data. So, first off how we are going to use solar which is meant for natural language crossing and store deep learning features into it. So, we expected the final level, final layer feature vector from Alex net or inception and VG unit and index it as a multi-value numerical list. Now while indexing we also wanted to group product which basically share similar looking images in term of deep learning feature. So, there is two reason for doing that. First we wanted to save some types in the complex re-ranking process which we are which I am going to discuss in the later stage. And at the same time we wanted to develop a variable free group similar looking product together. So, one data structure will basically solve the problem is hash function. So, we started experimenting with different hash function which is known to take numerical feature vector as an input and group similar looking products like local sensitive hashing, kernelize local sensitive hashing for image and self dot hashing. If you look local sensitive hashing and kernelize local sensitive hashing if they tries to group product by dividing the very high dimensional features space into multiple buckets. Self dot hashing is unique, unique in the sense that it use a collaboration of both supervised and unsupervised algorithm. If you look into the slide there is a for a given corpus there is an unsupervised mechanism to understand the possible hash bits which is going to be associated with the given document. And then we have a supervised mechanism to learn the mapping process from the given document corpus to this hash bits. Once we have been this model this new model can be used as a hash function for any upcoming new documents. So, we for our use case self dot hashing work best and we index the self dot hash bits along with deep learning features and textual descriptions into solar. So, why self dot hashing? Self dot hashing is a unique semantic hashing. Semantic hashing in the sense that we want to create bucket with bucket of products which basically share similar semantic attributes. For example, you will you can have the bucket of similar looking t-shirts and a completely different bucket for shoes and flippers. Self dot hashing provide you more control over your hash function generation. So, it here you explicitly define the neighborhoods for a given product. While doing the binelize application I can map decomposition to generate the hash bits. And since a machine learning algorithm is involved you can expect that similar looking image features will be mapped to the same hash bucket. Once we have indexed the whole textual description deep learning features and the corresponding self dot hash bits we wanted to develop a proper searching mechanism over the index data set. For the given task we needed a new request handler and we need a new request handler which is going to handle all the upcoming requests. So, whenever requests come we create a bucket of similar looking product based on the hash points of across the self dot hash bits which we already have indexed. Now, we narrow down that bucket to small subset of required document based on the title, based on the attributes extracted from titles and description. For example, you can explicitly ask the white shake and collar to be mentioned in your document. Once you have narrowed down your subset you re-rank them based on the deep learning feature which is already indexed in your database with the seed image over the query is made. One unique thing with the present mechanism is there is a tight coupling with store data set and the algorithm. So, instead of bringing the data from one server to another we have moved the whole computation engine to the data. So, we are saving a lot of latency time which is going to be associated when you want to fix the data of deep learning features which is going to be 2048 double list. So, now I want to show you some example where that this takes place mechanism and image matching helps us. So, for example, for given Batman seed server, if you go completely for image matching you might land up into a Superman shirt which look too much same to your seed image. While doing the proper text matching you might land up into a product which is which basically belong to Batman category, but share completely different image. It might be possible that Batman is a human is present on that image. But when you have this collaboration of both text and image you land up into an exactly same looking product. Similarly, there can be a denim shirt where the defining attributes is basically stars. While doing an image matching you might land up into a denim shirt which look too much similar to your seed image. And a text matching might land up into products where these attributes is present, but the image look completely different. But with both as a mechanism you land up into an exactly same looking product. So, following is a simple request call to the select request handler which comes with the solar. So, as you can see in the very part I made a call on for that for Batman's and like to be mentioned in the title and some basic filters across the market. The response section that scores are not normalized. So, you don't know that how much similar this product is with the given seed query. And also just by looking into the titles you can't make a decision about which product is similar or how much is going to be similar. Similarly, this is something through our mechanism which we have developed. So, here we made a query over the URL which is unique identifier for the given product and we have defined few basic filters on the MRTs and a deep learning cutoff that the overall competition should be at least this much. As you can see that although we are getting similar documents and this score is basically a normalized one when we compare the candidate with the seed. But still there is a scope of filtering. Filtering in the sense that I might not require the Batman Superman to be mentioned in the title. In the last, so here you can see that I have explicitly asked Batman's in the black to be mentioned in the title and HL and HU is the hash bounds which we want upon the hash which has been mentioned. So, here it is 0 to 4. Similarly, there is a filters upon the price and DLC is basically a deep learning cutoff that at least the deep learning, the cosine similarity which is going to be done on the indexed candidate and the seed image should be at least this much. And as you can see in the response spot, this black and Batman is mentioned in the title. And similarly the scores basically give you an idea that how much similar this candidate is with the seed. Any questions? If anyone has a question please raise your hand. A couple hands raised over here please. We don't have a lot of time for questions. We only have time for one. We can't hear you. Go training. Go training. Okay, so we haven't tried it yet. But we will experiment it. Deep features alone. So if you see the overall process, we basically index the subset which we want to search upon and also that seed image. So that seed image, if you see in the slide, seed is a unique identifier for that given URH which we want to search upon. Internally we fetch the deep learning features and based on the hash font, we know that these are the possible candidates which is going to be close to the given seed. And we do a cosine similarity across the seed deep learning features and the candidate ones. So while indexing itself, so what we have done, we pass the image to this deep learning architecture and we accepted the deep learning feature from this final layer. Yeah, so if you see in the present image, obviously you can't go with an ON search on whatever data that you have. So there's a self-taught hash mechanism also. So what we do for any given seed image, we first to find out the possible search bucket that these many images are going to be most similar to your seed. And then we re-rank them. We rank them that among this 10 or 20 documents which is in your given bucket, which one is the most similar one. And you can the unique thing here is that you can even refine that bucket. You can ask that look if 200 product has come into your bucket. I want this attributes to be present in the title or description. So you can narrow down that subset and finally you have that document you can re-rank. So what we basically do that before finding the hash bits and before extracting this deep learning feature, we do a set of pre-processing on images. We remove the background, we identify that look if the product is basically t-shirts, just focus upon the t-shirt part of it. And then we extract this deep learning feature. So a lot of pre-processing is going on behind, before the deep learning and hash bits has connected. But it's not necessary. So what we do, we identify the, if suppose two person is basically present in a given image. We basically identify the person itself. We identify the clothing part and we extract these images separately. We have an object detection mechanism which basically give us code that how much confident the model is for given clothing part. So we can index both of these features together in that, into Sula. Hello, actually you talked about that, what kind of function, what kind of function we use for that. Do we use sub-syllable function or what kind of function? So in our use case, the functions need to group the similar looking product into a similar model. So one of the possible hash function which is well known for this purpose is local sensitive hashing. But the problem which come with the local sensitive and local sensitive hashing is that you don't have much control over the hash function generation. And we needed some, some mechanism where the features, the correlation between the deep learning feature we can basically preserve. So that's why we went with the self-taught hash. So that's basically algorithm which is doing this. So hash is not just a filter which is going to work on the document. We have basically multiple filters. Filters based developed on the text and the description. For example, while making while planning to find an exact looking product, we are basically interested in the same color of the product. And that you can easily find from the text and the description which are present because most of the time on e-commerce sites, these attributes are going to be present. So generally we go with a higher recall system with the self-taught hashing mechanism. We want 90% or 90% or 92% of recall, at least in the self-taught hash bit. Then we narrow it down. Narrow it down that if more there is 2000 or 3000 product is present in your pocket, we filter those products where the black or blue is not mentioned. So once you have a small subset, then you can re-rank it because re-ranking is going to be a very computational example. Sir, just one question please. We have like one minute left. It's basically a unique identifier for that product. It can be hash value of the product. HL and HU is the lower hash point and the upper hash point which you basically want to be into the product. So in terms of improving, yeah, solar. So one thing why we went with the solar is that solar the text based mechanism or the text based NLP capability which we wanted, solar basically provide all of this functionality for you. So we haven't experimented with the plastic search but I think that present mechanism can be implemented in that also. Okay, the accuracy. So if you look into our company, we have a human in the loop process. So what we basically do, we find out a small subset of document and we show it to the human and they have the capability to basically search to the whole database based on the attributes which they think that it's a unique identifier for that given product. And once they identify that the product is exact matching, they choose that. So using human in the loop basically it's more than 90%. I think you might need to take this offline if you have another talk coming up. And we're just going to set up now. Thank you very much. I think next time we're going to have to put you in first slightly longer session.