Inspiration

When I'm shopping for something that I hope will last me anywhere from a few months to a decade, I invest quite a bit of time in the research process before biting the bullet. I get hung up on making sure I'm getting everything I need from whatever that product might be, and that I'm not missing out on a better (or cheaper!) alternative. In the past I've gone as far as making spreadsheets to navigate purchasing options for new software.

But that process is way more burdensome than it has to be. Manually digging through pages for descriptions of software capabilities or the construction material of a desk, countless hours poring over reviews, to find every little difference that I should be aware of as a consumer.

That's where Due Diligence comes in. It does some of that research for you, jumpstarting the process of picking a new product. We don't try to replace the process entirely; instead, the app provides a starting point for comparison and a handy place to start your shopping process.

What it does

Due Diligence allows you to throw out some search terms and gives you a digestible comparison of the resulting products and their specifications. It gives you a nice readable table comparing a few options you might be interested in, listing their prices and especially highlighting differences in traits. That's where Due Diligence excels; it looks for products that have some similar classification that could be compared, then places those differences directly side-by-side in a tabular view so you can easily spot the features you care about.

A screenshot of a table generated by Due Diligence for the query "gucci slides"

How we built it

Due Diligence makes use of a wide array of technologies to make this possible.

  • Flask with Python 3.9 handles routing and basic server-side logic.
  • HTML, Javascript and LessCSS (compiled to CSS on push by this GitHub Actions workflow) make up the frontend.
  • The Schema UI framework provides some handy layout components and CSS boilerplate that made the fronted a little smoother to implement.
  • Oxylabs' E-Commerce Scraper is used to snag raw data from Google Shopping search results.
  • That raw data is then manipulated and run through the Microsoft Semantic Kernel, with GPT-3.5 Turbo, to extract relevant specifications from product descriptions and put them in a machine-readable format.
  • Google App Engine provides a serverless hosting solution for the final Flask app.
  • I used Google's Cloud Build and GitHub Actions for CI/CD, to get recent versions of the app deployed as soon as I pushed new commits.

Challenges we ran into

Almost the whole tech stack used for Due Diligence is entirely new to me. I have some experience with Python from starting out in CS, but I've never used anything like the Semantic Kernel and with my surface-level understanding of LLMs, I spent a lot of time trying to draft prompts even for basic tasks.

I spent a lot of time trying to figure out a proper balance between the amount of information I could throw at the Semantic Kernel for better results, and the amount of time that processing then consumed. I still haven't found a great balance; even with small datasets (I have the app limited to 5 products per query right now) processing can sometimes take up to 10-20 seconds.

Accomplishments that we're proud of

Due Diligence makes use of a ton of different technologies, almost all of which were new to me just two days ago, and so I'm happy I managed to complete the project in the first place. That said, I'm also really happy with most of the results the app gives me. There's definitely some room for fine-tuning (and plenty of room for optimization), but for 36 hours, I think it's a great start.

What we learned

I learned to work with large language models, and I learned how much of a headache it can be to nudge them in the right direction through a bunch of marginal changes to prompt wording. But at the end of the day it was a really cool experience, and I'm already looking into other projects I can apply this tech to. I also learned how to pace myself for this hackathon--I spent most of Friday night trying to learn React before realizing that my pace wasn't going to work, so I doubled back and switched to a completely new technology stack to finish the project.

What's next for Due Diligence

There's of course plenty of optimization to be done, and I'd like to see what changes I can make to the LLM processing part of the application that would allow it to handle bigger datasets (with far more than 5 products simultaneously) and run much, much faster. I think fragmenting my prompts into a bunch of smaller functions--like extracting specifications from products individually, then comparing them--could be a good start.

There were also a few features I had planned from the offset that didn't make it into the final version; I wanted a lists page where users could save comparisons to refer to later, and possibly a history tab that could view past searches and build off of them. I'm planning to keep working on this for a while and see where it takes me.

Built With

Share this project:

Updates