Inspiration -
Publicly available reports are found in hordes. Multinational companies and big hedge funds hire 100s of people to study such reports and find out insights from them. However, it is simply impossible for individual or small investment firms to find out the latest trends and investment insights from these publicly available reports.
And that is how I was inspired to build a useful toolbox for organizations by allowing them to handle knowledge management efficiently - meaning less chaos and more productivity. In short, it would be like having a secret weapon that makes small firms and individuals sharper and more organized! To do this I wanted to keep the following under consideration:
- Analyze trends using AI
- Reports uploaded by users should be analyzed by AI and made readily available
- Trends should be extracted from them during the processing pipeline
- Streamline research through a chatbot
- Help users chat with each report individually and understand trends deeper
- It should all help chat with the entire reports repository and give real-time suggestions and trends
What it does-
Here are some key features of the project:
- Trend Aggregation and Curation:
- Collect trends from various sources (reports, articles, blogs, etc.).
- Curates and organizes them into relevant categories (technology, health, finance, etc.)
- Search and Filtering:
- Allows users to search for specific trends or topics.
- Provides filters based on time, relevance, and popularity.
- AI-Driven Chatbot:
- An intelligent chatbot that understands natural language - Users can ask questions about trends, reports, or any related topic.
- The chatbot can analyze trends and provide insights. For example: a. Why a trend is emerging b. Impact on industries or society c. How an emerging trend may affect users’ investment portfolios/decisions
- Report Summaries:
- Extract key points from reports.
- Provides concise summaries for busy users.
How we built it
We built the platform on a React frontend, a Flask backend, and a Postgres DB. Langchain was liberally used to create embeddings, summarize reports, and generate AI trends. It was also used to drive AI chat conversations. Databricks workflows were used to run the AI jobs and Databricks served models were used to create embeddings and run inference.
Coming to the architecture, the product can be divided into 2 parts,
Upload a report:
- On uploading a report, the frontend sends a request to the backend, which saves basic report metadata and triggers a Databricks Workflow
- Databricks Workflow triggers a Job, turns on clusters, and runs a Python/Spark notebook
- Inside the notebook, Databricks Volume and DB are created to store pdf files, Delta tables are created to store pdf chunks, Indexes are created to store vector indices and finally, DBRX and Llama70B models are used for creating embeddings and generating trends.
- The LLM chain summarizes the report, generates the top 5 trends, and then does sentiment analysis to figure out the general Greed Index and top stocks that can be impacted.
- Finally, the trends data is formatted by Langchain and sent back to our backend which is stored back in DB.
Chat with reports:
- The frontend triggers calls to the backend, backend calls Databricks VectorStoreClient to get similar documents
- Then using similar documents and chat history, a RAG model is run using Llama70B model for chat inference.
Challenges we ran into
3 major challenges we ran into were:
- Community edition Databricks doesn’t have a PAT token, which took some time to figure out
- Not all regions have pay-per-use models, so when we deployed in the Ireland area, we weren’t able to use the Databricks foundation models
- Serving models in the trial period was a strict no-no it seemed and for good reason to prevent getting high bills to due bombarded API calls. But then the existing foundation models were good enough to run a RAG Model on Databricks.
Accomplishments that we're proud of
2 accomplishments we are proud of:
- Being able to build the product end-to-end and working as desired is something we are very excited to achieve
- Getting around playing with the Databricks platform and learning about it, the DBRX models, and diving deeper into AI was also something we got by working on our project during the hackathon
What we learned
- We learned a lot about the Databricks platform, it AI capabilities, vector search databases, workflows, and Python notebooks
- We deep-dived and learned about the performance of the DBRX foundation models and directly used them in our projects
- We also learned to build a full-fledged AI project end-to-end
What's next for OnlyTrends -
- Implement features like forums, discussion boards, and user-generated content.
- Foster collaboration among users. Encourage discussions, knowledge sharing, and community-driven content.
- Visualize trends through charts, graphs, and interactive dashboards.
- Notifications and Alerts:
- Notify users when new reports or trends are added.
- Customizable alerts based on user preferences.
- Integration with External Tools:
- Link to external resources (original reports, research papers, etc.).
- Connect with other platforms (social media, news aggregators).
Built With
- databricks
- flask
- javascript
- langchain
- python
- react
Log in or sign up for Devpost to join the conversation.