Inspiration

We both always wanted to learn a new language, specifically Japanese. So we have tried many things to help us improve upon our abilities, such as consuming a lot of Japanese content, taking classes and even playing a lot of Duolingo! However, it is a well known fact that in order to truly improve one needs to gain confidence in speaking the new Language, but what can you do if you are in one of the following 2 scenarios?

Scenario 1: You are too shy to start trying to speak to fluent speaker, afriad of what they might think. Scenario 2: You don't know anyone (or is not comfortable) who is fluent enough for you to truly learn to speak the language.

So we came up with PolyGlot Pals!

What it does

It is a chatbot which acts as a "friend" to let the user learn to speak a language while being in a safe space without any worry about being judged.

How we built it

Polyglot Pals uses AzureAi's OpenAi Chat completion in the Backend, along with some prompt engineering to generate responses in the language of choice (as of right now only supports Japanese).

Frontend

The frontend of the app is built just using a simple HTML, CSS and Javascript which depicts a minimalistic chatbot interface. This allows the user to both type their responses and also speak their responses to get a more realistic feeling of interacting with an actual person. In addition to that there is also a Text-To-Speech button below every response allowing the user to request for the message to be spoken to simulate a conversation further.

Backend

The backend is a simple Flask app, which takes the GET/POST requests from the Frontend, such as the voice input which then gets sent to AzureAI's OpenAi's Whisper to get a transcript from the audio input of the user and adds it into the textbox as a text entry to be reviewed before being sent into the chat.

It also takes the input of the text/chat which then gets add to the prompting history to allow remembering of the previous conversations in order to allow the app to have a conversation with the user. Which then gets send to OpenAi's chat completion and gets return/shown as a response to the user on the chat interface.

Finally, it also implements a Text-To-Speech functionality which allows the user to request for a message to be spoken. The request is made to Azure's cognitive services for speech, which then synthesizes the responses and returns an audio that reads the message,

Web Dev Ops

The app is hosted on Microsofts Azure services, using a simple Azure app hosting. Which is also where all the server-side connection and hosting of the app is allowing us to leverage the capabilities and functions that are readily available on Microsoft Azure Services.

Accomplishments that we're proud of

Having built the project in essentially 3 days, we are extremely proud of how seamlessly the app runs (assuming traffic is minimal due to our single server setup currently). As during testing I was able to having decent conversational practice with the chat bot in Japanese in order further my reading, listening and speaking capabilities of the language while actually getting responses and being asked questions without any worry about being judged. Achieving our main goal of the project perfectly!

What we learned

Coming into this Hackathon, neither of us had any experiences with full-stack development, as being Data Scientist we usually only work with the backend, training models and performing analysis on data. Which made it easy for us to understand the capabilities of the LLM and how to prompt it well. But we particular struggled with the deployment of the app on Azure and also the frontend development of the app, as it was not our strong suit. Through this project however, we were able to learn how to connect an app to the web and also how the connections between the backends and the frontend of applications work.

What's next for PolyGlot Pal

The next steps would be to implement the ability to select one languages ability between beginner, intermediate and advanced, which would be done by simply more prompt engineering in order to let the LLM/AI model understand the level of the responses it should be creating.

In addition to that, we will also be adding support for more languages then just Japanese, which also the ability to highlight certain phrases and text to show the translations to English (kind of like the one on Duolingo). Plus for languages like Japanese display furigana above all Kanji to make the readability of more beginners learners easy.

Finally, we would also like to implement a SQL backend and a User system to allow the chatbot to remember your previous response even after you logged of for the day so that you are able to continue your conversations after logging off.

Built With

  • azure
  • azure-cognitive-services
  • flask
  • gpt
  • openai
  • whisper
Share this project:

Updates