Inspiration

I played around with Azure Ai Studio. They gave us an option to deploy a web app with GPT 3.5. But i wanted to deploy a web app with GPT 4 and with Azure Vision.

So i looked for azure github repos that had already implemented most of the chat interface. I found AzureChat : https://github.com/microsoft/azurechat

So my plan was to implement TTS (text to speech) and video analysis with AzureChat.

So if you wanted to create your own Chatgpt you would have these 2 features that ChatGPT currently does not have (well they do but not to your customization)

What it does

This allows you to talk to openai models, The main one being GPT3.5 and when you get the response you can use Text To Speech on it.

You can change your TTS voice's speed, voice, and emotion to your liking.

How we built it

I used AzureChat : https://github.com/microsoft/azurechat And then added some code to include TTS. So Azure Speech API.

Challenges we ran into

I Sadly could not implement video analyses. I started the project too late.

Accomplishments that we're proud of

I'm proud i was able to implement the TTS part. It could be fun and useful for reading out loud some mundane or exciting text.

I could do it with ChatGPT, but ChatGPT does not have the voices I like, and they also critically don't allow and easy interface to change the speed of the voice (as I know).

What we learned

I learned about Azure. This is my first time using it. I learned how to replace a provisioned CosmoDB with a Serverless one to save costs. I learned how to use the Speech API. I learned how to use Deploy with Azure and manage resource groups and the like.

What's next for AzureChat - TTS features

I really want to do video analysis.

If i could figure out video analysis, then i would do some finetuning and try to potentially to make a SaaS out of it.

But there are other bugs, Sometimes it tries to generate images so i need to fix the flow of it.

Oh and a big thing would be to have auto TTS where it auto reads.

And maybe add in an GPT3.5 api where it makes the auto read have different emotions for each sentence.

DEPLOYMENT DETAILS

** How to test: fork the github repo. follow the docs or this youtube tutorial : https://www.youtube.com/watch?v=rDLgkkCMBPY&t=855s&ab_channel=UnscriptedCoding

If you follow the youtube tutorial or repo docs it can take you an hour.

You can also do local development. But in this case you would deploy the resources on azure and then run npm run dev in your local VS Code.

So fast : Deploy resources on Azure, then clone repo locally and run it locally. (substitue ENVIRONMENT VARIABLES)

If you want your own website and auth system then it will take longer due to deployment with Github Actions and Credentials ETC. **

If judge want me to open auth to them or outsiders in a (safe way) im all ears.

Built With

  • azure
  • azure-speech
  • cosmodb
  • nextjs
Share this project:

Updates