Google Cloud Platform

Building Zubro – a voice-first English tutor on Google’s Gemini Live API

What is Zubro?

Zubro - a European bison with headphones and a green scarf

This article was created for the purposes of entering the Gemini Live Agent Challenge hackathon. #GeminiLiveAgentChallenge

Zubro is a voice-first English learning companion. You talk, Zubro listens, responds, corrects your grammar, and adapts to your level – all through real-time voice conversation. No typing, no multiple choice, no textbook exercises. Just a conversation with a patient AI tutor who happens to be a European bison wearing a green scarf.

A quick note on the name. In Polish, żubr (pronounced roughly “zhoobr”) means European bison – a national symbol and the largest land animal in Europe. I took the dot above the ż in the original Polish spelling, moved it to the end of the word, and got zubro. A slightly mangled version of a Polish word, used to teach English. Felt appropriate.

 

Continue reading

Unlocking Unstructured Data Potential with Google Gemini 1.0 Pro

In today’s digital era, businesses across the globe are inundated with vast oceans of unstructured data. From emails and documents to social media posts and beyond, this data holds invaluable insights that can drive innovation, enhance customer satisfaction, and streamline operations. However, the sheer volume and complexity of unstructured data present significant challenges in terms of analysis and information retrieval. Traditional data processing tools often fall short when faced with the nuanced, irregular, and often unpredictable nature of this data.

Enter Google Gemini 1.0 Pro, a cutting-edge Generative AI Model. In this article I would like to propose an intriguing way of utilizing such models to navigate the labyrinth of unstructured data with unprecedented ease and efficiency. By leveraging the power of Gemini 1.0 Pro, businesses can transform their data analysis processes, uncovering the hidden gems of information that lie buried within the digital textual chaos.

Continue reading

Integrating Serverless Apps with NoSQL Database and LLMs: Building a ‘Shopper’ Chat-Bot with PaLM 2 and LangChain

In the ever-evolving landscape of technology, the synergy between serverless architectures, NoSQL databases, and Large Language Models (LLMs) is opening new frontiers in application development. This article delves into the integration of these cutting-edge technologies using Google’s PaLM 2 and the LangChain framework, demonstrated through the development of a ‘shopper’ chat-bot.

In this entry I will describe an example I am preparing to showcase the possibility of using ReAct (Reasoning & Acting) paradigm of Large Language Model and incorporate serverless apps into our GenAI-powered applications

Shopper architecture

So here it is – a shopper architecture. Fairly straight forward. We are going to utilize Firestore as our NoSQL database, 3 Cloud functions that can accept API calls to list or modify content of the database, and 3 python-developed tools that will be utilized by LangChain Agent, powered by PaLM 2 Large Language model. But I’m getting ahead of myself. Let’s start step by step.

Continue reading