Have you ever wished you had a personal guide through the vast and intricate world of Shakespeare's works? Well, now you can! Introducing Virgil, your AI-powered companion for all things Shakespearean.

Fig.1 Virgiil application answering questions about the character Portia

Fig.1 Virgiil application answering questions about the character Portia

This blog post will take you on a journey through the creation of Virgil, a search application built using Vertex AI and Conversation and Gemini models. We'll explore how Virgil leverages the power of a datastore brimming with Shakespeare's plays and poems to answer your questions, provide insightful analysis, and even help you find those elusive quotes (which was the whole reason of starting this project). The technique presented in this post is called Grounding a Model

Grounding overview  |  Generative AI on Vertex AI  |  Google Cloud

Data Sources: Fueling Virgil's Knowledge

As with any AI application, the cornerstone of Virgil's intelligence lies in the data it's exposed on. For this project, we'll be utilizing the complete works of William Shakespeare, sourced from The Tech, MIT's esteemed and long-running newspaper.

The Complete Works of William Shakespeare

It's important to note that while The Tech is a reputable source, double-checking the accuracy and completeness of the Shakespearean text against other established sources is always a good practice.

https://github.com/TheMITTech/shakespeare/

By leveraging this comprehensive collection of Shakespeare's works, we ensure that Virgil has the knowledge necessary to answer your questions and provide insightful literary analysis.

Building a Datastore from Shakespearean HTML

Now that we have our data source, it's time to create a datastore that will allow Virgil to efficiently access and process the information. Here's how we can do it:

Let's first prepare the stage by setting up a Google Cloud Platform (GCP) bucket to save our Shakespearean play HTML files.

Uploading the HTML Files:

Once your bucket is created, click on its name to access it. You can either drag and drop your HTML files directly into the bucket interface or use the "Upload Files" button. Make sure all the play HTML files are uploaded successfully.

Fig.2 Example of a GCP Bucket with the

Fig.2 Example of a GCP Bucket with the merchant-of-venice.html

Creating a Search & Conversation Datastore from Cloud Storage

Now that our Shakespearean play data is neatly organized in a GCP bucket, it's time to create a Search & Conversation datastore to unlock its potential for search and analysis. Here's how to do it using the Google Cloud console:

  1. Accessing Data Stores: Go to the Google Cloud console and navigate to the Search and Conversation page. Click on the Data Stores tab.