This year, all the hype is squarely centered on Generative AI (Gen AI), and why not? It’s transformative in so many ways. I’m certainly a fan and haven’t been this excited by the possibilities of Gen AI since Blockchain. Ok that wasn’t that long ago. 😉 Anyway, If you’ve been researching or building Generative AI applications, then you’ve likely heard the term Retrieval-Augmented Generation or (RAG) in your travels. RAG is a critical component if you plan on incorporating your own private data into your app but it can be a complex and confusing subject. Questions such as ‘What does it do for my app?’ and ‘How do I go about building it?’ are often overlooked in favor of exploring deep the technical details of vectors, chunking and semantic search. These topics might be foreign right now but fear not my friends, im here to answer all these questions as we dive into another blog. As we reinvent application interfaces with natural language techniques like RAG offers a way to address the complexities of text based processing and semantic search. In this post, we will explore the intricacies of RAG, discuss why Amazon’s Bedrock Knowledge Base capability feels like a marriage made in heaven for building RAG and demonstrate the power of RAG using some good old fashion Sports data.
Before we begin let me first point out that this is not the only GenAI post on ‘Getting Nerdy in 30‘. For those looking for a primer on Amazon’s Bedrock services our Introduction to Amazon Bedrock Knowledge bases and Agents is a great start or alternatively, if your getting your feet wet with LLM Prompting, I highly recommend you check out Tips and Best Practices to get the most out of Bedrock Prompt Engineering. Both are great reads and im sure you’ll learn a few points from each.
OK enough waffle, grab your favorite beverage and follow along as we Get Nerdy in 30!!
What is RAG
Retrieval-Augmented Generation (RAG) is an approach in the field of natural language processing (NLP) that blends the strengths of two major components to enhance the quality and relevance of generated text by first retrieving related information from a large database or corpus and then using this context to inform the generation of a relevant response. These two components are the retriever and the generator. Let’s define each.
The retriever component is responsible for selecting relevant documents or pieces of information from a large dataset based on the query. This retrieval is typically powered by semantic search techniques, which go beyond simple keyword matching to understand the meaning and context of the query. The retrieved documents are then provided as additional context to the generator.
The generator, often a large language model like those provided by Amazon, Meta, Anthropic and others, that takes the original input and the retrieved documents to produce a more informed and contextually relevant output. This output benefits from the vast knowledge encoded in the retrieved documents, allowing the model to generate responses that are deeply informed by relevant external information.
RAG has been applied in various NLP tasks, including question answering, content creation, and conversation systems, demonstrating significant improvements in the quality and factual accuracy of the generated text. By effectively combining the vast knowledge retrieval capabilities with the advanced generation skills of large language models, RAG systems represent a powerful approach to building more knowledgeable, accurate, and context-aware NLP applications.
Why do we need to simplify things?
In order to answer this question we need to spend a few minutes understanding the concepts relevant for RAG. Lets go a bit deeper and talk about chunking, vectors and the Vector Database.
Chunking
As the name suggests, Chunking refers to the process of breaking down large sets of information or data into smaller, more manageable pieces, known as “chunks.”
In the context of information retrieval and natural language processing, chunking breaks down our text into smaller, meaningful units like phrases or sentences. This is particularly useful in tasks like parsing, where understanding the structure of sentences is crucial, or in indexing and retrieval, where smaller units of text are easier to manage and search through. There are various approaches to the chunking strategies you can employ and we will dive into this a little later but for now let’s just settle on the fact that Chunking is needed and forms a critical part of the process.
Vectors
A vector (or embedding) in the context of RAG is just a mathematical representation of your data in a format that machines can process effectively. For example, in a natural language processing or image recognition use case, vectors are used to represent words, sentences, or images in a high-dimensional space. Each dimension in these vectors can capture some aspect of the data, like the meaning of a word or the intensity of a pixel. This representation allows computers to perform operations on these vectors, enabling tasks such as searching, classifying, or generating text and images.
Both vectors and chunking are fundamental to the workings of RAG, where large amounts of high-dimensional data need to be efficiently processed, managed, and transformed into meaningful outputs.
Vector Databases
Once we have our vectors we need somewhere to store them, this is where vector databases are used. Unlike transnational databases that excel with structured data, a vector databases addresses the unique challenges posed by high-dimensional and unstructured data. This type of database is perfect for similarity searches, which are crucial in applications like recommendation systems, search engines, and content discovery. They can quickly find items that are “closest” another vector representation, enabling functionalities like “find similar words or phrases”. Finally, when you need information, the system first needs turn your query (phrases or sentences) into vectors (see above). It then searches the vector database to find matches or similarities to the vectors created from your question. The vector database quickly sifts through millions of these, finds the best matches, and retrieves the corresponding information forming your response. Using this approach means the response is generally more accurate, detailed, and helpful.
So…..
As you can see getting RAG up and running isn’t a trivial exercise and requires some specialized skills in data engineering database setup and query access pattern matching. These skills are typically beyond the comprehension of the average engineer, so packaging these components up into a simplified, managed service that abstracts the complexities away has to be a good thing….enter Bedrock Knowledge Bases.!!
What is Amazon Bedrock Knowledge Bases
Taking the definition straight from the source….
Knowledge bases for Amazon Bedrock provides you the capability of amassing data sources into a repository of information. With knowledge bases, you can easily build an application that takes advantage of retrieval augmented generation (RAG), a technique in which the retrieval of information from data sources augments the generation of model responses. A knowledge base can be used not only to answer user queries, but also to augment prompts provided to foundation models by providing context to the prompt. Knowledge base responses also come with citations, such that users can find further information by looking up the exact text that a response is based on and also check that the response makes sense and is factually correct.
So to paraphrase, it’s a fully managed RAG implementation that simplifies the setup, ingestion, processing and retrieval of private datasets. Again if you missed the shameless plug to get more information above, I highly recommend you go read Max’s blog.
Ok so now we know what RAG is and that Amazon can help us build a RAG implementation, let’s explore a real world use case and demonstrate how simple this is with Amazons Bedrock Knowledge base service.
Scenario
Let’s imagine you operate a Sports Streaming service and you want to improve audience engagement by incorporating player or team statistics into your broadcast. You have a large dataset of statistical information already but you want to expose this in a way that allows your viewers to ask questions during the coverage like….
”Which Quarterback has the most passing yards ?”
Or
“Which Quarterback threw the most intercepts?”
This might be accomplished through Voice to Text, a virtual keyboard or selecting from a drop down. The implementation specifics are less important over if this is possible. Generally speaking you wont store the information in this way so you need a search capability that can semantically find the relevant information and avoid the limitations of keyword searching only. Let’s build this with Amazon Bedrock knowledge Bases.
Step 1 – Create an S3 Bucket
Why are we creating an S3 bucket, you might ask. Well, Amazon Knowledge Bases uses S3 as its source data store. Documents uploaded into this S3 bucket are processed by Amazon Knowledge Bases into vectors (Embeddings) and then written to our vector store. We’ll need this bucket name when we create the Knowledge Base, so it makes sense to do this as the first step.
I’m going to assume you know how to create an S3 bucket so I’m not going to provide the steps here. I will however provide a link if you need a refresher;
https://docs.aws.amazon.com/AmazonS3/latest/userguide/create-bucket-overview.html
Note: When creating your S3 Bucket ensure it’s created in the same region you plan to use Bedrock Knowledge bases.
Step 2 – Prepare some data
Now we have a source data store, we need to populate it with statistics we want our Knowledge base to query. Amazon Knowledge Bases supports are large array of document types listed here. In our example we are going to use the NFL 2023 Passing stats which are available here. To make things simple, I copy and pasted the first page of stats into an excel sheet and saved it as QBStats.xls
Upload this file into your S3 Bucket
Step 3 – Create your Knowledge Base
Ok, now with the prerequisites in place, let’s go build our Knowledge Base. If your already on the Amazon Bedrock console, select the Knowledge Bases option in the left hand navigation menu
Click the [Create Knowledge base] button on the right

Give you Knowledge base a name e.g [get-nerdy-sports-stats-kb] and provide a description like
Streaming Service Sports Statistics

As this is our first Knowledge Base you can let AWS create your Service Role and leave the Tags blank for now. Click [Next]
Next we need to give our data source a name and select the S3 bucket we created previously. For the sake of this blog, we will leave the Advanced Options for now so just click [Next].

NOTE: Before we move on, Its worth calling out, that skipping the Advanced Options means we are happy with the AWS using a self managed KMS key for encryption and a chunking strategy that uses up to 300 tokens. In the real world, these settings may need to be adjusted based on your specific needs and data formats. For example we have 3 options when it comes to the chunking strategies and determining which one is based on how your data is partitioned up. If your unsure or ingesting large documents of text use the default option (300 tokens). If your data is partitioned in a consistent fashion with fixed lengths select Fixed length (20 – 8192 for titan) and set the number of tokens needed. Remember a token is usually 4-6 characters.
Alternatively, if your data is partitioned as individual files and not co-mingled in any way then selecting no chunking might be the right approach.
On the next page, select the Embedding Model we want to use. Remember our embedding model is a specific type of LLM that’s job is to analyse our data and break this up into vectors (embeddings). In this case the default Amazon Titan Embedding model will work just fine, so lets go ahead and select that.
At the bottom of this step you will also notice we can select our vector database. The justification for which vector knowledge base is a topic that requires its own blog and well beyond the scope of this guide, so for now just remember that Amazon Bedrock Knowledge Bases gives you options between OpenSearch (default), Aurora, Pinecone and Redis.
For our demonstration we will stick with the default Open Search database engine. So Click [Next]

Finally review the settings and click [Create Knowledge Base].
This will then begin to process to create a Open Search based vector store and setup and ingestion pipeline to process our data store which contains our QBStats.xls file.
Step 4 – Sync your data source
After a few short minutes you should see a green notification banner at the top of the AWS Console indicating the Knowledge Base has been created Successfully. Go ahead and press the [Sync] button to create and store your vectors.

After selecting [Sync], take note of the following Data Sources section. Make sure you don‘t see any Sync warnings as this is an indicator you have a file or permissions issue. If everything is clear, we should be ready to test.

Step 5 – Test your Knowledge Base
Ok, so lets just pause and take a moment to reflect on how easy and fast that was. We went from nothing to a fully managed RAG implementation in less than 10mins. That’s pretty impressive !!
Now we have everything in place, lets showcase the power of RAG in the context of a chatbot.
On the right hand side you will notice a section of the screen that allows us to test our Knowledge Base. As we mentioned earlier, Amazon Bedrock Knowledge bases combine the power of private data with the general knowledge of Large Language Models, so the first step to testing is to select the LLM we want to use.

For the sake of the demonstration, lets select the latest Claude model available which at the time of this writing was Claude 3 (sonnet)

After Selecting [Apply] let try our first query. Type the following into the chat window;
Which player has the most passing yards ?
After a few seconds, you should receive a result something to this.
Isn’t that amazing?. We were able to ask a simply question in natural language and our Knowledge was able to interpret this in the context of our columnar data and respond with a natural language answer, Citations included !!
No queries, filtering or complex mapping of our data to keywords.
Ok let’s try another search
Who threw the most interceptions ?
After a few seconds, you should receive a result something to this.
Again our knowledge base was able to interpret the question, analyse the data and determine that the INT column represents interceptions before returning an answer.
As you can see the ease in which we can create a semantic search capability with some rudimentary data and a few clicks is simply outstanding but this is just scratching the surface of whats possible. Next up, I’ll incorporate Amazon Bedrock Agents into the mix to help us augment our data. So stay tuned.
Conclusion
As we demonstrated today, RAG is a very powerful approach to handling the complexities of Natural Language processing in common search use cases and up until very recently this was a challenging component to design and build. With the introduction of Amazon Bedrock Knowledge Bases, we are removing the barriers to entry and short circuiting the complexities with a fully managed service. Whether its improved search, Semantic Q&A chat bots or creating a user friendly interface into your private data with the power of LLMs, Amazon Bedrock Knowledge Bases is by far the most simplistic way to build your RAG capabilities and get started in minutes.
As mentioned above, this blog will contain a follow-up entitled, ‘Orchestration and Augmentation with Amazon Bedrock Agents’ where will will extend our stats knowledge base with some specific business rules and integrations to give our stats super powers. Be sure to subscribe to Get Nerdy in 30 so you don’t miss this !!






1 thought on “Using Amazon Bedrock Knowledge Bases to power up sports stats”
Pingback: Amazon Q – does Q stand for quick? – GetNerdyin30