Unlock data insights with Amazon SageMaker and Amazon CodeWhisperer
Ok?Ok?8%?Ok?
Good afternoon everyone. Thank you so much for joining this session. We have a value packed agenda today. But before I get started, I was wondering who is here at Re:Invent for the first time? Can you raise your hand? Oh wow, that's actually quite a few of you. You know, this is my third time at Re:Invent and I can tell you it only gets better and better every year. And if you have any questions, please feel free to ask after the session. Me or Linda, we will be happy to help you with any advanced tips in general suggesting fun things to do.
So my name is Victoria and I'm joined today by my colleague Linda. We are both part of the AWS Developer Relations team and to tell you the truth, we really love our job because it gives us an opportunity to speak at events like this, create educational content. But more importantly, get your feedback back to our product teams so we can continue to drive innovation together. So if you have also any feedback on the product, please feel free to come and chat with us. We will be happy to get your feedback.
So um today before we get started, I want to ask you another question. Do you remember a time when you had to go to a library and go through multiple books just to find the piece of information? Yeah, I see some people know them well, perhaps even now we not necessarily use hard copies anymore and maybe when we prepare for like professional certifications or doing some kind of a research, we still trying to find this like some numbers statistics, right? And we go through multiple white papers, documents websites.
Now imagine what if you could take all of this information, documents, papers, websites and convert them to machine readable format and supplement the generative AI foundations model with this information and get your answers in instantly. So no more researching googling trying to find answers, you have all the information right away available. So this is exactly what we will be building today. We will review the generative stuff and we prepare a couple demos.
So let me show you the agenda. So first we will talk about what it really takes to unlock insights from your data using generative AI services and it not only the foundation models, it really involves several components. So we'll look at the different integrations, then we will look into actually we prepare couple demos. So we will do like some life coding. Um where Linda will first show how to use Amazon SageMaker and Amazon Kendra to unlock insights from the data that is stored on your account or maybe using a website. Then I will be showing you a different approach also using retrieval augmented generation. And we will talk later what it means. We utilize Amazon Bedrock and vector databases. And at the end, we will summarize these different approaches, talk about pros and cons and share some key takeaways and also resources for you to get started. So you could start building your own solutions.
So there recently has been a lot of buzz about generative AI, right? Like it's like everywhere you go gen AI gen AI. But what's really the difference between traditional AI and gen AI? So if you, let's say in the past year when we worked on machine learning projects or AI projects, it required several months to build one particular model because we had to do data prep data label and then train model, then tune the model and all to produce a model for one particular task. Let's say object recognition.
Well, this generates AI because they are trained on terabytes of data and have billions of parameters, they can address a wide variety of tasks. So what we will be talking today is how to use custom specific domain data and then customize it to an open site from your enterprise data using the generative AI foundation models.
So look let's look at the stack. If we look at the stack it involves kind of like three layers, the top layer we have user facing applications. So user facing applications might be like Amazon Alexa or chat GPC. Well, those user applications are powered by foundation models, we can categorize them into two primary groups. First is proprietary models like Amazon Titan or Anthropic Code. We will be actually using both of those models in today's demos and also open source models. So with open-source models, we can deploy self manage them. And we will be using today Amazon SageMaker for this and this can be accessed via model hubs like Hugging Face or SageMaker Jumpstart. So today we will be using Jumpstart and then at the bottom, we have cloud providers that provide us platforms and tools as well as hardware to train and run those models, for example, AWS Inferential or training for such as GPU accelerators.
So now just kinda like to bring us how do we typically interact? Well, we start with the prompt. The prompt might include several components, right? Like first we start with instructions, you are a data scientist, then context you are at a meeting with a customer and then user input, which is basically a question, provide us metrics for a provable concept. Then this prompt goes into LLM and LLM generates for us output response.
And now it kind of sounds almost like a magic, right? We have this dialogue with the foundation model. But in reality, it's just advancements in the technology and it's not perfect. So here's an example I'm asking, explain what is Amazon Bedrock. So that's a service that became generally available in October. And here the model provides an answer. It says it's an open source model hub. And for somebody who's actually not familiar with Amazon Bedrock, it might sound like a plausible answer, but it's entirely fictional one. And the reason for this, that's what's called hallucinations. Hallucinations is because the models train on a static data and they construct their answers based on the data that is available. And you can see here. I'm asking, are you sure? And I'm getting a response. You're right. My previous answer was incorrect. So it's important for us to put some kind of a guard rails and quality assurance and not blindly rely on this foundation models.
Besides guardrails, there are ways to customize models so we can get more accurate answer. There are three ways we can do it well, first answer, the first way is by doing a prompt engineering with prompt engineering, we simply shaping the prompt include some keywords and include some context and make the model more aware about what it responds, maybe like to say, well, think step by step or things like this. Second way is fine tuning this fine tuning? We provide to model additional data set and we further pretraining it. And the third way is information retrieval. And that's what we will be focusing today with information retrieval. We are not updating the weights of the model, but rather we supplement the model of these external data source. So that's really the focus of today's session.
So let's look at how this updates our technical stack. So now you can see here a new aware orchestration and this orchestration layer allows us to connect to data sources. So those external data sources now we can connect our foundation model to data sources and have for example, question and answer chain and go back and forth. There are also quite a few developer tools are emerging. So in today's demo, we will be leveraging Code Whisperer and you will see how Linda is actually going to code in plain English and she's not going to write a single line of code. She will be only providing the prompts and the Code Whisperer will be generating the code.
So at this point now I actually want to turn it to Linda. So you could, she could, she could share with you a demo and you could see things in action. So she will be sharing how to build highly accurate data systems utilizing Amazon Kendra and Amazon SageMaker.
Thank you so much for the insightful overview of the gen AI tech stack. And as Victoria mentioned we also talked a bit about hallucinations and how we can mitigate that using RA, using this enterprise data that we can then help mitigate.
So, with that, Victoria also mentioned before I get into that slide. Victoria also mentioned foundation models and we mentioned we'll show you two demos today, both showing RAG approach. But before we get into that, she talked about foundation models, how do we access them? Right. So we will cover a few services for the first demo. And one of the ways to access it is SageMaker, Amazon SageMaker and Amazon SageMaker makes it very easy. It empowers machine learning engineers and data scientists and simplifies the process from model building to deployment, right? Model building training, deployment on a high level on this diagram. You could see that it gives you flexibility and control over the infrastructure and the model management and SageMaker gives you a lot of different foundation models, both proprietary and open source. It also has built in algorithms, prebuilt machine learning solutions. And one thing we will use in our demo is SageMaker endpoints. It makes it very easy to deploy this machine learning models as SageMaker endpoints. It also helps fine tune as Victoria mentioned, right? We saw there's many ways to improve and customize your foundation models and it lets you choose computation power, you can balance your performance and your cost. And that's a high overview of SageMaker. And again, we're just going through a few of the services we will use in our upcoming demo.
So now that we went through a broad quick overview of SageMaker, I want to shift your focus to SageMaker Jumpstart. I love that name Jumpstart. This is if you want to get started quickly and I don't know about you, but anytime you tell me I could get started quickly building some of this, I say, absolutely.
So it also has some new generative AI capabilities and essentially how it works, how it works is this first step, you will choose a foundation model. There are over 100 and 50 pre trained models and it's only with a few clicks. It has industry leading models to choose from AI21 Labs, Hugging Face, Stability AI and with the click of a button, it will give you the code that you need to deploy this model.
So the next step is you can now try out this model, you can explore with it. So let's say if you just wanna play around with it, you can just explore with it or you can deploy it as an endpoint, right? And then you can also fine tune this model and you can improve it, right. And customize it. And one thing to note all the data stays in your AWS account and ensure security and privacy, right? So it really is a way to jump start your process.
So that's an overview of service one we're gonna use out of three that we want to quickly cover before our demo. The next service I want to introduce to is Amazon Kendra. Now, Amazon Kendra is a very accurate intelligent enterprise search service and is powered by machine learning. And how are we using Kendra? We're going to be using Kendra to augment that data, right? We're going to augment our LLM to use this enterprise data. Kendra is gonna help us search through all that enterprise data because if you have this data, you need to be able to search through it and Kendra is optimized to do that, right. It's gonna help you search through all that enterprise and it also has a lot of new generative AI capabilities that were recently launched. We will get to that in our demo.
And we're using it for a few reasons. Number one, it's accurate. It uses, it's very good with reading comprehension, it's context aware and it's pre trained on 14 domains. It also continuously improves the second whoops, I just went back one moment. There we go. The second thing, it's my favorite part. It is very easy to use. It has something called connectors. What are connectors?
pretty much you're able in a click of a button to connect to different data sources with no coding, right?
Um it has 50 plus connectors. You will see this, it's nicer to see visually but this is the theoretic version.
Um and it, it has an expanding list of connectors. So let's say you want to connect to an s3 bucket, you wanna connect and web crawl something. It just has this connector. You fill out all the little prompts and you're already connecting it to this to kendra, which is an index and all you have to do is target that index and you can have all these different sources and it's machine learning powered.
So it will intelligently search through that enterprise data, you give it right and it makes it much easier to bring that data in.
Third, it's secure, it's encrypted and transit and at rest.
And the last one which you will see in the demo also, it's very easy to integrate kendra with large language models. There are some recent uh updates with, with kendra. There is a generative a i um update that is called the retrieval api that is optimized for building gen a i applications and it's especially utilized for rag workflow, right?
So that's which is what we're gonna be showing. So that's an overview of kendra that service too.
So one more service we're gonna cover before we get building because of course, we love to build.
Um this is the third service we'll be using today. It improves developer productivity and it's an a i intelligent coding companion trained on billions of lines of code and it gives suggestions right in your id.
Now, one of my favorite parts. Well, it has many great features. One of them is it, it has security built in security scans. So it finds v vulnerabilities two. It also has reference tracking. So let's say you have open source code and it could kind of flag it for you and then you could decide if you want a attribute it to that or you wanna use it or not.
Um and my favorite part, it is free for individual use so you can do use it for free on any of their favorite i ds vs code stage maker studio.
Uh so that to me is like very easy to get started. It supports 15 plus programming languages, many different i ds. And there's a recent customization capability that if you are in an organization, you can have an enterprise account and it could be trained on your internal code base.
So we will give you suggestions that are much more tailored to your organization break. Cool.
So those are three services we covered sage maker, we covered part of a feature, we'll use sage maker, jumpstart kendra and code whisper.
And before we get into the demo, uh i want to ask you all a question.
So let's do a quick poll.
Um, so here, uh, we're gonna allow 45 to 60 seconds if you could scan the qr code. And, uh, we'll ask you, have you used an a i coding companion in the past kind of to get a pulse of the room and i'm gonna give you a few seconds here to answer that.
You'll, you'll get to not hear my voice for about 45 seconds, which is nice.
Maybe we could check some of the results so we could bring in some of the results of the poll just to see it. It'll be, it'll show it live.
Ok? it's like it's kind of like a 70 ish, 60 ish percentage.
All right. So some of you have, i guess more than 50% of you have used a i coding companions. Some of you have not.
Well, either way you'll see both tips and tricks specifically with code whisper coming up.
Um and if we can, let's go also to our next slide back to the slide and we're gonna do one more poll because we would love to see what programming languages you all use.
Um so with that, let's go to our next poll one more pole and then we get into the demo.
So if you could scan again the qr code and we want to know what is your preferred programming language.
Ok? Look, python is the winner. I actually ok. Good because we're using python. I was a little nervous.
Oh, i was a little nervous by the way. But there we go.
Um, let's see. So we have 100 and 14 votes for python.
Oh, i can look here, um, 23 votes for javascript. 34 votes for java. 32 other.
What is the other, what to call it out some other. What's the other go?
All right. All right. Well, we love to see it today. We will be the beauty is with a i coding companions. You could use so many languages and not even have to know the syntax although we like that as builders.
But you know, um now i think we're ready to go into the demo, right?
So should we get building, let's get into it.
So what are we going to build? Our first demo is rag approach with amazon kendra. Anytime you see rag approach, ok. We're saying we want to use external data sources with and to augment our application our large language models, right?
Because you have a general large language model as victoria showed it might not know the answer to something because and we're customizing it using this rag approach, right? This enterprise data, your customer data.
And what data are we using today we're going to use for the purposes of this demo, us inflation data. You can see a little screenshot over there of it and it's going to be cs v data pdf data as well. And a few others and essentially how it's gonna work is this, see my face over there?
Imagine yourself and you are typing in sage maker studio in a notebook and we're going to query and ask, 01 more thing. See that square on the, on the side over there, that's all the uh data types that um kendra supports. Just so you're aware because of course, the tools you use matter as far as your use case and it depends on the data you're working with.
So back to the flow of where we're building. So i'm the user here. Imagine yourself. And you're going to ask what is the inflation rate in 2022. Ok. And that is going to start a q a chain with l chains, a nice parrot over there.
And what is that gonna do? It's going to go to kendra kendra index and search through our enterprise data that we provide and connect to kendra to find the context for based on our question, that question is gonna get some context.
Oh inflation data. Ok. Let me go find what is related to that in, in that enterprise data that kendra is gonna help us search through because it's an intelligent search service.
And then it's gonna take that context and it's gonna return it back. And as a prompt to a large language model s deployed by sage maker, we're gonna use llama two for purposes of this demo.
And then sage maker endpoint, we're gonna do it through an a sage maker endpoint and then that's gonna return the response and come back to our notebook. Ok.
So that's gonna be what we're building in demo one. So i guess let's start building.
Ok. So we're going to start with stage maker jumpstart. And you see there's a foundation model button and you're going to see how quickly it is that you could get started.
We're going to choose lama 7 billion parameters, lama 27 billion parameters. It is going to generate a notebook and that notebook has all the code that we need to deploy as an endpoint and the payload, right?
So it kind of gives you already in a click of a button, all the code that you need to start building.
And so now we're going to open a different notebook. We're going to open a new one sage maker studio. We have uh code whisper already enabled and we're going to use this inflation data, this inflation data.
We're going to download it as a cs v us, inflation data. This is what it looks like.
And first step you know, any machine learning project, you gotta prepare the data and clean it, right.
So um what we're gonna do is we're going to import some libraries and download some of this data locally. If you put the cs v data in the s3 bucket, you would have it there.
And now we're going to use code whisper to clean this data. All we're going to do is use code whisper to clean this data.
We're going to maybe get the 14 1st records of the cs v data. And as you can see, i don't need to remember any of the syntax.
I am coding in pure english natural language and i'm able to uh pretty much get the code. It auto, it auto completes for me.
Um here you could do it in two ways. You can either write a comment and it will give you code or you could start typing and it autocomplete.
Um and all i have to do is press tab to accept and here i'm cleaning the data, i have some non values and i wanted to remove those non value.
So i say please remove those non values and it drops it or i tell it to drop the column.
Um here i have some non values and i want to change it to zero. So i ask it to do that and um you're really able to uh code much faster.
I don't need to search through anything.
Um i don't need to remember the syntax. I'm able to just do that.
Now, let's say, um i want to actually create a visualization in my notebook. Ok.
So i'm going to ask, you know, i'm going to comment and it's really all the logic that's the main thing.
Um i'm going to ask build chart that shows i'm reading this with you rate change over time and over here look, it generates already what i need to have imported.
I didn't need to think through the libraries. I need it, generates a few lines of code and i'm able to get this visualization.
Now, this one's a little basic. So maybe i want to make it a little nicer.
So let's also add some more things
So let's ask it to also have um, rate change over time and annual change. Ok. So we're going to make it a little fancier. And again, I could add labels and titles and it starts generating multiple lines of code. It will import certain things, it will make it. Uh uh, you could also have some blocks of code generated. Um you could press the up and down button with code whisper to get some other options. Ok?
So let's say I cleaned my data. I prepared my data. Great. Now the beauty with whisper, it, it's optimized for AWS services. So it's able to, uh really, uh, work well with AWS services. And here I'm asking it, save my data to and it's three bucket. I finished cleaning the data. I wanna upload it now to S3. Ok. So I ask it, copy file 2 S3 bucket. Ta ta da da. Let's see that it works. Ok, upload it. And now we're gonna verify that it's there and we see it's in the S3 bucket. Great.
Ok. So we just prepared our data that was all local, right? Took some data from an S3 bucket, downloaded it, let's say, imagine you have enterprise data that maybe is not in public, you wanna maybe clean it, do all that use code whisper. And and now I think the next step is we wanna utilize this data with a large language model, right?
So the next step is to deploy this model that we had LAMA two and invoke the end point. So we're going to use uh LAMA two to do this. All the code in this part is actually going to be from our SageMaker jumpstart notebook that we already had, right?
So we are going to deploy this as an endpoint as a SageMaker, end point SageMaker gives you a full suite. Um and here I can verify here's my end point. This is what it would look like. You would have an end point and you're able to now target this end point.
So now we're gonna go back and we're have a payload and we're going to input with that end point and we're going to actually ask it, what is the US inflation rate in 2022? Let's see if it knows the answer. So let's give it a moment to do that and it tells me, oh no, it doesn't have that data. Why it's not connected to my enterprise data. And a general large language model is not always optimized, right? With, with that kind of data.
So we want to actually um add a few things. So let's look at the second for a second at the architecture of what we already did. We deployed a SageMaker endpoint, right? With LAMA two, we asked it, what is the inflation rate? We got back a response that did not have the enterprise data.
We also cleaned some data. Ok. It was CSV data. That's lovely. But now we need to connect it to Kendra. We haven't done that yet, right? So let's do that right now.
So we're going to now set up Kendra and we're going to add also a few other data types. So you can see it's a many to one type service, Kendra, you can have one index and have all these different types of data in that. So you're able to just target that index.
So how do we do that? Remember my favorite part about Kendra connectors. You have 50 plus connectors, no code. You can connect to all these external data sources very easily.
So we were using an CSV file S3 bucket. But let's say for purposes of this demo, I have a US inflation rate data on a website. So I want to actually just scrape the data off the website. So I don't have to actually try to check if anything's updated there. You can have schedule checking. But on the website, it has updated.
So all I had to do is fill out that form and I'm able to just have a web scraper of data that's on the website. So if you have data that you want to crawl, you can just add it as a connector.
So here it's gonna be the same data that we just use also for CSV file. But it could be different types of data. And you could see this is my index for Kendra and I have all these different data sources.
Now, let's say I want to add also a PDF because I want to see if it works also with unstructured data. I wanna see I could read these tables. This is a Starbucks annual report. So we're going to add that also to our index. We're just gonna collect some other data sources and just see if it understands and is able to customize our um our our application.
So we're going to add this. It's 100 plus pages PDF. And let's see if this works for application.
Alright. So what did we do till now? We deploy the SageMaker endpoint, we asked what is the inflation rate we set up Kendra, right? And we uh cleaned our data.
But now the best part RAG approach, we want to actually be able to search through Kendra and then utilize that data and ask our large language model, what is the inflation data US inflation data in 2022 right? And the best part is Kendra has these new generative AI tools that essentially uh it's one API call called the retrieval API that is built for this. It's one API call and we'll be able to do it.
So let's do that. Now, part three of this notebook RAG workflow, we're going to combine Kendra with our large language models. And we're saying, hey, you know, use this retrieval API get that data from Kendra. Ask again the large language model, the question.
So let's do that. Now, let's actually query, what is the US inflation data in 22 2 and look at that, it starts the QA chain gets the context of the information we gave it in Kendra and then gives it back to us, finishes the chain, gives it back to us. What's the answer? 8%?
Now we're gonna verify that it actually is the case. We look here on the website, it is 8% and it was able to read a table. Great. I didn't even need to prepare some of this data, right? Because that was a web scraper.
Um now, remember we also added a PDF for purposes of this demo. We want to see how it works with different uh data sources. And we're going to ask how many stores opened in 2022 in Starbucks and it was 437.
Now, if we look at this PDF, this was in a table in an unstructured PDF 100 plus page pages and it was able to get it to us very quickly. And so this is an example of how you would really augment your applications to be uh optimized for your business and optimized for your customer. And that's the difference data is the difference between a general application and something that is customized.
So with that, I'm going to pass it back to Victoria who will show you another demo um that shows you for other use cases with RAG approach. Thank you, Linda. I really loved your example and how you show different data types and particular web scraper like so easy to use. And all you have to do is just provide the link.
Well, now I'm going to talk about a different demo and it will be we will be utilizing vector databases. Why vector databases? Well, actually vector databases have been gaining popularity with the rise of generative AI. And there are several reasons for this.
First of all, they are optimized for similarity search. So where we have um where a fast retrieval speed is critical and you will see there will be some uh metrics the query will execute like 0.1 2nd. That's where you might want to look and start utilizing vector databases.
O some uh vector databases can store different data modalities. So for example, here, Linda shared the table earlier with Amazon Kendra. You you you saw it's mostly like documents, right? But what if you have different audio video files, maybe some of you like working in a healthcare provider and you have some imaging data, clinical data sequence data. So this kind of data can be converted to vectors and stored in vector databases.
Also vector databases are highly dimensional. So where traditional databases struggle, that's where you can also utilize vector databases.
So let's look at back to our technical stack. So here now you can see data platforms. So what I will be showing next, we will be using a third party vector database because we want to show you how besides utilizing Amazon services, how you can connect and expand the application with other enterprise systems.
So we will be using Pinecone database, which is uh quite popular. Nowadays, you can also think about other enterprise systems that might be let's say Salesforce and really as an enterprise in order to for you to maximize the value of generative AI, you really have to think about how to build those pipelines to your tools, applications and data sources.
And um there are a lot more different books that I should add in this uh stack. However, we only have 45 minutes. So we cannot cover every single tool. But of course, uh setting up guardrails, thinking about the policy security are all important questions when it comes to build and generative AI applications.
So if you remember earlier, I asked you the question, what is Amazon Bedrock? And I got some wrong answer, right. So let's fix this. And we will do this by providing documentation from Amazon Bedrock website.
So we will download the documentation, we will convert it to vector embeddings and we'll see if we will get the right answer. So earlier, Linda described how we can access foundation models with Amazon SageMaker, but recently we announced Bedrock. So why not to use Bedrock to ask what is Bedrock?
So what is actually Bedrock? Well, Bedrock, unlike SageMaker, whereas the SageMaker, you have to host and deploy you. So Linda had to do the steps. You can simply use API and you can access different models.
So you, you have one API or you have to provide is just the name of the model and now you can start interacting and trying different models
There are quite a few actually models available like a i 21 la stability i anthropic. Um so we will be using an tropic now and what's really good that bedrock is serve less. So when you don't use bedrock, you don't have to pay for it, right? So with sage maker, you deploy and point whether you're using it or not, you're going to pay for it. So it's really make it cost effective as well as you don't have to manage infrastructure.
So how do you get started with amazon bedrock? Well, first of all, you can get started by just simply using u i. And there is a what's called playground where you can select a model, you can also just hyper parameters and you can experiment and find what model really fits to your use case. It's not like a one model that's going to rule the world. And we have to all use it. You really have to identify which model is best for you as well as which model is better cost optimized, right? You can use the l a 70 billion parameter, you can use 7 billion parameters. So you have to be kind of like aware of this uh cost.
Uh second thing with bedrock, you can also customize it. So if you have data in, let's say you can upload data in s3, you can use it as a training, data set, validation, data set. And now you can further pre trainin the model. You can also connect to different data sources. So that's what we will be using with a r approach where we are not going to modify the code, right? We are just going to connect to external resources.
And if you watch the keynote today, um adam announced that now amazon bedrock agents became uh publicly available. So for the task that requires um orchestrated complex tasks where that's where you can build agents and what's really good that the data that um you're utilizing is not being shared with uh foundation models. So you can customize and utilize your data with respect to privacy and security.
Now, let's look at architecture what we're going to build. So first we're going to take the documents and unfortunately not like linda shared bi scan, we cannot just simply start utilizing these documents. We have to first convert them to vectors and to do this, we have to use embedded model. And as you probably aware for each foundation model, there is a limit in terms of token. So uh when we provide the prompt and when we provide the document, there are a certain number of tokens that we can utilize. So we cannot simply convert the documents to vectors, we have to first split them.
Um so in our case, we will be splitting them to 1000 tokens. Uh and we will be using amazon titan models that is up to 8000 uh 8000 tokens. Once we split the token uh the this we will get vectors. This is a very primitive kind of representation of what vector is just to show you um this is like a three dimension, we will be utilizing 1536 dimensions today. Uh but you get an idea that similarly uh similar objects, they kind of cluster together. So here we have fruits right on one side and then animals on the other side. So similar text will be clustered together. If it's similar images, they will be also clustered together. And when we execute, we are really doing the similarity search and it will return to us this kind of similar objects.
Those objects will be represented as vectors and will be stored in a vector database. So we will be using pinecone. But there are also you can use open source, you can use r ds p vector and uh many other databases. So once we prepare the data, now we will be using l chain to, to act as an orchestrator between bedrock and our vector database, our vector store. So we'll get, we will use one chain to get the relevant information and we can provide how many paragraphs we want to return.
And then once we get this context, we will send it to bedrock and bedrock will provide us a response and uh i will get an answer in sage maker. So just similar as uh linda showed sage maker studio about this similarly doing the demo in sage maker studio.
So let's get to building. So the first thing that i'll do, we will go to bedrock. So with bedrock, you will need to enable the models. Uh so you will go to the model section, you can enable the models. Once you enable them, then you can go to console to the playground and start experiments. And so if you see here, i selected text play playground code to and you can adjust different hyper parameters, you can see how your model responds.
So that you know, well, is this a good model for my use case or not? You can see it provided the stream and answer and it actually said, well, it's internal system. Amazon bedrock is internal system developed for amazon that's obviously incorrect. So let's fix this. We will copy the api and now we will simply insert this api into sage maker studio and you can see it provides me the same answer, the wrong answer.
So now we have to fix this. So the second step is let's prepare now our data. So we will use amazon titans. That's what specified here. We'll use titan model for a conversion. So now what we do, we have to download the documents, right? We have to convert them to vectors. So simply provide them the two links to the documents. Now they stored locally and now we are ready to actually start chunking them.
So this then this book is just to chunk the documents and split them into 1000 token each chunk. Um then what this is how the chunk will look like. You can see like about two rows. Once we have chunks, let's initialize pine cone. So this pine cone you can see it's just a simple api. Uh and uh for the demo, uh you can create one free index, um you can see what parameters are suggested depending on the embedded model. Um and that's it. Basically you just initialize and now you can start utilizing this index, all you have to do is provide the index name.
So now let's convert just one chunk to see how the vector looks like. So you can see this is like the mathematical representation and let's convert these two documents, it's going to take a little bit of time and the more documents you have, of course, the more time it's going to take. So you can see it took about one minute four seconds to convert two pdf documents that are about 100 50 pages each and once it's converted.
So now it's in the vector database. So you can see it in the console, you can also see the the metrics here. And uh on the very right side, there is like a spike of insert, right? There was a insert uh made. So you can see how your database is being leveraged.
Ok? So what we've done so far, we took the documents right? We downloaded them, we chunk them into 1000 tokens each, we converted them to vectors and we stored into a pine cone. Now, we have to do this orchestration, right? We have to use the similarity search utilize and l chain and then pass this information to bedrock and we will see if we get the right response this time.
So now just, just asking a query, return me the three relevant answers provide the three relevant answers. And um now we can use this information to do the search and now we got our answer. So it is um the amazon bedrock is leveraged to access foundations models via api it's a bit of a short, right? Like we want to get more details, that's where we can use prompt engineering and say, well, we really want to have more details. You can adjust, you can fine tune the parameters.
And now you can see it provides a more descriptive answer, talks about the playgrounds that we saw earlier and you can continue. You don't have to use bedrock with rug, you can use it with r without rug and compare the answer. So here i'm asking what are the key features and i'm getting oh, bedrock is a blockchain obviously isn't correct.
So then similar question, what are the key features? And then we can see that those key features are coming from the document. I know the both demos were kind of like fast and we had to split them up. So let's just summarize what's the difference between these two demos?
So first linda talked how you can access foundations models using sage maker. So with sage maker, she had to first deploy the foundation model as an end point and then invoke this endpoint. Then she used kendra where she didn't have to do any kind of data prep, right? She simply just went and said, ok, this is my index. Those are my data sources and utilizing the kandra index. She she started getting all of the information back then.
In the second demo, we showed how you can utilize bedrock and skip the step of deploying the models, right? And instead utilize amazon bedrock, the server service and simply utilize api and get access to foundation model. And then we use the vector databases to get the similarity search. And in the demo to the to the show, it was like there was at some point it it was supposed to show that there was uh creator was taken 0.1 seconds. So it was like really, really fast.
So if you need to store different modalities and access different modality data like image and text, that's when you might want to use um the databases. So one is s servers, right? And the other one you have to manage with kandra, really easy to use. The other one you have to do the da da data prep. It's of course, you you don't have to use, let's say sage maker ves can you can use bedrock ves kandra, right? Or you can use sage maker with uh vector databases. We just try to cover as many different components as possible in today's session just to show you the different approaches that are available for you.
Well, now i want to actually ask linda to summarize some of the key takeaways from this session and also share some of the some of the resources. So you can start building uh your own applications.
Thank you so much, victoria, thank you for showing us the different differences we saw with rag approach. And so we covered a lot right? There was a lot of things and so if you were to walk away with anything, let it be these three takeaways. Ok?
So one foundation models are powerful but have limitations. We saw we had hallucinations and we were able to mitigate that using rag approach. One of three other three ways that victoria mentioned two, you wanna work backwards to select the right use case. We showed you two different demos. It really depends on the type of data you're working with and what you're trying to build. And one of them we had vector databases, the other we we were using kendra depending on the type of data you have depending on the right use cases for you. So you have to work backwards from that.
And three, your data is the differentiator, right? That is the difference between a general a i application and one that is customized for your use case and for your business. So that's the main thing and of course, the best way to learn is to build.
So we want to give you a few resources to get started if you want to start building with the rag approach and really get um the different experience. These workshops are kind of end to end. So the first workshop there you could take a screenshot that you have all the, all the qr codes.
The first workshop sage maker workshop would align to what we covered in the first demo. The second workshop bedrock would align to what victoria covered in the second demo. And the third one is code whisper. They go deeper and you're able to kind of experiment with labs as well in there. So very helpful and you know, we, we hope you get started building, we hope you enjoyed this session.
Thank you. And of course, uh uh connect with us on linkedin if you build anything, tag us and uh also complete the survey if you can and if you.
更多推荐
所有评论(0)