Google Cloud presents: AI Tools for Your Research

So thank you. Wonderful introduction. Thanks everyone. Pleasure to be here. As I mentioned. I am Russ Goldenbroit. I've been with Google for about eight years now. Most of that time, I have spent essentially in the education technology space around Google Cloud. And with the advent of generative AI my focus has become more in AI solutions, cyber security, and a lot of research computing. That is all part of the topics today.

So With me is Anand. He's the muscle, essentially generative AI specialist. So we briefly touched on the agenda. I am going to assume that not everyone here Is coming from The same background and so I'm going to very quickly move through some very foundational things that you kind of need to know in order to understand anything else that we're going to talk about around AI or any of the different services that you can take make use of for different use cases. I'll try to breeze right through that because I can understand that for some that might be trivial; you probably already know it. We're going to talk through these topics and at the end there we have something called Cloud Labs which I'll leave some time to explain. Which is essentially like micro environments. That will allow you to play around in Google Cloud and test some of the concepts that we're going to be going through. They're pretty open-ended so honestly you can do really whatever you want in the console. But just bear in mind that it's going to it's a sort of step-by-step instructional environment to allow you to Understand sort of how to use some of the tools and services. Lastly I know you guys may have had some questions prior to the session so we'll open it up. We'll have some time at the end for some questions, and anything we don't get to I'm happy to answer afterwards, or offline sort of asynchronously, so. Pleasure to meet you all, and with that we can get started. Going to give a brief introduction to really. Google Cloud and Cloud, computing as a whole. So for those who are not. Familiar with cloud computing. This concept was actually invented, maybe, like years ago, practically speaking, with the advent of things like salesforcecom. Basically offering applications and infrastructure. Without having to have. Those physical pieces of hardware on site. For you, which is. Has, you know, a multitude of. You know positives with it. Here are some of the things that it brings. So basically, you no longer have to have. You know thousands of servers. In your house, for example, to get access to that post, computing resources. Same thing goes for any type of storage databases. Really. A lot. Most most of what makes up computing today. Also, it's a pay as you go model. So while we do have The pleasure of working with institutions like Columbia University, who. Did provide access in a more provisioned Way to students. It. There's nothing that stops any type of individual or consumer in the world from going into Google Cloud, attaching a credit card and basically saying, I wanna create my own app. Or server, or something like that. And the last piece is the scalability. So. Just so. Everyone's familiar and on the same page Deck, you know, , , decades ago, and even some people today with more legacy companies or organizations. Have data centers filled with hardware. And if you need more, you can imagine it's like a to week procurement cycle, and then another months to install everything and make it available to people like yourselves at the university to do some sort of research, or any type of application or computational work. But with cloud. That's not necessary. It's highly scalable. You just basically point and click. And you get what you want. And you know in theory, essentially. There are a lot more things behind the scenes that make that. Easier or less less. Not going to go through all of this. But I did want to sort of paint the picture of what Google Cloud is. For those who aren't familiar, and I would imagine you know. Me. Having worked here years, I. Don't. I still don't know all the hundreds of different services like Google Cloud offers today. But at a very high level. These are things you're probably familiar with as consumer. Gmail maps, Youtube, they all run on Google's back end the same way that any type of business. Can spin up an application. This is where it sort of started with these types of consumer applications. Now, what you're seeing in the center. And all these icons are the various. Services that are provided by Google. And this is just a snapshot. This is no way an exhaustive list. Of Google services. But, as you can see, as you're sort of walking through it through compute storage. You have the basic infrastructure that is necessary to run any type of application or workload. Big data. So if there are any analysts in the in the room. Things like bigquery. Our data, warehouses and analytics platforms that allow you to do types of analyses with SQL. Machine learning, which is the basis of a lot of what we're gonna talk about today. Is a number of. And I'm gonna let a non talk most through a lot of that. But it is built on top of all of kind of what you're seeing here. And then you know, I'm we're not gonna talk as much about Api management, tools, identity and security. These are all supporting services that. Without them it wouldn't. You know, these these other pieces of infrastructure and analytics platforms wouldn't be able to function. Or they would be insecure. And there would be other problems. So at a very high level. This is what you should think of when you're thinking about Google Cloud Platform as a provider of services. So I've worked in use. I've worked in with students, faculty researchers for a long period of time. And so there are many, many different things. I've seen You know, faculty staff students use Google Cloud, for in that time. Some of them you might be surprised with like, for example, web and mobile app development. But there are a lot of places in universities that have, like incubators that are trying to create applications, spin them off as. Side projects and things of that nature. More importantly. And it's a lot of like what we can talk about today is data science. So for anybody who uses any type of data science work. This is a huge piece of what Google Cloud offers. We're gonna talk about. And Jupiter notebooks, but also big query and other etl tools that analysts use to conduct. You know different types of statistical analysis. Obviously, machine learning. That kind of goes without saying. Tensorflow began with Google, if you're familiar with tensorflow, so. Gonna table that for when a non talks. But then even some of the more I guess, newer technologies around like blockchain or game development. These are things that are being done Or with a lot of different companies and universities. Today. So just to give you a sort of breadth of what are some of the use cases that universities and students are using it for. I thought this would be an interesting Slide for students or anybody who's interested in learning more about cloud in general. You can think of this. This is what Google offers in terms of trainings. So, for example, as a practitioner or somebody who's interested in learning more about. Infrastructure or data management or analytics or machine learning. There's a basically a a path in terms of self paced learning that you can go through with hands-on labs that essentially lay the foundation foundational skills that you would might require to take on the roles that you're sort of seeing in the right. And this is again, is also just a cut out. There are many, many different types of. Especially as you're sort of thinking through your career. And what you might do afterwards. That you can take some self paced courses to really lay the foundations. That you would need to succeed in those types of roles. So those are available as well. So now we're gonna dive into specifically. Because a lot of what we're gonna focus on today is going to be more data analytics machine learning. AI focused. I don't know how many people here are. Just show hands quickly. Who knows what Colab is? Okay. Everyone knows Nicole's or most people. I'll just say at a very High level what it is, and then we're gonna jump into. Some of the the actual platform to show you off basically, since how to use it. What are the kinds of things that you can do. Why, it's so powerful. Things like that. So. For those who aren't familiar with what Collab is. At a very high level. It is a way for. You see a lot of data analysts, people who need access to just run code. Using Co lab as a quick tool to be able to build what we call notebooks. That you know, are easily, you know, you're able to collaborate and share them. That's essentially a very high level. What it is. Now there's many, many different flavors over the years of Jupiter, and these types of python notebooks that. Data analysts used in general. And so it's come a long way to the point where Enterprises or organizations have basically been. Have begun to create different flavors or versions of them. Many of you are probably familiar with the free version of Collab. Because it's free. And that's basically what a lot of consumers use. There's also, if you're not familiar with, there are paid versions of the more Of the the consumer version, called, like Collab. Pro plus, I believe. And basically what those. Consumer versions are offering. Our notebooks that have. Let's call it just bigger and bigger engines behind them. Access to additional resources that can accelerate your analytics. And basically provide you insights faster. And we're not going to be talking a lot about the consumer version. We're going to be talking about what the Google Cloud Platform versions are. There are flavors when it comes to lab in Gcp, specifically. There's something called workbench, which we're not going to spend as much time on, because. Workbench is a flavor of that is. More suited for larger enterprises or businesses or organizations that are essentially trying to build and deploy models with applications. Whereas I'm gonna talk a lot about. And I think that you students in general and researchers get a lot of. Practical use out of is Collab enterprise. Is, has a couple of Main incentives, and why? This version of you know essentially Jupiter notebooks. Is used by the research community. So the st piece and the most important piece is the shareability. So it inherits everything in Google. Cloud Platform is. Shareable using a concept called identity and access management. And basically, what this allows you to do is create a notebook, create work that you're you know, you're running some sort of analysis. And very easily you can enter someone else's. You can hit, share and enter someone else's email address to provide some version of access to that same notebook, to collaborate, to just see what you did for many different purposes. You can really put anything you want inside these types of notebooks, but it highly incentivizes collaboration. The other piece is all of the. Variety of Jcp concepts that is, showing kind of over here. That makes it more of an enterprise ready. Solution. So the consumer version, for example. Doesn't have. Any of the security functionality of, you know. What Google Cloud Platform provides. So like, for example, in the consumer version, I know you. Basically, it works with drive. And it's you know, that's that's the extent of it. But it doesn't really work in terms of. There are a number of different controls like. Limiting data exfiltration meaning, let's say you had something in the in the notebook that you didn't want to leave your environment. It was. Maybe it was sensitive. Maybe it was something that's your work that you don't want to expose. You couldn't stop that really with the consumer version if it's hacked, or something so like. But there are a number of different things, and I'm not going to get into all of this. But the idea is that there's a number of vendor, very enterprise level controls that are put in place. To secure your solution. The other piece is. This consumer version is very basic. You get what you you're given. You get a very small machine that essentially it is running on top of. And that's what you have to work with. So if you're doing a very complex analysis, and maybe you are running something. Basically it could take forever to execute. But imagine now, like you had the power of cloud behind it. Meaning like, you're trying to train a model. You're trying to run inference with a machine learning model or something of that nature that requires a very large amount of compute. Or a large, or access to gpus and vare varied. This is what Cola enterprise provides people. It allows them. To customize the back end. That is required to help run and accelerate. The analytics and research. You know that that you're. So I'll pause there. Those are the main incentives. But you know there's some. There are others as well, but there's only so much time, and so. I'm gonna just quickly walk through. For those. Who have not seen it. Just looking at time. Looks like we have time. Okay. So I'm gonna give a brief introduction to Coleb and and maybe even generally, Google Cloud Platform. If you haven't. Opened it. Ever but. This is what you're seeing is actually. The Cole Enterprise. Feature, of just. Let's see, can I? Hmm! I don't know how to. No, I got it. Can you see it? Oh, I just okay. So. Yeah, so, coleab, there are a number of tools and services that. Are in the Google Cloud platform today as we're if you actually go to this little hamburger menu on the right, you can actually see. Just if you hit view all products, it'll bring up basically all these services that Google Cloud is composed of. I have You basically clicked into specifically to go over today. But I wanted to make sure you understood that. There is really access to a variety of different services. Now in Collab enterprise. There are a number of different things that you can do. I'm only going to go over high level and and show you some some variety of things. But if you can imagine yourself as an analyst or researcher coming into this platform. The couple of things that I think are unique, that you wouldn't be exposed to from maybe a basic consumer version product. One is region. So you have to think about computing. There's a concept called data residency. Where basically, where is your compute storage data? This notebook, sitting. You can actually spin that up where in many different locations, where Google's data centers are around the globe. And so it presents an opportunity for people to get access to closer resources, to make it run with less latency. Or there are some use cases. For example, if you're working with regulatory or government type research that they actually require you to create all of your data and storage in a specific location. And this isn't available in the consumer version. Thus. Just just why there's a Collab enterprise. At the top. Here you'll see my notebooks and shared with me. This is pretty self explanatory. But when I was talking about identity and access management. This is how easy it is in terms of being able to share notebooks with your collaborators, whether they're, you know, sitting in the same room with you, or they're sitting around the world with you. And basically, what that allows you to do is it provides them some level of access which you control. To your notebook that you've created. Here are some of the notebooks that I've created, just as sort of test environments. And it's really as easy as clicking into the notebooks to open up these tabs that contain. The code, or whatever it is that you were doing with it. These are some quick functions. We're not going to go over just high level concepts in case people aren't familiar with it. There's a concept called Runtimes. For this particular example, I've basically, I have a default runtime. So what this means is, this is essentially the engine. That is running your notebook. So underneath. Underneath this, which is abstracted from you as an individual using python notebooks. Is essentially a virtual machine with disk and storage space. Perhaps it's sitting somewhere in a network that you're not. You're not. Look at seeing This is where you create those runtimes where your notebooks will run. And you can very easily if we go back to notebooks, which is nice. We click my notebook. You can very easily in In your notebook. See? The run times that are available to you. So this has to connect to an existing runtime, or you can actually create a new runtime, which is essentially again. Various compute or disk and other things that are associated with it. So not gonna actually spin up resources. So I'm not gonna waste time with that. But I did wanna make sure everyone knows with run times are runtime templates are Exactly what it sounds like so many in many cases. What you'll do. In terms of repetitive like research and analytics. You will end up needing a certain type of machine with a Gpu attached. And so, instead of recreating that every single time templates are made more easily. Create those machines for your runtimes. And of course This is just tracks another part of like. Probably maybe, or may maybe not. As interesting to everyone here but you many times working in python, working with analytics. You're going to run into issues. Troubleshooting. What's happened? This is where, like monitoring these executions tab, which is, gonna show you the jobs that you're running. Or you've scheduled to run for your notebooks. This is where this will come into play. And this is why this is more enterprise version, because you can't really do this. Very well with like a consumer free version. Last piece is schedules, which is pretty cool. This schedules think of schedules as sort of like a cron job. So if you're you have an application that you want to run periodically, or some code that you want to run. Notebooks is a very easy way. To schedule these types of jobs on a whatever frequency that you want. So that's where you can actually create these. So. That being said, that's pretty much how much, how simple it is. I will say. In terms of functionality. There's a lot that you can do. But I wanted to show you really basic version of how easy it is, and how fast it is to really just execute some. Analytics, sorry. Does the runtime price, how the also. Based on the different resources that you're using. Client? Or is it based on that time. That's a good question. So the the. The cost is really based on the backend resources that the runtime is supporting. So the machine. Shape, amount of persistent disk, the thing, the different resource. If you're using Gpus, which is really a bulk of the money, will come from. That's where all of the dollars are coming from. I will say. What you'll find is that. So there's some. If you don't spin down the notebook in. It will continue to run resources in the back end, because it is, that's what's supporting it. But once it goes idle, it will actually disconnect and shut down some point. So there is some fail safes built in. But if you're not using the notebook, it's best to. You know, basically just close it down. So it's not consuming any types of resources unnecessarily. Hopefully that answer the question. Cool. So yeah, very basic but if you're an analyst and or developer, you've worked with python or anything like that before. Couple of things I wanted to highlight here. Very simple. In general. I'll just you can. Run this in many different ways. And I'm just gonna point to click to make it easy. But It's very easy to create what we call notebooks that contain code. It's a way to really just segment. Your code. What we're doing here is what we do with any like python file. We're authenticating. And then we are importing some, but some different libraries. That we wanna run. I've already created some things, but I wanted to make sure that you understood. Some of the power behind what was bringing versus more a more basic version. So, if you remember, we actually had talked a little bit about bigquery. Or some of the data analytics platforms that Google has the great thing about cobra, and anon probably go into a little bit more. But it works and connects to a variety of the other Google Cloud services that are available. So, for example, if I wanted to Store, a lot of different tables and. In bigquery, so. You know, for analytics or whatever research I may be working on, I can simply connect to it or run a query against that service in Google Cloud, in a python notebook without having to actually go into the big query, console, or, you know, create like an Api call externally from my local laptop. It's really a platform that allows you to easily connect to other services as well. So in this particular case, is a public. Data set that's in bigquery. And basically what this is doing is running a query against that. And you get some data frames. You know that for the for this particular example, it's just fake data, not fake data, but like meaningless trivial information. But it's really just to show you to the exposure to be able to connect. To a variety of different Google Cloud services. Now, if we keep going down as a data analyst or you may be more familiar with Things like pandas, or. Other like analyst libraries. So I got a question. And is it like do I need to. And then upload it separately, and then this can. We understand that you can go like partners with this, but like so you generally explore like. So I mean it there, I mean, there's so many possibilities there, so depending on where the data is, so for example, if the person, if the vendor, or whoever you're buying the data from is accessible, has maybe an API that they've made available to you, you can call it from the notebook. But you can also take data and store it somewhere else in Google Cloud, whether it's BigQuery or an object-level store like Cloud Storage or one of the many options. And then you can also run queries and things against it.

So a lot of options available. So again, this is just some light analytics. All right, and then I like this piece because, for those who aren't as familiar with Colab, there is a number of libraries that allow you to create plots and more interactive visualizations, like with Plotly, for example, that many people, I think, are just unaware of. So in the back end, this is using. It's nice, because it actually shows you the time it takes to complete, if there are errors that actually happen.

See, these are some warnings that I'm not caring about, because it still completes my purposes. But there are some really powerful things you can do as an analyst and visualizations you can create to essentially understand the data that you're working with. And so I like to think of Colab Enterprise as a higher level, more customizable way to do that.

You could imagine, though, that if I was working with maybe hundreds of terabytes of data, this is a very small data set that I'm working with. It could take a lot longer for this query to run. And so that's where a lot of the Colab Enterprise piece comes in.

It allows you to custom fit the resources that are underneath to work with whatever the size or the amount of data, the amount of queries that you need to do to complete it. Because quite honestly, you don't want to click this and then wait for 24 hours for it to complete. So that's one thing.

And then at the very end, again, very easy to work with machine learning libraries. So this is using Scikit to basically create random synthetic data set, and then run a different machine learning algorithm on top of it and make some predictions. But again, it's as simple as writing the Python code, installing the libraries, and what you end up is with results in a matter of seconds.

The thing I like to point out here is that this is the heart of where people use GPUs for. So if you're trying to do some sort of inference or training a model or do anything related to machine learning, for the most part, that's a compute-heavy workload. And so essentially, this is where people create runtimes that have a variety of different GPUs attached.

And you can do that in Colab Enterprise as well to help accelerate these types of classification problems or anything like that. And again, just to show you the sort of collaboration that Colab Enterprise has with a variety of different GCP services, a lot of people are familiar with Google Translate. So there is an enterprise version, basically an API, that's exposed that a lot of businesses and organizations use for translation purposes.

This is just showing, again, me as a consumer connecting to that Translate API and asking it to translate a number of different phrases. And again, the machine is actually doing this in real time and calling the API and getting results back. And this is a warning that I don't care about.

It doesn't stop it from actually working. So it took about 10 seconds to run. And then when you come down here, you get your results, essentially, that you can do with what you please, meaning it starts with the English and then it actually assigns confidences for a particular language and translate it into other languages as well.

So this is all stuff that is possible with Colab Enterprise. Yes? AUDIENCE MEMBER 2. Does Colab update the APIs that they have to do in this response? Or do you do that from home? No, that has to be from home. You would have to update the APIs.

That's basically Colab. Oh, yeah. And you mean the libraries? Oh, yeah.

Yeah. Yeah. AUDIENCE MEMBER 5. Yeah, so if you haven't picked the requirements yet, who would be looking down for that in this case? Unless you don't need it yet.

Right. And with that, I think that's where I'm going to finish with Colab Enterprise. And I'm going to hand it over to Anant to talk through more AI-focused development.

I don't actually know. But it should be available as a service in the future. I don't know.

AUDIENCE MEMBER 6. What are some considerations on how to think of a one-time mode and update it? If I want to enter some code and want some analysis in a book, but I'm not sure how to know what to do, what machine to use for it. Right. So good question.

So the normal way that I would say is done in organizations or enterprises is just a process of benchmarking. So there are a myriad of different machines with various Intel processors, et cetera, behind them. If you go on, say, Google Public Documentation, I'm really any talk about a public documentation.

Basically, you're going to see a variety of machine types, like M1, M2, and all these different numbers, E2. And underneath it, it's going to give you a general idea of what people use those machine types for. Meaning, are you trying to run something that is compute-heavy? Are you trying to run something that is memory-heavy? Are you trying to run something that may require GPUs? And so that's going to help segment initially which machine types, if you're going to need GPUs and things of that nature.

Then, you're most likely going to start at a cost with something small and see how long. This is the process of benchmarking. Over time and experience, you get more of an idea of what fits your particular type of work.

And so you'll run it, and you'll know very quickly it's like this is taking hours to run something. And that's when people scale the size of their machine, the number of compute, the amount of memory, if it needs GPUs in some type of workload to meet their need. Technically, nobody's stopping anybody from using a huge machine to run something very simple.

It's just very expensive. And so basically, what you end up doing is after you've initially segmented, you start with probably the smallest machine types and really benchmark it and work your way up to basically align cost and time to insights. Yeah, it works.

You good on that? Yep, I'm good. And I just pasted the link for this one what Russ was talking about. Essentially, this lets you choose what kind of machine types you want and it defines what are the purposes.

Should be in the Zoom chat. Our Zoom chat isn't with everybody, but I'll make sure. Thank you.

Cool. So what is AI Intro? So again, my name is Anand. I'm going to keep mine real quick.

I've been at Google for about a year and a half now. But prior to joining Google, I was with Microsoft for about 15 years, working at the same customer base, state and local government and education. All right.

So let's start here. How many of you have used any sort of generative AI model? Show of hands. And if you had to pick, maybe anybody wants to call out your favorite model, what works best? Yeah.

Anyone else? No, you can't say ChargerPT because I use ChargerPT too. All right, cool. So I guess this is a question that at least we get asked often when we go to our customers saying that, hey, it looks like ChargerPT 4 beat the benchmarks for Gemini and Gemini releases a new model tomorrow.

It's better than ChargerPT and then they have Claude. How do you actually pick the right AI model? And I feel this is a question that most customers and also as developers, practitioners, we have this question. But I wanted to take a step back and just understand, are we asking the right question? Because it feels like right now and I think you guys would agree that we are at a stage in this AI lifecycle which is way too nascent.

We're probably in the beginning stages of it. And there's obviously going to be models that come out tomorrow, a week later, which keeps getting better. So in that case, how should we think about this? What should be the deciding factor for, hey, should it be the model or should it be more the platform and the way we deploy things? So that's why I wanted to introduce Vortex AI, which is Google's end-to-end machine learning platform.

And what this lets you do is, and what this makes it unique is basically these three building blocks. And I talk about all three and I'll keep the slides to a minimum. I'll try to show my console as much as possible.

The first one being Model Garden, right? And I talked about different models being available. With Model Garden, what we do is we have a bunch of first-party models, obviously Google's Gemini models that's available that you can pick and choose. We have 1, 1.5, etc.

But we also have some open-source models that Google developed, like Gemma. I'm not sure if you heard of Gemma. We have first-party open-source models.

But we don't stop there, right? So what we also do is we have open-source models that you can bring in. We recently announced integration with HuggingFace. How many of you are familiar with HuggingFace? Awesome.

So yeah, you can deploy HuggingFace models directly into Vortex AI with the click of a button. You have that option and you can also bring in sort of third-party models into the system like Anthropx Cloud and bring it into Vortex AI. Now I know what you might be thinking, what's the advantage of bringing this into Vortex AI? Why can't I deploy it where it is? And I'll talk about why that's important in the coming slides.

So that's the first one. And the second one is the agent builder and the model builder, right? So the approach that Google's taken with respect to Gen AI is that there's a spectrum of audience, right? There's people who want low-code, no-code, business users who should still have access to AI and the models that it has behind it. So that way we're trying to democratize that.

And I'll show you step-by-step on how to be created. We'll actually try to create an AI chat for today's slides. So yeah, and we also have at the other end, the deep end of it, like if you're a developer, you want to access the APIs and have all of the configurations and the safety filter set, that's also possible.

So you can pick and choose where you line the spectrum, and that's one of the capabilities that we bring in. And again, model builder is an end-to-end MLOps pipeline. Cool.

So this is the foundational models for Google as of now. I'm sure in the next week it's going to change, but we have Gemini 1 1.5 Pro, which is our multi-modal capabilities, which basically means that it not just understands text, it can also reason based off videos, images and also code audio, etc. And then also, I think I downloaded a video from the Columbia website, but if you guys have a video that you want me to analyze, we can do that live.

And we have other models, like we have Imagine, which is for... I think we have Imagine 3.0 now, which is basically your sort of text-to-image generation, like the DALYs and mid-channel. And we have Kodi for code generation, and Chirp, which is for speech-to-text. Now again, I talked about the first-party models.

This is the variety of models that we have, from the open source, as well as third-party. I think as of now, we're sitting at about 170 models that's available in the model garden today. And you can pick and choose based on the use case that you have, right? Now, going back to the question that I asked earlier, like why would we want to do this? Why would I want to bring this to Google? So there's a few advantages, right? If you look at it, we have certain safety configurations and responsible AI that we build in.

Most of those customizations and the configuration gets automatically applied when you bring the model. For example, let's say you have Llama, you want to deploy that to the Linux AI. Most of every safety configuration and the safety filters that we have available for our first-party model automatically applies to the third-party and open source models as well.

So that's an advantage. And again, there's more advantages, one of them being the price factor, right? If you look at our stack, I think it was back in 2017 when Google declared itself as an AI-first company. So that means we have thought about this for a while and if you look at the tensor processing units, so we have something called tensor processing unit, which is the equivalent of GPU, but only better because it's fine-tuned for machine learning models.

So what that means to you guys is that both from a cost and performance, it's going to be significantly better than GPUs. And now when you bring these models in to Vortex AI, you get the benefit of running it in GPUs, so same models run more cost-effectively and better performance. Yeah, so again, the other advantages you get is you get the end to end, the machine learning, the MLOps pipeline, all of the tools that you would need to build it.

It's in a single enterprise ready format. I know I've been speaking for a while. Any questions? We can make this interactive.

So that's a great question. So there's two ways to address this in Google Cloud. So one of them, let me see if I have that.

So I don't think I have the exact one, but let's go with this. So to your point, first question on sensitive data and secure data. So what we do is when we deploy a model into Vortex AI, so we have a consumer-facing version as well, which is called AI Studio, but then the enterprise version, Vortex AI, when we bring a model, we take the models and then we freeze the model weights and that's deployed within your instance, like the foundational models deployed.

Now once that's deployed, none of the data that you send it, like even the data that let's say you have PDF, BigQuery, structured data, all of that remains within the constraint of your Google Cloud environment, which is going to be the Columbia University's GCP environment, right? Those data is set in such a way that it's secure at rest, it's encrypted in transit, and anything that you feed the model, both from a data standpoint and any prompt and queries and analytics and audit, all remain inside the context of Google Cloud. Nothing goes back to the foundational model to train it. Let's say you're done with the project and you decide that, hey, this doesn't really work for me, it just gets collapsed.

Everything resides within it and it gets collapsed. So that's one way, it's protected there, but again, the second option is if you wanted to make sure that, hey, even the frozen weight models, I don't want to send it sensitive data, we have something called DLP, data loss prevention. So what that does is, let's say as a consumer, I'm asking a bot sensitive information, I'm sending it sensitive information, like I enter in my SSN ID, I enter in my date of birth, which I don't want the model to get into, the DLP will automatically redact this information and then send it back to the foundational models to get the response.

So there's ways that we can feed that with mock data, and when you get that response, it gets correlated, so there are architectures to do that. Yeah, great question. Anything else? Any other questions you guys have? One thing I wanted to ask, and you were talking about how Vertex could analyze not just text, but also pictures and video.

And I'm just wondering how advanced is the video analysis, at least differing from consumer to say we might have to use enterprise, but I upload a video and then ask it, is there a board running in this? Can you tell me what it's holding? Yeah, so actually we'll do that today. All of us can do that today. We'll try it in a follow-up approach.

Everybody can follow what I'm doing. We have the capabilities in the enterprise version, right? That means it's going to be more secure in all of the responsibilities. But to your point, to try it out for the consumer side, we have AIStudio.com where everybody can log in.

I actually took a video from your website, but you can analyze another video too. To go back to your first question on the accuracy of it, one of the things that I have found based on working with the customers is that specifically for multi-modal capabilities, like video and images, the Gemini model seems to be performing much better than some of the competitors out there. The use cases is that we're working with certain law enforcement agencies where let's say you have really long videos, like four hours worth of video content, and you would want to find exactly a certain point when an incident happens.

You can ask the model the question. It's going to reason, not based on the audio that's being spoken, but on actually what's happening within that video. And it's going to tell you, hey, on 11 minutes and 46 seconds, here's the incident that's happening.

So that level of reasoning capabilities is available. Also, just to add on to that point, in general, I think part of the question was like, with the Enterprise version of really any of the services that are available publicly, there are just additional features, less restraints on it, so I believe the context window is lower on the consumer side? Yes, the context window is going to be lower. Imagine you can do a 5-minute video on the Enterprise version, and you can do 9 hours.

So it's like, there are a lot of those types of things that limit you on the consumer side versus the Enterprise side. So it's going to be more capability difference, not what you can do or how large a video you can upload. That's one example, but again, there's also just like Enterprise features like security that are not available part of the studio.

Right, exactly. Primarily, the difference is going to be the security aspect of things. You can store it in a better way like all the responsible filters, and also the capabilities.

Like in the consumer side, I think we have a pretty generous token, but in the Enterprise version there's not really going to be a gap on the token limit. Yeah, I think there was a question. I'll come to you next.

This is slightly unrelated, but what are the advantages of using Google's SQL services like BigQuery or Firestore versus just getting a virtual machine from GCS and just running Postgres on it? Yeah, so that's a longer conversation, but I'll just maybe give you a quick rundown of it. So BigQuery and SQL versus you running it yourself with the Postgres, the primary difference is that the first one is a managed service, right? So when we say managed service, what that means is you don't really have to worry about updating it and also from a scalability standpoint. If you're running a Postgres SQL server, you'd want to make sure that it can handle the load and if it needs to scale, you would have to make sure that the instances are added, the application takes place, and also disaster recovery.

All of those components are built in and it's abstracted away from you because it's a managed service. You get to consume just the data warehousing capabilities. That's it? Is it a lot faster as well? I'm sorry, I didn't get that.

Is it a lot faster as well? So BigQuery, yeah, if you compare it to a lot, BigQuery is like one of our products that really performs very well is BigQuery and GKE. So yeah, from a latency standpoint, it's very long. Go ahead.

So our first question is what is the main difference between Gemini and Chrome and BigQuery? And the second question is about so the problem that I have when I'm trying to keep is that if I want to keep as large amount of data to process, I have to upload it as I thought. So I was wondering for Google with Gemini, if there's any way that you can help my personal cloud to avoid having to upload anything in the process? Great question. So the first one, the differences between Pro and Ultra, it's the number of parameters that it's being trained on.

If you look at it, it's the model size, if you may. I think Pro is around 60 billion. I don't know the exact numbers, but Ultra is a lot more bigger from the parameters that's being tuned on.

With that said, what we are finding now, at least if you're following the latest research, is that it's not really, after a certain point, adding additional parameters is not really going to give us any better answers or better output. So the approach that the industry as a whole is taking now is that instead of adding additional parameters, we are taking smaller models and using what we call agentic workflows. That means you let the model reason, right? For example, I can ask the model saying, hey, what's the weather in New York City? The model itself wouldn't know that information because it's not trained on that, but it can reason and find out that, oh, the user is asking about the weather.

I would have to call a real-time API, weather API, and get that information from it. So that's the approach that the industry is going towards. Yeah, so if I had to pick today, I think the pros are most popular model, yeah.

And your second question, yes, we have that capability, and that is one of the demos that we have, and I'll show you how you can have a cloud storage bucket, and it doesn't really have to be a cloud storage bucket, right? Let's say you have BigQuery or you have another third-party API. You can just point it to that, and then you would have an interface where you can talk to that data. You don't have to upload it every single time.

It automatically understands it. Sorry, I think there was a question? Okay. Sorry.

So I kind of want to ask about hybrid setups, where say you have a local storage device, whether that's for security or whatever, there are pros and cons to it. I mean, I don't know, performance-wise, how you just spoke about how you can load data via an API. Do you guys have any experiences with that? How does that work? And a lot of people do that local storage, and there's compute and cloud.

Yeah, that's an interesting question. So we do have certain customers who want to go that approach, right? Now, with that approach, the one thing I would say is, like, if it's more for security reasons, that the data can't go outside of the premise, how we do it is we have, even though the data is going to be on-prem, when we build the AI on our side, we take those and we build an index. We build vector embeddings out of it.

So there's some level of information of the data which is going to be available in Google. And this is the conversation we are, as we speak, I'm having with our security team and the compliance team to see if that's okay. Even though it's not the raw data, but we have the vector embeddings on cloud, is that okay from a compliance standpoint? So that's one.

The second aspect being, I mean, yes, from a latency standpoint, it is going to be a concern. But the most common approach, the easy way for this is if the on-prem storage can expose an API and if we can call the API, that would be the best of both worlds. All right.

Any other questions? Cool. Yeah. Can you use Vertex to chain models? So you can use Vertex to chain models in the sense that Russ had showed a notebook earlier.

Within that notebook, you can use any of the orchestration frameworks, like LangChain, Lama Index, or 1.2. We have something called 1.2 which is similar to LangChain. And you can sort of chain in any models. You can host all of those models within GCP and then chain it in the notebook itself.

So basically in Vertex, when you create a model, they're just something like an endpoint to... Exactly. Yep. Does Vertex also sort of calculate or, yeah, bias for the responses for the questions that are asked? Or is Vertex just a model that you're using? So as a platform, we have certain configurations.

Like we have, and I'll show you that as we speak. So this is some of the safety settings. Like we have hate speech, dangerous content.

You can block some. You can block most. Again, this is the input.

But let me try to see if I have... I might not have it in the UI, but to answer your question, we give a confidence score for every prompt that gets inputted as well as the output. We'll say, hey, this message seems like the hate score is 0.5. And we have an option of saying anything above 0.6 should be sent to the customer. And if it's not, just display our default message saying this is not appropriate content.

So you can control that. Cool. All right. I think the next one I wanted to show, I think we talked about this.

It just emphasizes the fact that, you know, from an open kind of model, we don't just talk about models being open source, we have agents, we have language integration and including the compute itself. We have GPUs and GPUs. Yeah.

So instead of me showing the slides, let me walk through this. So this, are you guys able to see? So this is the model garden, right? Within Vertex AI. So now you can pick and choose.

You see that these are some of the foundational models that we have, like we have 1.5, Pro, Vision, and we also have Cloud, et cetera. And if you look here, we have the option of deploying it directly from HuggingFace. That's also an option that we recently announced.

And you can pick and choose. For example, let's say I want to pick a vision model, which does segmentation. You can see that DeepLab V3 is one of the models that we can pick and choose.

So this is model garden. Now let me show you, all right, so this is a sample chat. And if you guys want to follow along.

So what I'm showing right now is the Vertex AI, the enterprise version of it. Then we also have a consumer version without really having Google Cloud. You can try and play around with the prompts.

And why this is used, the Vertex AI Studio or the AI Studio, is that when we are creating an AI models, what we have seen is you have to try a couple of, you know, trial and errors to get the right prompt and also make sure that the right tools, et cetera, is used. So you can use this as a playground. And once you have something that works well, you can directly get the code from here, depending on the language.

Let's say you have a Node.js or a Python application. You can take this and deploy this to your Cloud Run as is. You don't have to do any customization.

You have a ready-made bot with all of the custom prompts and custom computations. So yeah, if you want to follow along, if you guys want to go to aistudio.google.com, you can sign in with your Gmail, you should be able to. So for some of the models, yes.

For example, we have Gemma, which is our open source model. You can fine-tune the open source model within Vertex AI, and then you can download the model back. But the Gemini models, that's a proprietary model.

So you can fine-tune it, but you wouldn't be able to download this in your local machine with that. Absolutely. Absolutely.

You can do that. Yeah. Yeah.

Yeah. Yeah. Yeah.

Yeah. Yeah. Yeah.

Yeah. Yeah. Yeah.

Yeah. Yeah. Yeah.

Yeah. Yeah. Yeah.

Yeah. Yeah. Yeah.

That's fine, I'm from the studio, right? You have different models to choose from here. Like you can choose the experimental version. I like the flash.

Our flash model is the latest multimodal model, which means you can take, I think up to 10 million sort of input tokens, but at the same time, the latency is significantly lower than, you know, anybody in the market. Now, this is what we call temperature. How many of you are familiar with temperature science? So it's just a fancy term to say how much the model can make things up, right? Like most of the large language models hallucinates.

Hallucination means that if you ask it a question, if it doesn't know, it's gonna confidently answer. It's like my six year old. Even if you don't know the answer, you'll confidently answer it.

So you can set the temperature, the lower the temperature, you know, the more deterministic it is, the higher it's gonna be more creative. And you can have an output token. We'll talk about grounding and I'll talk about that in the next piece.

But yeah, this is some of the settings from a safety settings that you can pick and choose. Hey, here's how much you wanna configure the safety settings. You only see four in the UI, but when you're calling it through the API, through notebooks, you have a lot more configuration and you can set specific threshold limits as well.

Right, so now I think I had, let's do a YouTube URL. Particular thing that you guys have that you can analyze. Let's do this.

So, that I own. All right. So let me do this.

Since the video is too large. So let's do this in AI studio. And again, you guys will be able to do this yourself.

I can ask it questions now. I can say, hey, summarize the video for me in bullet points. Let's see if this still extracting.

Now what this is doing is, it's basically exploring the multimodal capabilities, right? So when we say multimodal capabilities, let me run this. Hopefully this works. Again, it's gonna take longer from a time standpoint because it's a video being analyzed and we're doing it on the fly, right? I think there's a question like earlier, what if you want to do some of the analysis before this, you can do the analysis we made.

Now again, I think it talks about the video, it's a Columbia University tour, actually is a sophomore, et cetera. It didn't really, it actually did the meaty thing. Yeah, so now you can, what you can do is you can sort of customize this how you want it to.

You can fine tune it, you can make the output pass that look like a certain thing. And what's new is that you can also do sort of code execution. You can say, hey, extract the fields from my first prompt and then call another API or a function to retrieve the values.

So that's something that can be done. And once it's done, again, click on the code or I can open it in Colab directly and then deploy it in my enterprise version. Cool.

Now let's say, I'm gonna ask this question. Who is the president of USA? And again, I ask this question. Oh, it's trying to see if it's there in the video.

So it caught the president. And again, this might, I know you guys might be thinking here, this is a basic question you might get in Google search. Why are you asking the AI chatbot? The reason I ask this question is in the next section, we're gonna talk about how you can ground some of these AIs with specific data, right? Now, let me go back to my slide.

Talked about this. Yeah. So we talked about how large language models can make things up, even if it's not 100%.

So we call that hallucinations. So why does this happen? It happens because of a couple of reasons. One, the fact that the training data is all over the internet, right? For example, if I ask a Columbia University question, I want it to come from an authoritative source.

I don't want the answer to come from a random Twitter or a random Reddit blog post, because the chances are it's not going to be accurate, right? And that's what we mean when we say, you know, hallucinations. How many of you have actually faced the LLMs hallucinating on it, knowing what I'm using? All right, cool. So how do we change this, right? Like the way that we can make sure, like even when I asked it, who is the president of USA? It said Joe Biden, right? But if I had an AI agent or an AI chatbot for Columbia University, I would want it only to answer questions that come from that particular dataset.

Anything outside of the dataset, it should say, hey, I don't have access to that information. I'm not allowed to answer that. So that's something that we have seen, like most of the customers that we work with, they want that kind of grounding capabilities.

So that's where we have Vertex AI Search. So what Vertex AI Search basically does is you define it a corpus of information, the dataset, and then once all of the backend things that happen, how many of you are familiar with Retrieval Augmented Generation, the RAG system? Cool. So that's the approach that we take, right? What we do in the backend is retrieval augmented generation.

So what does it mean? What does retrieval augmented generation mean? So what that means is if I ask a question to the chatbot, right now without anything, it's going to look through its indexes and then return an answer to me. But with Retrieval Augmented, what's going to happen is if I ask a question, the first thing that's going to happen is it's going to do a semantic search. Semantic search meaning I would have like 50 documents.

It's going to find me which particular paragraph of which document is the most relevant for the question the user asked. It can be five different paragraphs. And again, it's not doing a keyword search.

It's doing a semantic search. So what's a semantic search? It's basically going to look for the meaning of your question and then find the relevant chunks of information. And until this point, the AI is not involved.

It's just doing a search. And again, as Google has searched, we've been doing this for a long time. So the search piece really works well.

It extracts those information, the five different chunks of information, and then sends that to the large language model or the AI. Now we ask the AI saying, hey, AI, you only have this information. Go ahead and summarize this one.

So that's Retrieval Augmented Generation in a nutshell. Again, in order to make this happen, here's a list of things that need to happen, the first step, the first half of it. You need to process the data.

You need to embed the database. You need to index it, retrieve it, rank the results, and then generate it. Now, this is great, and we have seen customers adopt the DIY method for rank.

But what we are going to be focusing today is the second piece of it, which is Vertex AI Search, which basically extracts all of the complexity behind building such a complex Retrieval Augmented Generation system. It takes care of grounding. It takes care of indexing.

All you have to do as a user is just say, hey, I need, point it to the data source, and the rest of it is taken care of. And I know when I say it's simple, it's actually really simple. I'll show you that in the UI.

Yeah, so this is something we talked about, out-of-the-box experience. You know what? I'm not going to go through the slides. I'll show you directly on the portal.

Let's build something. All right, so what I have here is, I took a couple of PDF documents from your website, and I've uploaded this into, let me show you. I've uploaded this into the GCS bucket, which is essentially an S3 sort of bucket, right? All I've done is I've taken a couple of PDF documents, and I'll put it here.

And this is, again, unstructured data, so it's not really the format of SQL. So the next thing I do is I go to Agent Builder, right? This is where I'll be creating the app itself. I'd create an app.

So the first option I have is choosing the persona, right? I can either have it in the format of a search, or I can have a chatbot, or we also have an agent. If we do have time, we'll get to the agent piece. But for now, I'm going to go with search, and I'm going to say Columbia.

So here you see I get to choose an existing data store or create. But for now, I'm going to create a data store from the scratch, right? So when I say create a data store, these are the list of options that are available to me. So we have website.

So I can directly point it to a website and say index everything that's within this website. So what that would do is that once the chatbot is built, it's only going to return results from within that website, and anything outside of that, you ask it any information outside of that, it's going to say, I don't know the answer, right? All of that responsibility happens there. We also have BigQuery.

I know there was a question around structured data. So you can point it to a BigQuery table and then say, hey, sorry, go ahead. How about on the Google's conference? Could we ask? So, yeah, so you can do that.

But the only thing, so with respect to website, what we do is we want to make sure that you own the domain, right? Like if I want to index Columbia's website, I'm not allowed to. So we have to prove ownership of the domain before we can do that. With respect to Google Scholar, I don't know if we have the ability to do directly point it to Google Scholar, but you could have ways where you can say anytime a new page gets uploaded, extract that information manually, and then put it into the indexing.

Does that make sense? Any other questions so far? So, yeah, you can point it to BigQuery. You can also do cloud storage, like cloud storage. You can upload any kind of documents.

It can be your Excel, your PDF, Docx, Doc, PPT. All of that can be uploaded here. You see this.

Now what I'm going to do is I'm going to point it to my source, which is in this case Columbia Anand. Continue, give it a name, and that's it. That's essentially what's needed.

I'll select this one and then say create. So it literally took me like four to five minutes for me to build this enterprise-ready grounded AI agent. Okay, now the only thing is from after we build it, we give it some time for the AI to start indexing it in the backend, depending on the data size.

For example, certain websites, we have seen it takes four hours, five hours. For certain PDF documents, it's a couple of minutes, but yeah. Yeah, this is great.

I actually got this from my past job with a different cloud service, but I didn't believe what we were facing a lot was that oftentimes the data itself was structured in such a way that you have to build and switch on the backend embedding, depending on the data type or the specific data that we want to embed for the specific company. So is there a way with Vertex AI Search that you can customize that, or is this part just all out of the box? So that's a good question. So we do have certain configurations that you can configure from the console itself, right? Like for example, how do I want the answer with follow-ups and the models? How many results do I want to count? I can also customize the summary in the sense that I can give it a persona for the prompt.

And those things I can do it right from the UI, but from a chunking standpoint, the amount of data that needs to happen, those are controlled by the APIs itself. Yeah. So you're just calling out of the box the data.

Correct. And you think that the index is creating and embeds in that default form. Exactly.

So if you wanted to customize all that, you actually have to go through the steps. Yep. So let me show you one more thing.

That's a good question. For documents, we have some level of flexibility. So let me, I think it's this one.

So, okay, I think this might not have, but yeah, to answer your question, yes, some of them are available in the UI, but the rest of them you can configure it. Yeah. Because I'm assuming it probably performs very well with just a normal PDF.

Right. But then the second you introduce tables or other types of data, it probably produces so it doesn't know what's referring to what. Yeah, that's a great observation.

So that's something that we have also found, like with respect to tables and graphs, it wasn't so accurate. So that's what I was trying to look. Maybe this particular project of mine doesn't have the configuration, but internally right now, we also make that available through the UI.

So what that means is like if I upload docs and with tables, I have an option to say, document chunk passing, I can set the limit of the, you know, the paragraph size. Exactly, yeah. I can do that.

Right. Exactly, yeah. So I think this one is done, the one that I created before.

So these are the options. So this is the two documents. I'm just trying to see what's the best question that I can add.

This is, so while I'm just looking at this document for asking any questions, any questions that you guys have so far? All right, so let's see. So you can see that, I mean, I just asked a random question. I didn't really completely understand the document I uploaded.

But in a sense, if you do ask the question, what it's gonna do is it's going to give you this answer. Now, why this answer is key, the summary is that this summary is basically being pulled from different documents. It takes the meaning of your query.

It goes back, does the semantic search, finds the closest, you know, vector match, gets the relevant documents, and then summarizes it using large language model. Now, if I click on it, I will know that this particular segment of the paragraph was actually derived from this document. I can click on the document and it's gonna take me to that, right? So this way, like, from a grounding standpoint, you can see that, you know, it performs very well.

Let me ask a question. Who is, hopefully this is the right answer. Sometimes it bombs on me.

So yeah, so this, it says we don't have a summary because, again, it's not trained on that data. Even though it's a very basic question, you would expect most large language models to know this. But in this case, what we have built is a retrieval augmented system, which is grounded on your enterprise data.

Cool. Yes, Paxil. Have you found that it does better or worse than different continuum techniques? So, yes.

So certain things, for example, what's very tricky is that, for example, I'll give you an example for a customer we're working with. They have their small business site, which is basically index, right? And the question would be on restaurant. Now, the customer would ask a question on restaurant and then they would follow it up saying, how do I open the restaurant? Also give me, write a poem about, you know, Star Trek, which is not really, in that case, we have seen that, we call that prompt injection, right? It seems like a valid question, but then it's followed up with something else, which is not valid.

So that's where intelligent prompting techniques to ensure that, hey, make sure the complete question is relevant to what you're answering. And there's many examples of prompt injection that we are seeing, but yeah, prompt engineering definitely helps. And one of the cool thing is what we have seen is we've also used Gemini itself to basically come up with the prompt for us.

We can say, hey, here's what I'm trying to do. Here's where the answer gets leaked and it goes out to the internet. How do I make sure that doesn't happen? And Gemini has been like really good in giving the right prompt so that we can go back and edit it.

Something about Sputnik. Yeah, so essentially that's the demo that I had to show for you guys. Anything else that you guys wanted to see or any open questions? Go ahead.

I'm still a little bit curious about the database services that you offer. Can you tell me more about Cloud SQL versus Big Query, which is Firestore? Cloud SQL and Big Query, I can tell you Firestore, I'm not really, I don't know that much, but so Cloud SQL is primarily, both of them is for structured data, right? Cloud SQL and Big Query. But where you would use Cloud SQL more is that when you're doing a lot of transactions to the data itself and it's a live database, that's where we would prefer Cloud SQL.

Big Query, we have seen it more used in a data warehousing kind of setup where you want to run reports and analytics and you want it to be the central warehousing solution which basically pulls feeds in from different environments. So if you have a web application that needs to constantly refresh based on new data coming into a database, then you would probably choose Firestore. Right, exactly.

So what about Firestore? I'm not really aware. Well, Firestore is mostly made for lightweight applications. So whereas Cloud SQL is more of an enterprise version that has the features and the number of transactions and queries it can use, Firebase in general, and everything associated with it, is used very frequently for lightweight applications, prototyping, things along those lines.

Generally no development method for it. I'm sorry? Generally no development method for it. Not if you're planning on whatever application you're using to scale and be used more and more.

But if it's something simple, then sure. Cool. So I have some more slides about how semantic search works.

Or I can show you a demo of how we can build agents. Which one would you guys want? Research. There's a Google Socks scholar comment.

The grantee. The grantee has a Google Scholar connection. It has a- Cool.

Cool. But it's a good show of an application built on top. But first, I'd like to turn on.

Sorry? It's generated. It's the first, first, first. Oh, where is it? Oh, there it is.

Is this probably it? What is it? It should be there. Yeah, it's just loading. Yeah.

It's just kind of. It's just an example. So.

Go ahead. Yeah, so. Essentially, I think the data source for this is NIH.

Yeah, it's using. SEM-NIH. Sure.

So this is a platform, an application that's built on top of a lot of what the non-software services are fixing up. It's main purpose is to help researchers summarize a lot of the papers out there when they're conducting research. And so, because obviously, the task, the way to do this was every individual researcher would rread stacks and stacks of papers to really understand how to write their own paper and have their own research. And so what you can do is, like, here's an example of, like, it's pointed at the United Nations, like, all of their papers, and this is what my research is about, finding everything that's related to it. And then you can see how it has a confidence score associated with it, so it's actually doing, in the back end, well, it's an index, and I thought this was cool, because it shows, like, a group of scholars.

Yeah, essentially, so what it's doing is, I can, sorry, go ahead. Is it PubMed? Yep, it's PubMed. So what it's doing is I can give it a particular, you know, summary of what I'm looking at within the grants, and it's going to search for every available grant.

Now, like Russ said, it's going to give me a confidence score of how much of what I need is matching with what's currently on the grant itself, and then I can add in my scholar ID, Google Scholar ID, in this case, and once I do that, now it's going to customize it based off this particular researcher's interests and background. It's going to say, hey, based off your interests and based off what's available and what you want to do, here's the list of things that we see is the best match. Again, what this is is an example of what is the art of the possible with some of the Lego building blocks that we talked about, right? So in the back end, it's using BigQuery, it's using Vertex AI Search, et cetera.

So it gives me the summary of the grant itself, and then I can also use it to generate the application. I know there was a question on prompt. Now it says, hey, you're a university research, you know, you help to write proposals, right? So that's the prompt that's given.

And if I click search, now you see that it's automatically filling up my response for me. So this is a Gen AI response for the grant proposal. I can just change it with my name and institution.

I think that's a good point. So what is being displayed right now is not real time data, right? It's data that's being fed at a particular point in time, right? Now, you could change this. That's what I was talking about earlier, wherein we have this agentic framework, wherein we can say, anytime Volcker publishes a new article, take that into consideration and cite those sources.

But you have to choose a data source, basically, right? Right, correct. And make sure that you're limiting it to what you want. Yep, exactly.

Okay, there's still a lot of questions. Right. I think this is an application prototype demo, something that we've seen universities, research institutions use as a tool to enable researchers to be more productive, apply for more grants, get more money, versus the way that it's done really currently and historically.

So this isn't something, this isn't a Google product. This is something that can be customized and built based on the needs of the university, the research community, whatever. It's just sort of demoing the architecture.

Yeah, I think this one pulls the information of the research. In general, yes, you could connect this to Google. That's what this is doing currently in terms of connecting this particular individual to Google Scholar and everything associated with them.

But it's customizable. So it's really, the main point is that you can reach out to various data sources, whether it's PubMed, Google Scholar, other things, control what it's reaching out to and define how it's going to use that to present the researcher with options. Thank you.

Sir, when you're talking about retrieval of metrics generation, so hypothetically we have access to Web of Science. So I have my list of papers in a specific area that I'm interested in, and I download their abstracts and their list of papers to Web of Science in an Excel file or something, and then put something like retrieval of metrics generation, use that data set that I can then pull in to kind of give me more insights into here, into structuring this section of my paper. 100%.

Actually, if you're, I mean, I know there's no access, but if you had access, we could try that right now. Yes, if it is a particular content like in Excel, you have the summaries within the Excel or even the PDF itself directly, you can upload it to the documents that you download, and then you can ask it questions based off that. You can say, hey, what are the key characteristics of this? We have seen this used like across different, even from an AI tutor perspective, right? We've seen professors upload their entire course materials to this particular system, retrieval augmented, and then the students not really like replacing the professor, but what they do is like it's essentially you can use them as all the time available office hours.

If I have a question about that particular course, I can always go back to the spot, and it's going to cite the exact document. Yeah. So I think we have.

One thing I wanted to show is I know I talked about. So we talked about this, right, the retrieval augmented system. So that is or at least where we are at this point of time in the Gen AI space is that we call this basically like a tool.

So the LLM in itself, the large language model has foundational knowledge. But with the foundation knowledge, it's not really being able to do much. So what we are doing right now is we basically connect these tools to the large language model so that it can make a determination.

In your case, if you have the Web of Science data store attached and I could also have real time library availability of the books or all of this can be different tools that's attached. And when you ask the bot a question, it's going to know that, OK, this particular question is relating to library and it's going to extract the information from the library data. So that's what we are seeing the reasoning capabilities of LLM.

That's what we are seeing as most powerful right now. Instead of it knowing all the answers, it's basically able to get the answers from everyone. You can think of it more like us.

We don't know all the answers, but we have different people. We can collaborate and then get the right answers. So that's the analogy that I would say.

Yeah, with that, I just want to show you how we would be building agents on the Google site. Let me create an agent. So the approach that we have taken is we want to make this as little or no code as possible as much as possible.

So here's my agent, right? I would give it a name. I would give it the goal right now. This goal is me defining the agency.

Your goal is to make sure you give it the itinerary of a travel plan or an end customer. I can define those goals. And inside this, I would give it instructions to I would say here are the step by step instructions on how you would go about handling this.

And this is where I talked about the different tools that we have. So let me go to the tools option I can create. By default, we have a code interpreter tool.

If you guys have been following the news around this, you'd see that elements are really bad at math. I know there's this thing saying how many hours are in strawberry. Most of the models couldn't answer it, which is why opening I actually released a version called strawberry.

But there's a reason behind why it couldn't answer that question, right? Because it's good at reasoning and it's good at natural language task. But it's not really good at math or computing a particular execution of a code, which is why we attach those things now to the large language model. You can see that code is automatically attached, but you can create more tools to it.

So one of them is an open API specification. So you can define any function which can be residing in your on-prem environment. It appears Azure anywhere you host a particular API.

You'll be able to query the information with the API as long as you define the specs for the open API. The second one is the data store, which is similar to what we created right now, which is a search data store. And the third one is what we call function calling.

Essentially, we can extract the values. For example, if I asked it saying, hey, what time is the line A going to be arriving on September 27th in the morning? This is a bunch of unstructured paragraph text that I as a consumer give it to the model. But the model is going to understand that it needs the information, which is line A, which is a parameter.

Which is a parameter. It extracts those relevant information and then makes the call to the API. So, yeah, you would basically augment the LLM with the tools using this page.

And then you would say, hey, use this tool. I don't have any tools, but let's say yes, I can give it the tool name. In this case, I have the code interpreter, and I would say use this for any questions relating to Python examples.

So, yeah, those are examples of how you would give it natural language instructions and augmented with tools. So is the instruction for modifying the model or is it for the user to know what the model is? No, this is for modifying the model. So that's, again, the differentiator.

What we have done is there are tools like if you work with sort of true AI, et cetera, and even line graphs. Those are agent orchestration tools that's available open source. But you would have to know some level of coding and API to get this done.

The differentiator here is that you as a business user, anybody could just define it in natural language. Add the tools in there. Is the school, would you say, a platform where you're designing for conversational models? No.

So we have different kinds of models, which is where I was also showing the model garden. We have not just conversation model. I know generative AI and conversation AI has sort of overtaken it, but we have the traditional vision models, et cetera.

So this model garden has all of the available models here. Do you have a first component like the name density recognition? We have recognition models for video. So these are the models that we have, the tag recognizer, product recognizer.

Again, this is just limited to what's available in the marketplace. But if you choose to deploy from Hugging Face, you can bring it up. Do you have the calendar? So you can just bring up the calendar piece.

Yeah. So what I've gone ahead and done is taken everybody who signed up for Columbia EU registration and basically put together a number of different labs that expose you to some of the concepts that…[Video ends early due to screenshare stopping.]

September 19, 2024

GCP led a workshop at Columbia for researchers on how to leverage the latest AI tools for your research including Google Colab Enterprise, and both Gemini multi-modal and Vertex AI.