Accessing “ACCESS” a Free NSF-Funded HPC Resource
Transcript
Hello and welcome to the Accessing Access, a free NSF-funded HPC resource Lunch and Learn speaker series. My name is Elizabeth Kwon and I'm an Embedded Research Computing Specialist on the Research Computing Services team. So our Research Computing Services or RCS team helps provide technology resources to support Columbia's researchers and faculty with their compute needs, one of those resources being Access.
We also work in parallel with our partner team, High Performance Computing, to offer shared HPC clusters and services to Columbia's researchers. The goal of today's Lunch and Learn is to raise awareness and enhance our understanding of available resources, enabling researchers and educators to effectively utilize them in advancing their research efforts. So I'm first going to be explaining and talking about what is Access and what are the different use cases at Columbia University, Access Roadmap at Columbia University, what are some general terminology to help you kind of get acquainted with, allocation types and prerequisites, how can someone get started on Access.
I'll go through a live demo of Navigating Jetstream 2, which is one of the resources that are available on Access. Some additional resources and opportunities that may be of interest and then questions. And I also will say before we begin the live demo with just general multitasking, I will take a minute to also look at the Zoom chat to see if there are any questions coming up.
My team on RCS will also be kind of monitoring the meeting chat, so if you have any questions, feel free to just paste them in there. So what is Access? Access is an acronym, it stands for the Advanced Cyber Infrastructure Coordination System Services and Support, formerly known as XSEED. It's a science foundation to help researchers and educators utilize the nationwide collection of supercomputing systems.
If we kind of break down the acronym a little bit further, Advanced Cyber Infrastructure, this refers to computing that kind of goes beyond your laptop, that could mean supercomputers such as our on-prem HPCs, for example. It's any collection of resources that support complex, large-scale computing. Coordination ecosystem, Access is a multitude of awards that are working together and with other NSF-funded resources providing to end-user communities.
Services and support, services that they provide are things like account provisioning, user support, trainings, operating support, metrics, and reporting. So back in 2019, NSF through the Office of Advanced Cyber Infrastructure published a blueprint for National Cyber Infrastructure Coordination Services for accelerating science and engineering in the 21st century. Their vision called for the broad availability and innovative use of an agile, integrated, robust, like really well-thought-out, sustainable CI ecosystem that can drive new thinking and to transform discoveries in all different areas of sciences and engineering research.
And from that came Access. So Access is a program that's funded by the U.S. government that helps researchers use powerful supercomputers, data storage systems, and advanced software to solve complex scientific problems. You can kind of think of it as like a library of supercomputers and tools that scientists and engineers across the country can check out to run their experiments or analyze big data.
So Access provides a wide range of resources and services such as compute resources. These are the high-performance computing clusters that are CPU or GPU-based, storage resources, data storage systems for storing and managing large amounts of data, specialized support. So these are like additional tools and resources to help streamline research.
So no, almost any computer or any computer application that requires more than a desktop or laptop could qualify as needing an advanced computing system. Some examples include AI, machine learning, big data analysis, etc. Other ways to estimate when and why should you use advanced computing resources are if your tasks for research and their coursework should take minutes but are taking, you know, hours or days to compute, if your laptop regularly freezes due to high computational loads, if your laptop's CPU memory limitations and storage requirements are constantly being maxed out.
And Access also enables research across almost all fields of study from natural sciences like chemistry to formal sciences like math to social sciences. And to reiterate, Access resources are available at no cost, but the application process can be a bit competitive depending on the compute resources needed. So there's three different types of allocations that Columbia researchers and educators can use in order of ease of acquisition and the amount of resources available.
And yes, thank you, Jess, for pointing out something I also did want to state is that, you know, some people might have questions about like, you know, if we have on-prem HPC resources at Columbia, you might ask, why would someone be interested in Access? To that, I will say that I, you know, I strongly first recommend, you know, checking out our on-prem HPC resources for researchers, PIs who have grants or funds can utilize not only our HPC resources but they also have direct access to our HPC support team. So you'll have the ability to purchase either partial or full servers and you'll have complete access to the five-year lifespan of the hardware. However, you know, not everyone has funds or supporting grants.
So this is where Access comes in. It's available again at no cost to researchers. It removes the entry barrier to access compute resources.
And for more information, Jess, one of my teammates on this call provided some links with some information on how he can kind of access these shared HP clusters at Columbia University. So moving forward, looking at some of the use cases, Columbia researchers and affiliates are currently using Access in the following research fields in material simulations in geophysics and planetary sciences, cosmological simulations studying dark matter and dark energy, fundamental applied studies of turbulent flow phenomena, housing market economics studying the impact of real estate agents on housing prices, and many more. In terms of general trends in supercomputing, we're seeing that applications of high-performance computing are expanding in all different fields.
What used to be utilized in only STEM-specific fields are not being used in things like weather simulations and forecasting, molecular dynamics, digital twins for visualizing real-world objects, for example, like city mapping, and many more. I want to take a step back now to kind of take a look at the roadmap of Access at Columbia University. Note there might be a few terms here that you're not quite familiar with, but I'm gonna jump into this first and then I'll explain some of the terminology in the next slide.
So I'm going to start with XSEED, which is the Extreme Science and Engineering Discovery Environment, which launched back in 2011. It was a national collaborative initiative that provided high-performance computing resources and support for research in science and engineering that existed before Access. Columbia first started using XSEED back in 2017, and in 2022, XSEED was reprogrammed and succeeded by Access.
So this type of program isn't something new at Columbia University, it's been here for quite some time. The core difference is that Access provides a more modernized and decentralized approach to providing compute resources than its predecessor. So in 2022, our allocation was renamed to the Columbia Discover Allocation under the new Access program, and in September 2023, our team submitted a proposal to request more Access credits, which were approved of 750,000 credits supplement for a total of 1.5 million Access credits.
And back in August 2024, we then renewed our Discover allocation here at Columbia for another year. So yeah, I just used a lot of terminology that might seem really confusing, so I'm going to an Access project and the resource units that are provisioned to this project. To get resources on Access, you first need to get an Access project or allocation.
Campus Champions are institutional representatives that are familiar with the Access program. They manage allocations and offer guidance in getting started with the resource allocations. And for reference, I am a Campus Champion for Columbia University, so you can always reach out to me if you have any questions about Access credits.
These are the currency that's allocated in Access that allows to use a certain amount of resources. Generally, an Access credit is about one hour or one gigabyte of storage. Amounts can vary by resource.
So for example, a GPU hour delivers more computing, so that might cost you more credits. And with regards to navigating credits and how much they are worth, Access does have a exchange calculator on their web platform with more information on the exchange rates for different resource providers. This refers to the compute resources that are provided by an organization.
And then finally, resource provider. Resource provider is an organization that provides compute or storage resources. The majority of the computational resources provided via Access, including computing, data storage, software, etc., they are operated on their separate agreements between the NSF and the institution.
So Access is your gateway to gaining access to and getting support for these resources. On Access, there are four types of allocations you can request. They are the Explore, Discover, Accelerate, and Maximize.
These are the project types that one can request to get resources. So Columbia University has a shared Discover allocation on Access, which includes over 20 different resources across different resource providers. Users can join our Columbia Discover allocation to use the compute resources for things like small-scale testing and benchmarking.
And these resources, they have varying computing systems, different hardware, core counts, memory sizes, different accelerators, and more data storage systems, software libraries. So there's a lot to leverage. And anyone who's a researcher or educator at Columbia University can request to join our Columbia Discover allocation.
If you want to lead your own separate allocation, like your own separate Explore or Discover allocation, you must be a U.S.-based researcher or educator at the graduate student level or higher. So one thing I do want to know about Access is that these resources are for non-sensitive data use only. Research projects must be handling non-sensitive data in order to utilize Access resources because this is an open science program, fundamentally for fostering open public research.
So if your research does involve sensitive data, I highly recommend taking a look at alternative resources that our RCS team can support, such as the Secure Data Enclave, the SDE, or the Columbia Data Platform, the CDV. And we will, and thank you, yes, we have the resources and links in the chat. So, you know, I talked about the four different allocation types that represent four different project tiers.
So I'm going to kind of break them down a little bit further. So if someone does want to get their own allocation for their research, this is going to be a helpful guide in understanding each tier and the associated requirements. So if we take a look at, for the example of the Explore allocation type, this will give you a maximum credit amount of 400,000 credits.
You know, the project duration will generally be for the length of the supporting grant, if you have one, or 12 months. Explore allocations are reviewed all throughout the year, so, you know, you can put it in a request at any time. And the general requirements are that you're just going to need a project title and an abstract, like four to five sentences, explaining just what you want.
So Explore, this is best suited for newer community members looking to kind of dip their toes into Access waters. In terms of general requirements, Access reps and resource providers, they're going to go and review and see why you want to use Access, what are you trying to do with the resources and see if it's a good fit. If we go up the next level, we go to the Discover allocation, you can get a maximum of 1.5 million credits on here.
And again, it's the same amount for the project duration, requests are reviewed all throughout the year. And the next, you know, additional requirement is you're going to need a one page proposal, instead of just the project title and abstract. So there's a little bit more that's included.
And, you know, Discover, it's more catered for moderate scaled research computing, or even for like a large classroom environment for teaching purposes. Going up the next level to accelerate, you can get a maximum of 3 million credits. Again, same project duration, also reviewed all throughout the year.
However, this requires now a three page proposal. And there is a Access panel merit review, which this undergoes a review by the Access review board, due to just the size of the allocation and sheer amount of resources that are being provided. And then finally, we have the Maximize allocation.
So this has a, you know, not a specific number or limit of resource credits, this is entirely dependent on the scope and size of the research needs. The project duration has to be 12 months for this one's very specific. And also these requests are reviewed every six months.
This is going to require a max 10 page proposal. And again, like with Accelerate, it goes through a Access panel merit review. So if you're interested in Access resources, but haven't fully fleshed out your compute needs, you can try using the Columbia Discover allocation for, you know, again, small scale testing, benchmarking to get a better sense of what resources you'll need.
And you can start with an Explorer allocation if you know, once you realize you need to get your own separate allocation, and you know, as your needs increase, you can request and scale your project to a Discover allocation with the, you know, respective documentation requirements. So the best thing to do is, you know, start small, and then upgrade later to save yourself kind of the headache of having to provide more documentation than needed. And eligible individuals, you know, can reach out to our RCS department to request to join our Columbia Discover allocation, Columbia campus champions such as myself, and also Xenia Radova, senior manager of RCS can readily approve and add users to the allocation based on individual compute needs.
So how does one learn more or join our Columbia Discover allocation? There's two ways, through our National High Performance Computing Access Service page, and also the ServiceNow catalog intake form. So our team has developed the Access Service page on our website, which will be kind of your one-stop shop for everyone to learn about Access, what resources are available, and other related resources. I highly recommend using the QR code that's displayed here to take a look at our page and see what Access offers.
So just going to give a brief second if in case anyone wants to whip out their phones and take that screenshot there. And then the second one is, I want to actually give a quick shout out and thank you to Alan Raghunath and his team for helping us create the new Access Service in ServiceNow and the catalog intake form, which researchers can fill out to submit a request to join our Columbia Discover allocation specifically. This streamlined our process at CUIT for approving new requests, avoiding all kind of all of the back and forth that we would normally do, also saving researchers a ton of time.
So this can be found on the previous page using the make a request blue button here. So in addition to research use cases, instructors can request their own Explore or Discover allocation for classroom and educational purposes depending on how many resource units are needed for the class. So once you choose your correct project size and submit the request, students can then register and create their own Access ID to join.
So how does one create an Access project? To get started with your own Explore or Discover or whatever project, now you're first going to have to create and register your Access account. To create your Access ID, you're going to have to log into access-ci.org and create your account. Your Access account will help you review your projects, allocations, and more.
So that's going to be the platform to help understand everything in terms of what your project is and what resources you're going to get. Step two will be select your project type based on your compute needs. If you're not sure how your needs translate into resource units, then you can first get started with an Explore Access account.
So after trying out different resources, again, you can always upgrade later to a larger project once you understand how your work corresponds to resource usage. Step three, choosing resources. Your most difficult choice will probably be choosing the resource or resources where you're going to conduct your work.
So Access has a resource catalog that will help you find the resources that suit your needs. The catalog has basic descriptions of each resource and its recommended uses. And also while you're filling out the form, you'll also be able to add OPIs and the allocation managers and other users to your allocation project.
Step four, you're going to prepare and submit your allocation requests with respective documentation materials. To complete an Explore Access request, again, you're going to need that project title. You're going to need that short abstract.
If you have any supporting grants, you know, you can always include that information as well. And then a CV and resume. For an Explore Access request, you'll generally hear about the outcome within like seven to ten business days.
And you know, if you work with us and you're interested in our Columbia Discover allocation, we can, you know, we already have this Discover allocation that we manage. We can readily provide you access to that easily. And then finally, step five, once you've been approved and receive your access credits, you can then exchange them for time on the resources that you want.
As a general statistic, about 95% of access allocation requests are approved. Majority of those in the 5% are actually people who are either unfortunately ineligible or were either missing some documentation in the application process to which, you know, access reps will reach out to you clarify and give you a chance to then resubmit or, you know, add on or upload any documentation that was missed. For troubleshooting and additional questions, Access users can reach out to support representatives using the ticketing system available on the Access web platform.
This, you know, goes without saying that you will need an Access ID to access the system. Here you can ask, you know, as general or specific of a question as needed. If you have a specific troubleshooting question about one of the resources, through the ticketing system, you'll be able to communicate with one of the support reps from the representative resource providers.
We also have a Q&A bot, which is a new feature that can help quickly resolve most general questions on Access resources. And Access also has a knowledge base filled with resources that were contributed by other, you know, CI professionals, so researchers, research computing facilitators, you know, engineers, and many more people. So, you know, this Access community members regularly contribute to grow this knowledge base for others to use.
And on Access, I want to just highlight that two of the most popular resource providers that are used across Access and also specifically Columbia University are Indiana University's Jetstream 2 and also Pittsburgh Supercomputing Center's Bridges 2. So, more information on both of these resources. Indiana University's Jetstream 2 is a flexible, user-friendly cloud computing environment that's designed and housed within Indiana University. Jetstream 2 is one of the many resources that are part of the Access, you know, platform and is primarily designed for researchers with either minimal, high-performance compute experience to those that are, you know, interested in exploring cloud-native resources.
There's currently over 2,500 active users that are being supported on this resources with a 8 petaflops cloud computing system and 17.2 petabytes in storage. It has over 400 compute nodes of AMD third-gen Epic Malan CPUs with 128 cores and 512 gigabytes of RAM and 90 GPU nodes with four NVIDIA A100s each. So, why would someone want to use Jetstream 2? In addition to the fact that all these are, you know, resources again are available at no cost for Access, it's a virtual cloud computing environment where users can kind of create their own virtual machines with on-demand resources, meaning there's no queues or runtime limits.
Users also have full admin access to install software as needed for their research compute needs. These instances are easy to spin up and you can create a virtual machine whenever you need one so you're not waiting for some approval in order to use the resources. Jetstream instances also have full access to the internet and persistent IPs.
And again, just to note, Jetstream 2 is not primarily a storage device but a temporary location to stage your data for analysis. Bridges 2 is a system, is a high-performance computing resource at the Pittsburgh Supercomputing Center. It's designed to support a wide range of resources for researchers from, you know, traditional computational tasks to emerging fields like AI and machine learning.
There's three main types of compute nodes on Bridges 2. It's regular memory nodes which provide extremely powerful general purpose computing, you know, pre- and post-processing, AI inferencing, and more. The extreme memory nodes provide four terabytes of shared memory for statistics, graph analytics, genome sequence assembly, and more. And there are GPU nodes which provide exceptional performance and scalability for deep learning and accelerated computing.
Why would someone use the PSC Bridges 2? Bridges 2 is mainly optimized for machine learning including support for deep learning framework like TensorFlow and PyTorch as examples. It provides nodes with high memory capabilities which are ideal for memory intensive applications. So you'll have access to public data sets such as the 2019 novel coronavirus, AlphaFold, CosmaFold, and many others.
So, you know, anyone who has a Bridges 2 account will have access to these. All right. And now, you know, we'll be going into the live demo on JetStream 2. I am going to just take a minute here to not only get a sip of water but also take a quick look at the chat if there are any questions.
I don't think we have anything open in the chat. I will- some folks were asking me about slides and we will be providing those afterwards along with a recording. But yeah, if you have any live questions also feel free to raise them now before the demo.
Yes, thanks Jess. Yeah, if anyone has any questions at this moment feel free to unmute yourself. And if not, we will then go ahead into the demo.
Okay, I will take the silence as no questions at this moment. All right, so now before we begin I'm going to give some additional information on JetStream 2. JetStream 2 is built on OpenStack cloud computing infrastructure and users can interact with it through one of three graphical interfaces. Exosphere, Horizon, and Kakao.
So in this tutorial that I'll be going into we'll be focusing on Exosphere which is the primary and most user-friendly interface. So Horizon, you know, does have additional more advanced tools for people that are more experienced and have been using JetStream 2 for a while. And then Kakao is the next level further which is predominantly used for building clusters.
So to log into JetStream 2, now I'm just going to open up a new tab and you'll go to jetstream2.exosphere.app. So I put a typo in there. So let me go there. Okay, so to log into Exosphere you'll go to the JetStream 2 Exosphere web platform and for, you know, first-time users that are logging in, you're going to kind of see an empty screen here that you're not logged into any allocations yet.
So you'll need to log in by clicking the plus sign here to add your allocation and sign into your access account. So I'm already signed in so I wouldn't need to do this but this is where you would start. So going back, you know, you'll then see all the different possible allocations that you've been added to.
The allocation here, the TRA120004 represents our Columbia Discover allocation but if you request your own, you know, Access Explorer Discover allocation, for example, on the JetStream 2, they will populate right here. So once you click to open up your allocation, you're going to be presented with kind of an overview page with several different cards that are explained with different instances, volumes, and some more additional information. On Exosphere, users can create their own VMs in various sizes and flavors depending on the level of compute they need and it also comes with one terabyte of storage by default.
The default storage system in JetStream 2 are volumes which you can attach to an instance and there you are going to read and write your data from your programs and then finally when you're done, you detach the volumes. So let's start with first creating an instance. What you're going to do is you're going to go to this red create button up here, click on it, and go to instance.
So instances are the computers that exist virtually in the cloud in Exosphere and then you're going to be prompted with just which OS you want and these are, you know, think of them as images. These are the install disks to boot up your machine and has all of the software and stuff that's used to run your machine. So here I'm just going to click Ubuntu with the latest version and then you're going to be prompted to provide some details such as a name.
So I'm going to give the name access what you learn and then here then you're going to be prompted for the flavor of your instance. So the flavor refers to the size of into how many compute resources you want on the instance. So keep in mind the service units or the SU cost is the number of CPUs.
So one SU represents one CPU per hour of compute. So I think I'm just going to go with the m3.small and then if we scroll down further you can choose a root this size. So I'm just going to keep it at the 20 gigabytes for now and then it's going to ask how many instances.
So right now I'm just going to leave it at one but you know let's say that you're using Jetstream 2 for a classroom environment for educational purposes and you want you have 20 students. So what you could do here is you can actually go up and choose like 20 for example and this is going to boot up 20 instances rather having you manually go through and create one instance at a time. So I'm going to choose one for now.
I'm going to enable the web desktop. So we want to utilize this feature which I'll go into momentarily in this demo and then I'm going to click create. So it's now creating the instance.
So right now if you go to this instance card you'll see access lunch and learn it's building. If I click here to the access lunch and learn instance I just created now you're going to see some information and you'll see the state that it's currently in. So right now where my mouse is hovering it's currently in the building stage and you know it's going to go through a few phases here where it's going to change from building to running setup to ready and inside here you know in the info card section you'll see just some general information who created it you know what os the flavor and the burn rate.
So you know here you'll then see some additional information if I scroll down about you know interactions credentials volumes. I'm going to step back for a second and then as you look here in the instance section you'll see that by default exosphere filters to instances you've created. So if your instance ever errors out in this process when you're creating one the best thing you can do is just build a new one.
Just delete and rebuild. It'll only take like a couple minutes. And again, so right now it's still building. So I click on that.
Yeah, and also another important thing to point out is the burn rate. So here I choose the m3.small has a burn rate of two sus per hour. So that's just to keep in mind of how many resources or how many resource credits you're using.
And in terms of credentials here, you know, it's going to get a default hostname, you know, a public IP address, again, it's, you know, open for various use case purposes, you'll see it says something username and password here, and the username is user by default. So if you ever want to get into your instance using SSH, this is where you can use the username. And then, you know, once this instance is done building, it's going to give you the passphrase information here.
And you can use that here to SSH in. And then under interactions here, this is where you're going to actually interact with your instance exosphere has a lot of handy features like the web shell and web desktop for users who are not familiar with command line interface. So once it's finished building, this is where you're going to be able to select those features.
So now, right now, it's currently building. And what I'm going to do in the meantime, is I actually pre built another instance. So we're going to use that one going forward.
And you know, here, you can see that it has a green ready sign. So I'm going to click through here. Again, you'll see the default information up here, you'll see some information about resource usage file, you know, your instances built.
And then again, all the other information down here. So now I'm going to go into that web desktop feature that I mentioned earlier. So I'm going to click on web desktop here in the interactions card.
And what you'll see is it opens up a new tab and opens up, it's going to open up the full desktop. So if you're familiar with Ubuntu default desktop, this might look a little bit familiar. And you know, even if you're not familiar with Linux, or even like Windows, it's fairly straightforward to navigate.
So I'm going to move over to these different icons, you know, for example, I can click on files, and this is going to open up your file browser. And you know, you can navigate to different folders here. And what you can do is you can also create folders.
So for example, I'm going to create just a new folder called test folder here. And let's see, you can open up other apps. So let's say I want to open up Firefox.
So again, this is open to the internet. So I'm opening up Firefox in the VM. And I want to navigate to docs.jetstream-cloud.org. This is actually going to direct me to Indiana University's Jetstream 2 documentation page, which is really extremely helpful for kind of getting a rundown of how to navigate the Exosphere and Horizon and Kakao platforms.
So I'm going to close out of this. You'll also see by default, there's a few different apps that are here, like RStudio, MATLAB, they also have like Anaconda and a bunch of other software that's already pre installed. So you know, you can keep that in mind.
And the featured images on Jetstream 2 are kept up to date. So for security purposes, if you know, you need to create your own firewalls for any reason, you can, but by default, everything is open. So let's say you want to upload some files to Jetstream 2. What you can do, which is a neat trick is you can do Ctrl Alt Shift, if you're on a Windows machine, or Ctrl Command Shift, if you're on a Mac, which opens up a sidebar with additional features.
So Jetstream uses something called guacamole to display this virtual desktop. And you know, some browsers might not support direct copy and paste. So you can use this tool to copy and paste.
So here, you can upload files to your instance, from your local computer, by clicking here, go to, I want to find my home folder. And then exo user, that's the name of the username for this machine. And then here's that test folder that I created earlier.
And here, you can click on upload files. And then you can just upload whatever files that you need to. So I'm going to click cancel, go back, and I'm going to close out of this.
So that's how you can upload files. Now to go into volumes. So you can think of volumes as like a flash drive of sorts, you store data on it, and you can move it between different instances.
And you can have multiple volumes attached to an instance, but each volume can only attach to one instance at a time. So we have our instance, and we want to create a volume. How do we do that? So what you do is you scroll back up, go to that red Create button.
And we're going to click on volume here. And then give it a name. So I'm going to call it lunch, learn volume.
I already created it for this one. So let me just do test. Fine.
And I'm just going to give it a default 10 gigabytes and click Create. So this created the volume. And I'm going to now go back to my instance.
So I'm going to go back to my CIT lunch learn instance. And then you'll see there's a section here called volumes. And I'm going to click attach volume.
And then you can select whichever volume you have, and then click attach. And then I'm going to go back to my instance. And if we scroll down to that volumes card, you'll see that it's attached here.
So now let's say we also want to create some data in this volume. So I'm going to create a new folder using the web shell feature here and create a file inside the folder that we created. So this is going to open up a new tab when you click on web shell, wait for that to load up.
And I am going to create a folder called my dash folder, I am then going to create a text file called Hello, and I'm going to put the content. Hello world in it. And then make sure that it it gets put into my dash folder.
And with the name Hello dot. Okay. So now that I've run that, if we go back into the web desktop, so here, I'm back here, and I go to files.
And I go into this my folder that was created, we'll see there's that Hello text document that was created. If I want to open that up, that's Hello world. So everything worked out.
And now let's see, if we want to, you know, again, transfer data, we can use them in the methods that were mentioned previously from the either your local machine, or from the internet, as alternative methods. And, you know, let's say we've run our work, we've run our analysis. Once you're done analyzing all of the data that you need to, you can then go ahead and detach a volume from an instance.
So first things to do is make sure you close out of all web desktop web shell that you have open. And then once you're there, we're going to navigate back to the volumes page. So here, you see, there's this little arrow icon here, that goes into which volume that's attached to your instance, we're going to click on detach.
And then it's going to, you know, give you a warning message just to make sure in case of any potential data loss, and I'm going to click detach. And then it's detached. And then if I go back to my, and I'm just going to go back to my instance, and then I'm going to go back to the page here, there's no volume attached.
So that works. So one final important note I did want to make is instance management and CU consumption. So for as long as an instance is running, even if you aren't actively using it, you know, and it's still consuming some service units, those SUs.
So and again, in reference to that burn rate, that's up here, as to SUs per hour for this instance that I have, you know, if your research isn't being used for a server or service that needs to be online 24 seven, we do strongly recommend to shelf your instances to avoid rapidly burning through SUs, shelving automatically creates an image of your instance to be used to create a new identical instance at a later time, when you can then unshelve it. So, you know, by shelving an instance, your SU consumption goes to zero. So you can shelve an instance by going to the actions item here.
And then you'll see there's a button called shelve. So I'm going to click that, and then click Yes. And in the process of shelving it right here.
And then, you know, we just want to ensure that we are fully optimizing and efficiently maximizing resources on our shared discover allocation, which again, is shared amongst Columbia researchers. But you know, if you have your own Explorer discover allocation, we want to make sure that you know, you're also, you know, maximizing usage of the resources that provision to you so you don't run through them too quickly. So that is a quick sneak peek of what you can do on just room two.
So I will now go back to our page. And I'm going to take a quick look in case there are any questions. But I think Helene already sort of addressed Darko's question.
But I think that I know I have an insider knowledge that you'll probably get into this. But he was asking, if you have heavy GPU usage, and you're like tapping into the NVIDIA GPUs, would CUDA be enabled for AI and ML learning? Or is there a better option for that? Yeah, so good question. So there are different resource providers that do have, you know, users that have heavy GPU usage.
So it really depends on the different resource providers. Let me see if I can quickly go to the access dash ci.org page to kind of redirect you to where you need to go. So this is kind of accesses general platform that has all the information that you need.
If you go to the resources tab, you can browse the different resources here. And then, you know, I want to filter by GPU specific resources. So I'm going to choose GPU compute.
So here, you know, you're going to get Darko flagged all of the different resources that are that have GPU capabilities. So I highly recommend and I'm going to copy and paste this into the chat to take a look to see if these resources meet your needs. And also, what OSs are available.
So for example, for Jetstream 2, it has, you know, Ubuntu, you know, it's primarily a Linux environment. But you know, not all of these resources have the same OSs. Some take on additional OSs.
So you would have to kind of filter and go through here to find out what best suits your needs. Okay. And are there any other questions? And feel free to unmute as well if you have any questions.
Sorry, there was one about Jetstream asking if you're using the Jetstream 2 every day, should you shelve it each day and then reboot it each morning? I would recommend doing that just because it will be going through overnight the resource through the SUs you have. So you know, if especially if you have your own Explore or Discover allocation, which has much fewer access credits than a accelerate or maximize, you know, you're kind of just burning through, you know, potential credits or, you know, core hours you could have been using on the CPU or GPU resources. So, you know, better to shelve them when you're not using them, save you some time, save you some resources, and then unshelve them when you're ready to use them again.
And yes, it saves electricity, we want to also be, you know, one of the core philosophies or principles of access is we want to be sustainable, and that includes with, you know, energy preservation. All right, so let me go back to our demo. And again, these are like great, great questions.
And I'm gonna go on to, so I had a video demo just in case things didn't work out with our, with our instances and everything, but everything worked out perfectly, which is great. But, you know, since we're on the topic of access, I also want to share a few other relevant resources that might be interesting and catch people's eye. But NAIR, which is the National Artificial Intelligence Research Resource, it's another program that's led by NSF that aims to connect US researchers and educators to computational data and training resources that are needed to advance AI research, or research that employs AI.
So a NAIR ask force was established back in 2020, and launched in 2021. The NAIR pilot aims to kind of bridge the gap of accessibility of AI resources and tools to the broader research and education communities to, you know, advance trustworthy AI, protecting privacy, civil rights and civil liberties. And many of NAIR's resources are part of the access program, and campus champion representatives at CYT can help provide guidance to researchers who are interested in navigating access to NAIR resources.
There's also the NAIR Secure program that's co-led by the Department of Energy and the National Institutes of Health, NIH, as part of the NAIR pilot. So NAIR Secure will enable research that involves sensitive data, which requires special handling and protections. And in terms of eligibility, this call is open to proposals by US-based researchers and educators from specifically US-based institutions.
So this is academic institutions, nonprofits, federal agencies, startups, and small businesses with federal grants. And, you know, another core requirement is that, you know, projects on NAIR have to cover cross-cutting or domain-focused areas of AI. So advancing AI methods that enable scientific discovery or using large-scale models to explore complex data sets interactively.
So, you know, you might be wondering when should someone use access or when they should use NAIR resources. NAIR specifically for projects related to AI and research and education and is recommended for research conducted for the duration of the pilot program. So this pilot program will go on until January 2026.
Access is recommended for all different types of projects and fields with the caveat that it is non-sensitive data. So if needed, you can renew. You can also extend your access allocations for the duration of, you know, whatever research needs you have.
If you have a research supporting grant, you can always extend it to parallel that. So it's recommended for more long-term use. NAIR is also holding a workshop called AI Unlocked, Empowering Higher Education to Research and Discovery that's going to be held April 2nd to 3rd in Denver, Colorado.
And the primary goal of this workshop is to connect US-based higher education affiliates with valuable information and resources and training to deepen their understanding of how to leverage AI in their current work while also equipping people with the skills necessary to advance their careers and achieve specific professional objectives. If interested, I highly recommend applying. And with, you know, advancing computing resources for research, I'd be remiss if I also didn't talk about Empire AI.
So Empire AI is a New York State initiative that was launched by Governor Kathy Hochul to advance research aimed at addressing major societal challenges for the public good. The program focuses on fostering collaboration between public and private institutions to drive innovation, advance economic development, and position New York as a leader in AI. It aims to support AI research, talent development, and the growth of AI-driven businesses.
So myself and my colleagues on the Research Computing Service HPC team, Max Short and Al Tucker, giving them a quick shout out, we're currently all working as points of contact for Columbia University in this statewide initiative. And, you know, researchers with Empire AI have access to 13 HGX nodes, eight H100s, which are shared between Columbia, NYU, Cornell, CUNY, SUNY, and RPI. But finally, now, if you have any additional research computing questions, or you want to learn more about access, I highly recommend reaching out and contacting us at rcs.columbia.edu. You can also check out our Research Computing Services page and a lot of the other resources and links that were shared in the meeting chat.
But other than that, thank you so much for attending and learning more about access. All right. And, you know, again, I know in the interest of time, we have 10 minutes left.
If anyone has any open-ended questions or things they want to learn more about, feel free to unmute yourself and ask here. And yes, the meeting recorded is going to be posted also on our RCS video library. And I see Alan has a question.
Yeah, for Empire AI, is there, like, how does it compare to access? And is it live enough that people can start using it? It seems like you don't have to go through the application process the way you do with access. All right. So the access Empire, so for access, you know, it's kind of like available whenever and, you know, you need them.
It's reviewed all throughout the year. For Empire AI, so the beginning pilot phase of using their alpha system is what it's called, has already started. So back in 2024 fall, they had a kind of open call for project proposals that have already been reviewed and approved.
And since then, we have actually been working with the Empire AI technical team and have been soft onboarding different projects and PIs onto Empire AI. So it's currently in the works and it's being used. I would definitely love to talk more about it also, if I can do another lunch and learn, but it's, you know, specifically, again, for people that want to use and research AI related initiatives.
Sounds like something that would be interesting for Darko. Okay. Thank you.
Yeah. Any other questions? Well, if there aren't any other questions again, you know, thank you all for attending and I will give you all back eight minutes and I will put my, again, also my uni at Columbia.edu in the chat as well, in case anyone wants directly ask any questions, but definitely reach out to us also at [email protected]. Okay. Thank you, everyone.
We also work in parallel with our partner team, High Performance Computing, to offer shared HPC clusters and services to Columbia's researchers. The goal of today's Lunch and Learn is to raise awareness and enhance our understanding of available resources, enabling researchers and educators to effectively utilize them in advancing their research efforts. So I'm first going to be explaining and talking about what is Access and what are the different use cases at Columbia University, Access Roadmap at Columbia University, what are some general terminology to help you kind of get acquainted with, allocation types and prerequisites, how can someone get started on Access.
I'll go through a live demo of Navigating Jetstream 2, which is one of the resources that are available on Access. Some additional resources and opportunities that may be of interest and then questions. And I also will say before we begin the live demo with just general multitasking, I will take a minute to also look at the Zoom chat to see if there are any questions coming up.
My team on RCS will also be kind of monitoring the meeting chat, so if you have any questions, feel free to just paste them in there. So what is Access? Access is an acronym, it stands for the Advanced Cyber Infrastructure Coordination System Services and Support, formerly known as XSEED. It's a science foundation to help researchers and educators utilize the nationwide collection of supercomputing systems.
If we kind of break down the acronym a little bit further, Advanced Cyber Infrastructure, this refers to computing that kind of goes beyond your laptop, that could mean supercomputers such as our on-prem HPCs, for example. It's any collection of resources that support complex, large-scale computing. Coordination ecosystem, Access is a multitude of awards that are working together and with other NSF-funded resources providing to end-user communities.
Services and support, services that they provide are things like account provisioning, user support, trainings, operating support, metrics, and reporting. So back in 2019, NSF through the Office of Advanced Cyber Infrastructure published a blueprint for National Cyber Infrastructure Coordination Services for accelerating science and engineering in the 21st century. Their vision called for the broad availability and innovative use of an agile, integrated, robust, like really well-thought-out, sustainable CI ecosystem that can drive new thinking and to transform discoveries in all different areas of sciences and engineering research.
And from that came Access. So Access is a program that's funded by the U.S. government that helps researchers use powerful supercomputers, data storage systems, and advanced software to solve complex scientific problems. You can kind of think of it as like a library of supercomputers and tools that scientists and engineers across the country can check out to run their experiments or analyze big data.
So Access provides a wide range of resources and services such as compute resources. These are the high-performance computing clusters that are CPU or GPU-based, storage resources, data storage systems for storing and managing large amounts of data, specialized support. So these are like additional tools and resources to help streamline research.
So no, almost any computer or any computer application that requires more than a desktop or laptop could qualify as needing an advanced computing system. Some examples include AI, machine learning, big data analysis, etc. Other ways to estimate when and why should you use advanced computing resources are if your tasks for research and their coursework should take minutes but are taking, you know, hours or days to compute, if your laptop regularly freezes due to high computational loads, if your laptop's CPU memory limitations and storage requirements are constantly being maxed out.
And Access also enables research across almost all fields of study from natural sciences like chemistry to formal sciences like math to social sciences. And to reiterate, Access resources are available at no cost, but the application process can be a bit competitive depending on the compute resources needed. So there's three different types of allocations that Columbia researchers and educators can use in order of ease of acquisition and the amount of resources available.
And yes, thank you, Jess, for pointing out something I also did want to state is that, you know, some people might have questions about like, you know, if we have on-prem HPC resources at Columbia, you might ask, why would someone be interested in Access? To that, I will say that I, you know, I strongly first recommend, you know, checking out our on-prem HPC resources for researchers, PIs who have grants or funds can utilize not only our HPC resources but they also have direct access to our HPC support team. So you'll have the ability to purchase either partial or full servers and you'll have complete access to the five-year lifespan of the hardware. However, you know, not everyone has funds or supporting grants.
So this is where Access comes in. It's available again at no cost to researchers. It removes the entry barrier to access compute resources.
And for more information, Jess, one of my teammates on this call provided some links with some information on how he can kind of access these shared HP clusters at Columbia University. So moving forward, looking at some of the use cases, Columbia researchers and affiliates are currently using Access in the following research fields in material simulations in geophysics and planetary sciences, cosmological simulations studying dark matter and dark energy, fundamental applied studies of turbulent flow phenomena, housing market economics studying the impact of real estate agents on housing prices, and many more. In terms of general trends in supercomputing, we're seeing that applications of high-performance computing are expanding in all different fields.
What used to be utilized in only STEM-specific fields are not being used in things like weather simulations and forecasting, molecular dynamics, digital twins for visualizing real-world objects, for example, like city mapping, and many more. I want to take a step back now to kind of take a look at the roadmap of Access at Columbia University. Note there might be a few terms here that you're not quite familiar with, but I'm gonna jump into this first and then I'll explain some of the terminology in the next slide.
So I'm going to start with XSEED, which is the Extreme Science and Engineering Discovery Environment, which launched back in 2011. It was a national collaborative initiative that provided high-performance computing resources and support for research in science and engineering that existed before Access. Columbia first started using XSEED back in 2017, and in 2022, XSEED was reprogrammed and succeeded by Access.
So this type of program isn't something new at Columbia University, it's been here for quite some time. The core difference is that Access provides a more modernized and decentralized approach to providing compute resources than its predecessor. So in 2022, our allocation was renamed to the Columbia Discover Allocation under the new Access program, and in September 2023, our team submitted a proposal to request more Access credits, which were approved of 750,000 credits supplement for a total of 1.5 million Access credits.
And back in August 2024, we then renewed our Discover allocation here at Columbia for another year. So yeah, I just used a lot of terminology that might seem really confusing, so I'm going to an Access project and the resource units that are provisioned to this project. To get resources on Access, you first need to get an Access project or allocation.
Campus Champions are institutional representatives that are familiar with the Access program. They manage allocations and offer guidance in getting started with the resource allocations. And for reference, I am a Campus Champion for Columbia University, so you can always reach out to me if you have any questions about Access credits.
These are the currency that's allocated in Access that allows to use a certain amount of resources. Generally, an Access credit is about one hour or one gigabyte of storage. Amounts can vary by resource.
So for example, a GPU hour delivers more computing, so that might cost you more credits. And with regards to navigating credits and how much they are worth, Access does have a exchange calculator on their web platform with more information on the exchange rates for different resource providers. This refers to the compute resources that are provided by an organization.
And then finally, resource provider. Resource provider is an organization that provides compute or storage resources. The majority of the computational resources provided via Access, including computing, data storage, software, etc., they are operated on their separate agreements between the NSF and the institution.
So Access is your gateway to gaining access to and getting support for these resources. On Access, there are four types of allocations you can request. They are the Explore, Discover, Accelerate, and Maximize.
These are the project types that one can request to get resources. So Columbia University has a shared Discover allocation on Access, which includes over 20 different resources across different resource providers. Users can join our Columbia Discover allocation to use the compute resources for things like small-scale testing and benchmarking.
And these resources, they have varying computing systems, different hardware, core counts, memory sizes, different accelerators, and more data storage systems, software libraries. So there's a lot to leverage. And anyone who's a researcher or educator at Columbia University can request to join our Columbia Discover allocation.
If you want to lead your own separate allocation, like your own separate Explore or Discover allocation, you must be a U.S.-based researcher or educator at the graduate student level or higher. So one thing I do want to know about Access is that these resources are for non-sensitive data use only. Research projects must be handling non-sensitive data in order to utilize Access resources because this is an open science program, fundamentally for fostering open public research.
So if your research does involve sensitive data, I highly recommend taking a look at alternative resources that our RCS team can support, such as the Secure Data Enclave, the SDE, or the Columbia Data Platform, the CDV. And we will, and thank you, yes, we have the resources and links in the chat. So, you know, I talked about the four different allocation types that represent four different project tiers.
So I'm going to kind of break them down a little bit further. So if someone does want to get their own allocation for their research, this is going to be a helpful guide in understanding each tier and the associated requirements. So if we take a look at, for the example of the Explore allocation type, this will give you a maximum credit amount of 400,000 credits.
You know, the project duration will generally be for the length of the supporting grant, if you have one, or 12 months. Explore allocations are reviewed all throughout the year, so, you know, you can put it in a request at any time. And the general requirements are that you're just going to need a project title and an abstract, like four to five sentences, explaining just what you want.
So Explore, this is best suited for newer community members looking to kind of dip their toes into Access waters. In terms of general requirements, Access reps and resource providers, they're going to go and review and see why you want to use Access, what are you trying to do with the resources and see if it's a good fit. If we go up the next level, we go to the Discover allocation, you can get a maximum of 1.5 million credits on here.
And again, it's the same amount for the project duration, requests are reviewed all throughout the year. And the next, you know, additional requirement is you're going to need a one page proposal, instead of just the project title and abstract. So there's a little bit more that's included.
And, you know, Discover, it's more catered for moderate scaled research computing, or even for like a large classroom environment for teaching purposes. Going up the next level to accelerate, you can get a maximum of 3 million credits. Again, same project duration, also reviewed all throughout the year.
However, this requires now a three page proposal. And there is a Access panel merit review, which this undergoes a review by the Access review board, due to just the size of the allocation and sheer amount of resources that are being provided. And then finally, we have the Maximize allocation.
So this has a, you know, not a specific number or limit of resource credits, this is entirely dependent on the scope and size of the research needs. The project duration has to be 12 months for this one's very specific. And also these requests are reviewed every six months.
This is going to require a max 10 page proposal. And again, like with Accelerate, it goes through a Access panel merit review. So if you're interested in Access resources, but haven't fully fleshed out your compute needs, you can try using the Columbia Discover allocation for, you know, again, small scale testing, benchmarking to get a better sense of what resources you'll need.
And you can start with an Explorer allocation if you know, once you realize you need to get your own separate allocation, and you know, as your needs increase, you can request and scale your project to a Discover allocation with the, you know, respective documentation requirements. So the best thing to do is, you know, start small, and then upgrade later to save yourself kind of the headache of having to provide more documentation than needed. And eligible individuals, you know, can reach out to our RCS department to request to join our Columbia Discover allocation, Columbia campus champions such as myself, and also Xenia Radova, senior manager of RCS can readily approve and add users to the allocation based on individual compute needs.
So how does one learn more or join our Columbia Discover allocation? There's two ways, through our National High Performance Computing Access Service page, and also the ServiceNow catalog intake form. So our team has developed the Access Service page on our website, which will be kind of your one-stop shop for everyone to learn about Access, what resources are available, and other related resources. I highly recommend using the QR code that's displayed here to take a look at our page and see what Access offers.
So just going to give a brief second if in case anyone wants to whip out their phones and take that screenshot there. And then the second one is, I want to actually give a quick shout out and thank you to Alan Raghunath and his team for helping us create the new Access Service in ServiceNow and the catalog intake form, which researchers can fill out to submit a request to join our Columbia Discover allocation specifically. This streamlined our process at CUIT for approving new requests, avoiding all kind of all of the back and forth that we would normally do, also saving researchers a ton of time.
So this can be found on the previous page using the make a request blue button here. So in addition to research use cases, instructors can request their own Explore or Discover allocation for classroom and educational purposes depending on how many resource units are needed for the class. So once you choose your correct project size and submit the request, students can then register and create their own Access ID to join.
So how does one create an Access project? To get started with your own Explore or Discover or whatever project, now you're first going to have to create and register your Access account. To create your Access ID, you're going to have to log into access-ci.org and create your account. Your Access account will help you review your projects, allocations, and more.
So that's going to be the platform to help understand everything in terms of what your project is and what resources you're going to get. Step two will be select your project type based on your compute needs. If you're not sure how your needs translate into resource units, then you can first get started with an Explore Access account.
So after trying out different resources, again, you can always upgrade later to a larger project once you understand how your work corresponds to resource usage. Step three, choosing resources. Your most difficult choice will probably be choosing the resource or resources where you're going to conduct your work.
So Access has a resource catalog that will help you find the resources that suit your needs. The catalog has basic descriptions of each resource and its recommended uses. And also while you're filling out the form, you'll also be able to add OPIs and the allocation managers and other users to your allocation project.
Step four, you're going to prepare and submit your allocation requests with respective documentation materials. To complete an Explore Access request, again, you're going to need that project title. You're going to need that short abstract.
If you have any supporting grants, you know, you can always include that information as well. And then a CV and resume. For an Explore Access request, you'll generally hear about the outcome within like seven to ten business days.
And you know, if you work with us and you're interested in our Columbia Discover allocation, we can, you know, we already have this Discover allocation that we manage. We can readily provide you access to that easily. And then finally, step five, once you've been approved and receive your access credits, you can then exchange them for time on the resources that you want.
As a general statistic, about 95% of access allocation requests are approved. Majority of those in the 5% are actually people who are either unfortunately ineligible or were either missing some documentation in the application process to which, you know, access reps will reach out to you clarify and give you a chance to then resubmit or, you know, add on or upload any documentation that was missed. For troubleshooting and additional questions, Access users can reach out to support representatives using the ticketing system available on the Access web platform.
This, you know, goes without saying that you will need an Access ID to access the system. Here you can ask, you know, as general or specific of a question as needed. If you have a specific troubleshooting question about one of the resources, through the ticketing system, you'll be able to communicate with one of the support reps from the representative resource providers.
We also have a Q&A bot, which is a new feature that can help quickly resolve most general questions on Access resources. And Access also has a knowledge base filled with resources that were contributed by other, you know, CI professionals, so researchers, research computing facilitators, you know, engineers, and many more people. So, you know, this Access community members regularly contribute to grow this knowledge base for others to use.
And on Access, I want to just highlight that two of the most popular resource providers that are used across Access and also specifically Columbia University are Indiana University's Jetstream 2 and also Pittsburgh Supercomputing Center's Bridges 2. So, more information on both of these resources. Indiana University's Jetstream 2 is a flexible, user-friendly cloud computing environment that's designed and housed within Indiana University. Jetstream 2 is one of the many resources that are part of the Access, you know, platform and is primarily designed for researchers with either minimal, high-performance compute experience to those that are, you know, interested in exploring cloud-native resources.
There's currently over 2,500 active users that are being supported on this resources with a 8 petaflops cloud computing system and 17.2 petabytes in storage. It has over 400 compute nodes of AMD third-gen Epic Malan CPUs with 128 cores and 512 gigabytes of RAM and 90 GPU nodes with four NVIDIA A100s each. So, why would someone want to use Jetstream 2? In addition to the fact that all these are, you know, resources again are available at no cost for Access, it's a virtual cloud computing environment where users can kind of create their own virtual machines with on-demand resources, meaning there's no queues or runtime limits.
Users also have full admin access to install software as needed for their research compute needs. These instances are easy to spin up and you can create a virtual machine whenever you need one so you're not waiting for some approval in order to use the resources. Jetstream instances also have full access to the internet and persistent IPs.
And again, just to note, Jetstream 2 is not primarily a storage device but a temporary location to stage your data for analysis. Bridges 2 is a system, is a high-performance computing resource at the Pittsburgh Supercomputing Center. It's designed to support a wide range of resources for researchers from, you know, traditional computational tasks to emerging fields like AI and machine learning.
There's three main types of compute nodes on Bridges 2. It's regular memory nodes which provide extremely powerful general purpose computing, you know, pre- and post-processing, AI inferencing, and more. The extreme memory nodes provide four terabytes of shared memory for statistics, graph analytics, genome sequence assembly, and more. And there are GPU nodes which provide exceptional performance and scalability for deep learning and accelerated computing.
Why would someone use the PSC Bridges 2? Bridges 2 is mainly optimized for machine learning including support for deep learning framework like TensorFlow and PyTorch as examples. It provides nodes with high memory capabilities which are ideal for memory intensive applications. So you'll have access to public data sets such as the 2019 novel coronavirus, AlphaFold, CosmaFold, and many others.
So, you know, anyone who has a Bridges 2 account will have access to these. All right. And now, you know, we'll be going into the live demo on JetStream 2. I am going to just take a minute here to not only get a sip of water but also take a quick look at the chat if there are any questions.
I don't think we have anything open in the chat. I will- some folks were asking me about slides and we will be providing those afterwards along with a recording. But yeah, if you have any live questions also feel free to raise them now before the demo.
Yes, thanks Jess. Yeah, if anyone has any questions at this moment feel free to unmute yourself. And if not, we will then go ahead into the demo.
Okay, I will take the silence as no questions at this moment. All right, so now before we begin I'm going to give some additional information on JetStream 2. JetStream 2 is built on OpenStack cloud computing infrastructure and users can interact with it through one of three graphical interfaces. Exosphere, Horizon, and Kakao.
So in this tutorial that I'll be going into we'll be focusing on Exosphere which is the primary and most user-friendly interface. So Horizon, you know, does have additional more advanced tools for people that are more experienced and have been using JetStream 2 for a while. And then Kakao is the next level further which is predominantly used for building clusters.
So to log into JetStream 2, now I'm just going to open up a new tab and you'll go to jetstream2.exosphere.app. So I put a typo in there. So let me go there. Okay, so to log into Exosphere you'll go to the JetStream 2 Exosphere web platform and for, you know, first-time users that are logging in, you're going to kind of see an empty screen here that you're not logged into any allocations yet.
So you'll need to log in by clicking the plus sign here to add your allocation and sign into your access account. So I'm already signed in so I wouldn't need to do this but this is where you would start. So going back, you know, you'll then see all the different possible allocations that you've been added to.
The allocation here, the TRA120004 represents our Columbia Discover allocation but if you request your own, you know, Access Explorer Discover allocation, for example, on the JetStream 2, they will populate right here. So once you click to open up your allocation, you're going to be presented with kind of an overview page with several different cards that are explained with different instances, volumes, and some more additional information. On Exosphere, users can create their own VMs in various sizes and flavors depending on the level of compute they need and it also comes with one terabyte of storage by default.
The default storage system in JetStream 2 are volumes which you can attach to an instance and there you are going to read and write your data from your programs and then finally when you're done, you detach the volumes. So let's start with first creating an instance. What you're going to do is you're going to go to this red create button up here, click on it, and go to instance.
So instances are the computers that exist virtually in the cloud in Exosphere and then you're going to be prompted with just which OS you want and these are, you know, think of them as images. These are the install disks to boot up your machine and has all of the software and stuff that's used to run your machine. So here I'm just going to click Ubuntu with the latest version and then you're going to be prompted to provide some details such as a name.
So I'm going to give the name access what you learn and then here then you're going to be prompted for the flavor of your instance. So the flavor refers to the size of into how many compute resources you want on the instance. So keep in mind the service units or the SU cost is the number of CPUs.
So one SU represents one CPU per hour of compute. So I think I'm just going to go with the m3.small and then if we scroll down further you can choose a root this size. So I'm just going to keep it at the 20 gigabytes for now and then it's going to ask how many instances.
So right now I'm just going to leave it at one but you know let's say that you're using Jetstream 2 for a classroom environment for educational purposes and you want you have 20 students. So what you could do here is you can actually go up and choose like 20 for example and this is going to boot up 20 instances rather having you manually go through and create one instance at a time. So I'm going to choose one for now.
I'm going to enable the web desktop. So we want to utilize this feature which I'll go into momentarily in this demo and then I'm going to click create. So it's now creating the instance.
So right now if you go to this instance card you'll see access lunch and learn it's building. If I click here to the access lunch and learn instance I just created now you're going to see some information and you'll see the state that it's currently in. So right now where my mouse is hovering it's currently in the building stage and you know it's going to go through a few phases here where it's going to change from building to running setup to ready and inside here you know in the info card section you'll see just some general information who created it you know what os the flavor and the burn rate.
So you know here you'll then see some additional information if I scroll down about you know interactions credentials volumes. I'm going to step back for a second and then as you look here in the instance section you'll see that by default exosphere filters to instances you've created. So if your instance ever errors out in this process when you're creating one the best thing you can do is just build a new one.
Just delete and rebuild. It'll only take like a couple minutes. And again, so right now it's still building. So I click on that.
Yeah, and also another important thing to point out is the burn rate. So here I choose the m3.small has a burn rate of two sus per hour. So that's just to keep in mind of how many resources or how many resource credits you're using.
And in terms of credentials here, you know, it's going to get a default hostname, you know, a public IP address, again, it's, you know, open for various use case purposes, you'll see it says something username and password here, and the username is user by default. So if you ever want to get into your instance using SSH, this is where you can use the username. And then, you know, once this instance is done building, it's going to give you the passphrase information here.
And you can use that here to SSH in. And then under interactions here, this is where you're going to actually interact with your instance exosphere has a lot of handy features like the web shell and web desktop for users who are not familiar with command line interface. So once it's finished building, this is where you're going to be able to select those features.
So now, right now, it's currently building. And what I'm going to do in the meantime, is I actually pre built another instance. So we're going to use that one going forward.
And you know, here, you can see that it has a green ready sign. So I'm going to click through here. Again, you'll see the default information up here, you'll see some information about resource usage file, you know, your instances built.
And then again, all the other information down here. So now I'm going to go into that web desktop feature that I mentioned earlier. So I'm going to click on web desktop here in the interactions card.
And what you'll see is it opens up a new tab and opens up, it's going to open up the full desktop. So if you're familiar with Ubuntu default desktop, this might look a little bit familiar. And you know, even if you're not familiar with Linux, or even like Windows, it's fairly straightforward to navigate.
So I'm going to move over to these different icons, you know, for example, I can click on files, and this is going to open up your file browser. And you know, you can navigate to different folders here. And what you can do is you can also create folders.
So for example, I'm going to create just a new folder called test folder here. And let's see, you can open up other apps. So let's say I want to open up Firefox.
So again, this is open to the internet. So I'm opening up Firefox in the VM. And I want to navigate to docs.jetstream-cloud.org. This is actually going to direct me to Indiana University's Jetstream 2 documentation page, which is really extremely helpful for kind of getting a rundown of how to navigate the Exosphere and Horizon and Kakao platforms.
So I'm going to close out of this. You'll also see by default, there's a few different apps that are here, like RStudio, MATLAB, they also have like Anaconda and a bunch of other software that's already pre installed. So you know, you can keep that in mind.
And the featured images on Jetstream 2 are kept up to date. So for security purposes, if you know, you need to create your own firewalls for any reason, you can, but by default, everything is open. So let's say you want to upload some files to Jetstream 2. What you can do, which is a neat trick is you can do Ctrl Alt Shift, if you're on a Windows machine, or Ctrl Command Shift, if you're on a Mac, which opens up a sidebar with additional features.
So Jetstream uses something called guacamole to display this virtual desktop. And you know, some browsers might not support direct copy and paste. So you can use this tool to copy and paste.
So here, you can upload files to your instance, from your local computer, by clicking here, go to, I want to find my home folder. And then exo user, that's the name of the username for this machine. And then here's that test folder that I created earlier.
And here, you can click on upload files. And then you can just upload whatever files that you need to. So I'm going to click cancel, go back, and I'm going to close out of this.
So that's how you can upload files. Now to go into volumes. So you can think of volumes as like a flash drive of sorts, you store data on it, and you can move it between different instances.
And you can have multiple volumes attached to an instance, but each volume can only attach to one instance at a time. So we have our instance, and we want to create a volume. How do we do that? So what you do is you scroll back up, go to that red Create button.
And we're going to click on volume here. And then give it a name. So I'm going to call it lunch, learn volume.
I already created it for this one. So let me just do test. Fine.
And I'm just going to give it a default 10 gigabytes and click Create. So this created the volume. And I'm going to now go back to my instance.
So I'm going to go back to my CIT lunch learn instance. And then you'll see there's a section here called volumes. And I'm going to click attach volume.
And then you can select whichever volume you have, and then click attach. And then I'm going to go back to my instance. And if we scroll down to that volumes card, you'll see that it's attached here.
So now let's say we also want to create some data in this volume. So I'm going to create a new folder using the web shell feature here and create a file inside the folder that we created. So this is going to open up a new tab when you click on web shell, wait for that to load up.
And I am going to create a folder called my dash folder, I am then going to create a text file called Hello, and I'm going to put the content. Hello world in it. And then make sure that it it gets put into my dash folder.
And with the name Hello dot. Okay. So now that I've run that, if we go back into the web desktop, so here, I'm back here, and I go to files.
And I go into this my folder that was created, we'll see there's that Hello text document that was created. If I want to open that up, that's Hello world. So everything worked out.
And now let's see, if we want to, you know, again, transfer data, we can use them in the methods that were mentioned previously from the either your local machine, or from the internet, as alternative methods. And, you know, let's say we've run our work, we've run our analysis. Once you're done analyzing all of the data that you need to, you can then go ahead and detach a volume from an instance.
So first things to do is make sure you close out of all web desktop web shell that you have open. And then once you're there, we're going to navigate back to the volumes page. So here, you see, there's this little arrow icon here, that goes into which volume that's attached to your instance, we're going to click on detach.
And then it's going to, you know, give you a warning message just to make sure in case of any potential data loss, and I'm going to click detach. And then it's detached. And then if I go back to my, and I'm just going to go back to my instance, and then I'm going to go back to the page here, there's no volume attached.
So that works. So one final important note I did want to make is instance management and CU consumption. So for as long as an instance is running, even if you aren't actively using it, you know, and it's still consuming some service units, those SUs.
So and again, in reference to that burn rate, that's up here, as to SUs per hour for this instance that I have, you know, if your research isn't being used for a server or service that needs to be online 24 seven, we do strongly recommend to shelf your instances to avoid rapidly burning through SUs, shelving automatically creates an image of your instance to be used to create a new identical instance at a later time, when you can then unshelve it. So, you know, by shelving an instance, your SU consumption goes to zero. So you can shelve an instance by going to the actions item here.
And then you'll see there's a button called shelve. So I'm going to click that, and then click Yes. And in the process of shelving it right here.
And then, you know, we just want to ensure that we are fully optimizing and efficiently maximizing resources on our shared discover allocation, which again, is shared amongst Columbia researchers. But you know, if you have your own Explorer discover allocation, we want to make sure that you know, you're also, you know, maximizing usage of the resources that provision to you so you don't run through them too quickly. So that is a quick sneak peek of what you can do on just room two.
So I will now go back to our page. And I'm going to take a quick look in case there are any questions. But I think Helene already sort of addressed Darko's question.
But I think that I know I have an insider knowledge that you'll probably get into this. But he was asking, if you have heavy GPU usage, and you're like tapping into the NVIDIA GPUs, would CUDA be enabled for AI and ML learning? Or is there a better option for that? Yeah, so good question. So there are different resource providers that do have, you know, users that have heavy GPU usage.
So it really depends on the different resource providers. Let me see if I can quickly go to the access dash ci.org page to kind of redirect you to where you need to go. So this is kind of accesses general platform that has all the information that you need.
If you go to the resources tab, you can browse the different resources here. And then, you know, I want to filter by GPU specific resources. So I'm going to choose GPU compute.
So here, you know, you're going to get Darko flagged all of the different resources that are that have GPU capabilities. So I highly recommend and I'm going to copy and paste this into the chat to take a look to see if these resources meet your needs. And also, what OSs are available.
So for example, for Jetstream 2, it has, you know, Ubuntu, you know, it's primarily a Linux environment. But you know, not all of these resources have the same OSs. Some take on additional OSs.
So you would have to kind of filter and go through here to find out what best suits your needs. Okay. And are there any other questions? And feel free to unmute as well if you have any questions.
Sorry, there was one about Jetstream asking if you're using the Jetstream 2 every day, should you shelve it each day and then reboot it each morning? I would recommend doing that just because it will be going through overnight the resource through the SUs you have. So you know, if especially if you have your own Explore or Discover allocation, which has much fewer access credits than a accelerate or maximize, you know, you're kind of just burning through, you know, potential credits or, you know, core hours you could have been using on the CPU or GPU resources. So, you know, better to shelve them when you're not using them, save you some time, save you some resources, and then unshelve them when you're ready to use them again.
And yes, it saves electricity, we want to also be, you know, one of the core philosophies or principles of access is we want to be sustainable, and that includes with, you know, energy preservation. All right, so let me go back to our demo. And again, these are like great, great questions.
And I'm gonna go on to, so I had a video demo just in case things didn't work out with our, with our instances and everything, but everything worked out perfectly, which is great. But, you know, since we're on the topic of access, I also want to share a few other relevant resources that might be interesting and catch people's eye. But NAIR, which is the National Artificial Intelligence Research Resource, it's another program that's led by NSF that aims to connect US researchers and educators to computational data and training resources that are needed to advance AI research, or research that employs AI.
So a NAIR ask force was established back in 2020, and launched in 2021. The NAIR pilot aims to kind of bridge the gap of accessibility of AI resources and tools to the broader research and education communities to, you know, advance trustworthy AI, protecting privacy, civil rights and civil liberties. And many of NAIR's resources are part of the access program, and campus champion representatives at CYT can help provide guidance to researchers who are interested in navigating access to NAIR resources.
There's also the NAIR Secure program that's co-led by the Department of Energy and the National Institutes of Health, NIH, as part of the NAIR pilot. So NAIR Secure will enable research that involves sensitive data, which requires special handling and protections. And in terms of eligibility, this call is open to proposals by US-based researchers and educators from specifically US-based institutions.
So this is academic institutions, nonprofits, federal agencies, startups, and small businesses with federal grants. And, you know, another core requirement is that, you know, projects on NAIR have to cover cross-cutting or domain-focused areas of AI. So advancing AI methods that enable scientific discovery or using large-scale models to explore complex data sets interactively.
So, you know, you might be wondering when should someone use access or when they should use NAIR resources. NAIR specifically for projects related to AI and research and education and is recommended for research conducted for the duration of the pilot program. So this pilot program will go on until January 2026.
Access is recommended for all different types of projects and fields with the caveat that it is non-sensitive data. So if needed, you can renew. You can also extend your access allocations for the duration of, you know, whatever research needs you have.
If you have a research supporting grant, you can always extend it to parallel that. So it's recommended for more long-term use. NAIR is also holding a workshop called AI Unlocked, Empowering Higher Education to Research and Discovery that's going to be held April 2nd to 3rd in Denver, Colorado.
And the primary goal of this workshop is to connect US-based higher education affiliates with valuable information and resources and training to deepen their understanding of how to leverage AI in their current work while also equipping people with the skills necessary to advance their careers and achieve specific professional objectives. If interested, I highly recommend applying. And with, you know, advancing computing resources for research, I'd be remiss if I also didn't talk about Empire AI.
So Empire AI is a New York State initiative that was launched by Governor Kathy Hochul to advance research aimed at addressing major societal challenges for the public good. The program focuses on fostering collaboration between public and private institutions to drive innovation, advance economic development, and position New York as a leader in AI. It aims to support AI research, talent development, and the growth of AI-driven businesses.
So myself and my colleagues on the Research Computing Service HPC team, Max Short and Al Tucker, giving them a quick shout out, we're currently all working as points of contact for Columbia University in this statewide initiative. And, you know, researchers with Empire AI have access to 13 HGX nodes, eight H100s, which are shared between Columbia, NYU, Cornell, CUNY, SUNY, and RPI. But finally, now, if you have any additional research computing questions, or you want to learn more about access, I highly recommend reaching out and contacting us at rcs.columbia.edu. You can also check out our Research Computing Services page and a lot of the other resources and links that were shared in the meeting chat.
But other than that, thank you so much for attending and learning more about access. All right. And, you know, again, I know in the interest of time, we have 10 minutes left.
If anyone has any open-ended questions or things they want to learn more about, feel free to unmute yourself and ask here. And yes, the meeting recorded is going to be posted also on our RCS video library. And I see Alan has a question.
Yeah, for Empire AI, is there, like, how does it compare to access? And is it live enough that people can start using it? It seems like you don't have to go through the application process the way you do with access. All right. So the access Empire, so for access, you know, it's kind of like available whenever and, you know, you need them.
It's reviewed all throughout the year. For Empire AI, so the beginning pilot phase of using their alpha system is what it's called, has already started. So back in 2024 fall, they had a kind of open call for project proposals that have already been reviewed and approved.
And since then, we have actually been working with the Empire AI technical team and have been soft onboarding different projects and PIs onto Empire AI. So it's currently in the works and it's being used. I would definitely love to talk more about it also, if I can do another lunch and learn, but it's, you know, specifically, again, for people that want to use and research AI related initiatives.
Sounds like something that would be interesting for Darko. Okay. Thank you.
Yeah. Any other questions? Well, if there aren't any other questions again, you know, thank you all for attending and I will give you all back eight minutes and I will put my, again, also my uni at Columbia.edu in the chat as well, in case anyone wants directly ask any questions, but definitely reach out to us also at [email protected]. Okay. Thank you, everyone.