Researcher Resource Digest: Tools & services for Columbia researchers

The Researcher Resource Digest is a quarterly newsletter that spotlights systems and solutions that empower Columbia's researchers.

Together, the Office of the EVPR and CUIT Research Services want to share with you the tools that the University offers to facilitate and enhance your research work.

Archives

newspaper icon

Check out the back-editions of the digest to review systems and services that you may have missed being rolled out!

Researcher Resource Digest header image

Extended queue-time for AI jobs on HPC

Columbia’s perpetual HPC cluster, Insomnia, just got a boost for AI research. CUIT HPC has implemented updated Slurm queues to better support large-model training, GPU workflows, and increasingly complex machine-learning workloads. The long queue now supports 7-day jobs, a new 14-day burst preemptible queue allows extended runs on idle group-owned hardware, and a high-priority hpc_test queue is available for workflow validation on owner nodes.

These enhancements are fully live and require no changes to existing pipelines.

Explore HPC purchase options


 

Columbia Data Club Sprint: Explore R, Python & GIS

Before the semester ramps up, Research Data Services is compressing six Data Club sessions into three weeks of Wednesday and Thursday afternoon gatherings. Join to learn advanced research data techniques with Python and R, or in a GIS environment.

Sign-up for upcoming sessions


 

Gemini and NotebookLM available now!

Google Gemini and NotebookLM are now available to the Morningside Columbia community at no cost. Gemini supports writing, coding, data analysis, and research workflows, while NotebookLM helps you synthesize and question your own uploaded documents. Both are provided through a secure, enterprise-grade environment. 

Google will also be coming to campus in March for a workshop on using natural language code assistance tools to build data cleaning and processing pipelines – sign up for the Foundations for Research Computing mailing list to receive registration information!

Features, use cases & security info


 

Claude loggo

Claude for Education Coming Soon

Claude for Education is coming soon to Columbia! Claude is an AI assistant created by Anthropic that supports writing, analysis, coding, and problem-solving. For $300/year, you will be able to access the latest Claude models in an environment with enterprise-grade security. Join the January 30 AI Community of Practice to learn more and see live demos.

Learn more about upcoming Claude service


 

NAIRR Pilot

Consider NAIRR to bolster your AI research

Supporting 500+ projects, the National Artificial Intelligence Research Resource Pilot is on its way to becoming a sustained HPC resource for NSF grant-holders. If you need access to advanced computing systems for your AI-driven research, join CUIT RCS for a webinar overview of NAIRR on October 29, and stay in the loop as NAIRR's offerings mature by subscribing to their newsletter.

Register for NAIRR webinar


 

IPUMS logo

It's here: IPUMS dataset now available

The IPUMS dataset is now available to all Columbia researchers. Covering global census and survey data from 1850–1950, IPUMS integrates information across time and space to support demographic, social, and economic research.

This restricted-use dataset is managed by the Columbia Population Research Center (CPRC) and hosted on the Columbia Data Platform (CDP). Apply for access by completing a project application and research agreement found on the CDP website (sign in and then click "Join Organization").

Start exploring IPUMS data


 

MarketScan: Data You Can Trust

Discover MarketScan data on CDP 

The Merative™ MarketScan® Research Databases have also arrived on the Columbia Data Platform. Curious? Join the 9/30/25 webinar to review the dataset features, and how you can leverage CDP for computing, storage and discovery. This comprehensive clinical and claims data on nearly 300M patients, including lab results and mortality data, are often used for longitudinal studies, benchmarking, and healthcare policy impact analysis. 

Faculty, staff, and students from all Columbia Schools are eligible to use the data. PIs and course instructors should check with their Department Administrator to see if they are part of an active user group. 

Questions? Email [email protected].

Register for the Marketscan webinar


 

Columbia Data Club

Columbia Data Club Relaunches

Research Data Services has updated their popular Data Club series this semester! This fall's meetings will cover a broader scope of topics than in years past, with sessions ranging from Introduction to Python and Text Mining with TDM Studio and ChatGPT to Using American Indian and Alaska Native Census Data

Meetings are held weekly on Thursdays in Lehman 215, and will be recorded in case you're unable to make it!

Sign up for Data Club mailing list

Explore BioRender Premium 

CUIT RCS is considering adding BioRender to our portfolio of discounted research software. BioRender has 50K customizable icons, helping labs quickly generate publication-quality figures for papers, grant submissions, and posters. Microbiology, immunology, and neuroscience labs across Columbia already collaborate using BioRender's cloud platform, and if there is sufficient interest, RCS will establish a group subscription for discounted licenses.

This summer, BioRender has extended Premium access to everyone at Columbia for free so your lab can explore the tool fully. Log in with your UNI, check out a webinar, and bookmark the order form to commit to a discounted license by August 29!

Order discounted BioRender licenses


 

Centralized REDCap now at Columbia!

This summer, the VP&S Office for Research launched a centralized REDCap instance that supports affordable and easy-to-use tools for research data collection and management. This new REDCap service is available to all researchers across Columbia. The VP&S REDCap provides new functionality for secure collection and management of clinical data from Epic, compliance with Columbia's organizational policies, and access to support and training from within the university. 

Join VP&S REDCap


 

Introducing CU CHAT:
Columbia's new AI assistant 

Explore AI securely with CU CHAT (CUIT’s new hosted toolkit). Built on LibreChat, CHAT lets you tap multiple models— OpenAI today, with Google Gemini and Anthropic Claude coming later this year. Replacing CU-GPT, CU CHAT offers a free daily quota, plus pay-as-you-go upgrades as new models arrive.

Check out CHAT


 

Decorative shield image: Secure Data Enclave

Expanded support for Secure Data Enclave 

New to handling sensitive data? If you'd like to conduct research using sensitive data but don't have the IT infrastructure to support your analysis, the Secure Data Enclave (SDE) has you covered! Columbia's SDE is RSAM-certified for PHI, RHI, and PII data storage and analysis, and requires a trained Data Security Officer (DSO) to facilitate secure data transfers. CUIT now offers DSO as a service, allowing PIs to contract with an RCS-trained IT professional to securely handle onboarding and off-boarding your sensitive data.

Explore DSO as a service


 

"marketscan: data you can trust"

Unlock powerful health data for your research! 

Struggling with health data for benchmarking or AI model testing? Great news! The Merative™ MarketScan® Research Databases are now more accessible than ever for CU researchers. Comprehensive clinical and claims data on nearly 300 million patients, now including lab results and mortality data, will be made available on the Columbia Data Platform. Faculty from all Columbia Schools are eligible to use the data. Pre-docs, post-docs, and students may be eligible to use the data for certain projects too! PI’s and course instructors should check with their Department Administrator to see if they are part of an active user group.

Email VP&S to learn more


 

Yellow warning traffic sign with exclamation point

Have graduates leaving your lab? Transfer their LabArchives notebooks! 

Graduates will lose UNI access to their LabArchives electronic notebooks shortly after commencement. If their notebook data is needed for your lab's research, be sure to ask them to transfer the ownership of the notebook to the lab's PI or manager before May 21, 2025.


 

Undergraduate researcher seated at a lab desk

Get research support this summer — Host an undergrad researcher 

Support your lab and mentor future scientists by hosting an undergraduate researcher this summer. Students can assist with projects, contribute fresh ideas, and gain hands-on experience. Post your opportunity on the Undergraduate Research Opportunities Platform (UROP) by May 21 to connect with motivated Columbia students seeking summer internships.

Questions? Email us at [email protected].

Submit a summer research opportunity


 

Globus logo

Expanded Globus support now available 

RCS now offers a dedicated Globus managed endpoint if you need a collection to transfer your data to/from, but you don't have the IT resources or expertise to configure an endpoint (server) to put the collection on. Once established, this is a speedy solution for moving data between SRCPAC HPC cluster and Box, AWS S3, Google Drive, and other supported cloud storage apps. You can even set up automatic repeating data syncs!

Ask about a Globus collection


 

Decorative image of busy overlapping calendar pages

Level up with workshops!

Need a refresher on R or SQL, or want to expand your Overleaf or SnapGene skills? Sign up for trainings crafted especially for Columbia's research community. Sessions range from the introductory "Machine Learning 4 Everyone" to the bespoke "Google Earth Engine Boot Camp". The Libraries' Data Club is even doing a Python deep-dive throughout the semester (including Polars and Xarray). Most sessions are free!

Bookmark the training calendar


 

Decorative image of animated group of workers sitting around a table with walls covered in graphs and charts

AICop is back!

The AI: Community of Practice at Columbia (AICoP) fosters collaboration and innovation in AI and machine learning. Open to faculty, researchers, and staff, the group brings workshops, discussions, and projects exploring AI’s impact. February's session will feature Google showcasing tools like AI Studio and NotebookLM. 

AICoP has been integral to Columbia's ChatGPT for Education rollout, and is excited to share that Anthropic’s Claude for Enterprise will be available at Columbia soon...stay tuned!

Join AICoP


 

Photo of HPC server towers with colorful wires and lights lit up

Educational & Free HPC options move to Insomnia

Arts & Sciences, SEAS, and the Office of the EVPR have partnered to strengthen Columbia's free and educational HPC tiers. Now with a full dedicated node on SRCPAC's recently-launched Insomnia cluster, users that qualify will share access to a server with 80 cores, including 512 GBs of memory, a 480 GB SSD drive, and 100 GB+ of high-speed interconnectivity. The partner groups have committed to adding a node each year, increasing capacity.

The education tier is ideal for course instructors teaching computational content, while the free tier offers researchers, postdocs, and graduate students access to HPC. Online documentation and recorded trainings are available.

Explore HPC options


 

Hierarchy of researcher needs, with Infrastructure at the base and Self-driven research at the top

Columbia faculty committee submit computing report

This summer, the Research Computing and Data Infrastructure Faculty Committee submitted a report with recommendations for future computing resources and staffing at Columbia, especially in the context of AI. 

Read the report


 

HPC server towers

CUIMC C2B2 installs upgraded HPC cluster

This fall, CUIMC launched an upgraded HPC cluster in the Center for Computational Biology and Bioinformatics (C2B2). With the support of a $2M grant from NIH, this new Dell HPC cluster features 12K CPU cores using the latest AMD EPYC processors, over 1M CUDA cores from NVIDIA GPUs, and an NVIDIA Superchip GH200. It also includes 64 compute nodes with large memory capacities (.75–1.5TB) and pre-installed software such as GCC, Python, R, MPI, MATLAB, and BLAST. This upgrade, the largest in 20 years, is designed for real-time data analysis, AI/ML workflows, and neural network applications.

Follow for more info & upcoming live demos


 

Computer user looking at text on two monitors

Explore AI/ML with Columbia Data Platform

Use the Columbia Data Platform (CDP), the University’s comprehensive solution for research data analysis and management, to streamline your next research project. CDP offers cloud computing resources, including GPU capabilities ideal for AI/ML tasks, such as automated text mining, audio transcription, and text anonymization. Additionally, CDP provides essential services like secure data storage, data discovery, real-time analysis with visual tools or APIs, collaboration features, and archiving, all accessible through a single web-based portal. Interested? schedule a consultation to discuss further.

I'm interested


 

Insomnia HPC cluster logo: two googly eyes on a server tower

Buy into Insomnia, SRCPAC's shared HPC, at any time!

Columbia's research groups can now join or increase their share in the University's shared HPC cluster, without the wait! Orders for GPU and CPU servers were previously taken annually, but you can now submit requests for Insomnia orders as soon as your computing needs solidify. Server orders will be placed at least quarterly, based on demand, and the online server menu includes updated pricing and lead time estimates. 

Explore the Insomnia server menu


 

Image of computer terminal with text "ubuntu: ~$ sudo"

Linux-based SDE is here!

CUIT's Secure Data Enclave (SDE) platform is now accessible via Linux! Classically available via Windows, Linux users can now leverage secure Red Hat virtual machines certified for PII, PHI, and RHI data. Ideal for remote analysis and collaboration, the SDE is accessible to any project members with Columbia VPN access.

Review SDE specs


 

ClipArt image of networked avatar heads interconnected in a circle

Integrate AI into your workflow with AI Consulting Services

Introducing AI Consulting Services from CUIT’s Emerging Technologies Consortium (ETC): Researchers can email to receive personalized AI/ML guidance on projects with advice on navigating techniques for enhancing studies, streamlining data analysis, and improving research outcomes. 

CUIT is also developing AI-enabled research tools. Your perspective is welcomed in our feedback survey

Request an AI consult


 

Over-the-shoulder image of a computer monitor displaying a busy calendar

Expand your research horizons

With a new academic year rapidly approaching, groups across Columbia have organized a range of workshops and trainings for researchers. From R 101 for Social Scientists to NVIDIA's 5 Ways to Accelerate Your Computing with GPUs, there is something for all skill levels and disciplines. Register now and bookmark our round-up of research-relevant sessions from across the University!

Check out upcoming sessions


 

NSF and ACCESS logos

Try out NSF's ACCESS HPC

ACCESS is a NSF-funded program of HPC resources available for FREE though merit-based applications. You can easily test ACCESS HPC resources with Columbia's Discover Allocation, which CUIT's ACCESS representatives can readily approve. ACCESS is a great option if you want to try new resources (CPU, GPU), need to run jobs for more than SRCPAC's current 5-day maximum wall time, have a small project, or simply don't have an HPC budget. 

Learn More & Request ACCESS


 

ChatGPT Enterprise logo

Available Now: Columbia ChatGPT Enterprise

Columbia ChatGPT Enterprise is a walled garden designed to prioritize data privacy and is equipped with the fastest speed (up to 2x), a 128k larger context window, and unlimited message caps. No data or customer prompts are used for training models. Advanced features include custom GPTs, Dall-3, browsing, voice, and advanced data analytics.

More Info & Order Form


 

abstract pixelated image of grey and green hexagons

Apply Today for a Unique Data Grant

Columbia's Libraries are committed to helping researchers gain access to datasets for their research. To broaden access to these expensive resources, the Unique Data Grant program connects researchers with Libraries staff with expertise in collections acquisitions for one-time purchases of unique datasets. Columbia faculty, students and staff can submit an expression of interest by May 30, 2024.

Learn More & Apply for Dataset


 

Globus logo over a data transfer arrow connecting Point A to Point B

Go BIG with Globus

Transfer GB- and TB-sized data quickly, reliably, and securely for free! CUIT's Globus subscription includes unlimited endpoints, including your laptop, HPC node, AWS bucket, Google Drive, SharePoint, Box and more. Use Globus to send data efficiently between systems and across organizations, within and outside of Columbia.

Learn More


 

Columbia logo atop "Emerging Technologies: AI Community of Practice"

AI Community of Practice

Join the Emerging Technologies' AI:Community of Practice (AICoP) to delve into the realms of AI / ML through engaging in collaborative discussion, learning, and application. Our mission is to demystify AI, spur innovation, and approach challenges with an AI-centric perspective. AICoP meets on the 4th Friday of the month at 9:30 am ET.

I'm interested


 

Aerial view looking down on a cluttered table with several laptops and smartphones

Unlock software deals

Did someone say Adobe discount?! On top of University-sponsored free software, Columbia has negotiated reduced prices on many analysis and publication programs, including favorites like Adobe Creative Cloud, MATLAB, and GraphPad Prism. We're rounding up the research-relevant options all in one place!
 

Explore research apps


 

Overleaf Professional logo

Overleaf for all!

5,000 Columbia users can't be wrong. CUIT teamed up with the Libraries this summer to bring Overleaf Pro licenses to everyone at Columbia. Overleaf is the leading collaborative LaTeX editor with templates for publishing scientific documents, papers, bibliographies, and more. A Pro license includes unlimited collaborators, version tracking, and offline editing.

Create or link your Overleaf account


 

Two young students in lab coats running an experiment

Find undergrad talent

Undergrads now have a dedicated database of research opportunities, surfacing open positions across all Columbia's campuses. Post your open research role today to attract bright, enthusiastic applicants that are eager to learn!

Learn more about the Undergrad Research Opportunity portal


 

Pie chart from the MyGrants dashboard: Budget Overview

Visualize your grant spending

Use the MyGrants dashboard to track your grant finances at a glance! MyGrants makes it easier for Columbia Principal Investigators to monitor your real-time expenses, see currently available funds, and plan for the future by visualizing sponsor committed funds as listed in your Notice of Award.

Learn more & log into MyGrants

CU Research Software Roundup

bar-chart icon

Columbia deals on popular analysis and publication software (plus a few transfer/storage bonuses), available to active researchers, faculty, and students. Email us at [email protected] with any additions or updates!

Offerings change frequently, so also check: