Empire AI
Empire AI provides high-performance compute resources to eligible PIs across a consortium of New York State research institutions to advance safe, equitable, and responsible AI research for the public good.
Related Links
Examples of Existing Columbia Project Categories on Alpha
- Foundation Models and Generative AI
- AI for Scientific Discovery and Simulation
- Natural Language Processing and Understanding
- Machine Learning Methodology and Applications
- And moreā¦
Timeline
Empire AI was launched in April 2024, marking the beginning of a multi-phase initiative to expand AI research capacity across New York State. By October 2024, the Alpha system became operational at the University at Buffalo, followed by the Beta system award and the conceptual design of the Gamma system in January 2025.
The Beta system is scheduled to come online end of May 2026 and construction of the Gamma facility will begin in early 2027. The Gamma facility is expected to open in Mid-late 2027, with the full Gamma system becoming operational in the first half of 2028.
Alpha Hardware Specifications
The Alpha system initially consisted of 13 HGX nodes with 8 H100 80GB GPUs per node and 4 PB of storage across DDN and VAST. Alpha was recently upgraded to Alpha++ over Fall 2025, with now 24 HGX nodes:
24 HGX Nodes
- Nodes alphagpu01 - alphagpu18
- 8 H100 80GB GPUs per node
- Nodes alphagpu19 - alphagpu24
8 H200 80GB GPUs per node
- 10 400Gb/s ConnectX-7 NIC Cards (8 for IB and 2 for Ethernet)
- 30TB NVMe caching space
- 2TB of system memory
- x86 processor
- Storage:
- 2PB of DDN Storage
- 2PB of VAST (all flash)
Research: There are currently over 120 active projects and 320 users across all the institutions on Empire AI.
Beta Hardware Specifications
Compute: NVIDIA DGX GB200 SuperPOD
- 4 racks NVL72 with 288 B200 GPUs for AI Applications
- 60 Grace-Grace Superchip nodes for data processing and general HPC use
Storage: ~10 PB DDN; ~20 PB VAST all flash Software: NVIDIA AI Enterprise Software
Institutional Allocation & Service Units
Empire AI provides a range of compute nodes and storage options, each with varying performance levels and costs. To ensure consistency in measuring and managing usage, all resources are expressed in a unified metric known as Service Units (SUs).
Alpha and Beta will deliver 6.3M SUs per year with each institutional allocation at 700,000 SUs per year.
With the Beta phase, all computer resources are assigned a value in Service Units, for example:
- 1 Alpha GPU hour = 1 SU
- 1 Beta GPU hour = 2 SU
- 1 Grace-Grace node hour = 0.5 SU
- 1 TB.month = 8.333 SU up to 100 TB, data over 100 TB will be free but allocated and managed separately
- Priority queue access 2x the above
- Shared resource access 0.5x the above
- These above conversion factors are termed multipliers.
- Cost of a job in SUs = duration * amount of resources * multiplier
Cost Per Service Unit
For Empire AI, 1 SU will equate to $0.50. For Columbia users, the University will be subsidizing 50% of the cost, therefore $0.25/SU. Individual Schools and Departments may provide additional subsidies to further reduce the effective SU cost for their users.
Eligibility
Only those who hold a full-time appointment in a PI-eligible position at Columbia University may submit project proposals for Empire AI. For more clarification on PI eligibility please see: https://research.columbia.edu/pi-eligibility-sponsored-projects
Contact Us
Please send any inquiries to [email protected]. A ServiceNow ticket will be generated and a member of the Columbia Empire AI Support Team will be able to assist further.
Frequently Asked Questions
A Service Unit (SU) is the standard unit of accounting that Empire AI uses to measure and manage resource usage across its systems.
Because Empire AI includes different types of compute nodes (Alpha GPUs, Beta GPUs, Grace-Grace nodes) and storage, each with different performance characteristics and costs, the SU provides a common currency that allows all usage to be compared fairly, regardless of the underlying hardware or service.
Institutions and projects will not be charged for more than their allocation. Projects and institutions that have exceeded allocations will only have access to free, low-priority, preemptible queues (that are otherwise inaccessible) and will be requested to reduce their storage footprint as described in the Empire AI terms and conditions. This policy provides a smooth end to projects, helps ensure full system utilization, and encourages projects to promptly use allocations. Users with no allocation will be transitioned off the system after some period of time, again as described in the terms and conditions.
While Beta will not be HIPAA compliant when it first starts operations later this year, we anticipate compliance by late 2026 for a subset of the system.
