Columbia Data Platform — CDP

Cloud-based solution for research data storage, discovery, analysis, collaboration and archive

The Columbia Data Platform (CDP) service is currently being piloted by CUIT. Please check this page for updates on a wider rollout.
About

Data Platform uses Google Cloud Platform for storage and compute and is powered by Redivis.

Features
  • Data storage and discovery
    • Upload numeric, text, structured, and unstructured data using files or APIs 
    • Curate and tag rich metadata for easy discovery
    • Search across datasets, metadata and variables (see examples of public datasets on CDP)
  • Data analysis and exploration
    • Filter, merge, and analyze billions of records in real time
    • Use visual interface or APIs for analysis
    • Do mashups with ease and bring disparate datasets together
  • Collaboration
    • Analyze, visualize, and share data within Columbia and beyond
    • Set granular access to datasets, projects, and analysis
    • Automated version control and usage tracking
    • Export data and visualizations to multiple formats 
  • Data archive (coming soon!)
    • Move datasets between hot-cold-archive storage to reduce cost
    • Persistent links to archive data for reproducibility
    • Data is always available instantly