Research Data Transfer — Globus

Secure, efficient and reliable file transfer service for large, non-sensitive data transfers within Columbia and to external collaborators.

Globus is a robust, cloud-based, file transfer service designed to move many large files by the University of Chicago. Columbia's Globus subscription includes Globus Connect and Open Access.

Why Globus?

Fast

If you transfer large files or large collections of files (TBs, or even PBs) that take 15+ minutes, then Globus is highly recommended to expedite your data transfers. Globus is an efficient alternative to scp, sftp, and rsync over ssh utilities, which are best-suited for small datasets.

Reliable

If your data transfers may be interrupted due to an unreliable connection or exceeded disk quota, Globus is a great solution since it automatically resumes your data transfer in the case of temporary disconnections.

Secure

Globus integrates with the grid security infrastructure and adds encryption to both the data and control channels for moving data between two endpoints (e.g. your computer, HPC clusters, Google Drive, OneDrive, etc.). As a result, the data moves directly between the source and destination endpoints and cannot be accessed or stored by Globus, only by the GridFTP servers running on your managed endpoints. 

NOTE: at this time, sensitive data of any kind is not permitted for use with Columbia's Globus subscription. Globus is not certified for sensitive data of any kind and can only be used for non-sensitive, non-confidential, and otherwise unrestricted data, as specified by the Columbia University Data Classification Policy

Convenient

With Columbia's Globus Connect and Open Access subscriptions, you can create a data-sharing endpoint on almost any device: your laptop or personal desktop, campus HPC clusters, lab servers, Google Drive, Amazon S3 bucket, Box, OneDrive, and more.

Collaborative

You can securely transfer data both in and outside of Columbia using Globus. The basic Globus transfer service is free for all non-profit organizations, so transferring data to external collaborators outside of Columbia is likely free for them as well!

Globus terminology

High-level definitions for quick reference; see Handling Collections vs Endpoints if more detail is needed.

Globus logo

How do I get started with Globus?

If you are new to Globus, follow these steps to create your Columbia Globus account:

  1. Navigate to the Globus login page.
  2. Select Columbia University from the drop-down (you can type the first letters to narrow results).
  3. Log in with your UNI and UNI password, and authenticate with Duo.
  4. Select your preferred permission-level for releasing your account information to Globus. Many users select the middle option.
  5. To upgrade your account to Globus Plus, see below.

If you already have a Globus account from another organization, log in as described above and choose Link to an existing account. The Identity Linking Tutorial explains in detail how Identity Linking works.

  1. Request access to the Columbia University Standard subscription.
  2. While you wait to be approved, download Globus Connect Personal to set up a data transfer endpoint on your own Mac, Windows or Linux system. 
  3. Optional: Follow Globus' tutorial to practice sharing data.
  4. Optional: If you plan to share data from your computer directly to another Globus user, you must enable sharing in your Globus Connect Personal app. Click on the Globus app icon (in upper-right toolbar on Macs, lower-right toolbar in Windows), then select Preferences, choose the Access section, and finally check the Sharable box.

1. Log into Globus with your @columbia.edu identity.

2. Open Globus Connect Personal on your computer (see above to install GCP).

3. Navigate to the File Manager in Globus from the left-hand navigation panel.

4. Enter the name of your Globus Connect Personal collection at the top of the left panel (or vice versa). Tip: the name of your collection can also be found under Bookmarks --> Your Collections

5. Enter "CUIT Ginsburg Google Drive" at the top of the right panel (or vice versa).

Globus File Manager screen with Personal Collection and CUIT Ginsburg Google Drive collections entered as endpoints

6. On the left, select the file(s) you would like to transfer.

7. On the right, select the destination where you would like the files to be transferred to (MyDrive is the top-level location for LionMail Drive). If you don't select a specific folder, the file(s) will be dropped in the generic top-level Drive location.

8. Click the Start button at the top on the side you will be sending the data from. You will see a pop-up indicating that the transfer is in progress.

9. You will receive an automated email from Globus Notification <[email protected]> when the transfer is complete. You can also monitor progress using the Activity page in Globus (accessible from the left-hand navigation panel).

Globus File Manager with left-hand Start button circles and "Transfer request submitted successfully" pop-up on right

1. Log into Globus with your @columbia.edu identity.

2. Open Globus Connect Personal on your computer (see above to install GCP).

3. Navigate to the File Manager in Globus from the left-hand navigation panel.

4. Enter the name of your Globus Connect Personal collection at the top of the left panel (or vice versa). Tip: the name of your collection can also be found under Bookmarks --> Your Collections

5. Search for the name of the CUIT HPC cluster at the top of the right panel (or vice versa). All users that have an HPC account will have automatic access to their cluster's collection.

Globus Web App page asking for permission to connect to CUIT HPC cluster collection

6. Once you select the cluster, you will need to authenticate your HPC account within Globus. Click Allow.

6. On the left, select the file(s) you would like to transfer.

7. On the right, specify the destination where you would like the files to be transferred to.

8. Click the Start button at the top on the side you will be sending the data from. You will see a pop-up indicating that the transfer is in progress.

9. You will receive an automated email from Globus Notification <[email protected]> when the transfer is complete. You can also monitor progress using the Activity page in Globus (accessible from the left-hand navigation panel).

FAQ

Globus uses GridFTP, a high-performance extension to FTP, optimized for high-bandwidth, wide-area networks, providing more reliable high-performance file transferring and synchronization than ftp, scp, or rsync. Grid FTP automatically tunes parameters to maximize bandwidth by auto-selecting the most appropriate settings for concurrency and parallelism on every transfer task.

That said, Globus transfers are still subject to your local environment's constraints, including:

  • Local network speed (check your current speed here)
  • Endpoints: Transfers involving a personal endpoint are likely to be slower than transfers between institutional endpoint. If you are transferring to a storage device, then SSDs with USB-C or USB 3.0 connectors are recommended to optimize speed (rather than HDDs or SSDs with older connectors)
  • Resources: The load or available resources (RAM, CPU, etc.) of the source and destination collections
  • Storage systems: The performance of the source and destination storage systems