Secure Data Enclave — SDE
A secure virtual desktop research environment.
The Secure Data Enclave (SDE) provides Columbia researchers with a highly secure, remotely accessible, virtual Windows 10 or Red Hat Linux desktop environment to collaboratively analyze and store sensitive data (PII, RHI, and PHI). The SDE functions as a virtual, remote-friendly, alternative to traditional "cold-room" computing environments, where a physical computer, disconnected from the internet, is used for sensitive data analysis.
Using Citrix remote desktop, researchers can work on sensitive data and collaborate with other members of their project simultaneously. SDE projects are logically isolated; researchers are able only to access their data explicitly uploaded on their SDE project's virtual environment. Moreover, they are restricted from internet access and can only reach applications installed within the SDE (e.g. Stata, R, Python). Data can only be transferred to and from the system by the designated "Data Security Officer" (DSO), required for each project. All data is securely wiped after project retirement in compliance with DOD 5220.22-M standards.

Announcements
- RCS now offers a Data Security Officer (DSO) service for project teams that are unable to identify a qualified Columbia staff member to serve as their project's DSO. RCS' DSO is a trained IT professional who can securely support the onboarding and offboarding of sensitive data -- learn more about DSO as a service!
SDE Details
The SDE is HITRUST certified by CUIMC Security as HIPAA-compliant, and is certified for the storage and analysis of PII and PHI data. Users can reference the CUIMC Security RSAM registration ID number (3868), which confirms the SDE's certification by CUIMC Security for HIPAA- and sensitive data compliance.
Additionally the SDE has been approved for use of popular restricted datasets including the Bureau of Labor Statistics National Longitudinal Surveys (NLSY) datasets, University of North Carolina Longitudinal Study of Adolescent Health (Add Health) datasets, and European Commission Eurostat restricted economic datasets.
Researchers on the SDE have the option to choose either a virtual Windows 10 or a Red Hat Linux desktop system, each powered by 4 cores of an Intel I8462Y CPU and 16GB of RAM.
Storage allocation is partitioned between shared and individual user storage. Standard projects receive:
- 100 GB of raw data storage in a shared Data Directory
- 100 GB of collaborative workspace in a shared Group Work Directory
- Individual users are each provided with
- 50 GB in each individual Working Directory
- 2 GB in each individual Home Directory designated for code files
- 5 GB in each individual Output Directory, for staging files that need to be relocated from the SDE
If increases in CPU, RAM, or storage resources are necessary, reach out to [email protected] to request a review by RCS. Such changes may incur additional fees to acquire and provide said resources.
The Research Computing Services (RCS) team handles software installations and updates. Currently, the SDE supports many research analysis software, including Stata, R, Python, STAN, and QGIS. Other programs, depending on licensing availability, have included SPSS, SAS, Matlab, and more (note that the cost for user licenses must be paid for by the project owner).
The standard offering is five accounts: 1-2 Primary Investigators (PIs), 2-3 Research Assistants (RAs), and 1 Data Security Officer (DSO). Both for security and system access volume we ask project applicants to err towards restricting project members to as few users as necessary. If more researchers require access, it can be accommodated, but there may be additional costs associated.
Users must have a UNI and VPN access to use the SDE. Outside collaborators can get a UNI and VPN access through appropriate department-level HR status. CUIT’s RCS team manages accounts for the SDE for Columbia-affiliated users.
Researchers using the SDE system must identify a Data Security Officer (DSO).
The DSO should not be a researcher with data analysis responsibilities on the project. The DSO is responsible for:
- Loading and removing the restricted-use data to/from the SDE
- Retrieving output on behalf of their project members
- Ensuring that all materials exported from the SDE do not violate the data use agreement or their project’s data handling requirements
- Training and assisting researchers on securely accessing and managing their project data stored on the SDE
It is important to identify the DSO before committing to using the SDE, because the DSO will need to be added to the project's IRB protocol (if applicable) and possibly the data provider's data usage agreement (DUA).
How do I find a qualified DSO?
Many IT staff members at Columbia have the capability of being a qualified DSO; the role requires a moderate level of computer infrastructure understanding, availability to assist the project team during business hours, as well as a commitment to learning, understanding and upholding the security requirements for the given SDE project's specific dataset. There are two common DSO types:
- Staff member from the researchers' local IT group (Support Staff, System Administrators)
- CUIT Research Computing Services' DSO service: Researchers can contract with CUIT for DSO services by one of the RCS staff members (for an annual fee). CUIT's DSO has undergone thorough training from Columbia's IRB and CUIT on sensitive data handling procedures, and will be available during Columbia's standard business hours.
Regardless of whether your DSO is from your local IT group or from CUIT, the DSO's responsibilities and liabilities are limited to that of their role as defined in the MOU (if applicable) and DSO User Agreement of the SDE Project. While the DSO facilitates secure data storage and movement, security and compliance are shared responsibilities. Thus, all users of the project are equally responsible for the security and compliance required of the project. All SDE users must understand and adhere to any relevant legal, regulatory, or contractual requirements on the SDE.
The SDE is priced at $1,000 per project, per year, which includes up to four user accounts, and one Data Security Officer account. Discounts are available for bulk project purchases.
If you are interested in contracting with CUIT RCS for a Data Security Officer, then an additional $275/year will be applied to the project fee (for standard projects with up to four researcher accounts). You can opt into CUIT's DSO service by selecting it when submitting your SDE project application. A sample of the DSO Service MOU is available upon request.
Contact [email protected] to discuss.
Project Onboarding Process
Send an email to [email protected] to get started. Please provide:
- Your UNI and department/school at Columbia
- The name of your data-provider and the type of sensitive data you expect to receive (e.g. PHI, RHI)
- The name(s) of your PI (if not yourself)
- Any questions you may have
A member of CUIT's Research Services department will get back to you to go over your information and review the SDE requirements and restrictions.
After confirming the SDE is a good fit for your project, you will need to gather updated paperwork:
- Proof of data provider approval for using the SDE (if data provider is a non-Columbia entity). Typically this is in the form of a data agreement (DUA or DAA), modified to stipulate that the SDE will be used and signed by both the data provider and SPA; if you are based in Teachers College or Barnard, the SPA signature can be replaced by a representative from your school. If no such formal approval exists, some sort of written approval by the data provider must be acquired.
Generally, you should discuss with the data provider what data security information they need to include in their DUA/DAA. Please reach out to Research Services at [email protected] for the SDE Data Security Plan and assistance with language to provide Data Providers and Columbia IRB.
If your data provider is within Columbia, please provide documentation of this.
- Proof of IRB approval* for using the SDE. If you have an existing IRB protocol, it must be modified to stipulate the SDE is being used. For the "System ID numbers" sub-question, you should reference the CUIMC Security RSAM registration ID number, 3868, which confirms the SDE was certified by CUIMC Security for HIPAA compliance. You should also add all users that will be accessing the SDE to the IRB protocol, including any Research Staff and the DSO.
After approval, you can provide proof via a PDF copy of the approval email or the downloaded protocol "data sheet".
*Alternatively, you can provide proof of IRB protocol exemption.
To formally apply for the SDE, complete and submit this Qualtrics SDE application form. The form requires:
- Baseline project information
- Names and contact information for all PI(s), Research Staff, and Data Security Officer (DSO)*
- Document upload: Proof of data provider approval with SPA signature for using the SDE (if data provider is a Columbia entity, attach proof of this instead)
- Document upload: Proof of IRB approval/exemption for using the SDE
*You may choose to use a DSO from CUIT Research Computing Services for an additional annual fee
After receiving a complete application, Research Services will generate a custom SDE User Agreement. All PI(s), Research Staff, and your DSO must sign this document (Adobe Signature is accepted).
This agreement, along with your ARC chartstring information for payment ($1,000/year/project), should be uploaded to this Qualtrics user agreement and payment form.
RCS will set up a time to provide your project's researchers with training on how to use the SDE. Training covers basic operation of accessing and conducting analysis on the SDE, as well as overview of data security measures in place and expectations of enforcement.
After training, your DSO is permitted to upload your project's data in a manner compliant with the agreement of the data provider. At this point, PIs and Research Staff may begin their analysis on the SDE.
If you haven't requested to have your project retired and deleted within a year's time, CUIT will reach out to confirm if you'd like to extend your SDE contract for another year, at the annual fee.
FAQ
Yes, multiple users can log in at once. Each user receives their own SDE account and connects to their Windows or Linux desktop via Citrix.
Only the users on your project will have access to the same data. Each project's data storage is split among several drives, some of which are individual and some of which are shared among the project.
Yes, Access is available along with the entire MS Office Suite. You can see a list of typically available programs here and we can discuss installing special programs (if you have a license) if needed.
No. The SDE is completely isolated; there are no network capabilities on SDE machines.
No. Because there is no network connection (the virtual machines are air-gapped), it is not possible to connect to applications over the network.
Yes. Nearly all projects that use restricted datasets need an IRB protocol. If you believe your project is an exception, you can confirm that by asking to have your project approved by the IRB as exempt and providing that documentation.
At some point after leaving Columbia (graduation, retirement, moving jobs), a user's VPN access is rescinded and they will no longer be able to access the SDE. To maintain access, the user should speak to their local HR person about how they can maintain UNI access with VPN privileges after they leave Columbia.
Yes, as long as the departing user's data is properly removed, and the new user is properly onboarded.
No. Your DUA needs to contain the signatures of SPA (or Barnard/TC representative, if that is the case) and a representative from the data provider.
The SDE requires you to change the password every 90 days; after 90 days, after you log in, you will receive a notification about your expired password needing to be reset.
To reset your expired password:
- Log in to the SDE with your previous password.
- You will see the expired password message. Now, type in a new password that is 16+ alphanumeric characters (it will ask you to enter the new password twice). (Tip: Keep note of your previous password until you have successfully logged in with the new password!)
- Return to the login screen and use the new password to log in.
- If the new password does not work when logging in after resetting, try the previous password again. If the old password works, this means your new password was not accepted as it must meet the complexity conditions to be accepted: minimum length of 16 alpha-numeric characters.