The Data Storage Finder is designed to help you find storage solutions to meet your needs. It is not intended to be a complete or comprehensive catalog of storage services available at Penn State. If you need to transfer data, especially for research purposes, please visit this page to find out more about Globus at Penn State.

 

Descriptions of Terms

Data Storage Comparison Table

Click on the tiles above to compare the selected services here.

Service Amazon Web Services (AWS) - Simple Storage Service (S3) Data Commons Departmental Penn State Access Account Storage Space (PASS) GitLab Google Cloud Platform (GCP) - Object or Blob Storage Google Drive Kaltura Local Server - Files Microsoft Azure - Azure Blob Storage Microsoft Azure - Azure Files OneDrive for Business Penn State File Storage (PSFS) PreVeil REDCap Roar Active Storage Roar Nearline Storage ScholarSphere Secure Enclave (Roar) Secure Enclave SharePoint SharePoint Stream UDrive File Storage Service VM Hosting VM Hosting Secure
Contact Information Amazon Web Services (AWS) - Simple Storage Service (S3) Data Commons Departmental Penn State Access Account Storage Space (PASS) GitLab Google Cloud Platform (GCP) - Object or Blob Storage Google Drive Kaltura Microsoft Azure - Azure Blob Storage Microsoft Azure - Azure Files OneDrive for Business Penn State File Storage (PSFS) PreVeil REDCap Roar Active Storage Roar Nearline Storage ScholarSphere Secure Enclave (Roar) Secure Enclave SharePoint SharePoint Stream UDrive File Storage Service VM Hosting VM Hosting Secure
Description

Flexible and scalable object storage for use with AWS Cloud services and on-premises resources.

A public access institutional repository for research data created by Penn State faculty, graduate students, and their collaborators.

On-premises disk space/file storage space for use by departments at Penn State.

An open-source software platform where members of the Penn State Community can share and collaborate on code.

Flexible and scalable object for use with Google Cloud Platform (GCP) and on-premises resources. 

Although still active, this service is no longer recommended due to vendor-enforced quotas. Select this service and see Limits in the comparison table below.

Enterprise-level, cloud-based media management platform for storing, publishing, and streaming videos, video collections, and other media.

On-premises file-based storage provided by local Information Technology teams.

Flexible and scalable object storage for Microsoft Azure and on-premises resources.

Scaleable cloud-based file storage service for Microsoft Azure and on-premises resources.

Cloud-based storage designed for business—access, share, and collaborate on all your personal work files from anywhere.

Provides Penn State faculty and staff with on-premises network attached storage (NAS), allowing them to store, access, and share files within the Penn State network.

Cloud-based storage designed for research project data at the high or restricted information classification levels.

A secure, web-based application for databases and online surveys for research purposes.

On-premises file-based active storage for research provided by the Institute for Compuational and Data Sciences.

On-premises file-based archival storage for research provided by the Institute for Compuational and Data Sciences.

An institutional repository for storing all types of scholarly materials, including publications, instructional materials, creative works, and research data.

Secured and vetted cluster for research data at Penn State.

Secured and vetted cloud-based storage using a special version of Microsoft's SharePoint technology.

Web-based platform for sharing files and content. Provided through Penn State's Office 365 agreement with Microsoft.

Cloud-based enterprise video-sharing service. Provided through Penn State's Office 365 agreement with Microsoft.

On-premises file service provided via Windows servers for Penn State academic/lab users.

Cost-effective, reliable Virtual Machines (VM) for departments, colleges, and research units at Penn State.

Secured and vetted virtual infrastructure located on-premises at Penn State.

Cost

$0.023/GB/Month* Varies based on storage tier, duplication, and region.

There is no fee for storing public access research data on the Data Commons.

$0.07/GB/month - but billing is non-functional right now.  Please budget with the understanding billing will resume.

None for git.psu.edu, but you might incur charges if you set up a vm for runners.

$0.026/GB/Month* Varies based on storage tier, duplication, and region.

No cost. Covered under Penn State's Subscription.

No cost. Covered under Penn State's Subscription.

Based on any agreement with service provider (e.g., local IT department).

Varies based on storage tier, replication, region, and whether reserved capacity is used.Example: Locally Redundant Tiered Blob istorage n US East Regions: Hot LRS - $0.0172/GB/Month,Cool LRS - $0.0036/GB/Month,Archive LRS - $0.00098/GB/Month

$0.06/used GiB/Month in Standard Tier.  $0.20/used GiB/Month in Premium tier

No cost. Covered under Penn State's Subscription.

Primary Storage: $7.50 / TB allocated / monthReplication Add: $5.50 / TB allocated / monthBackup Add: $11 / TB consumed / month (UBI rate). Billing will be done quarterly.

Currently covered centrallly, though with a limited number of available licenses.

No cost.

Active Storage (data that will be worked with and retrieved often) - $6.67/TB/Month .

Nearline (data that will be archived and/or accessed less frequently)/Archive - $1.25/TB/month.

The University provides this service at no cost to current Penn State faculty, students, and staff and to emeriti faculty.

Active Storage (data that will be worked with and retrieved often) - $6.67/TB/Month.

No cost. Covered under Penn State's subscription.

No cost. Covered under Penn State's subscription.

No cost. Covered under Penn State's subscription.

Commodity service paid for by Penn State IT.

$0.04/GB/Month + Cost of host VM.

$0.04/GB/Month + Cost of host VM.

Types Active, Nearline, Archival, Object Repository, File Active, File Repository, File Active, Nearline, Archival, Object Active, File Active, File Active, File Active, Nearline, Archival, Object Active, File Active, File Active, Nearline, Archival, File Active, File Active, Database Active, File Nearline, File Repository, File Active, File Active, File Active, File Active, File Active, File Active, File Active, Block
Classification Low (Level 1), Moderate (Level 2), High (Level 3 - ATO Required) Low (Level 1) Low (Level 1), Moderate (Level 2) Low (Level 1), Moderate (Level 2) Low (Level 1), Moderate (Level 2), High (Level 3 - ATO Required) Low (Level 1), Moderate (Level 2) Low (Level 1), Moderate (Level 2) Low (Level 1), Moderate (Level 2), High (Level 3 - ATO Required), Restricted (Level 4- ATO Required) Low (Level 1), Moderate (Level 2), High (Level 3 - ATO Required) Low (Level 1), Moderate (Level 2), High (Level 3 - ATO Required) Low (Level 1), Moderate (Level 2) Low (Level 1), Moderate (Level 2) High (Level 3 - ATO Required), Restricted (Level 4- ATO Required) Low (Level 1), Moderate (Level 2), High (Level 3 - ATO Required) Low (Level 1), Moderate (Level 2), High (Level 3 - ATO Required), Restricted (Level 4- ATO Required) Low (Level 1), Moderate (Level 2), High (Level 3 - ATO Required), Restricted (Level 4- ATO Required) Low (Level 1) High (Level 3 - ATO Required), Restricted (Level 4- ATO Required) High (Level 3 - ATO Required) Low (Level 1), Moderate (Level 2) Low (Level 1), Moderate (Level 2) Low (Level 1), Moderate (Level 2) Low (Level 1), Moderate (Level 2) High (Level 3 - ATO Required), Restricted (Level 4- ATO Required)
Eligibility Faculty, Staff Faculty, Staff, Graduate Students Faculty, Staff Faculty, Staff, Students, Sponsored Accounts, Service Accounts Faculty, Staff Faculty, Staff, Students Faculty, Staff, Students, Affiliates Faculty, Staff Faculty, Staff Faculty, Staff, Students Faculty, Staff Faculty, Staff Faculty, Staff Faculty, Departments, Colleges, Institutes, Centers Faculty, Departments, Colleges, Institutes, Centers Faculty, Staff, Students, Emeriti Faculty Faculty, Staff Faculty, Staff Faculty, Staff, Students Faculty, Staff, Students Faculty, Students Faculty, Staff Faculty, Staff
Common Uses Backups, user content, data warehousing, bulk data storage, map/reduce workload storage, web content storage, archival data. Research data, models, instruments, supplemental information, and related materials created by Penn State faculty, staff, graduate students, and their research collaborators. General file storage Source Code Control, Code Collaboration, Documentation, Automation Backups, user content, data warehousing, bulk data storage, map/reduce workload storage, web content storage, archival data. Personal Files, Sharing, Collaboration Media (video and audio files) Generalized file storage Backups, user content, data warehousing, bulk data storage, map/reduce workload storage, web content storage, archival data. Replace or supplement on-premises file servers, lift and shift file storage to the cloud, simplify cloud development. Personal Files Generalized file storage Research project data at the high or restricted information classification levels. Surveys and databases primarily related to research studies and operations Research Data Research Data Content primarily of scholarly import, including curricular materials and creative works produced in support of Penn State's teaching, learning, and research mission. Projects requiring strict data controls Sensitive Data Project Files,Unit Files,Group Files Media (video and audio files) Academic data Personal Files,Unit Files,Project Files Projects requiring strict data controls
Technical Complexity Medium Low Medium Medium Medium Low Low Medium Medium Medium Low Med Med Medium Medium Medium Medium HIgh HIgh Low Low Medium HIgh HIgh
Limits

Individual objects are limited to 5 TB in size.

Data Commons is a mediated service. There are no limits for data storage. Data Commons staff work with data providers to ensure the most effective manner for handling data. This is especially true for large, complex data sets.

Minimum of 15 GB, up to 500 GB; beyond 500 GB open a SNOW request to validate resources are available.

- max projects/user: 100- max attachment size: 100 MB- max import size: 50 MB- max artifacts size: 2048 MB- default artifact expiration: 30 days

Individual objects are limited to 5 TB in size, each bucket has unlimited growth.

Google Workspace for Education customers have a specific pooled (institution-wide) storage limit that has been significantly reduced by Google and will be reduced again in the future. Due to these changes, Penn State no longer recommends Google Drive for storage. Contact googleworkspace@psu.edu for more information.  For informational purposes, Google Drive limits are posted on this page.

No posted file size limits.

Configuration dependent

Individual objects are limited to 4.7 TB in size, each storage account is limited to 5PiB.

Individual files in a file share limited to 1 TiB, max file of a file share is 100 TiB.

250 GB upload file size limit. More info available on this page.

Min: 50 GB; Max: Contact Support; Currently, allocations must be less than 100TB.

No posted file size limits.

Users get:
Home: 10 GB
Work: 128 GB
Scratch: 1 Million files (anything older than 30 days is deleted)

Storage allocated in 5 TB chunks with a minimum of 5 TB allocation

Single file uploads via browser cannot exceed 500 MB; Folder or group file uploads via browser cannot exceed 1 GB; Dropbox uploads cannot exceed 1.9 GB; Box uploads cannot exceed 5 GB

Storage allocated in 5 TB chunks with a minimum of 5 TB allocation

250 GB upload file size limit. More info available on this page.

250 GB upload file size limit. More info available on this page.

50 GB upload file size limit.

1 GB max allocation

Min: 5 GB
Max: Contact Support

Min: 5 GB Max: Contact Support

Sharing Users with a Penn State or College of Medicine User ID and External Users. Open/Public Users with a Penn State or College of Medicine User ID. Users with a Penn State or College of Medicine User ID. Users with a Penn State or College of Medicine User ID and External Users. Users with a Penn State or College of Medicine User ID and External Users. Users with a Penn State or College of Medicine User ID. Configuration dependent. Users with a Penn State or College of Medicine User ID and External Users. Users with a Penn State or College of Medicine User ID. Users with a Penn State or College of Medicine User ID and External Users. Users with a Penn State or College of Medicine User ID. Users with a Penn State or College of Medicine User ID and External Users. Users with a Penn State or College of Medicine User ID. Users with a Penn State or College of Medicine User ID. Open/Public Users with a Penn State or College of Medicine User ID. Users with a Penn State or College of Medicine User ID and External Users. Users with a Penn State or College of Medicine User ID and External Users. Users with a Penn State or College of Medicine User ID and External Users. Users with a Penn State or College of Medicine User ID. Configuration dependent. Configuration dependent.
Data Protection Backups/snapshots and replication are available. Snapshots on the appliance in UP datacenter for short-term recovery. Backups via UBI (redundant copies including offsite to Hershey datacenter). Daily Veeam backups. Backend database is load balanced Backups via multiple file versions. Service provider replicates data. Data is replicated and backed up. Configuration dependent. Backups/snapshots and replication are available. Backups and replication are available. Backups via multiple file versions. Service provider replicates data. Snapshots of configurable duration; Backups for additional fee to UBI; Replication for additional fee to UTC. Service provider replicates data. Active storage backed up to TSM for 90 days. Archive is not backed up. Older Archive data is stored on tape. Active storage backed up to TSM for 90 days. Archive is not backed up. Older Archive data is stored on tape. Backups via multiple file versions. Service provider replicates data. Backups via multiple file versions. Service provider replicates data. Every day at 4:00 a.m. a “snapshot” of the entire UDrive is taken on UBackup, which holds a real-time copy of UDrive data in addition to several weeks of daily snapshots. Single copy, located in UTC datacenter (30 days) Single copy, located in UTC datacenter (30 days)
Access HTTPS, API HTTP, HTTPS, FTP, SFTP, APIs, REST Services (where applicable). SFTP, SSH (via lxcluster), NFSv3, SMBv2, HTTPS (via PASS Explorer) HTTPS, SSH, API HTTPS, API Google Drive Client, HTTPS, API HTTPS, APII FTP,FTPS,SFTP,NFS,SMB HTTPS, API SMB, NFS (in private preview), API OneDrive for Business Client, HTTPS, API SMB, NFS Sync, HTTPS, External HTTPS, API SSH, Globus, NFS, SMB (Active only), Aspera SSH, Globus, NFS, SMB (Active only), Aspera HTTP, HTTPS SSH, Globus, NFS, SMB (Active only), Aspera HTTPS HTTPS, API HTTPS SMB, HTTPS (via PASS Explorer), WebFiles Configuration Dependent Configuration Dependent
Transfer rate Max of 25 Gbs to EC2. Speed varies to points outside of AWS based on the storage region and storage tier. Connection Dependent Connection Dependent; no more than 10 Gbps and probably significantly less. No SLA guarantees on disk throughput. Connection Dependent Bandwidth varies based on tier, region and key distribution and caching configuration. Connection Dependent Connection Dependent Connection Dependent Up to 500 requests per second for a single blob in standard tier. Speed varies based on storage tier (standard vs premium). Up to 300 MiB/s and 10,000 IOPS per file share in standard tier. Connection Dependent Connection dependent; Archive tier performance < 100 Mbps read/write speed with ~50ms latency). Connection Dependent Connection Dependent Connection Dependent Connection Dependent Connection Dependent Connection Dependent Connection Dependent Connection Dependent Connection Dependent Connection Dependent Connection Dependent; no more than 10 Gbps and probably significantly less. No SLA guarantees on disk throughput. Connection Dependent; no more than 10 Gbps and probably significantly less. No SLA guarantees on disk throughput.
Notes

Penn State is not charged ingress fees under our contract. Data egress fees are waived up to 15% for each AWS account's total monthly spend.Data can be automatically moved to colder storage tiers based on policy to reduce monthly costs at a greater expense of retrieval. Mounts can be performed with the FUSE filesystem or utilizing the Storage Gateway service for CIFS, iSCSI, or VTL for backup software.Contact Penn State Cloud Services for more information.

Data Commons provides public access to data. Any data on the Data Commons must be able to be open and publicly accessed with no restrictions.

Please note that we are actively considering moving this service to hosted GitHub. While CI/CD functionality is included in Gitlab, we do not currently provide runners or a platform for runners. Help at support@git.psu.edu.

Penn State is not charged ingress fees under our contract. Data egress fees are waived up to 15% of of PSU's total monthly spend.Data can be automatically moved to colder storage tiers based on policy to reduce monthly costs at a greater expense of retrieval.Mounts can be performed with the FUSE filesystem.10 Gbps public network peering is in place with Google.Contact Penn State Cloud Services for more information.

API permissions may not be available to all users.

Find out more about how to use Kaltura by visiting its support page.

Penn State is not charged ingress fees under our contract. Data egress fees are waived up to 15% of of PSU's total monthly spend.Data can be automatically moved to colder storage tiers based on policy to reduce monthly costs at a greater expense of retrieval.Mounts can be performed with the FUSE filesystem.Private network peering is possible using Azure Private Link service.Contact Penn State Cloud Services for more information.

Penn State is not charged ingress fees under our contract. Data egress fees are waived up to 15% of of PSU's total monthly spend.Private network peering is possible using Azure Private Link service.Contact Penn State Cloud Services for more information.

Penn State IT — Information Security is piloting OneDrive policies that will support information classification levels 3 and 4. Supports differential file sync.

Contact ORIS or Penn State IT Information Security about potentially using Preveil to store research project data at the High (Level 3 - ATO Required) or Restricted (Level 4 - ATO Required) information classification levels.

Contact Penn State IT - Information Security about storing High (Level 3) or Restricted (Level 4) data in RedCap.

Users get: Home: 10GB Work: 128GB Scratch: 1 Million files (anything older than 30 days is deleted)

Deposit of content to ScholarSphere grants Penn State University non-exclusive rights for preserving that content and, if you choose, disseminating that content (see our "Preservation Policy" below).

Contact Penn State IT — Information Security to create an Enclave Secure SharePoint folder that supports High (Level 3) or Restricted (Level 4) data.

Supports differential file sync.

See the Knowledge Base article, Managing UDrive Storage Utilization.

Requires a virtual machine (VM) to host the storage; VMs start at $2.00/month. Reachable via the Research Network (confirm).

Requires a virtual machine (VM) to host the storage. VM requires an ATO to be processed with Penn State IT - Information Security.

About the Data Storage Finder

The application was adapted from University of Michigan's Data Storage Finder. The original creators are Cornell University Research Data Management Service Group and Cornell Information Technologies Custom Development Group (2018).