[case study] Data Crunching Servers – Duke Physics Department

Data Crunching ServersScientific research can produce huge volumes of data for analysis, a fact the Physics Department at Duke University knows well. Each of the department’s 12-plus research groups exploring nuclear physics, particle physics, and other exotic fields generates a terabyte of data every month.

To meet the storage challenge this data presented, the research groups used direct-attached storage. However, this approach limited storage to the size of each computer’s hard drive and impeded data sharing among researchers. It also barely kept pace with the intense data bursts generated by the groups’ experiments.

Read more below about how Duke’s Physics Department deployed high-speed JetStor RAID Arrays to absorb “bursty” data generation and enable the storing and sharing of large data sets.

JetStor® RAID Arrays Support Cutting-Edge Physics Research at Duke University

THE ORGANIZATION

At the Duke University Physics Department in Durham, North Carolina, some 600 faculty, graduate students, postdocs, researchers, and visiting scholars work in such fields as nuclear physics, condensed matter physics, high energy physics, photon physics, and quantum optics. The department has 1,500 computers and workstations in facilities like the new 280,000 square-foot French Family Science Center, which has laboratories for genomics, biological chemistry, materials science, nanoscience, physical biology, and bioinformatics.

THE CHALLENGE

Research at Duke’s physics department, generates vast amounts of data that must be stored for analysis and peer review. Not all the data are quantitative, however. Research groups, such as the biophysics workgroup investigating the dynamics of cardiac muscles, often rely on high-speed cameras that produce huge quantities of images very rapidly. Just one research group can create a terabyte of data each month and the Duke Physics Department has over 12 such groups. Additionally, unlike at commercial enterprises that often produce somewhat consistent levels of data over the course of weeks, the data generated at the physics department is “bursty.” Sophisticated, often custom-built applications may rest dormant for periods and then rapidly churn out data for the duration of experiments.

The Duke Physics Department initially met its substantial storage needs by using direct-attached storage (DAS), in which large hard drives were directly linked to computers without a network in between. Stored data, however, were accessible only from the attached computer and the amounts of data preserved were limited by the capacities of each hard drive. Although investigative science is highly collaborative, researchers were unable to easily share data, which impeded analyses and ensuing discoveries.

“We required a more advanced storage strategy, one worthy of the science we conduct,” said Jimmy Dorff, senior IT manager for the Duke University Physics Department. “We needed solutions that are scalable and fast enough to ingest large troves of data very quickly.”

THE SOLUTION

Ten JetStor SAS 516iS 16-bay iSCSI RAID Arrays and JetStor SATA 416iS 16-bay iSCSI RAID Arrays with 2 Tb disks from Advanced Computer & Network Corporation (AC&NC).

SYSTEM CONFIGURATION

  • JetStor SAS 516iS iSCSI RAID Arrays with gigabit iSCSI links to a Dell Powerconnect 5424 switch
  • JetStor SATA 416iS iSCSI RAID Arrays with gigabit iSCSI links to a Dell Powerconnect 5424 switch
  • Dell iSCSI Powerconnect 5424 Optimized Switch

BENEFITS IMMEDIATELY REALIZED

By clustering ten JetStor RAID Arrays into an iSCSI storage area network (SAN), the Duke Physics Department gained a storage infrastructure that fully supports its ongoing scientific research. The theoretical nuclear physics group, for example, stores its experimental data on five JetStor platforms, and another group using high-speed cameras to investigate particle flows saves some 20,000 high-resolution images on JetStor solutions. Physicists working with national laboratories like Fermilab and Brookhaven deploy the devices to house large data sets locally to expedite analyses of research results.

“We built a robust SAN without the costs and complexities of Fibre Channel by using the iSCSI connectivity of the JetStors,” said Dorff. “Because data is striped across multiple disks in each array, the JetStors offer the throughput demanded by even the very bursty data production of our lab work.”

The physics department also relies on the JetStor arrays to back up Linux servers and a Mac OS server that uses Apple’s Time Machine application to mirror data on Mac laptops. Administrators even use the JetStor systems to support the department’s web site. “We can allocate storage to workgroups as needed and add capacity without disrupting the production environment,” added Dorff. “Our JetStors also ensure no data is lost, which is vital because repeating experiments to reacquire lost data is expensive and time consuming. Our physicists can now perform the most rigorous lab work with confidence that storage will never be a bottleneck or impediment.”

HOW WE DID IT

To build its storage environment, the Duke Physics Department attached its JetStor RAID Arrays to a Dell iSCSI Powerconnect 5424 Optimized Switch using iSCSI Gigabit Ethernet links. The switch connects with the same bandwidth to a variety of servers, mostly Dell and Sun with one Mac system, and to the department’s 10 Gigabit Ethernet production network. The JetStors platforms are configured for RAID 6, which delivers block-level data striping and avoids data loss even should two disks fail within an array.

The department provisions them with 2 terabyte disks to attain over a 100 terabytes of storage capacity. “Between the bandwidth on our network and the throughput of our JetStors, we can support extreme data generation,” said Dorff.“When a workgroup is created or a data-intensive experiment is conducted, we can ensure that researchers have access to fast, reliable storage.”

Administrators use JetStor RAID Manager, a web-based application, to manage the storage systems. They can easily start or shut any array with readily-accessible controls on the devices, and can quickly identify any disk within an array that might be malfunctioning. “This enables us to quickly adapt storage to our very fluid research environment,” Dorff concluded.


  • Does this Case Study Relate to Your Storage Requirements?

    If you would like to discuss your data storage requirements, please let us know.

 

Shared iSCSI Storage Data Recovery

JetStor NAS CCTV

Shared iSCSI storage typically refers to a shared network drive for a group of people to access. Businesses these days are into shared iSCSI storage for data sharing and recovery. This type of iSCSI storage platform allows everyone to access the same file, given all the information and the password to work on a particular file.

Most video companies are into this kind of iSCSI storage because it is more convenient and fast. iSCSI storage for video is made simplified using this platform. A video editor from anywhere in the world can access the file provided all the file names are known to him or her. You can then check on the progress of the task without having to transfer the file to where your editors are. The exchange of communication under shared storage is made simpler; you don’t have to call or schedule for a meeting with your video editors, you just to check on the progress of the task on the platform.

Most internet people are also into virtualized shared storage. It saves times and effort in storing and access all the files needed. What is good with shared iSCSI storage most especially to those who are outsourcing their task is the monitoring part. You can easily check which files are ready, and, which are not. Likewise, your outsourced staff can easily check and track down all the tasks that need to be done or changed.

In terms of local businesses can use iSCSI shared storage run through an Ethernet line or LAN. Backup storage is also made less complex since only the shared iSCSI storage is backed-up and not the rest of your files. Likewise, you can also opt to use virtualized storage for your shared files only to safeguard your data.  With an intelligent backup iSCSI storage system, all your shared data is safe and secure.

 
© 1994-2015, Advanced Computer & Network Corporation. All Rights Reserved. Legal
Blog.

WANT TO KNOW MORE ABOUT US? SUBSCRIBE TO OUR NEWSLETTER.