Virtual Private Network for PB-level open data in CASEarth Program

 2020-12-15

Background

Launched on Jan.1 2018, the Big Earth Data Science Engineering Program (CASEarth) is a strategic priority research program funded by the Chinese Academy of Sciences. Her overall objective is to establish an International Science Center for Big Earth Data. Currently, the CASEarth multidisciplinary platform has totally released 5PB research data, including 1.8PB of earth observation data, 2.6PB of biological and ecological data, 0.4PB of atmospheric and ocean data, 0.2PB of geographic data and ground observation data; 490,000 data records from the stratigraphy and paleontology database, 3.6 million records of China biological species list, 420,000 microbial data records as well as 1 billion omics data online.

CASEarth homepage

CASEarth Data Sharing and Service Portal

Research demands

Scientists from over 129 research institutions are involved in the CASEarth Program and their research areas may include but are not limited to high energy physics, computer science, space science, geography and geology, atmospheric science, marine science, biology, ecology, etc.

There are various challenges within the CASEarth project, such as the issues of remote data capture and curation, online data computing, long-term data archive, and disaster recovery. And for data transfer, if we continue to share the institute bandwidth, we may suffer from traffic congestion, packet loss, and information delay. Besides, we cannot make full use of the bandwidth resources on our own at all. Competing for network resources leads to poor network quality and low efficiency. Even for 1Gbps bandwidth, it shall still take 1PB data exchange at least 105 days ideally. Therefore, an enhanced data network for massive data exchange becomes an urgent need in the CASEarth program.

 

CSTCloud solutions

Through full discussions with selected institutions within the CASEarth program, such as the Aerospace Information Research Institute, CAS, the Institute of Atmospheric Physics, CAS, and the Institute of Microbiology, CAS, the CSTCloud team had an in-depth understanding of the data transmission needs and found bottlenecks encountered within each institute and jointly reach a practical construction plan. To smooth the construction of the private network, and better coordinate multiple node units along the route, optical cable are implemented. In response to the demand for massive data transfer, CSTCloud has launched the "Technology Cloud Communication service” in which a private network for secure and high-speed data transmission can be quickly built for sundry scientific research applications. 7X24 customer services are also available for VPN services to ensure stability and sustainability.

Based on the "CASEarth Data Transmission Private Network", 10 Gigabit data transmission networks are available, connecting the Aerospace Information Research Institute, CAS, the Institute of Atmospheric Physics, CAS, the Institute of Microbiology, CAS, the Computer Network Information Center, CAS, the Institute of Tibetan Plateau Research, CAS, the Beijing Institute of Genomics, CAS, and the National Science Library, CAS. And to measure the performance of data exchange, the CSTCloud "Unified Operation Management Platform" is ready for private network monitoring, operation, and maintenance. The compound VPN solution has greatly changed the way we exchange data in the past. According to the metrics, the maximum data transmission rate in actual use cases can exceed 6Gbps.

Through such cooperation, CSTCloud has customized and launched special services for different application scenarios and will continue to strengthen her capability for data exchange through technical research and application of SD-WAN, virtual private network, and big data transmission tools. Thus, tailored services will always be ready for the broad scientific community.

[1] CASEarth. About us. Available at http://english.casearth.com/index.php?option=com_content&view=article&id=66&Itemid=161