Datacenter Structure

Introduction

The datacenter provides the capability to host my own servers at an affordable cost in long-term run. Those servers, including high performance computing nodes, high capacity storage servers, cold data storage system (tape), general servers, and hypervisors.

This part has been divided into three categories: Domains, Bare Metal & Virtual Machines, and Inter-Connection

Active Directory Domains

Currently, the network was primarily separated into two domains: ad1 and ad2.

structure of active directory ad1 and ad2

ad1

ad1 including all servers that are dedicated to genetic research projects, it also safeguards the core services within the entire network like the CA policy server or hypervisors. This is real production environment and has it own set up proxies & firewalls. Depending on the project requirment, servers and virtual machiens can be included or excluded into ad1 dynamically.

Currently, ad1 has the following servers:

Model CPU Amount Purpose Type
DELL R630 2 * Intel Xeon E5-2640 V4 20C40T @2.4G 1 File preprocessing & downloading Bare-Metal
DELL R630 2 * Intel Xeon E5-2640 V4 20C40T @2.4G 1 Result compressing & tape writing Bare-Metal
DELL R630 2 * Intel Xeon E5-2640 V4 20C40T @2.4G 1 Genetic analysis Bare-Metal
DELL R740 2 * Intel Xeon Platinum 8269CY 52C104T @2.5G 2 Genetic analysis Bare-Metal
DELL R740 2 * Intel Xeon Platinum 8269CY 52C104T @2.5G 1 Hypervisor Bare-Metal
Supermicro X10QBL 4 * Intel Xeon E7-8892V2 60C120T @2.8G 1 Hypervisor Bare-Metal
HP DL380 G9 2 * Intel Xeon E5-2650 V4 24C48T @2.2G 3 Storage node Bare-Metal
VM 4 Core 2 Domain Controller Virtual Machine
VM 4 Core 1 Cloud Drive Virtual Machine
VM ... ... More VM omitted due to security reasons. Virtual Machine

For details of how these servers works in the project, please check out at the Genetic Analysis Project

ad2

ad2 includes all servers that are for general, personal, or experimental use. It contains all "less-critical" services and is usually used as an "experimental field" for ad1. Before a new group policy was deployed to ad1, it was deployed to ad2 first. Once the new technology was proved to be stable, it can then be deployed to ad1.

Currently, ad2 has the following servers:

Model CPU Amount Purpose Type
HP DL380 G9 2 * Intel Xeon E5-2650 V4 24C48T @2.2G 1 Storage node Bare-Metal
VM 4 Core 2 Domain Controller Virtual Machine
VM 4 Core 1 ASL Cloud Drive Virtual Machine
VM 2 Core 1 Web Primary Proxy Virtual Machine
VM 2 Core 1 Web forum Virtual Machine
VM 2 Core 1 Web static resources Virtual Machine
VM 2 Core 1 Web analytics Virtual Machine
VM 2 Core 1 Web mail Virtual Machine
VM ... ... More VM omitted due to security reasons. Virtual Machine

Bare-Metal Servers and Virtual Machines

Introduction

A mix of bare-metal servers and virtual machiens were used to fulfill our needs. For scientific calculation that require absolute high performance, bare-metal server(s) were assigned to maximize the efficiency. For needs like internal proxies or temperory experimental environment, a bare-metal server is obviously overkill. For such purpose, we create virtual machines for each application or requirement.

Bare-Metal Servers

Bare-metal servers are physical servers that has generic OS installed to provide (mostly) a sole purpose. This is mostly seen in the genetic research project where an intensive amount of data needs to be downloaded, analyzed, and compressed.

Whenever a purpose of a bare-metal server changed, we will re-install the OS via the management interface remotely, and reconfigure it for its new task.

rack of hardware servers

Virtual Machines

Virtual Machiens are much more flexible. Although all the hypervisors were governored under ad1, the virtual machines hosted are not.

Currently, we create virtual machines whenever that is a new application needs to be deployed. This includes a new website, a new experimental environment, or simply a new service like database or email.

When there is enough resources, a virtual machine is more flexible than a docker container. Whenever the application was taken out of service, the virtual machine hosting the service will be deleted from the system.

screenshot of asl web virtual machines

Virtual Machine Overhead

Unlike the docker container, each VM has its own complete OS. This creates overhead when the same OS was installed on multiple virtual machines. Fortunately, a common Linux distribution like Debian or Ubuntu can be installed on a 10GB disk without problem. In most cases, 10GB is enough for the base system and newly insalled software & dependencies. For the application data that is much larger, a remote file system will be mounted. That is, all the VM's shares the space of storage nodes. This has greatly minimize the waste of disk space. Since each hypervisor was equipped with TBs of SSD storage, this enables each hypervisor to host up to hundreds of virtual machines.

mounted network drives on debian 10 system.

Interconnection

Network

All the servers were connected in a full-fiber 40G Ethernet. For storage nodes, RDMA(RoCE) were configured to avoid performance reduction caused by the CPU performance. For the all the physical servers, they were plug in to the core 40G switch directly with MPO fiber.

For management interfaces or all other downstream equipment, they were pluged into a H3C switch that has a total of 40G uplink(bonded by 4 x 10G SFP port). This ensures that the network will always be able to perform at its maximum performance for any equipment at any time.

rack of network equipment

VLAN Isolation

Different VLAN was created and assigned to different purpose and for each machine. This ensures the safety of most servers. However, the VLAN configuration is not complete yet, and more information will be available once the project was complete

Gateway

There are a total of 3 firewall zones, each corresponds to a different route to the internet. This refers to the 3 primary access line of the datacenter. Each route has its own access policy and they served as backups for each other. Whenever a service was setup, it will be assigned an access policy to use all or any of the routes.

Structure of gateway access policy

Typically the application that has intensive traffic (like web drive) will use a different route than application that are less intensive but has more controlling data (like a service monitor or console). This ensures the stableness of most critital service by avoiding packet loss at the gateway.

The Future Work

Currently, there are two focus for the datacenter.

These plans, however, require further study on their potential impact on the structure and performance, and thus were assigned to a different project.