Infrastrucure

Security and sensitive data usage

We provide access to baremetal servers with root access. Resources are provided for individual order, no sharing or virtualization is used. Servers with NVME disks allow you to apply hardware-level Secure Erase operation to ensure the data is erased securely (it is applied by default after each release order, but user still is able to do it manually).

Our GPU Servers features

We use DELL EMC and HPE servers. Some important features that are available:
PLX Switches in between CPU and GPU. They provide 16X PCIv3 lines access per each GPU even is GPU number per server is more then 4. Also they provide an ability to share GPU memory using 16x lanes PCI-E bus within one server.
Onehost. All P100 and V100 GPUs are connected to a single CPU even if server has 2CPUs. This feature simplify interconnection between your software and access to GPU without need to optimize software for multiple CPU to multiple GPU connections.
NVLINK. All P100 and V100 GPUs are inter-connected using NVLINK technology with up to 80 and and 150 GB/s speed respectively.

Our GPU Server Cluster features

Modern GPU clusters require specific technologies to fulfill latest frameworks requirements.

RDMA over Infinityband. Infinityband is a transport technology that is used instead of Etherent and it is specially optimized for Remote Direct Memory Access (RDMA), RDMA is a protocol access other node GPU Memory. Classic Ethernet network technology have comparably high delays in memory access requests because of 4 CPU queue during each request processing (2 on local host and 2 on remote). If Infinityband is used to link Nodes where Network Adapter is installed in PLX switch PCI-E port, it allow to request other node GPU memory without CPU involvement, so communication will looks like: GPU - PLX - IB Adapter - IB Switch - IB Adapter - PLX Switch - GPU. In this case minimum possible delay is provided as well as possibility to broadcast new model parameters to all GPUs in cluster using Intinityband's hardware-level broadcasting (checkout visualization by Microsoft for DeepSpeed technology link ). Our SXM-based servers can be grouped into a cluster by request. Network cards have 2 40GB Ports with Infinityband or ethernet protocols.

GPUs

Our datacenter is equipped GPUs with following specs:

GPU Type	CUDA Cores	Tensor Cores	Memory	Memory Bandwidth, GB/sec	Half-precision performance, TFLOPS	Single-precision performance, TFLOPS	Double-precision performance, TFLOPS
GTX 1080ti	3584	-	11 GB GDDR5X	484	0.16	10.6	0.33
TESLA K80 (2 chips)	4992	-	24GB DDR5	480	N/A	8.73	2.91
TESLA P100 SXM2	3584	-	16GB HBM2	732	21.2	10.6	5.3
TESLA V100 SXM2 16	5120	640	16GB HBM2	900	31.4	15.7	7.8
TESLA V100 SXM2 32	5120	640	32GB HBM2	900	31.4	15.7	7.8