Unrelative components are disables for vulnerabilities concerns.
The major problems hindering the availability while deploying Gitlab on multiple nodes is that Gitlab need a redundant block storage.
There are two components that require shared block storage. The first is sidekiq, a distributed job executor. When users want to retrieve some temporary file generated by a background job, the Nginx can’t assure the request is routed into the same node.
Another is Gitaly, the filesystem layer of Git, which has officially ended of support of NFS, Luster, GlusterFS, multi-attach EBS and EFS. However, the virtual block storage is supported, such as Cinder, EBS, or VMDK.
Features such as CI/CD, GitOps, KAS(Kuberenetes management), or artifacts are disabled in our self-hosting Gitlab, as we are concerned about the segregation of duties(SoD) and security. Instead, we opt other well-known alternatives (JFrog, Jenkins, Gerrit) to maintain the simplicity of Gitlab instances.
We have 3K active users per month on self hosting Gitlab instances, which undertake CI/CD builds (by Jenkins) and source code management. We consistently monitor the Gitlab instances with grafana to find the performance issues.
According to our statistics, there are some requirements for Gitlab hosting
CPU: Gitlab is not a CPU-bound application, as the peak CPU time is during the merge request only. But you may not use the oversold CPU instances, as the high performance network adapter requires better specs.
Memory: Gitlab requires memories at a range of 8GB and 128GB, besides the application memory, the operating system may cache the file into memory pages.
Networking: Gitlab requires high performance I/O during the code pull and push
Local storage: Gitlab requires high performance storage while differing the code, the cloud SSD disk is recommended.
In summary, I’d recommend deploying on an instance that has 4U/32G or 8U/64G machine with large local cloud SSD disks.
Besides the machine specs, we also found that running on bare metal was not efficient.
As there is no sliver bullet to eliminate all single point of failures and disasters. In Gitlab, the storage replication is required.
However, as dedicated storage including Clould EBS are zone specific resources, which are redundant in the same available zone, disaster tolerance among regions is not covered in our scope, as Git operations are lantency intensive, replicating data over WANs will slow down the I/O performance.
Here are some Git hosting solutions.
Gitlab Enterprise | Wandisco’s Gerrit | Hot standby Gitlab | |
---|---|---|---|
Free | N | N | Y(OSS) |
Docs | Gitlab Geo-replica. | Wandisco’s Multisite solution | Here |
Minimum local nodes | 5 | 1 * | 2 |
Support local storages | Y | Y | N (Dedicated storage required) |
High availability | Y | Y | Generally |
Disaster tolerance | Y (with Gitlab Geo) | Y | N |
Comments | Gitlay cluster requires an additional Postgres | Ecosystems changed | Swichover is not automatically. |
You need to pay an enterprise license to get the premium service.
It works fine on both bare metal and virtualized machines, even with the local storage.
By using gerrit, you have to switch the ecosystem from Gitlab to Gerrit. I believe it’s okay to use gerrit if your team is changing from subversion to Git.
To avoid the computing node disruptions, we believe Gitlab node should never access it’s local storage but access centralized and dedicated storage nodes through fibers, such as a NetApp/OceanStor hardware, or Cinder virtualization platform.
The diagram shows a hot standby architecture, The secondary VM is connecting the primary VM directly with Gitaly client through gRPC, rather than connecting to a disk.
You are allowed to create one more more Secondary VMs to minimize the downtime risk.
Compared with the single node deployment, when the primary VM is down, despite the unavailability of accessing code, the rails application in the secondary node, which is database backended, is keeping running.
There are three approaches that can set up the high availability of Git hosting. The hot standby solution comes with tradeoffs, but it requires no additional license, I hope you find the design useful when deploying Gitlab services.