Prior to GitLab 15, we implemented GitLab availability using shared NFS, despite occasional latency errors while accessing the filesystem.
Since GitLab 16, Gitaly, the filesystem layer of Git, has officially ended supporting for shared disk, including NFS, Luster, GlusterFS, multi-attach EBS, and EFS. The only supported storage is block storage, such as Cinder, EBS, or VMDK.
Due to segregation of duties (SoD) and vulnerabilities concerns, we opt to keep simplicity on GitLab. Features such as CI/CD, GitOps, KAS(Kubernetes management), or artifacts are disabled in our self-hosted GitLab. Instead, we opt for other well-known alternatives (JFrog, Jenkins) to reduce the workload on GitLab.
Here are some Git hosting solutions.
GitLab Enterprise | Wandisco’s Gerrit | Hot standby GitLab | |
---|---|---|---|
License | Commercial | Commercial | OSS |
Docs | GitLab Geo-replica. | Wandisco’s Multisite solution | Here |
Minimum Nodes | 5 | 3 | 2 |
Local Storages Support | Yes | Yes | N (Centralized storage required) |
AZ Replication | Yes(Gitaly cluster) | Yes(PAXOS) | Partial |
Regional Replication | Yes(GitLab Geo) | Yes(WAN replication) | No |
Comments | Gitaly cluster requires an additional Postgres | Require ecosystem migration | Swichover is not automatically. |
You need to purchase enterprise licenses to access these premium services.
It works fine on both bare metal and virtual machines, even with local storage. The issue remains here is that there will still be downtime during the upgrade.
Wandisco introduced PAXOS on Git, which is a consensus algorithm for distributed systems, to replicate repositories in multiple continents. The performance has been proven over decades of use in their subversion product.
To use Gerrit by Wandisco, consider the following:
To avoid the disruptions to computing nodes, we believe each node should access dedicated storage nodes through fiber connections, such as a NetApp/OceanStor hardware, or Cinder virtualization platform.
The diagram shows a hot standby architecture along with disaster tolerance in another region.
Here is more details explanation:
Compared with the single node deployment, when the primary VM is unavailable, the Rails application on the secondary node, which is database-backed, continues to run.
Unlike Jenkins or Sonarqube, there is no LTS version of GitLab on security & bug fixes, we must keep the instance up-to-date every three months. However, according to GitLab’s version policy, new releases may include new features, as we all know, no one can ensure new features are free from vulnerabilities. That is to say, we are in a frequently upgrade cycle.
New features can be an experimental version (e.g., AI chat, Kubernetes deployment) for an extended period, you might wait for iteration.
The upgrade requires extensive preparations every three months.
While no solution elimates all single points of failure, the hot-standby solution offers a license-free option with tradeoffs, I hope you find the design useful when deploying GitLab services.