Prior to GitLab 15, we implemented GitLab availability using shared NFS, despite occasional latency errors while accessing the filesystem.
Since GitLab 16, Gitaly, the filesystem layer of Git, has officially ended supporting for shared disk, including NFS, Luster, GlusterFS, multi-attach EBS, and EFS. The only supported storage is block storage, such as Cinder, EBS, or VMDK.
Due to segregation of duties (SoD) and vulnerabilities concerns, we opt to keep simplicity on GitLab. Features such as CI/CD, GitOps, KAS(Kubernetes management), or artifacts are disabled in our self-hosted GitLab. Instead, we opt for other well-known alternatives (JFrog, Jenkins, Gerrit) to reduce the workload on GitLab.
Here are some Git hosting solutions.
GitLab Enterprise | Wandisco’s Gerrit | Hot standby GitLab | |
---|---|---|---|
License | Commercial | Commercial | OSS |
Docs | GitLab Geo-replica. | Wandisco’s Multisite solution | Here |
Minimum Nodes | 5 | 3 | 2 |
Local Storages Support | Yes | Yes | N (Centralized storage required) |
Zone Replication | Yes(Gitaly cluster) | Yes(PAXOS) | Partial |
Regional Replication | Yes(GitLab Geo) | Yes(WAN replication) | No |
Comments | Gitaly cluster requires an additional Postgres | Require ecosystem migration | Swichover is not automatically. |
You need to purchase enterprise licenses to access these premium services.
It works fine on both bare metal and virtual machines, even with local storage.
Wandisco introduced PAXOS on Git, which is a consensus algorithm for distributed systems, to replicate repositories in multiple continents. The performance has been proven over decades of use in their product.
To use Gerrit by Wandisco, consider the following:
To avoid the disruptions to computing nodes, we believe each node should access dedicated storage nodes through fiber connections, such as a NetApp/OceanStor hardware, or Cinder virtualization platform.
The diagram shows a hot standby architecture along with disaster tolerance in another region.
Here is more details explanation:
Compared with the single node deployment, when the primary VM is unavailable, the Rails application on the secondary node, which is database-backed, continues to run.
As a platform engineer with extensive hands-on experience with a vast amount of DevOps products, I believe that GitLab’s “All-in-one” strategy is a significant design flaw. An overly broad scope leads to immaturity in features.
Manage the GitLab instance is complex duo to its underlying RPC-based architecture.
The structure inside GitLab’s Docker container is disorganized, I have never seen a product using Chef automation on a single node.
Unlike Jenkins or Sonarqube, there is no LTS version of GitLab on security & bug fixes, we must keep the instance up-to-date every three months. However, according to GitLab’s version policy, new releases may include new features, as we all know, no one can ensure new features are free from vulnerabilities. That is to say, we are in a frequently upgrade cycle.
New features can be an ongoing version (e.g., AI chat, Kubernetes deployment) for an extended period, you might wait for iteration.
The upgrade requires extensive preparations every three months.
The following issues are unrelated to typical tech vulnerabilities (such as XSS or CSRF), but the underlying design of roles and permissions.
Guest
and Reporter
without a cheat sheet.:admin_project
role grant extensive permission checkpoints.Some design issues have been discussed in another article.
We are still using GitLab rather than Gerrit because we have accumulated a significant investment in the GitLab API, especially on the merge request workflow. Besides the investment, there are other goodness:
While no solution elimates all single points of failure, the hot-standby solution offers a license-free option with tradeoffs, I hope you find the design useful when deploying GitLab services.