Before GitLab 15, we implemented the availability of GitLab via shared NFS, despite there being minor latency errors while accessing the filesystem.
Since GitLab 16, Gitaly, the filesystem layer of Git, has officially ended supporting for shared disk, including NFS, Luster, GlusterFS, multi-attach EBS, and EFS. The only supported block storage is cloud storage, such as Cinder, EBS, or VMDK.
Due to segregation of duties (SoD) and vulnerabilities concerns, we opt to keep simplicity on GitLab. Features such as CI/CD, GitOps, KAS(Kubernetes management), or artifacts are disabled in our self-hosted GitLab. Instead, we opt for other well-known alternatives (JFrog, Jenkins, Gerrit) to reduce the workload on GitLab.
Here are some Git hosting solutions.
GitLab Enterprise | Wandisco’s Gerrit | Hot standby GitLab | |
---|---|---|---|
License | Commercial | Commercial | OSS |
Docs | GitLab Geo-replica. | Wandisco’s Multisite solution | Here |
Minimum Nodes | 5 | 3 | 2 |
Local Storages Support | Yes | Yes | N (Centralized storage required) |
High Availability | Yes(Gitaly cluster) | Yes(PAXOS) | Partial |
Disaster tolerance | Yes(GitLab Geo) | Yes(WAN replication) | No |
Comments | Gitaly cluster requires an additional Postgres | Require ecosystem migration | Swichover is not automatically. |
You need to pay enterprise licenses to get the premium service.
It works fine on both bare metal and virtual machines, even with local storage.
Wandisco introduced PAXOS on Git, which is a consensus algorithm for distributed systems, to replicate repositories in multiple continents. The performance has been proven on their product for decades years.
To use Gerrit by Wandisco, the following should take into account:
To avoid the disruptions to computing nodes, we believe each node should never access its local storage but access dedicated storage nodes through fiber connections, such as a NetApp/OceanStor hardware, or Cinder virtualization platform.
The diagram shows a hot standby architecture, the secondary VM is connecting the primary VM directly with Gitaly client through gRPC, rather than connecting to a disk.
Here is an explanation:
Compared with the single node deployment, when the primary VM is down, despite the unavailability of accessing code, the rails application in the secondary node, which is database-backed, keeps running.
As a platform engineer who has hands-on experience with a vast amount of DevOps products, I believe that the “All-in-one” strategy is the worst choice that GitLab Inc. has ever made, as an overly broad scope will lead to the immaturity of features.
To manage the GitLab instance, I studied the underlying RPC-based architecture and found it complicated.
The structure inside GitLab’s docker container is disorganized, I have never seen a product using chef automation on a single node.
Unlike Jenkins or Sonarqube, there is no LTS version of GitLab on security & bug fixes, we have to keep the instance up-to-date every three months. However, according to GitLab’s version policy, new releases could include new features, as we all know, no one can ensure new features are free from vulnerabilities. That is to say, we are in a frequently upgrade cycles.
New features can be an ongoing version (e.g., AI chat, Kubernetes deployment) for a long period, you might wait for the iteration.
The upgrade requires extensive preparations every three months.
The following issues are unrelated to typical tech vulnerabilities (such as XSS or CSRF), but the underlying design of roles and permissions.
Guest
and Reporter
without a cheat sheet.:admin_project
undertakes too many checkpoints.Some design issues have been talked about in another article.
We are still using GitLab rather than Gerrit because we have accumulated a large amount of investment in GitLab API, especially on the merge request workflow. Besides the investment, there are other goodness:
While no solution elimates all single points of failure, the hot-standby solution offers a license-free option with tradeoffs, I hope you find the design useful when deploying GitLab services.