hyteck-blog/content/post/garage.md

4.8 KiB

title date draft image categrories tags
Garage distributed object storage via Ansible 2023-02-27T10:12:54+02:00 false uploads/logos/garage.png
english
homelab
devops

I recently build a beginner-friendly ansible playbook for Garage, a S3 compatible distributed object storage.

What is garage-docker-ansible-deploy?

Garage is an open-source distributed object storage service tailored for self-hosting. The ansible playbook garage-docker-ansible-deploy helps you to set up such a cluster.

It comes with "batteries included" so it will automatically install docker and set up a reverse proxy (traefik).

You may be familiar with some related ansible playbooks that this playbook is based on

These playbooks are masterfully maintained by spantaleev and community. I copied the design und re-use roles e.g. to install traefik.

Opinionated Design

Garage is a very flexible software that can server a lot of use-cases. The playbook is opinionated in the sense that it reduces the flexibility of garage in favor of an easy deployment that should serve common use cases. The playbook currently encourages a layout where

  • 1 garage data node is used per physical drive that should be used by the cluster
  • 1 gateway node is used per host to make redundant setups possible

Each host is assumed to habe a public IPv4/IPv6 address and every node should have a dedicated subdomain + one subdomain per gateway on the host.

When all of this comes together a garage host might look something like this

{{< figure src="/uploads/design_scheme.png" width="100%" caption="Example layout with one host that has 2 nodes (as it has two drives where data will be stored)" alt="A garage node with one gatway node and 2 data nodes that expose the ports 3901, 3911 and 3912. A trafik server exposes port 443. Everything is contained within server1 that has IP 42.42.42.42" >}}

The playbook will need you to configure the DNS records to point to server1 and will make everything else happen with the following configuration.

garage_garage_node1_base_path: "/media/drive1/garage/node1"
garage_garage_node2_base_path: "/media/drive2/garage/node2"
garage_garage_nodes:
    - name: "gateway1"
        metadata_path: "{{ garage_garage_meta_path }}/gw1"
        data_path: "{{ garage_garage_data_path }}/gw1"
        gateway: true
        rpc_bind_port: 3901
        node_addr: "garage-gw1.example.com"
        s3_api_addr: "s3.example.com"
    - name: "node1"
        gateway: false
        capacity: 3
        metadata_path: "{{ garage_garage_node1_base_path }}/metadata"
        data_path: "{{ garage_garage_node1_base_path }}/data"
        rpc_bind_port: 3911
        node_addr: "garage-node1.example.com"
    - name: "node2"
        gateway: false
        capacity: 3
        metadata_path: "{{ garage_garage_node2_base_path }}/metadata"
        data_path: "{{ garage_garage_node2_base_path }}/data"
        rpc_bind_port: 3921
        node_addr: "garage-node2.example.com"

Limitations

While the playbook should of course be reusable and fairly modular it will never be a solution to all use cases. The playbook does not cover

  • Setting up domains (but there are instructions)
  • Detailed management of the buckets and keys: There are basic features to create buckets and access keys but management will not be in the scope of the playbook
  • connecting nodes via (mesh) VPN as metioned in the project documentation

Getting started

Be aware that the playbook is not yet used widely so I don't have much more than my own experiences. I am happy to help if you experience bumps in the road

{{< chat garage >}}