Prepare Gitea migration (from GKE to Kosmos server) #147

Closed
opened 2020-03-26 23:11:51 +00:00 by raucao · 13 comments
Owner

As discussed, GKE doesn't give us enough benefits at the moment to justify the high costs. Especially as we don't even get zero-downtime deployments due to not being able to attach volumes to more than one pod/node.

Rough idea was to get another metal box at Hetzner, which could then also host Kosmos Drone CI.

As discussed, GKE doesn't give us enough benefits at the moment to justify the high costs. Especially as we don't even get zero-downtime deployments due to not being able to attach volumes to more than one pod/node. Rough idea was to get another metal box at Hetzner, which could then also host Kosmos Drone CI.
Author
Owner

After everyone in the last call agreed about the Hetzner move, I just ordered a used box (from the auction system), which I thought was the best value for what we need. As it's the middle of the night, and also a weekend, in Germany, it'll be ready by tomorrow or Monday.

It's an Intel Core i7 4770 with two 240GB SSDs and 32GB of RAM. This should be way enough to run all of Gitea, Drone CI, and Jitsi Meet.

I think it would also be cool to use this to add a second ejabberd node, as that should be pretty easy with Erlang (or maybe not, but then we know what it involves). And if someone wants to dive into Postgres replication, it'd be awesome to have a warm standby server, so that in case of the main server going down, we could easily deploy all apps elsewhere and switch to the standby, with zero data loss from backup intervals.

After everyone in the last call agreed about the Hetzner move, I just ordered a used box (from the auction system), which I thought was the best value for what we need. As it's the middle of the night, and also a weekend, in Germany, it'll be ready by tomorrow or Monday. It's an Intel Core i7 4770 with two 240GB SSDs and 32GB of RAM. This should be way enough to run all of Gitea, Drone CI, and Jitsi Meet. I think it would also be cool to use this to add a second ejabberd node, as that should be pretty easy with Erlang (or maybe not, but then we know what it involves). And if someone wants to dive into Postgres replication, it'd be awesome to have a warm standby server, so that in case of the main server going down, we could easily deploy all apps elsewhere and switch to the standby, with zero data loss from backup intervals.
greg self-assigned this 2020-04-07 14:45:43 +00:00
Owner

I have found instructions to perform a MySQL dump from a GKE node to a local file (https://medium.com/@madushagunasekara/export-mysql-db-dump-from-kubernetes-pod-and-restore-mysql-db-on-kubernetes-pod-6f4ecc6b5a64) and did a successful dump.

I'm now looking into exporting the git repositories from the persistent volume

I have found instructions to perform a MySQL dump from a GKE node to a local file (https://medium.com/@madushagunasekara/export-mysql-db-dump-from-kubernetes-pod-and-restore-mysql-db-on-kubernetes-pod-6f4ecc6b5a64) and did a successful dump. I'm now looking into exporting the git repositories from the persistent volume
Owner

Exporting the data is also straightforward:

kubectl cp gitea-server-#{pod_id}:/data ./backup
Exporting the data is also straightforward: ``` kubectl cp gitea-server-#{pod_id}:/data ./backup ```
Owner

I looked into running Gitea and Drone using Docker Compose (we're running Drone CI using Docker Compose at 5apps, but for Kosmos we're using Gitea instead of GitHub for auth)

The part that's a bit tricky is running SSH in Gitea on port 22 on the host machine, I won't have time to finish this today but I'm close

I looked into running Gitea and Drone using Docker Compose (we're running Drone CI using Docker Compose at 5apps, but for Kosmos we're using Gitea instead of GitHub for auth) The part that's a bit tricky is running SSH in Gitea on port 22 on the host machine, I won't have time to finish this today but I'm close
Author
Owner

To be honest, I'd prefer not to use Docker on normal servers wherever possible. It adds another layer of complexity both for running the thing, but also for preparing it for deployment (can't go straight from source to server).

Gitea deployment only requires a database server and a single executable to build or download, and run. And building it ourselves on the server would automatically solve the problem of creating custom builds for us, when we e.g. want to deploy our own fixes or additions from a different branch.

(Drone CI is a different beast, of course, and Docker Compose seems like the best way to go to me.)

To be honest, I'd prefer not to use Docker on normal servers wherever possible. It adds another layer of complexity both for running the thing, but also for preparing it for deployment (can't go straight from source to server). Gitea deployment only requires a database server and a single executable to build or download, and run. And building it ourselves on the server would automatically solve the problem of creating custom builds for us, when we e.g. want to deploy our own fixes or additions from a different branch. (Drone CI is a different beast, of course, and Docker Compose seems like the best way to go to me.)
Author
Owner

This could come in handy:

https://discourse.gitea.io/t/migrate-gitea-db-from-mariadb-to-postgresql/2072/3

By the way, I think we should use only the Postgres master on Andromeda from both machines, and set up a hot standby (and read-only) replica on Centaurus. This way, we can recover all apps that use Postgres very easily, and without using outdated backup downloads, when one of the machines goes down.

This could come in handy: https://discourse.gitea.io/t/migrate-gitea-db-from-mariadb-to-postgresql/2072/3 By the way, I think we should use only the Postgres master on Andromeda from both machines, and set up a hot standby (and read-only) replica on Centaurus. This way, we can recover all apps that use Postgres very easily, and without using outdated backup downloads, when one of the machines goes down.
Owner

Yeah deploying Gitea without Docker makes more sense, I will do that.

Switching to Postgres also sounds good

Yeah deploying Gitea without Docker makes more sense, I will do that. Switching to Postgres also sounds good
Owner

I got Gitea to run in a VM, deployed by Chef. I'm running a postgresql server with TLS, cert generated by Let's Encrypt with the DNS auth, using Gandi like we already do for the XMPP server. I have also successfully used the gitea backup command with the postgresql format, it's cool that it's included

Now I'm researching warm standby

I got Gitea to run in a VM, deployed by Chef. I'm running a postgresql server with TLS, cert generated by Let's Encrypt with the DNS auth, using Gandi like we already do for the XMPP server. I have also successfully used the `gitea backup` command with the postgresql format, it's cool that it's included Now I'm researching [warm standby](https://www.postgresql.org/docs/10/warm-standby.html)
Author
Owner

Just a hint: the linked doc is for an outdated version of Postgres. The replication options have changed since then:

https://www.postgresql.org/docs/12/warm-standby.html

Edit: there are also various tools for helping with replication and failover, e.g. https://repmgr.org/

Just a hint: the linked doc is for an outdated version of Postgres. The replication options have changed since then: https://www.postgresql.org/docs/12/warm-standby.html Edit: there are also various tools for helping with replication and failover, e.g. https://repmgr.org/
Owner

Thanks, I got replication in Postgres 12 to work and it is much better. Its options are part of the normal config, no more recovery.conf in the data directory, and the initial sync can now set up the master. PostgreSQL provide official packages for 12 on Ubuntu 18.04, so that's easy to set up using the existing upstream cookbook

We should update from Postgres 10 to 12 before setting up replication on the new server, I think that would be the easiest route. For our setup with just one database pg_upgrade looks like a good fit, I remember using it before, probably to switch from PostgreSQL 9 to 10

Thanks, I got replication in Postgres 12 to work and it is [much better](https://www.percona.com/blog/2019/10/11/how-to-set-up-streaming-replication-in-postgresql-12/). Its options are part of the normal config, no more `recovery.conf` in the data directory, and the initial sync can now set up the master. PostgreSQL provide official packages for 12 on Ubuntu 18.04, so that's easy to set up using the existing upstream cookbook We should update from Postgres 10 to 12 before setting up replication on the new server, I think that would be the easiest route. For our setup with just one database [pg_upgrade](https://www.postgresql.org/docs/12/pgupgrade.html) looks like a good fit, I remember using it before, probably to switch from PostgreSQL 9 to 10
Author
Owner

Sounds great!

Yeah, I have to use pg_upgrade on my laptop for every major version. For Arch, it looks like this: https://wiki.archlinux.org/index.php/PostgreSQL#Upgrading_PostgreSQL

Sounds great! Yeah, I have to use `pg_upgrade` on my laptop for every major version. For Arch, it looks like this: https://wiki.archlinux.org/index.php/PostgreSQL#Upgrading_PostgreSQL
raucao added a new dependency 2020-04-25 10:13:57 +00:00
Owner

Copied from the PR:

I think we can finally migrate away Gitea to Centaurus (with the DB on Andromeda as the master). Tomorrow I will check everything on Andromeda and Centaurus, then we can pick a time and date to do the switch. It will involve a DNS switch, in preparation for this I have just lowered the TTL on gitea.kosmos.org down to 300s, the lowest Gandi supports. It was previously set to 1800s.

Here are my notes for the dump/import:

Perform a dump

From https://discourse.gitea.io/t/migrate-gitea-db-from-mariadb-to-postgresql/2072/3

$ kubectl exec gitea-server-5d57fc877d-ghvps -n default -- /app/gitea/gitea dump -d postgres -c /data/gitea/conf/app.ini -f /data/gitea/gitea-dump.zip
$ kubectl cp gitea-server-5d57fc877d-ghvps:/data/gitea/gitea-dump.zip gitea-dump.zip
$ kubectl exec gitea-server-5d57fc877d-ghvps -n default -- /bin/rm /data/gitea/gitea-dump.zip

Import the dump

on Andromeda:

$ sudo su - postgres -c "psql gitea < gitea-db.sql"

on Centaurus:

SCP and unzip gitea-dump.zip

$ scp gitea-dump.zip centaurus.kosmos.org:
$ ssh centaurus.kosmos.org
$ sudo systemctl stop gitea
$ mkdir dump; mv gitea-dump.zip dump; cd dump; unzip gitea-dump.zip; sudo cp -R repositories /home/git/gitea-repositories
# Copy the content of `data`:
$ sudo cp -R dump/data/sessions /var/lib/gitea/
$ sudo cp -R dump/data/indexers /var/lib/gitea/data/
$ sudo cp -R dump/data/attachments /var/lib/gitea/data/
$ sudo cp -R dump/data/avatars /var/lib/gitea/data/
$ sudo cp -R dump/data/repo-avatars /var/lib/gitea/data/
$ sudo rm -rf /var/lib/gitea/data/queues; sudo cp -R dump/data/queues /var/lib/gitea/data/
$ sudo chown -R git:git /var/lib/gitea/data/
$ sudo systemctl start gitea

To add the public keys from the database to the /home/git/.ssh/authorized_key file, allowing users to access git@gitea.kosmos.org:

$ sudo su - git -c "/usr/local/bin/gitea admin regenerate keys --config /etc/gitea/app.ini"

This performs the same operation as running the Update the '.ssh/authorized_keys' file with Gitea SSH keys. (Not needed for the built-in SSH server.) maintenance task from the Gitea admin dashboard

Copied from the PR: I think we can finally migrate away Gitea to Centaurus (with the DB on Andromeda as the master). Tomorrow I will check everything on Andromeda and Centaurus, then we can pick a time and date to do the switch. It will involve a DNS switch, in preparation for this I have just lowered the TTL on gitea.kosmos.org down to 300s, the lowest Gandi supports. It was previously set to 1800s. Here are my notes for the dump/import: ## Perform a dump From https://discourse.gitea.io/t/migrate-gitea-db-from-mariadb-to-postgresql/2072/3 ``` $ kubectl exec gitea-server-5d57fc877d-ghvps -n default -- /app/gitea/gitea dump -d postgres -c /data/gitea/conf/app.ini -f /data/gitea/gitea-dump.zip $ kubectl cp gitea-server-5d57fc877d-ghvps:/data/gitea/gitea-dump.zip gitea-dump.zip $ kubectl exec gitea-server-5d57fc877d-ghvps -n default -- /bin/rm /data/gitea/gitea-dump.zip ``` ## Import the dump ### on Andromeda: ``` $ sudo su - postgres -c "psql gitea < gitea-db.sql" ``` ### on Centaurus: SCP and unzip gitea-dump.zip ``` $ scp gitea-dump.zip centaurus.kosmos.org: $ ssh centaurus.kosmos.org ``` ``` $ sudo systemctl stop gitea $ mkdir dump; mv gitea-dump.zip dump; cd dump; unzip gitea-dump.zip; sudo cp -R repositories /home/git/gitea-repositories # Copy the content of `data`: $ sudo cp -R dump/data/sessions /var/lib/gitea/ $ sudo cp -R dump/data/indexers /var/lib/gitea/data/ $ sudo cp -R dump/data/attachments /var/lib/gitea/data/ $ sudo cp -R dump/data/avatars /var/lib/gitea/data/ $ sudo cp -R dump/data/repo-avatars /var/lib/gitea/data/ $ sudo rm -rf /var/lib/gitea/data/queues; sudo cp -R dump/data/queues /var/lib/gitea/data/ $ sudo chown -R git:git /var/lib/gitea/data/ $ sudo systemctl start gitea ``` To add the public keys from the database to the `/home/git/.ssh/authorized_key` file, allowing users to access git@gitea.kosmos.org: ``` $ sudo su - git -c "/usr/local/bin/gitea admin regenerate keys --config /etc/gitea/app.ini" ``` This performs the same operation as running the `Update the '.ssh/authorized_keys' file with Gitea SSH keys. (Not needed for the built-in SSH server.)` maintenance task from the Gitea admin dashboard
Owner

Copied this question from the PR:

As we're importing a database dump, the /home/git/.ssh/authorized_keys file on centaurus will be empty at first, this admin task generates the content of the file with users’ public key. It is then managed by Gitea, so new keys are automatically added, deleted keys deleted, etc

Yes, that's obvious from the original post. But what is this task? It is just English text in your post, but is it a script somewhere? How is it run?

Originally this was a maintenance task executed from the Gitea admin dashboard, so a link on https://gitea.kosmos.org/admin

I have found a better way, this is also available as a script, I have added the line to the checklist above:

$ sudo su - git -c "/usr/local/bin/gitea admin regenerate keys --config /etc/gitea/app.ini"
Copied this question from the PR: >> As we're importing a database dump, the /home/git/.ssh/authorized_keys file on centaurus will be empty at first, this admin task generates the content of the file with users’ public key. It is then managed by Gitea, so new keys are automatically added, deleted keys deleted, etc > > Yes, that's obvious from the original post. But what is this task? It is just English text in your post, but is it a script somewhere? How is it run? Originally this was a maintenance task executed from the Gitea admin dashboard, so a link on https://gitea.kosmos.org/admin I have found a better way, this is also available as a script, I have added the line to the checklist above: ``` $ sudo su - git -c "/usr/local/bin/gitea admin regenerate keys --config /etc/gitea/app.ini" ```
greg changed title from Migrate Gitea off of GKE to Prepare Gitea migration (from GKE to Kosmos server) 2020-06-02 11:51:56 +00:00
Sign in to join this conversation.
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Depends on
Reference: kosmos/chef#147
No description provided.