Encrypt PostgreSQL data directory #166

Merged
raucao merged 11 commits from feature/pg_encfs into master 2020-06-08 15:02:59 +00:00
Owner

encfs always runs a configuration assistant when creating a new volume, so this needs to be done manually:

systemctl stop postgresql@12-main
mv /var/lib/postgresql /var/lib/postgresql.old
encfs /var/lib/postgresql_encrypted /var/lib/postgresql --public
Pick p (paranoia mode) and enter the password from the data bag twice

mv /var/lib/postgresql/* /var/lib/postgresql/
systemctl start postgresql@12-main

This is running on centaurus and is mounted automatically on boot by a system unit

Refs #129

encfs always runs a configuration assistant when creating a new volume, so this needs to be done manually: systemctl stop postgresql@12-main mv /var/lib/postgresql /var/lib/postgresql.old encfs /var/lib/postgresql_encrypted /var/lib/postgresql --public Pick p (paranoia mode) and enter the password from the data bag twice mv /var/lib/postgresql/* /var/lib/postgresql/ systemctl start postgresql@12-main This is running on centaurus and is mounted automatically on boot by a system unit Refs #129
Owner

Mostly LGTM. But what good is encypting a volume, when you leave the encryption password lying around on the hard drive?

(This is meant to make it impossible to rip the drive out of our machine and get to the user data.)

Mostly LGTM. But what good is encypting a volume, when you leave the encryption password lying around on the hard drive? (This is meant to make it impossible to rip the drive out of our machine and get to the user data.)
Author
Owner

Another option is to not start the PostgreSQL service on boot and run a script on boot where you input the encrypted volume password and then it starts PostgreSQL

Another option is to not start the PostgreSQL service on boot and run a script on boot where you input the encrypted volume password and then it starts PostgreSQL
Owner

Yes, and that should be only once for all volumes ideally (or only have one data volume to begin with). I.e. we also want encryption for any other user data ideally, like Gitea repos, XMPP uploads, remoteStorage files, and so on.

Yes, and that should be only once for all volumes ideally (or only have one data volume to begin with). I.e. we also want encryption for any other user data ideally, like Gitea repos, XMPP uploads, remoteStorage files, and so on.
raucao added a new dependency 2020-06-04 10:50:23 +00:00
Author
Owner

I took another look at this issue, I'm starting to think a full disk encryption setup would make more sense instead of encFS directories, something similar to https://github.com/TheReal1604/disk-encryption-hetzner/blob/master/ubuntu/ubuntu_swraid_lvm_luks.md

I took another look at this issue, I'm starting to think a full disk encryption setup would make more sense instead of encFS directories, something similar to https://github.com/TheReal1604/disk-encryption-hetzner/blob/master/ubuntu/ubuntu_swraid_lvm_luks.md
greg changed title from Encrypt the Postgresql data dir on the replica (centaurus) to WIP: Encrypt the Postgresql data dir on the replica (centaurus) 2020-06-04 17:46:12 +00:00
Author
Owner

I have pushed a proof of concept that creates a /var/lib/local/encrypted_data encfs dir and mounts it to /mnt/data. This is done using a Systemd unit that prompts for the encryption password, and then starts the Postgresql unit. See the last commit

Edit: I have also found a way to automate the encfs dir creation

I have pushed a proof of concept that creates a `/var/lib/local/encrypted_data` encfs dir and mounts it to `/mnt/data`. This is done using a Systemd unit that prompts for the encryption password, and then starts the Postgresql unit. See the last commit Edit: I have also found a way to automate the encfs dir creation
Owner

I think using a service unit for encfs may not be the right approach for this. It is not a running service, like e.g. postgres, but only mounts a directory once.

Here's a nice overview of the different types of units available:

https://www.computernetworkingnotes.com/linux-tutorials/systemd-units-explained-with-types-and-states.html

I think using a service unit for encfs may not be the right approach for this. It is not a running service, like e.g. postgres, but only mounts a directory once. Here's a nice overview of the different types of units available: https://www.computernetworkingnotes.com/linux-tutorials/systemd-units-explained-with-types-and-states.html
Owner

I think it would be interesting to try path-based activation of units. This would basically map 1:1 to the human understanding of "start the postgres service as soon as path /mnt/data/postgres" becomes available.

I think it would be interesting to try path-based activation of units. This would basically map 1:1 to the human understanding of "start the postgres service as soon as path /mnt/data/postgres" becomes available.
Owner

BTW, I just noticed that encfs and gitea are the only two cookbooks using an underscore as space (kosmos_encfs), while all the others use a hyphen (e.g. kosmos-postgresql).

BTW, I just noticed that encfs and gitea are the only two cookbooks using an underscore as space (`kosmos_encfs`), while all the others use a hyphen (e.g. `kosmos-postgresql`).
Owner

I just tried this branch by adding the recipes for a postgres master and encfs to the Vagrant config's runlist. However, Chef runs fail, claiming there's no postgres user:

cannot determine user id for 'postgres', does the user exist on this system?

I cannot find this user being created in a site-cookbook either.

I just tried this branch by adding the recipes for a postgres master and encfs to the Vagrant config's runlist. However, Chef runs fail, claiming there's no `postgres` user: ```plain cannot determine user id for 'postgres', does the user exist on this system? ``` I cannot find this user being created in a site-cookbook either.
Owner

... adding a postgres system user to the default recipe fixes the problem. However, when trying to unlock encfs, it does not accept the password from the data bag:

mount_encfs[2065]: Error decoding volume key, password incorrect

I think this code is probably too prone to errors:

  command <<-EOF
echo "y\\\n
y\\\n
p\\\n
#{encfs_password}\\\n
#{encfs_password}\\\n
" | encfs #{encrypted_directory} #{mount_directory} --public --stdinpass
  EOF

Edit: also, a reboot makes the Postgres service fail, instead of wait:

postgresql@12-main[2230]: Error: /mnt/data/postgresql/12/main is not accessible or does not exist

I'm going to try to use the path as trigger instead and actually make it wait until the path exists.

... adding a postgres system user to the default recipe fixes the problem. However, when trying to unlock encfs, it does not accept the password from the data bag: ```plain mount_encfs[2065]: Error decoding volume key, password incorrect ``` I think this code is probably too prone to errors: ```rb command <<-EOF echo "y\\\n y\\\n p\\\n #{encfs_password}\\\n #{encfs_password}\\\n " | encfs #{encrypted_directory} #{mount_directory} --public --stdinpass EOF ``` Edit: also, a reboot makes the Postgres service fail, instead of wait: ```plain postgresql@12-main[2230]: Error: /mnt/data/postgresql/12/main is not accessible or does not exist ``` I'm going to try to use the path as trigger instead and actually make it wait until the path exists.
Owner

I keep running into issues with the code here. There's actually no cluster created with the correct datadir, but all the files are created in the default directory. So starting the process later fails with it complaining that the datadir is not a valid cluster directory. However, trying pg_createcluster then correctly fails, stating that the cluster config already exists.

I keep running into issues with the code here. There's actually no cluster created with the correct datadir, but all the files are created in the default directory. So starting the process later fails with it complaining that the datadir is not a valid cluster directory. However, trying `pg_createcluster` then correctly fails, stating that the cluster config already exists.
Owner

All solved! I have everything running correctly now, with the cluster created in the encrypted data dir, and the services started by path units.

@greg I'm not experienced with writing Chef resources, but I think a resource in the encfs cookbook would make most sense, so that you simply describe that you want to wait for the encrypted dir to start a certain systemd service in the recipe of the respective service.

That is, for Postgres e.g., we'd only do something along the lines of:

create_encfs_path_activation_unit_for 'postgresql@12-main.service'

I have added a template for this to the encfs site cookbook, which looks like this:

[Unit]
Description=Start <%= @service_unit %> when encrypted data directory is mounted

[Path]
PathExists=/tmp/data-dir-mounted.txt
Unit=<%= @service_unit %>

[Install]
WantedBy=multi-user.target
All solved! I have everything running correctly now, with the cluster created in the encrypted data dir, and the services started by path units. @greg I'm not experienced with writing Chef resources, but I think a resource in the encfs cookbook would make most sense, so that you simply describe that you want to wait for the encrypted dir to start a certain systemd service in the recipe of the respective service. That is, for Postgres e.g., we'd only do something along the lines of: ```ruby create_encfs_path_activation_unit_for 'postgresql@12-main.service' ``` I have added a template for this to the encfs site cookbook, which looks like this: ```erb [Unit] Description=Start <%= @service_unit %> when encrypted data directory is mounted [Path] PathExists=/tmp/data-dir-mounted.txt Unit=<%= @service_unit %> [Install] WantedBy=multi-user.target ```
Owner

... I had to fork the postgresql cookbook in order to add the possibility to correctly use custom data dirs on Debian-based systems:

9389178e11

I have added the site cookbook as a git submodule. So when you pull the branch, just do a git submodule update --init.

... I had to fork the postgresql cookbook in order to add the possibility to correctly use custom data dirs on Debian-based systems: https://github.com/67P/postgresql/commit/9389178e116528ad22b1d05e93ca93c2a485c727 I have added the site cookbook as a git submodule. So when you pull the branch, just do a `git submodule update --init`.
Owner

I have pushed everything. See new commits.

I have pushed everything. See new commits.
raucao self-assigned this 2020-06-07 10:51:08 +00:00
greg was assigned by raucao 2020-06-07 10:51:10 +00:00
raucao changed title from WIP: Encrypt the Postgresql data dir on the replica (centaurus) to WIP: Encrypt PostgreSQL data directory 2020-06-07 10:57:12 +00:00
greg reviewed 2020-06-08 13:38:00 +00:00
@ -44,0 +46,4 @@
# This service is a dependency that will auto-start our cluster service on
# boot if it's enabled, so we disable it explicitly
service "postgresql" do
action :disable
Author
Owner

postgresql is a dummy service, it only runs /bin/true. The service to disable is the content of the postgresql_service variable (postgresql@12-main), so this can be moved above

`postgresql` is a dummy service, it only runs `/bin/true`. The service to disable is the content of the `postgresql_service` variable (`postgresql@12-main`), so this can be moved above
raucao changed title from WIP: Encrypt PostgreSQL data directory to Encrypt PostgreSQL data directory 2020-06-08 15:02:37 +00:00
raucao closed this pull request 2020-06-08 15:02:59 +00:00
raucao deleted branch feature/pg_encfs 2020-06-08 15:03:07 +00:00
Sign in to join this conversation.
No reviewers
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: kosmos/chef#166
No description provided.