WIP: Set up ingress with Let's Encrypt certificates using cert-manager #26

Closed
greg wants to merge 5 commits from feature/ingress into master
greg commented 3 years ago
Owner

This is using haproxy-ingress to support forwarding SSH on port 22
Since we're using cert-manager with ingress to get Let's Encrypt certs, we're not using the Let's Encrypt functionality that's part of Gitea. To run this we need to change the config file, have Gitea run on port 3000 as HTTP and disable all the Let's Encrypt config keys. Currently the gitea-ingress.yaml uses the letsencrypt-staging ClusterIssuer

This has been tested on a local Kubernetes cluster using Docker for Mac

<strike>This is using haproxy-ingress to support forwarding SSH on port 22 Since we're using cert-manager with ingress to get Let's Encrypt certs, we're not using the Let's Encrypt functionality that's part of Gitea. To run this we need to change the config file, have Gitea run on port 3000 as HTTP and disable all the Let's Encrypt config keys. Currently the gitea-ingress.yaml uses the letsencrypt-staging ClusterIssuer This has been tested on a local Kubernetes cluster using Docker for Mac</strike>
Owner

Next step is testing with a fresh Gitea on GKE then?

Next step is testing with a fresh Gitea on GKE then?
Poster
Owner

Almost there, now I have a permission issue running this with a fresh Gitea on GKE inside its own namespace. Reading on RBAC to fix the error I'm getting on the ingress pod (service with name gitea-test/ingress-default-backend found: services "ingress-default-backend" is forbidden: User "system:serviceaccount:gitea-test:default" cannot get resource "services" in API group "" in the namespace "gitea-test")

Almost there, now I have a permission issue running this with a fresh Gitea on GKE inside its own namespace. Reading on RBAC to fix the error I'm getting on the ingress pod (`service with name gitea-test/ingress-default-backend found: services "ingress-default-backend" is forbidden: User "system:serviceaccount:gitea-test:default" cannot get resource "services" in API group "" in the namespace "gitea-test"`)
Poster
Owner

I fixed my RBAC troubles. I was missing the serviceAccountName in the deployment, https://github.com/jcmoraisjr/haproxy-ingress/tree/master/examples/rbac#usage.

Now the ingress works for SSH on port 22, but the HTTP ports are closed, let's see what is causing this

I fixed my RBAC troubles. I was missing the `serviceAccountName` in the deployment, https://github.com/jcmoraisjr/haproxy-ingress/tree/master/examples/rbac#usage. Now the ingress works for SSH on port 22, but the HTTP ports are closed, let's see what is causing this
Poster
Owner

OK, now I finally understand what I was missing. You need more moving parts than I thought when running your own ingress controller (that's needed for SSH, the GCE and nginx ingresses built into GKE are for HTTP(S)).

You have an ingress pod, and ingress controller automatically pushes the config to the ingress pod. Then you use a NodePort for the deployment to expose it locally, and finally a LoadBalancer that connects to the NodePort. I'm going to make a diagram as part of our documentation. This is very different from using a built-in ingress on GKE, where you don't run your own controller or need the NodePort and LoadBalancer, you just get a public IP directly

OK, now I finally understand what I was missing. You need more moving parts than I thought when running your own ingress controller (that's needed for SSH, the GCE and nginx ingresses built into GKE are for HTTP(S)). You have an ingress pod, and ingress controller automatically pushes the config to the ingress pod. Then you use a NodePort for the deployment to expose it locally, and finally a LoadBalancer that connects to the NodePort. I'm going to make a diagram as part of our documentation. This is very different from using a built-in ingress on GKE, where you don't run your own controller or need the NodePort and LoadBalancer, you just get a public IP directly
Owner

The entire point of using Ingress on GKE is so we don't have to pay for load balancers. So if it's just more complex, but incurring the same cost, then what's the benefit of Ingress for Gitea?

Everything else doesn't need SSH, so the built-in Ingress sounds much more important to look at first imo.

The entire point of using Ingress on GKE is so we don't have to pay for load balancers. So if it's just more complex, but incurring the same cost, then what's the benefit of Ingress for Gitea? Everything else doesn't need SSH, so the built-in Ingress sounds much more important to look at first imo.
Poster
Owner

The benefit is that you end up with only one load balancer instead of one for each service. And yes, the built-in ingress is perfect for everything that doesn't need SSH

The benefit is that you end up with only one load balancer instead of one for each service. And yes, the built-in ingress is perfect for everything that doesn't need SSH
Owner

If we don't need one for normal Ingress (without SSH), then we only need one for Gitea, which leaves us with the same one load balancer that we have without setting up Ingress for it.

So are you saying it's needed for Ingress either way? It's not clear from what you wrote, but it sounds like it's only needed for the SSH one.

If we don't need one for normal Ingress (without SSH), then we only need one for Gitea, which leaves us with the same one load balancer that we have without setting up Ingress for it. So are you saying it's needed for Ingress either way? It's not clear from what you wrote, but it sounds like it's only needed for the SSH one.
Poster
Owner

A LoadBalancer is needed on GKE when using an Ingress controller that's not built-in. The built-in Ingress controller is for HTTP(S), and can serve traffic directly without a LoadBalancer. Here is a tutorial about deploying ingress-nginx on GKE, with a LoadBalancer in front of the ingress controller (https://cloud.google.com/community/tutorials/nginx-ingress-gke). And the docs about HTTP(S) load balancing with Ingress on GKE: https://cloud.google.com/kubernetes-engine/docs/concepts/ingress

However I'm not sure that Ingress isn't charged the same as a LoadBalancer on GKE. The pricing page (https://cloud.google.com/compute/pricing#lb) only contains "ingress" as in "inbound traffic/data", not Ingress. Forwarding rules are what appear to be charged for.

Where did you see that Ingress was cheaper than a LoadBalancer on GKE?

A LoadBalancer is needed on GKE when using an Ingress controller that's not built-in. The built-in Ingress controller is for HTTP(S), and can serve traffic directly without a LoadBalancer. Here is a tutorial about deploying ingress-nginx on GKE, with a LoadBalancer in front of the ingress controller (https://cloud.google.com/community/tutorials/nginx-ingress-gke). And the docs about HTTP(S) load balancing with Ingress on GKE: https://cloud.google.com/kubernetes-engine/docs/concepts/ingress However I'm not sure that Ingress isn't charged the same as a LoadBalancer on GKE. The pricing page (https://cloud.google.com/compute/pricing#lb) only contains "ingress" as in "inbound traffic/data", not Ingress. Forwarding rules are what appear to be charged for. Where did you see that Ingress was cheaper than a LoadBalancer on GKE?
Owner

I assume this is a very long-winded way of saying "yes, a load balancer is indeed necessary in front of nginx-ingress"?

From all I read so far, it always sounded like Ingress is free (except for normal pod resource usage), and that that's also one of the reasons why people use it. I mean, that's the entire premise for us looking at it in the first place. We talked about this in person on so many occasions, that I really have no idea where this sudden "what, you don't know Ingress costs the same?" question is coming from.

I assume this is a very long-winded way of saying "yes, a load balancer is indeed necessary in front of nginx-ingress"? From all I read so far, it always sounded like Ingress is free (except for normal pod resource usage), and that that's also one of the reasons why people use it. I mean, that's the entire premise for us looking at it in the first place. We talked about this in person on so many occasions, that I really have no idea where this sudden "what, you don't know Ingress costs the same?" question is coming from.
Owner

I think I read about it here: https://www.doxsey.net/blog/kubernetes--the-surprisingly-affordable-platform-for-personal-projects

But it does indeed use a custom setup and not Ingress, and also say that Ingress costs the same as a load balancer.

So then I don't know what all the fuss is about. If we need one load balancer for every Ingress, then why use Ingress at all?

I think I read about it here: https://www.doxsey.net/blog/kubernetes--the-surprisingly-affordable-platform-for-personal-projects But it does indeed use a custom setup and not Ingress, and also say that Ingress costs the same as a load balancer. So then I don't know what all the fuss is about. If we need one load balancer for every Ingress, then why use Ingress at all?
Poster
Owner

The GKE pricing is really confusing, I'm trying to find a definitive answer on the Ingress pricing. My previous understanding was that you can run one Ingress to replace multiple LoadBalancers too

I've just found this blog article that says using an Ingress saves money because you can have multiple services behind one Ingress as opposed to one LoadBalancer per service, but it doesn't cite a source. I am contacting GKE support to ask them how they charge for their own Ingress controller, and for running our own

Right, in that article they're using a DaemonSet to run nginx in pods directly, listening on all network interfaces using hostNetwork: true

I am now checking if I can use the hostNetwork: true on the ingress controller itself, together with a firewall rule. If that works we can do without a LoadBalancer. I will need to add a firewall rule too

The GKE pricing is really confusing, I'm trying to find a definitive answer on the Ingress pricing. My previous understanding was that you can run one Ingress to replace multiple LoadBalancers too I've just found this [blog article](https://ryanmccue.ca/4-simple-ways-to-help-save-on-your-gke-bill/) that says using an Ingress saves money because you can have multiple services behind one Ingress as opposed to one LoadBalancer per service, but it doesn't cite a source. I am contacting GKE support to ask them how they charge for their own Ingress controller, and for running our own Right, in that article they're using a [DaemonSet](https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/) to run nginx in pods directly, listening on all network interfaces using `hostNetwork: true` I am now checking if I can use the `hostNetwork: true` on the ingress controller itself, together with a firewall rule. If that works we can do without a LoadBalancer. I will need to add a firewall rule too
Poster
Owner

I got haproxy-ingress to work without a LoadBalancer on GKE, using a DaemonSet for the Ingress controller. That way the haproxy Ingress controller runs on every Kubernetes node (4 in our case) and is accessible from the outside through each node's public IP

I got haproxy-ingress to work without a LoadBalancer on GKE, using a DaemonSet for the Ingress controller. That way the haproxy Ingress controller runs on every Kubernetes node (4 in our case) and is accessible from the outside through each node's public IP
Owner

I think now you lost the purpose of it. If you put it on 4 different nodes, then you don't need Ingress at all. The point of Ingress is routing traffic to services, no? No need for Ingress if you're just doing DNS round-robin to all nodes of the cluster. But with all the drawbacks of having to have all those nodes be online 100% of the time.

The way I understand this now is that you use Ingress instead of a load balancer (literally what the article says), and the benefits of that being that:

  1. As you can define your own routing rules for Ingress, you can use a single one to route to all your services via the rules.
  2. It can route dynamically to services instead of nodes, so all the normal k18s high-availability features are in effect.

Does that make sense?

I think now you lost the purpose of it. If you put it on 4 different nodes, then you don't need Ingress at all. The point of Ingress is routing traffic to services, no? No need for Ingress if you're just doing DNS round-robin to all nodes of the cluster. But with all the drawbacks of having to have all those nodes be online 100% of the time. The way I understand this now is that you use Ingress *instead* of a load balancer (literally what the article says), and the benefits of that being that: 1. As you can define your own routing rules for Ingress, you can use a single one to route to all your services via the rules. 2. It can route dynamically to *services* instead of nodes, so all the normal k18s high-availability features are in effect. Does that make sense?
Owner

Oh, and Let's Encrypt of course. And a service can still be nginx for example, which in turn can do more fancy stuff (possibly needed for LE?).

Oh, and Let's Encrypt of course. And a service can still be nginx for example, which in turn can do more fancy stuff (possibly needed for LE?).
Poster
Owner

Running a LoadBalancer in front of the ingress controller is one solution that I found works on GKE, I did not say it was the only way

I think now you lost the purpose of it. If you put it on 4 different nodes, then you don’t need Ingress at all. The point of Ingress is routing traffic to services, no? No need for Ingress if you’re just doing DNS round-robin to all nodes of the cluster. But with all the drawbacks of having to have all those nodes be online 100% of the time.

If you put what on 4 different nodes? I don't understand this whole paragraph. What I have done is deploy a the haproxy-ingress controller to all nodes, as a DaemonSet, making it accessible on the 4 nodes' public IP, routing traffic to a gitea-server service when using a specific DNS.

Once you have an Ingress controller running and have added an Ingress resource (that configures the controller) you still need to expose the controller using a public IP. Now I have that working with a DaemonSet, so we don't need a LoadBalancer. There is probably another way

Here are some docs about exposing an Ingress controller behind either a DaemonSet or a LoadBalancer: https://github.com/nginxinc/kubernetes-ingress/blob/master/docs/installation.md#4-get-access-to-the-ingress-controller

The way I understand this now is that you use Ingress instead of a load balancer (literally what the article says), and the benefits of that being that:

  • As you can define your own routing rules for Ingress, you can use a single one to route to all your services via the rules.
  • It can route dynamically to services instead of nodes, so all the normal k18s high-availability features are in effect.

Does that make sense?

The first part makes sense to me, not the second one. I think a LoadBalancer also routes to services

Running a LoadBalancer in front of the ingress controller is one solution that I found works on GKE, I did not say it was the only way > I think now you lost the purpose of it. If you put it on 4 different nodes, then you don’t need Ingress at all. The point of Ingress is routing traffic to services, no? No need for Ingress if you’re just doing DNS round-robin to all nodes of the cluster. But with all the drawbacks of having to have all those nodes be online 100% of the time. If you put what on 4 different nodes? I don't understand this whole paragraph. What I have done is deploy a the haproxy-ingress controller to all nodes, as a DaemonSet, making it accessible on the 4 nodes' public IP, routing traffic to a gitea-server service when using a specific DNS. Once you have an Ingress controller running and have added an Ingress resource (that configures the controller) you still need to expose the controller using a public IP. Now I have that working with a DaemonSet, so we don't need a LoadBalancer. There is probably another way Here are some docs about exposing an Ingress controller behind either a DaemonSet or a LoadBalancer: https://github.com/nginxinc/kubernetes-ingress/blob/master/docs/installation.md#4-get-access-to-the-ingress-controller > The way I understand this now is that you use Ingress instead of a load balancer (literally what the article says), and the benefits of that being that: > > * As you can define your own routing rules for Ingress, you can use a single one to route to all your services via the rules. > * It can route dynamically to services instead of nodes, so all the normal k18s high-availability features are in effect. > > Does that make sense? The first part makes sense to me, not the second one. I think a LoadBalancer also routes to services
Poster
Owner

Update: SSH doesn't work with a DaemonSet in front of the haproxy-ingress controller, they already have an SSH daemon running on port 22 on the nodes.

So I would say that for Gitea, the current setup cannot be improved by using Ingress...

Update: SSH doesn't work with a DaemonSet in front of the haproxy-ingress controller, they already have an SSH daemon running on port 22 on the nodes. So I would say that for Gitea, the current setup cannot be improved by using Ingress...
Poster
Owner

I pushed docs about Ingress, based on the successful test deployment of Gitea that I've done (https://gitea-test.kosmos.org/), here they are: 3cdc07cdf3

This will be useful once we deploy more HTTP(S) services on GKE, but for Gitea itself it will not be useful since we need TCP (SSH) support.

Adding a new service with HTTP(S) will consist in adding a DNS entry for the desired host, deploying the service behind a NodePort, and pointing an Ingress entry for the host to that NodePort. Since we can use one Ingress for multiple services that will also help us keep the number of port forwardings below 5, the limit after which extra forwardings are charged (https://cloud.google.com/compute/pricing#lb)

I pushed docs about Ingress, based on the successful test deployment of Gitea that I've done (https://gitea-test.kosmos.org/), here they are: 3cdc07cdf369512159c70c18547844e52371cf79 This will be useful once we deploy more HTTP(S) services on GKE, but for Gitea itself it will not be useful since we need TCP (SSH) support. Adding a new service with HTTP(S) will consist in adding a DNS entry for the desired host, deploying the service behind a NodePort, and pointing an Ingress entry for the host to that NodePort. Since we can use one Ingress for multiple services that will also help us keep the number of port forwardings below 5, the limit after which extra forwardings are charged (https://cloud.google.com/compute/pricing#lb)
raucao closed this pull request 2 years ago
Please reopen this pull request to perform a merge.
Sign in to join this conversation.
No reviewers
No Milestone
No Assignees
2 Participants
Notifications
Due Date

No due date set.

Dependencies

This pull request currently doesn't have any dependencies.

Loading…
There is no content yet.