3 Commits

Author SHA1 Message Date
9bf21e8317 Merge pull request 'Slow down Gitea 404s to mess with scrapers/bots' (#626) from chore/gitea_scraping into master
Reviewed-on: #626
Reviewed-by: Greg <greg@kosmos.org>
2026-04-11 17:08:16 +00:00
aaed9a56d1 Slow down Gitea 404s to mess with scrapers/bots
Seems to have helped quite a lot for dealing with AI scrapers using
up all available server resources
2026-04-11 15:37:38 +04:00
41e6b29b97 Add AGENTS.md 2026-04-11 15:36:54 +04:00
2 changed files with 56 additions and 0 deletions

41
AGENTS.md Normal file
View File

@@ -0,0 +1,41 @@
# AGENTS.md
Welcome, AI Agent! This file contains essential context and rules for interacting with the Kosmos Chef repository. Read this carefully before planning or executing any changes.
## 🏢 Project Overview
This repository contains the infrastructure automation code used by Kosmos to provision and configure bare metal servers (KVM hosts) and Ubuntu virtual machines (KVM guests).
We use **Chef Infra**, managed locally via **Knife Zero** (agentless Chef), and **Berkshelf** for dependency management.
## 📂 Directory Structure & Rules
* **`site-cookbooks/`**: 🟢 **EDITABLE.** This directory contains all custom, internal cookbooks written specifically for Kosmos services (e.g., `kosmos-postgresql`, `kosmos_gitea`, `kosmos-mastodon`). *Active development happens here.*
* **`cookbooks/`**: 🔴 **DO NOT EDIT.** This directory contains third-party/community cookbooks that are vendored. These are managed by Berkshelf. Modifying them directly will result in lost changes.
* **`roles/`**: 🟢 **EDITABLE.** Contains Chef roles written in Ruby (e.g., `base.rb`, `kvm_guest.rb`, `postgresql_primary.rb`). These define run-lists and role-specific default attributes for servers.
* **`environments/`**: Contains Chef environment definitions (like `production.rb`).
* **`data_bags/`**: Contains data bag configurations, often encrypted. Be cautious and do not expose secrets. (Note: Agents should not manage data bag secrets directly unless provided the `.chef/encrypted_data_bag_secret`).
* **`nodes/`**: Contains JSON state files for bootstrapped nodes. *Agents typically do not edit these directly unless cleaning up a deleted node.*
* **`Berksfile`**: Defines community cookbook dependencies.
* **`Vagrantfile` / `.kitchen/`**: Used for local virtualization and integration testing.
## 🛠️ Tooling & Workflows
1. **Dependency Management (Berkshelf)**
If a new community cookbook is required:
- Add it to the `Berksfile` at the root.
- Instruct the user to run `berks install` and `berks vendor cookbooks/ --delete` (or run it via the `bash` tool if permitted).
2. **Provisioning (Knife Zero)**
- Bootstrapping and converging nodes is done using `knife zero`.
- *Example:* `knife zero converge name:server-name.kosmos.org`
3. **Code Style & Conventions**
- Chef recipes, resources, and roles are written in **Ruby**.
- Follow standard Chef and Ruby (RuboCop) idioms. Look at neighboring files in `site-cookbooks/` or `roles/` to match formatting and naming conventions.
## 🚨 Core Directives for AI Agents
1. **Infrastructure as Code**: Manual server configurations are highly discouraged. All changes must be codified in a cookbook or role.
2. **Test Safety Nets**: Look for `.kitchen.yml` within specific `site-cookbooks/<name>` to understand if local integration tests are available.
3. **No Assumptions**: Do not assume standard test commands. Check `README.md` and repository config files first.
4. **Secret Handling**: Avoid hardcoding passwords or API keys in recipes or roles. Assume sensitive information is managed via Chef `data_bags`.

View File

@@ -18,6 +18,8 @@ server {
client_max_body_size 121M;
proxy_intercept_errors on;
location ~ ^/(avatars|repo-avatars)/.*$ {
proxy_buffers 1024 8k;
proxy_pass http://_gitea_web;
@@ -52,5 +54,18 @@ server {
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
error_page 404 = @slow_404;
}
# Slow down 404 responses to make scraping random URLs less attractive
location @slow_404 {
internal;
default_type text/plain;
content_by_lua_block {
ngx.sleep(10)
ngx.status = 404
ngx.say("Not Found")
ngx.exit(ngx.HTTP_NOT_FOUND)
}
}
}