Proxy testing with IP Namespaces and GitLab CI/CD

CC-BY-NC https://www.flickr.com/photos/thomashawk/106559730

At work, I have a CLI tool I’ve been working on. It talks to the web and is used by customers all over the planet, some of them on networks with tighter restrictions than my own. Often those customers have an HTTP proxy of some sort and that means the CLI application needs to negotiate with it differently than it would directly with a web server.

So I need to test it somehow with a proxy environment. Installing a proxy service like Squid doesn’t sound like too big a deal but it needs to run in several configurations, at a very minimum these three:

  • no-proxy
  • authenticating HTTP proxy
  • non-authenticating HTTP proxy

I’m going to ignore HTTPS proxy for now as it’s not actually a common configuration for customers but I reckon it’s possible to do with mkcert or LetsEncrypt without too much work.

There are two other useful pieces of information to cover, firstly I use GitLab-CI to run the CI/CD test stages for the three proxy configurations in parallel. Secondly, and this is important, I must make sure that, once the test Squid proxy service is running, the web requests in the test only pass through the proxy and do not leak out of the GitLab runner. I can do this by using a really neat Linux feature called IP namespaces.

IP namespaces allow me to set up different network environments on the same machine, similar to IP subnets or AWS security groups. Then I can launch specific processes in those namespaces and network access from those processes will be limited by the configuration of the network namespace. That is to say, the Squid proxy can have full access but the test process can only talk to the proxy. Cool, right?

The GitLab CI/CD YAML looks like this (edited to protect the innocent)

stages:
- integration

.integration_common: &integration_common |
apt-get update
apt-get install -y iproute2

.network_ns: &network_ns |
ip netns add $namespace
ip link add v-eth1 type veth peer name v-peer1
ip link set v-peer1 netns $namespace
ip addr add 192.168.254.1/30 dev v-eth1
ip link set v-eth1 up
ip netns exec $namespace ip addr add 192.168.254.2/30 dev v-peer1
ip netns exec $namespace ip link set v-peer1 up
ip netns exec $namespace ip link set lo up
ip netns exec $namespace ip route add default via 192.168.254.1

noproxynoauth-cli:
image: ubuntu:18.04
stage: integration
script:
- *integration_common
- test/end2end/cli

proxyauth-cli:
image: ubuntu:18.04
stage: integration
script:
- *integration_common
- apt-get install -y squid apache2-utils
- mkdir -p /etc/squid3
- htpasswd -cb /etc/squid3/passwords testuser testpass
- *network_ns
- squid3 -f test/end2end/conf/squid.conf.auth && sleep 1 || tail -20 /var/log/syslog | grep squid
- http_proxy=http://testuser:testpass@192.168.254.1:3128/ https_proxy=http://testuser:testpass@192.168.254.1:3128/ ip netns exec $namespace test/end2end/cli
- ip netns del $namespace || true
variables:
namespace: proxyauth

proxynoauth-cli:
image: ubuntu:18.04
stage: integration
script:
- *integration_common
- apt-get install -y squid
- *network_ns
- squid3 -f test/end2end/conf/squid.conf.noauth && sleep 1 || tail -20 /var/log/syslog | grep squid
- http_proxy=http://192.168.254.1:3128/ https_proxy=http://192.168.254.1:3128/ test/end2end/cli
- ip netns del $namespace || true
variables:
namespace: proxynoauth

So there are five blocks here, with three stages and two common script blocks. The first common script block installs iproute2 which gives us the ip command.

The second script block is where the magic happens. It configures a virtual, routed subnet in the parameterised $namespace.

Following that we have the three test stages corresponding to the three proxy (or not) configurations I listed earlier. Two of them install Squid, one of those creates a test user for authenticating with the proxy. They all run the test script, which in this case is test/end2end/cli. When those three configs are modularised and out like this with the common net namespace script as well it provides a good deal of clarity to the test maintainer. I like it a lot.

So then the last remaining things are the respective squid configurations: proxyauth and proxynoauth. There’s a little bit more junk in these than there needs to be as they’re taken from the stock examples, but they look something like this:

 visible_hostname proxynoauth
acl localnet src 10.0.0.0/8 # RFC1918 possible internal network
acl localnet src 172.16.0.0/12 # RFC1918 possible internal network
acl localnet src 192.168.0.0/16 # RFC1918 possible internal network
acl SSL_ports port 443
acl Safe_ports port 80 # http
acl Safe_ports port 443 # https
acl CONNECT method CONNECT
http_access deny !Safe_ports
http_access deny CONNECT !SSL_ports
http_access allow localhost manager
http_access deny manager
http_access allow localnet
http_access allow localhost
http_access deny all
http_port 3128

and for authentication:

 visible_hostname proxyauth
acl localnet src 10.0.0.0/8 # RFC1918 possible internal network
acl localnet src 172.16.0.0/12 # RFC1918 possible internal network
acl localnet src 192.168.0.0/16 # RFC1918 possible internal network
acl SSL_ports port 443
acl Safe_ports port 80 # http
acl Safe_ports port 443 # https
acl CONNECT method CONNECT
http_access deny !Safe_ports
http_access deny CONNECT !SSL_ports
http_access allow localhost manager
http_access deny manager

auth_param basic program /usr/lib/squid3/basic_ncsa_auth /etc/squid3/passwords
auth_param basic realm proxy
acl authenticated proxy_auth REQUIRED

http_access allow authenticated
http_access deny all
http_port 3128

And there you have it – network-restricted proxy testing with different proxy configurations. It’s the first time I’ve used ip net ns without being wrapped up in Docker, LXC, containerd or some other libvirt thing, but the feeling of power from my new-found network-god skills is quite something :)

Be aware that you might need to choose different subnet ranges if your regular LAN conflicts. Please let me know in the comments if you find this useful or if you had to modify things to work in your environment.

Apache Forward-Proxy REMOTE_ADDR propagation

I had an interesting problem this morning with the Apache forward-proxy supporting the WTSI sequencing farm.

It would be useful for the intranet service for tracking runs to know which (GA2) sequencer is requesting pages but because they’re on a dedicated subnet they have to use a forward-proxy for fetching pages (and then only from intranet services).

Now I’m very familiar using the X-Forwarded-For header and HTTP_X_FORWARDED_FOR environment variable (and their friends) which do something very similar for reverse-proxies but forward-proxies usually want to disguise the fact there’s an arbitrary number of clients behind them, usually with irrelevant RFC1918 private IP addresses too.

So what I want to do is slightly unusual – take the remote_addr of the client and stuff it into a different header. I could use X-Forwarded-For but it doesn’t feel right. Proxy-Via is also not right here as that’s really for the proxy servers themselves. So, I figured mod_headers on the proxy would allow me to add additional headers to the request, even though it’s forwarded on. Also following a tip I saw here using my favourite mod_rewrite and after a bit of fiddling I can up with this:

#########
# copy remote addr to an internal variable
#
RewriteEngine  On
RewriteCond  %{REMOTE_ADDR}  (.*)
RewriteRule   .*  -  [E=SEQ_ADDR:%1]

#########
# set X-Sequencer header from the internal variable
#
RequestHeader  set  X-Sequencer  %{SEQ_ADDR}

These rules sit in the container managing my proxy, after ProxyRequests and ProxyVia and before a small set of ProxyMatch restrictions.

The RewriteCond traps the contents of the REMOTE_ADDR environment variable (it’s not an HTTP header – it comes from the end of the network socket as determined by the server). The RewriteRule unconditionally copies the last RewriteCond match %1 into a new environment variable SEQ_ADDR. After this mod_headers sets the X-Sequencer request header (for the proxied request) to the value of the SEQ_ADDR environment variable.

This works very nicely though I’d have hoped a more elegant solution would be this:

RequestHeader set X-Sequencer %{REMOTE_ADDR}

but this doesn’t seem to work and I’m not sure why. Anyway, by comparing $ENV{HTTP_X_SEQUENCER} to a shared lookup table, the sequencing apps running on the intranet can now track which sequencer is making requests. Yay!