{"id":5,"date":"2008-09-30T14:29:45","date_gmt":"2008-09-30T14:29:45","guid":{"rendered":"http:\/\/psyphi.net\/blog\/?p=5"},"modified":"2010-01-24T17:18:52","modified_gmt":"2010-01-24T17:18:52","slug":"apache-forward-proxy-remote_addr","status":"publish","type":"post","link":"https:\/\/psyphi.net\/blog\/2008\/09\/apache-forward-proxy-remote_addr\/","title":{"rendered":"Apache Forward-Proxy REMOTE_ADDR propagation"},"content":{"rendered":"<p>I had an interesting problem this morning with the <a title=\"Apache Web Server\" href=\"http:\/\/httpd.apache.org\/ \">Apache<\/a> forward-proxy supporting the <a title=\"Wellcome Trust Sanger Institute\" href=\"http:\/\/www.sanger.ac.uk\/\">WTSI<\/a> sequencing farm.<\/p>\n<p>It would be useful for the intranet service for tracking runs to know which (GA2) sequencer is requesting pages but because they&#8217;re on a dedicated subnet they have to use a forward-proxy for fetching pages (and then only from intranet services).<\/p>\n<p>Now I&#8217;m very familiar using the X-Forwarded-For header and HTTP_X_FORWARDED_FOR environment variable (and their friends) which do something very similar for reverse-proxies but forward-proxies usually want to disguise the fact there&#8217;s an arbitrary number of clients behind them, usually with irrelevant RFC1918 private IP addresses too.<\/p>\n<p>So what I want to do is slightly unusual &#8211; take the remote_addr of the client and stuff it into a different header. I could use X-Forwarded-For but it doesn&#8217;t feel right. Proxy-Via is also not right here as that&#8217;s really for the proxy servers themselves. So, I figured mod_headers on the proxy would allow me to add additional headers to the request, even though it&#8217;s forwarded on. Also following a tip I saw <a href=\"http:\/\/www.mail-archive.com\/plug@lists.linux.org.ph\/msg17244.html\">here<\/a> using my favourite mod_rewrite and after a bit of fiddling I can up with this:<\/p>\n<pre>\r\n<code>#########\r\n# copy remote addr to an internal variable\r\n#\r\nRewriteEngine  On\r\nRewriteCond  %{REMOTE_ADDR}  (.*)\r\nRewriteRule   .*  -  [E=SEQ_ADDR:%1]\r\n\r\n#########\r\n# set X-Sequencer header from the internal variable\r\n#\r\nRequestHeader  set  X-Sequencer  %{SEQ_ADDR}<\/code>\r\n<\/pre>\n<p>These rules sit in the container managing my proxy, after ProxyRequests and ProxyVia and before a small set of ProxyMatch restrictions.<\/p>\n<p>The RewriteCond traps the contents of the REMOTE_ADDR environment variable (it&#8217;s not an HTTP header &#8211; it comes from the end of the network socket as determined by the server). The RewriteRule unconditionally copies the last RewriteCond match %1 into a new environment variable SEQ_ADDR. After this mod_headers sets the X-Sequencer request header (for the proxied request) to the value of the SEQ_ADDR environment variable.<\/p>\n<p>This works very nicely though I&#8217;d have hoped a more elegant solution would be this:<\/p>\n<pre>\r\n<code>RequestHeader set X-Sequencer %{REMOTE_ADDR}<\/code>\r\n<\/pre>\n<p>but this doesn&#8217;t seem to work and I&#8217;m not sure why. Anyway, by comparing $ENV{HTTP_X_SEQUENCER} to a shared lookup table, the sequencing apps running on the intranet can now track which sequencer is making requests. Yay!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I had an interesting problem this morning with the Apache forward-proxy supporting the WTSI sequencing farm. It would be useful for the intranet service for tracking runs to know which (GA2) sequencer is requesting pages but because they&#8217;re on a dedicated subnet they have to use a forward-proxy for fetching pages (and then only from &hellip; <a href=\"https:\/\/psyphi.net\/blog\/2008\/09\/apache-forward-proxy-remote_addr\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Apache Forward-Proxy REMOTE_ADDR propagation&#8221;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"_uf_show_specific_survey":0,"_uf_disable_surveys":false,"footnotes":""},"categories":[17],"tags":[6,7,8,1080],"class_list":["post-5","post","type-post","status-publish","format-standard","hentry","category-sysadmin","tag-apache","tag-mod_rewrite","tag-proxy","tag-webdev"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/psyphi.net\/blog\/wp-json\/wp\/v2\/posts\/5","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/psyphi.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/psyphi.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/psyphi.net\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/psyphi.net\/blog\/wp-json\/wp\/v2\/comments?post=5"}],"version-history":[{"count":8,"href":"https:\/\/psyphi.net\/blog\/wp-json\/wp\/v2\/posts\/5\/revisions"}],"predecessor-version":[{"id":24,"href":"https:\/\/psyphi.net\/blog\/wp-json\/wp\/v2\/posts\/5\/revisions\/24"}],"wp:attachment":[{"href":"https:\/\/psyphi.net\/blog\/wp-json\/wp\/v2\/media?parent=5"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/psyphi.net\/blog\/wp-json\/wp\/v2\/categories?post=5"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/psyphi.net\/blog\/wp-json\/wp\/v2\/tags?post=5"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}