Page MenuHomeGitPull.it

Setup Google Font reverse proxy cache for privacy reasons - with Apache's mod_cache at http://google-font-proxy.reyboz.it
Closed, ResolvedPublic3 Points

Description

Problem

There is no reason in the world to give Google the IP addresses of each of our visitors just to serve cute fonts.

Solution

For privacy reasons we setup a Google Font reverse proxy cache with Apache's mod_cache.

This is a drop-in replacement of Google Font. For example you can update right now your URLs from fonts.googleapis.com to google-font-proxy.reyboz.it.

See it in action:

The nice part is that after the first request, fonts are then literally stored locally. We can do this because most of Google Fonts are under a Free license.

NOTE: Yes, different user agents should trigger different fonts. If this does not happen, feel free to file a Task.
WARNING: If you noticed a problem in the licenses, file a Task. There shouldn't be any because we use Free fonts.

VirtualHost

apt install apache2
a2enmod cache
a2enmod substitute
/etc/apache2/sites-available/reyboz.google-font-proxy.conf
#
# See https://gitpull.it/T776
#

<VirtualHost *:80>

	ServerName google-font-proxy.reyboz.it

	Include /etc/apache2/my-includes/google-font-proxy.conf

</VirtualHost>

<VirtualHost *:443>

	ServerName google-font-proxy.reyboz.it

	SSLEngine on
	SSLCertificateFile      /etc/letsencrypt/live/google-font-proxy.reyboz.it/cert.pem
	SSLCertificateKeyFile   /etc/letsencrypt/live/google-font-proxy.reyboz.it/privkey.pem
	SSLCertificateChainFile /etc/letsencrypt/live/google-font-proxy.reyboz.it/chain.pem

	Include /etc/apache2/my-includes/google-font-proxy.conf
	Include /etc/apache2/my-includes/ssl-hardening.conf
</VirtualHost>
/etc/apache2/my-includes/google-font-proxy.conf
#
# This is a shared Apache HTTPd configuration for a Google Font proxy
# License: public domain (CC 0)
# Author: Valerio Bozzolan (2021)
#
# https://gitpull.it/T776
#

# where you want to put a nice homepage
DocumentRoot /home/www-data/reyboz.it/google-font-proxy/www

#
# Your canonical domain
#
# useful to store same cache entries from different hostnames
# so, if you share this to multiple ServerName/ServerAlias, you save some kilobytes! asd
CacheKeyBaseURL "http://google-font-proxy.reyboz.it/"

# allow to proxy via https://
SSLProxyEngine On

# fix mod_cache for proxies
CacheQuickHandler off

# disable unuseful and slow features
<Location />
	AllowOverride none
</Location>

# forward static font files and cache locally
<Location /s/>

	# hi Google! Give me the font file
	ProxyPass        https://fonts.gstatic.com/s/
	ProxyPassReverse https://fonts.gstatic.com/s/

	# cache future requests
#	CacheEnable disk
</Location>

<LocationMatch "^/(?<SUFFIX>css|css2|icon)$">

	# append the value "User-Agent" to the Vary HTTP header
	# but only if the dont-vary environment is not set.
	# don't know why we have to check the dont-vary env
	Header append Vary User-Agent env=!dont-vary

	# hi Google! Give me the CSS file
	ProxyPass        https://fonts.googleapis.com
	ProxyPassReverse https://fonts.googleapis.com

	#
	# Replace some URLS to fonts.gstatic.com
	#
	# Note that Sobstitute will not work without the next line
	# https://stackoverflow.com/questions/32603182/apache-httpd-substitute-wont-work
	#
	AddOutputFilterByType INFLATE;SUBSTITUTE;DEFLATE text/css
	Substitute "s|https://fonts.gstatic.com/s/|/s/|n"

	# cache future requests
	CacheEnable disk

</LocationMatch>

#
# Forward JavaScript requests to Google and cache locally
#
# Example:
#  https://ajax.googleapis.com/ajax/libs/jquery/1.10.2/jquery.min.js
#
<Location /ajax>

	# cache future requests
	CacheEnable disk
	CacheHeader on

	# hi Google! Give me the JavaScript file
	ProxyPass        https://ajax.googleapis.com/ajax
	ProxyPassReverse https://ajax.googleapis.com/ajax

</Location>

# cache even with Cache-Control: private
CacheStorePrivate On

# please cache also these cases or it will not work
CacheIgnoreNoLastMod    On
CacheIgnoreCacheControl On

# add X-Cache with HIT or REVALIDATE or MISS
# useful for debug purposes
CacheHeader on

# remove "Link: <https://fonts.gstatic.com>; rel=preconnect; crossorigin"
Header unset Link

#
# Cache directory root
#
# no need to change the default cache pathname in Debian
#
# Check if the system is cleaning it with:
##  systemctl status apache-htcacheclean
#
#CacheRoot /var/cache/apache2/mod_cache_disk

# eventually enable serious debug
#LogLevel debug
#CustomLog "/var/log/apache2/cached-requests.log" common env=cache-hit
#CustomLog "/var/log/apache2/uncached-requests.log" common env=cache-miss
#CustomLog "/var/log/apache2/revalidated-requests.log" common env=cache-revalidate
#CustomLog "/var/log/apache2/invalidated-requests.log" common env=cache-invalidate
#LogFormat "%{cache-status}e " cachelog
#CustomLog /var/log/apache2/cachelog.log cachelog

Proposed extra to be tested, made by @andreaganduglia in order to do not send the original user agent but just some fixed ones (also removing accepted languages, eventual cookies, and eventual referer):

ProxyAddHeaders Off
RequestHeader unset Cookie
RequestHeader unset Accept-Language
RequestHeader unset Referer

# Generic User-Agent
RequestHeader set User-Agent "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.0.0 Safari/537.36"

SetEnvIf User-Agent ^(.*)(Chrome/)[0-9]{1,}(.*)$ ua_chrome
RequestHeader set User-Agent "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36" env=ua_chrome

SetEnvIf User-Agent ^(.*)(Firefox/)[0-9]{1,}(.*)$ ua_firefox
RequestHeader set User-Agent "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:103.0) Gecko/20100101 Firefox/103.0" env=ua_firefox

SetEnvIf User-Agent ^(.*)(Version/)[0-9]{1,}\s+Safari/(.*)$ ua_safari
RequestHeader set User-Agent "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6 Safari/605.1.15" env=ua_safari

SetEnvIf User-Agent ^(.*)(MSIE)\s+[0-9]{1,}(.*)$ ua_ie
RequestHeader set User-Agent "Mozilla/5.0 (Windows NT 10.0; Trident/7.0; rv:11.0) like Gecko" env=ua_ie

TODO: Check if the "referer" is normally sent to Google Fonts in both Firefox and Chrome with default settings.

Let's Encrypt

$ certbot certonly --webroot --webroot-path=/var/www/html -d google-font-proxy.reyboz.it
Saving debug log to /var/log/letsencrypt/letsencrypt.log
Plugins selected: Authenticator webroot, Installer None
Obtaining a new certificate
Performing the following challenges:
http-01 challenge for google-font-proxy.reyboz.it
Using the webroot path /var/www/html for all unmatched domains.
Waiting for verification...
Cleaning up challenges

IMPORTANT NOTES:
 - Congratulations! Your certificate and chain have been saved at:
   /etc/letsencrypt/live/google-font-proxy.reyboz.it/fullchain.pem
   Your key file has been saved at:
   /etc/letsencrypt/live/google-font-proxy.reyboz.it/privkey.pem
   Your cert will expire on 2021-06-29. To obtain a new or tweaked
   version of this certificate in the future, simply run certbot
   again. To non-interactively renew *all* of your certificates, run
   "certbot renew"
 - If you like Certbot, please consider supporting our work by:

   Donating to ISRG / Let's Encrypt:   https://letsencrypt.org/donate
   Donating to EFF:                    https://eff.org/donate-le

Cache directory

Already created by Debian here:

/var/cache/apache2/mod_cache_disk

Systemd service

This service is needed for a general cleanup of the cache directory to keep it at a maximum size.

It was already provided by Debian:

systemctl enable --now apache-htcacheclean

Event Timeline

valerio.bozzolan created this task.
valerio.bozzolan renamed this task from Setup Google Font reverse proxy with mod_cache for privacy reasons to Setup Google Font reverse proxy with mod_cache for privacy reasons at http://google-font-proxy.reyboz.it.Mar 31 2021, 07:59
valerio.bozzolan closed this task as Resolved.
valerio.bozzolan updated the task description. (Show Details)
valerio.bozzolan set the point value for this task to 3.
valerio.bozzolan renamed this task from Setup Google Font reverse proxy with mod_cache for privacy reasons at http://google-font-proxy.reyboz.it to Setup Google Font reverse proxy cache for privacy reasons - with Apache's mod_cache at http://google-font-proxy.reyboz.it.Apr 1 2021, 12:37
valerio.bozzolan updated the task description. (Show Details)
valerio.bozzolan updated the task description. (Show Details)

I've updated the shit since now Google Fonts has /css and /css2.

Hello, adding this lines you can strip X-Forwarded-For, X-Forwarded-Host and X-Forwarded-Server, Cookie, Referer and reduce the User-Agents granularity down to 4.

ProxyAddHeaders Off
RequestHeader unset Cookie
RequestHeader unset Accept-Language
RequestHeader unset Referer

# Generic User-Agent
RequestHeader set User-Agent "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.0.0 Safari/537.36"

SetEnvIf User-Agent ^(.*)(Chrome/)[0-9]{1,}(.*)$ ua_chrome
RequestHeader set User-Agent "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36" env=ua_chrome

SetEnvIf User-Agent ^(.*)(Firefox/)[0-9]{1,}(.*)$ ua_firefox
RequestHeader set User-Agent "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:103.0) Gecko/20100101 Firefox/103.0" env=ua_firefox

SetEnvIf User-Agent ^(.*)(Version/)[0-9]{1,}\s+Safari/(.*)$ ua_safari
RequestHeader set User-Agent "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6 Safari/605.1.15" env=ua_safari

SetEnvIf User-Agent ^(.*)(MSIE)\s+[0-9]{1,}(.*)$ ua_ie
RequestHeader set User-Agent "Mozilla/5.0 (Windows NT 10.0; Trident/7.0; rv:11.0) like Gecko" env=ua_ie

@andreaganduglia Thank you so much for this contribution! Is this something you made up or did you take your cue from somewhere? Feel free to link any source!

No source. I'm the source. It's my solution to fix some issue of your great work. In this way proxy headers are disabled and no user references are sent to Google Fonts. Those four UA are enough to use Google Fonts API, but the real UA is hide.

Thank you! Can I assume your contributions in the public domain CC 0 by @andreaganduglia?

Yes it's. You can merge with your code if you want.