With two young children starting to make increasing use of the Internet, my attention has turned in recent times to the thorny subject of Content Filtering. This posting is actually going to look at a technical approach I settled upon, however one cannot help mentioning, at least in passing, some of the wider issues involved.
As a parent I do not believe in raising children in some sort of bubble, totally devoid of anything that could possibly “harm” them. That applies to the Internet too – my hope is to raise children who are able to understand and deal with things, rather than require protection from them. To that end, Internet access for my children involves their parents first and foremost! They use a laptop, after asking permission, in the kitchen, in view of everyone else. I’m interested in what they are doing on it (genuinely so, not as some excuse to snoop!) and they want me to help and guide them. Email? Sure, make full use of it. But all emails sent to your address also get forwarded to me too guys… Why? So I can see what you’re receiving! Very open. Very honest. Nothing underhand. Those are the rules in this house.
And that approach actually covers probably 90% of what is required. However there’s still a small part that needs attention. As most adults know, there’s some weird stuff in some corners of the Internet. Really weird. Disturbingly weird. Stuff which I do not want my young children to see, even if accidentally. Being a very liberal sort, and totally anti-censorship with regard to what consenting adults view, I do not support any move to remove such stuff from the Internet. Weird, sick, depraved, whatever… Some of it may not be at all nice, but it’s there and it can be found. I just don’t want young children to accidentally find it. So what is a network engineer father to do…?
Content filtering – 4 approaches
Broadly speaking there are four way of approaching content filtering in the home environment:
- Workstation filtering
- Network filtering
- ISP filtering
- DNS blocking
The first three are all variations on the same theme. They vary in terms of the “Where do you do it?”
There are many software packages out there which will filter content locally on the PC being used to browse the web. In a similar manner to that used by the more familiar virus detection software, one can purchase and run content filtering software which aims to identify and block various categories of content. The difficulty faced with this approach is that it’s not at all easy to identify what to block! Just to take the most obvious candidate category for blocking: pornography. The software can, and will, have lists of the names of the popular, known web-sites with porn. And with some enormous proportion of the Internet being porn, that will already be a long long list! Then we have the challenge of the fact that every day goodness knows how many hundreds of new porn sites will appear, and old ones disappear. The list of sites cannot be fully up to date. So the software will also need to include elements of heuristic detection: identifying porn indirectly and blocking it. So we’re now into looking and scanning all the traffic to and fro for words or patterns which might identify it as porn. And so on. It’s a computationally intensive exercise, and requires frequent updating with new lists of patterns, URLs, IP addresses and so on.
The task is very similar to virus detection, with the frequent updates required, slowing down of communication to an extent, higher CPU usage, and so on.
The software out there is worth consideration for some – I’m not saying it’s a bad approach. But it has unavoidable limitations. Two obvious ones which apply to me are:
- Locked to a particular PC – If I install the software on one PC, then I cannot let the kids use another PC as it will be unprotected, unless I pay for and install the software there too.
- Linux. A big issue for me is that we only have a single Windows PC in the house (hooray!) All the others (too many… way too many…) including that used by the children, run Linux. And such packages are few and far between for Linux…
An approach I used for a while, with success, was to filter on the gateway device on my network. A quick summary here: at home my Internet connection terminates on a gateway firewall/router system. This system performs all manner of network-related functions. The key one is to run my Linux-based firewall. A host of other jobs get handled by this box too: VPN termination, media serving, DHCP, IPv6 routing, the list is long. Given that all of our Internet traffic traverses this system it is ideally suited to perform a filtering function.
To that end, for a while I ran Dans Guardian on my firewall. This is a sophisticated bit of software, and not entirely trivial to set up and get working. Apart from quite a lot of configuration itself, it also requires a web-proxy to be running on the firewall. I ran squid to fulfil that requirement. And then there’s the requirement to “hook” users into it. That involves either configuring the workstation to use a designated web proxy (and possible authentication required there – depends upon what exactly you want to achieve) or using IPTables on the firewall to intercept traffic from a given workstation and force it via the proxy. Various approaches, all quite interesting, but only if you find networks interesting… Many would find it simply “complicated”.
Once up and running, however, there are then further challenges to be faced. Firstly there’s the question of overhead. That is, how much load does it place on the gateway device, and hence how much delay or slowness does it introduce to the web browsing. My kids may not need the snappiest, lightening fast response times possible, but nor do they want to wait tens of seconds to see a page, or have some You Tube video constantly stop and start. Let me be clear here (and make sure I’m fair to Dans Guardian): if the device running it is powerful (in terms of CPU, memory, disk and so on) then it’s great. Really good. Trouble is, however, that a lot of boxes used as gateway routers/firewalls are not, by their nature, so highly specified. And that applied to me. My installation was, frankly, not fast enough. Much of the time it would work OK-ish, but often there would be very long delays indeed.
If you have a powerful box you can dedicate to such filtering, then do go ahead and consider it.
On other issue I also had to tackle was that of updates: as described for the Workstation solution, filtering software, wherever it is located, needs to be kept up to date. Dans Guardian does not come with an update mechanism, nor source of updates. There are sources of such updates out there if you search, but again, it’s an extra piece of work to do this and get it all set up correctly, auto updating silently every day. As before, not a criticism of the software that has been made freely available – but something that does need to be taken in to account.
Many ISPs offer a filtering service to their customers. This is of course attractive, as it entirely removes the need to perform the filtering and blocking locally to the home network. The work is offloaded to the ISP. While there may be a charge associated with this, it may be worth considering. The main, and maybe for many significant, disadvantage to it is the all-or-nothing approach. If you have many PCs (and hence different users) within the home network, you may only want to block certain stuff from certain PCs. I may not want my kids viewing DominatrixFrenchMaids.com, but (purely for research purposes, of course) their father may need to. (God knows, such a site probably exists, but I dare not look…) More realistically, there are other sites which are more genuinely OK for adults, but not for young children. If one has an interest in 20th Century history, a sad reflection on humankind is that there are some horrible things which can be seen… For older children and adults, that’s fine and indeed educational. But not below a certain age. I’d like to maintain the illusion of a nice world for at least a little while longer.
So ISP filtering is attractive in terms of removing the work from the home. But it does come, in general, with a certain amount of inflexibility.
This last technique is somewhat different from the others. Most people have at least some awareness that the names we use on the Internet (www.ipsidixit.net) actually map on to so-called IP addresses. For example www.ipsidixit.net is mapped via a DNS (Domain Name Service) to the IP address 220.127.116.11 (And to IPv6 2001:4b98:dc0:41:216:3eff:feaa:964a – I’m soooo hip and trendy…)
Yet no one in their right mind (nor even a network engineer) bothers with the numerical version. You just bang in the name and have your computer us DNS to resolve it to an IP address.
Most PCs will use one or more DNS devices specified and operated by their ISP. Used “normally, for example, my ISP (free.fr) provides two DNS systems for workstations to use.
However one does not need to use their suggestions. One can, in general, use other DNSs operated by third-parties.
The point is, then, that if one used a DNS service which had a constantly updated blacklist of sites which are “undesirable”, one could block access to them by simply declining to resolve them to their correct address. This then offers the benefits of ISP Blocking in so far as the load of shifted outside of the home network, but with the added flexibility that only workstations that require protection need use the “filtering” DNS. Other workstations can use the normal DNS.
I found that OpenDNS provide such a service, and have stated to make use of it. It’s free (they have some paid options too – but the free one seems fine for me) I have no association with OpenDNS, and am only “promoting” them as what they offer seems neat and useful. If others have knowledge of other similar services, please do post them in a comment – I’m not trying to make this exclusive to OpenDNS! In fact I’d like to compare OpenDNS to some others.
The service they offer is to provide DNS addresses which can have a selectable level of filtering applied. The spectrum is covered, from porn, violence, drug use, etc. through to shopping sites, social networking sites, etc. You get to choose which categories to block and which to allow.
And it does seem to work really rather well indeed. Below I am going to detail how I set it up within my network, integrating it within the DNS caching system already used.
The main weakness of the system is that with some knowledge and effort it can be circumvented (as, of course, can most systems) One could take the trouble to manually find the Name <–> IP mapping for a domain and enter that directly into a browser, thereby bypassing the DNS. However such a bypass would be very cumbersome to use, since even if you use an IP to land on a page, probably any link off that page will in turn require DNS, and would then need to be manually decoded, etc. Workable, but hard work. By the time my kids are knowledgeable enough to work all that out, they will probably be old enough to look after themselves!
Integrating OpenDNS into a Linux firewall, already running DNSMasq
My home network has DNSMasq running on a central gateway/server/firewall box. DNSMasq is responsible for DHCP (i.e. allocating IP addresses on my home network) and also DNS caching. To that end, it announces, via DHCP, that it is the DHCP server to be used by devices. Then it, in turn, resolves addresses via the ISP-supplied DNSs. It caches then DNS lookups locally.
In the DHCP configuration it has a pool of addresses available for any device to use, but most of the devices on the network have pre-allocated addressees reserved for them within the DNSMasq configuration. These are allocated based upon the Ethernet MAC address of a device. This is a very common technique to use with DHCP.
Given that, where now a device will be handed an IP address and the address of a DNS server to use (where that DNS server will actually be the same as the DNSMasq device itself) we want to change the config so that for certain devices (the childrens’ PC) when an IP address is handed out it will instead be given with the DNS addresses of the OpenDNS filtering systems. Then all DNS requests from that PC will no longer be locally forwarded to the gateway device, but will instead be routed out externally to OpenDNS, where they can be answered or blocked as appropriate.
The DNSMasq config to achieve this is slightly fiddly, so I am providing it here more or less in its entirety (a few names omitted and some light obfuscation of MACs etc.), but only highlighting the parts that particularly pertain to the OpenDNS filtering setup.
# Configuration file for dnsmasq. domain-needed resolv-file=/etc/resolv.conf no-resolv no-poll # Add other name servers here, with domain specs if they are for # non-public domains. server=/localnet/192.168.0.22
This part is not related to OpenDNS in any way: I don't use my ISP's DNS for normal use - I instead use Google's Public DNS. # Google Public DNS servers server=18.104.22.168 server=22.214.171.124 # Add local-only domains here, queries in these domains are answered # from /etc/hosts or DHCP only. local=/localnet/ interface=eth1 expand-hosts domain=localnet # For general purpose use, use this range. dhcp-range=192.168.0.128,192.168.0.160,12h
This is for OpenDNS. We use the dhcp-mac config to tag these special devices for filtering: # MAC list for openDNS filtering dhcp-mac=opendns,00:c0:9f:12:34:56 # Laptop on-board dhcp-mac=opendns,00:90:4b:12:34:56 # Laptop wifi
Here we're back for normal dhcp-host preallocation for known unfiltered devices: # Most ip addresses are pre-allocated here dhcp-host=00:50:ba:12:34:56,aname,192.168.0.2,720m dhcp-host=00:18:8B:12:34:56,anothername,192.168.0.3,5m dhcp-host=00:90:4b:12:34:56,laptop_wifi,192.168.0.4,720m dhcp-host=00:26:37:12:34:56,galaxy,192.168.0.5,60m dhcp-host=00:18:41:12:34:56,magic,192.168.0.6,60m dhcp-host=00:26:82:12:34:56,eva9150,192.168.0.7,720m dhcp-host=00:c0:9f:12:34:56,laptop_eth,192.168.0.8,720m dhcp-host=00:14:29:12:34:56,camera,192.168.0.10,120m dhcp-host=00:21:5A:12:34:56,printer,192.168.0.11,720m dhcp-host=00:40:63:12:34:56,aservername,192.168.0.22,infinitem
The devices tagged "opendns" above here get special DHCP options pointing them to the OpenDNS filtering-servers. # OpenDNS content filtering servers # Specify the two OpenDNS first, then ourselves third for local stuff dhcp-option=opendns,6,126.96.36.199,188.8.131.52,192.168.0.22
Note also the "184.108.40.206" on the end - this is optional, if you still want the filtered devices to be able to resolve local names. dhcp-authoritative cache-size=150 clear-on-reload
Anyone who has an existing DNSMasq configuration should find the above more than enough to change it to point arbitrary devices at the OpenDNS systems.
Nothing, but nothing, replaces a conscientious adult supervising, guiding and helping get to grips with the Internet. However even with that it’s still all too easy for some stuff to pop up which is better left hidden! This article highlight some of the general technical approaches one can take, and in particular that of DNS filtering with a service such as OpenDNS, optionally using a Linux device to semi-automatically allocate filtering to some device but not others.