Analysis of GoC .gc.ca Domains
Ben was able to use a public list of domains published on Data.gov to begin his analysis, but unfortunately the Canadian Open Data site doesn't seem to provide this. Fortunately, I could build a list on the main GoC site and then posted just the ones with gc.ca domains to this Wikipedia page where hopefully it will be maintained and expanded. The USA list is broken down by agency, but unfortunately I don't have this type of metadata on the domains I've gathered.
I surveyed over 220 government domains using this tool and was surprised at some of my findings, even though they are very much in step with those in the USA.
Summary of Canadian Government sites:
- 27 still had no non-www support - yikes this is a one line fix in a Apache config file
- There was limited support for next generation IPv6 (see TBS statement)
- 76 supported HTTPs for encrypted transactions
- Only 4 use the Akamai CDN to speed up delivery
- None seem to be using the big USA based cloud servers
- Google Analytics is only used by 13 sites though some are probably using more traditional log analysis
- Drupal (17) is also the most popular CMS, followed by WordPress (7), Sharepont (3) and finally Joomla (1). More comments & links below.
- Microsoft-IIS 5.0 to 7.5 (104), Apache 2.0.63 to 2.2.3 (69), Zeus 4.2 to 4.3 (13), Lotus-Domino (6), Oracle-Application-Server 9.0.2 to 10.1.3.4.x (6), Zope 2.7.8-final (1), nginx (1)
3rd Party Services
Content Delivery Network (CDN) are increasingly being used to improve response times for websites.
The following use the CDN Akamai:
Google Analytics is a powerful tool, but has only recently been adopted by several government departments due to Google's service agreements. Only these sites (as far as this script can tell) are using analytics tools that allow them to get a good understanding of how citizens are actually using these sites.
Sites using Google Analytics:
Open Source CMS Solutions
Drupal is used by:
Wordpress is used by:
Joomla is used in http://crr.ca but I was a bit surprised not to see any Typo3 instances.
Other CMS Solutions
Sharepoint was used by 3 sites, but no other proprietary CMS was listed in this survey. It is used more extensively for Intranets.
Interwoven has been known to be deployed at DFAIT, Agriculture Canada, Industry Canada, Canada Post. There are several CMS solutions that essentially bake the HTML output into flat HTML files that makes it very difficult to sniff out. It might be possible to guess by searching for signature URLS or unique files, but it may not be possible in all instances.
I had never heard of the Canada Science and Technology Museums Corporation but it is a crown corporation running several government sites with what is probably a custom built software solution. If there are other solutions like this that don't have a critical mass of global users, then they are unlikely to show in this list as well.
I decided to write a quick script to pull the generator metatag to pull out other, less common CMS solutions. Using this I found sites reporting the use of CommonSpot Content Server, PRISM(TM) & DotNetNuke & FrontPage. This doesn't necessarily reflect the back end however.
I'd like to see a few other reviews added to this script. I'd like to know:
- Are there RSS, RDFa or Atom feed available?
- Automated validation for accessibility & HTML compliance
- Which version of HTML is being supported?
- Page load times
- Which versions of known CMS's
- Links to social media sites like Twitter, Facebook & YouTube
- Mobile readiness (either responsive theme or presence of a m.example.com)
- Finding links to related sub-domains linked to from scanned pages
However, already this script provides access to a lot of information which institutions do not have a means to keep track of. Keeping track of the tools used for sites inside & outside of organizational firewalls is often quite difficult.
As noted in Ben Balter's original post:
Please note: This data is to be treated as preliminary and is provided “as is” with no guarantee as to its validity. The source code for all tools used, including the resulting data, is available in GitHub. If you find a systemic error, I encourage you to fork the code and I will try my best to recrawl the list to improve the data’s accuracy.
About The Author
Mike Gifford is the founder of OpenConcept Consulting Inc, which he started in 1999. Since then, he has been particularly active in developing and extending open source content management systems to allow people to get closer to their content. Before starting OpenConcept, Mike had worked for a number of national NGOs including Oxfam Canada and Friends of the Earth.