We suffered a short server outage yesterday afternoon, around 16:40 UTC. Turns out that one of the two hard disk’s in the server’s RAID1 array had developed a fault. So we had to take the server off line and replace the disk. The sever was back up and running by 17:10 UTC.
Users may have noticed that the server was a bit slow early evening as the RAID rebuilt itself, copying data to the replacement disk. This process completed circa 21:00 UTC, and everything seems to be running normally again.
I was first alerted to the issue when the server sent me an automated message around 04:00 UTC, alerting me to a failing hard drive. I ran some tests to confirm we had a problem and find our more about what was actually wrong. At around 08:00 I reported the issue to Bytemark, our up-stream provider, the company that actually houses our rack-mount server in York, UK. Once again, Bytemark sorted this issue very promptly and efficiently, literally within hours of us first reporting it. 🙂
We have just completed a fairly major server OS and software upgrade today. Frankly this was long overdue and was delayed because we knew it would be quite a messy business and could result in some server downtime.
Fortunately the mail-server was only down for a few minutes – though it did go down several times during the afternoon. Unfortunately however, the web-server upgrade took rather longer to fix and was off line for several hours – for which we apologise. Anyway, we are all up and running. We chose a weekend – and rather a warm one at that – so hopefully the inconvenience to our customers was minimal. 🙂
You may have noticed that when you access your site, there is a change in the way the URL looks in the address-bar on your browser? In a long-overdue upgrade, all our customer sites have been upgraded from HTTP (hypertext transfer protocol) to HTTPS (hypertext transfer protocol secure).
What is the difference?
In an ordinary HTTP session, data that flows to/from the server and the user’s computer is sent in plain text, this means that hackers connected in between may be able to see what is being transmitted.
Using HTTPS, the server and the user’s computer agree on a “code” between them, and then they encrypt the data flowing between them using that “code”, so that no one in between can read them. This keeps your information much safer from hackers.
Customer sites have been re-configured, so that all requests for the old HTTP pages are redirected to the new HTTPS version instead. This redirection should happen automatically, with no intervention form the user. A few sites required a little page-by-page tweaking, too. But as far as I am aware, all customer sites have now been upgraded successfully.
Our own sites, deoss.net and deoss.com have also been upgraded to https. All our other projects will follow shortly.
The Wikipedia has a very good page that describes HTTPS and its relative advantages:-
Apologies. We suffered a server outage for about ten minutes. The server locked up completely. We were able to reboot it remotely and initial diagnosis shows no obvious issues. However we are busy going through the logs to see what happened. We will also contact our upstream provider to see if he knows of any upstream issues
Meantime, as of 2017-06-14 01:09 UTC, everything seems OK and working as expected.
Yes, the server move is complete. The new server is fantastic. And we have a new server configurator, which will make the job of managing all the projects on the DEOSS community server much easier and safer.
Now, we know there are a few bugs and outstanding issues with some projects. These are being fixed right now. And there will be some software upgrades to individual projects. But in principle, the hard work is done! 🙂
In a nutshell, we have to move out of our existing server, that currently resides in a secure data centre in Manchester, and move into a new server, located in York. This was not my idea, I hasten to add. I have nothing against Manchester! 😉
It has come about due to restructuring on the part of our upstream service provider, Bytemark. Obviously this is causing me massive disruption that I am trying very hard not to pass on to my users. However some disruption is inevitable, I’m afraid.
To sweeten the pill, Bytemark has done me a rather good deal where we get a faster and more highly-specified server for very little extra money. I am also taking the opportunity to improve the facilities that the server offers. Over the years customers have requested various features that the old server was either unable to provide, or provided rather poorly. For example:-
- A simple, fast webmail service.
- TLS security on emails (so that emails can be automatically encrypted).
- Password-protected secure SMTP, so email users can send email via our server as well as receive it.
- Better support for mobile devices.
- The new sever is much faster, with a better CPU and a lot more RAM. We have also taken steps to optimise the delivery of PHP-based content. This should all result in a much faster delivery of web pages.
I am also taking the opportunity to weed out a lot of cruft that has accumulated over the years. To make most efficient use of the available computing resources, a number of redundant or barely-used services will be withdrawn:-
- Old sites/projects that customers no longer use. Up until now we have maintained closed projects at our expense, keeping them hidden in the background, just in case users wish to reinstate them in the future. This has proven time consuming and expensive. It has also taken valuable computing resources from current projects and poses at least a theoretical security risk. Therefore, all such projects will not be carried across to the new server. Instead, archived copies of dead projects will be made and kept off-site.
- EGW(EGroupWare). This is a massive project with many excellent features. However customers only ever used its webmail feature. Many customers also commented that they found EGW to be overly complex and cumbersome. Webmail, for those who want to use it, will now be provided by the much simpler and faster Squirrellmail. I must add that we have not lost interest in EGW and still have the greatest respect for its lead developer, Ralf Becker. We will be running experimental and demonstration installations on private servers. This is to serve future clients that need such an extensive and sophisticated CRM (customer relationship management) system.
- EyeOS. This is a web-based virtual operating system, that showed great promise in its early days. But hardly anyone actually used it. Added to which it was a pain to maintain, and often even minor upgrades would break it. As with EGW, we will be running experimental and demonstration installations on private servers, for possible future deployment.
- POSITIVE. This was a project running at deoss.org aimed at NHS customers considering moving to open source software. Again it showed great promise in its early days. But it has seen very little activity recently. Customers who wanted to “go open source” have mostly done so. Those who haven’t, probably never will. So POSITIVE will be closed down within the next few weeks. The deoss.org domain will be redirected at deoss.com for the time being, with a view to the domain being reinstated for a new generation of open source projects in the future. Meanwhile the plethora of small experimental projects at deoss.org will be moved to and/or merged with projects at garfnet.org.uk, over the next few weeks.
The whole move has to be completed by midnight 2014-10-31. However I am hoping to have the majority of this work done much earlier than this.
A small but annoying bug was reported over Christmas by Android users, where pictures were not showing properly on web pages served by The DEOSS Community Server. The odd thing was it only affected some browsers. Users with Firefox for Android were unaffected.
After some investigation it transpired that mod_security disliked the way some browsers presented their HTTP responses. Anyway, the problem has now been fixed. There was no server downtime.
All the WordPress-based sites here on the DEOSS Community Server have just been upgraded to version 3.8. The upgrade appears to have gone very smoothly and there was no downtime on any of the affected sites.
Firstly, Merry Christmas. It seems to come round faster every year doesn’t it! Now, I must apologise for the lack of recent blog updates. Actually quite lot has happened on the DEOSS Community Server lately and it has been hard to keep track of it all. So here’s a very brief summary:-
- Automated plugin updates via rsync over ssh for all WordPress-based sites. Keeping all those plugins updated was becoming very tiresome and was causing delays in updating: A new backend script on the control server means that all updates can be made promptly and easily.
- Upgrade of all wordpress sites to 3.7.1
- Routine updates to the Debian operating system. These are generally performed on a weekly basis.
- Tweaks to the anti DDS (denial of service attack) system. Some users were reporting error 500’s when trying to perform certain functions on their sites. These issues were mostly due to overzealous settings in an Apache module called mod_security. Mod_security is an excellent module but its documentation is not as clear as it might be. And it is a complex bit of kit too with configuration settings that have far-ranging implications for a server such as ours. Therefore configuring it is something of a dark art. Nevertheless, hopefully all the issues have now been resolved.
That’s it for now. More routine work is currently being undertaken this week. No downtime is anticipated.
Good news is that DEOSS is now back on line and running reasonably well. However, I don’t anticipate that the bad guys will be giving up anytime soon.
Consequently, I have been forced to impose some very tight security restrictions in order to fight of this ongoing DDoS attack. The server logs indicate that we are currently experiencing 1 attack every 4 seconds – though at one point it was 10 attacks per second! To fight them off I have deployed a WAF (web application firewall) called ModSecurity – amongst a number of other measures. It is a very powerful tool and is holding up well. However its configuration is very poorly documented and its settings are virtually incomprehensible!
In addition I have had to create my own directives to respond to the particular attack we are experiencing. This means that while I finetune the new security measures to suit all the various web applications on DEOSS, you may get experience some “error 403’s“. This is the default response of the WAF to anything it considers a threat. So if you are a DEOSS customer and you experience unexpected “403’s” then please let me know about them, through the normal channels.