I’ve had to disable comments on my blog a couple of weeks ago when I started getting spammed in my comments. Movable Type is working on a fix to the blogging engine that I use so that hopefully I’ll be able to re-enable commenting at a later date. It seems as if spammers try to increase their Page Rank on Google by spamming your comments with links to their sites. Google ranks pages higher if they have lots of links to them. I started getting so many that it was bringing my poor little server to a crawl.
That sucks for the conversational aspect of blogs, but Movable Type is one of the most popular blogging engines out there so I expect a fix fairly soon, probably in January.
We got a season’s pass for Rabbit Hill this year. Emily’s 8 and Mack is 5, so they are both old enough to get into skiing. Last year we went a couple of times with Emily, overestimated her confidence level and then took her to Lake Louise in a trip that didn’t go too well. She got scared and wasn’t able to ski too much, just because of the sheer size of the place.
This year, we’ll go lots to Rabbit Hill, get everybody in skiing shape, and then make a late-season decision about a trip out to Marmot Basin or something.
Anyways, last Sunday was Mack’s first time out. Jennifer is pretty much in the transition between beginner and intermediate skier, and I’m a non-skier, preferring my snowboard, so as ski teachers we suck. Mack did OK on the tow rope and got a little bit of the idea of snoplowing, but we really only went out to get our passes, try out the process of going there, getting skis, doing some runs, and coming home. Mack did very well and was pretty enthusiastic, so we’re looking forward to more tries. This Saturday we’re getting Mack into a 1-1/2 hr lesson for rank beginners. That should help.
We have about 500 users on our GroupWise system. It runs on a NetWare 6 2-node cluster, with load sharing and auto-failover. There is a third box to handle web-access and the SMTP server component, called the GWIA in GroupWise parlance. There is a small SAN with 100 GB of storage shared between the two nodes of the cluster holding the users’ mailboxes. That gives us a little over 200 MB per user for mail.
This is a business mail system, used for business communications. People in our office also use it for personal mail, but I only really care about the business mail, and so does the company. We work on a lot of projects. Most of the email is project related, so if it affects scope, budget, schedule, design details or contract requirements, it is a record. Our official policy is that records are paper, and paper records have a well defined storage and retention policy. That means that for emails with the types of information listed, they should be printed and sent to the paper files.
Unfortunately there are far too many emails flying around to make this practical. Some users get hundreds of emails per day. We can’t expect them to reasonably print and correctly file all that stuff. They do OK with a half dozen or a dozen pages of paper per day, but not hundreds. We would also be choked under the cost of physical storage.
The next best solution is to designate the pertinent messages as electronic records, which thanks to our recent records management project, have a well defined storage location and retention schedule. The trick now is to get the users to start electronically filing the messages to the appropriate locations, and get them out of GroupWise. As long as they stay in GroupWise they haven’t been properly filed, and are not accessible to the rest of the project team. They also could be at risk of loss, although we do have comprehensive backup solutions for our mail.
We can’t afford to have GroupWise grow out of control, because we pay for bandwidth to the colocation facility that hosts our mail and also pay for backup by the kilobyte. We also pay for colocation space, and a big storage array to get into terabyte storage size would cause us to need to move over into the next bigger size of space at the colocation site.
Our options in order to ensure record retention, are to either make it as easy as possible for users to electronically file messages and ride their backs keeping after them to save their stuff, or provide a back-end system that keeps and logs all mail to a storage server, and automatically turf email older than a certain age on the production mail server. I guess I have to lean towards the first solution, even though I’d like to autodelete mail older than six months. I guess I’ll be writing some enhancements to MailSaver sooner than I anticipated.
This article is not about BSD Unix, its about MailSaver, but I’m going to start with some history that includes BSD Unix to explain my paranoia about email.
The following might be highly inaccurate but it’s good enough for my story.
There used to be UNIX from AT&T. People at Berkely ported it to run in Intel 386-based computers. It was called 386-BSD. Then there was a big lawsuit in which AT&T contended that 386-BSD was infringing on it’s intellectual property. Then the propietary parts of 386-BSD were removed and reimplemented and the FreeBSD project was born. Another project called NetBSD also came about at that time, and they focussed on trying to get BSD working on lots of different computer platforms.
One of the NetBSD developers was called Theo DeRaadt. He worked on NetBSD for a while, and then had a falling out with the rest of the NetBSD core team. They accused him of all sorts of verbal and written abusive behaviour and essentially pressured him to leave. He rebutted their claims, and backed up his rebuttal with scads of carefully and systematically preserved email messages. I read all this correspondence a few years ago, just to indulge my interest in computer operating systems and so forth, and became a desciple of the school of thought that the messages in email are worth preserving.
Aside from messages like “Where’s my stapler” and “Bring perogies to the pot luck dinner next week” I haven’t gotten rid of any email messages since that day. I keep the body text and header info for every message I send or receive.
We use GroupWise, and have for years. It got to be very time consuming to use the “Save As” feature in GroupWise to save messages one at a time in order to preserve them. GroupWise’s “Save As” feature also saves messages in RTF format, which I hate. I wanted something to save messages all at once, in batches, and to put them in a nice simple format like plain text. Initially using the IMAP compatibility of GroupWise, and writing code in Perl, I created the first version of MailSaver. It worked great for me, and I used it like that for a couple of years.
Then, a few people in the office started to want to save project-related email with the rest of the project documentation on our engineering project directories. This prompted me to write MailSaver 1, which was a standalone program in VB that essentially worked as a custom GroupWise client to save messages from folders directly. When I started, I asked my company’s CTO to allow me to retain ownership of MailSaver, and release it into the wild under a BSD-style license, to which he graciously agreed. This version worked great in GroupWise 5.x and we used it a lot for a couple of more years.
Then, we planned an upgrade to GroupWise 6, and we were going to cut off all old GroupWise 5 mailboxes, and give everyone shiny new empty GroupWise 6 accounts. Everybody wanted to preserve old mail from GroupWise 5, and the existing MailSaver was not quite up to the task. There were new features requested, so I wrote MailSaver 2. MailSaver 2 was ported from VB6 to VB.NET, and it was a nightmare to get working properly. It also required the giant Microsoft .NET runtime in order to work. It was a 300 kB program that needed a 20 MB install. Users used it but nobody liked it too much.
After getting GroupWise 6.0 rolled out, we discovered that there were some insurmountable problems in MailSaver 2, that I couldn’t fix because of incompatabilities between the GroupWise Object API, which is the extensibility framework for the GroupWise client, and .NET. I decided to re-rewrite MailSaver 3, going back to VB6, and at the same time, doing away with the stand-alone application and integrating MailSaver right into GroupWise. This resulted in MailSaver 3 pretty much as it currently stands.
I should mention that MailSaver has saved my butt a couple of times working as a consulting engineer. Here’s an example: Last year I was working on a project where we upgraded a sewage treatment plant. My team specified some air flow instruments for the aeration system. The contractor incorrectly installed the instruments, and they didn’t work. Then the site engineer contacted me telling me that I had failed to tell the contractor the necessary information to install the instruments correctly. While I was on the phone with the site engineer, I searched through my MailSaver mail archive. I immediately sent him the message that the contractor had emailed me one day asking if there were any special piping considerations required when installing these meters, and the message I sent back to the contractor a few minutes later including the manual in PDF format, with a citation of the exact page and paragraph, showing instructions on exactly how to install the instrument. The site engineer said “thanks” and went after the contractor instead.
A couple of bugfixes have been written for MailSaver 3, and after Novell created Novell Forge, I moved MailSaver to be hosted there, and it started to be used by people outside Associated Engineering. I’ve had a couple of hundred downloads, and a little feedback. Then, today Novell featured MailSaver on their GroupWise Cool Solutions Newsletter, and I presume a bunch of people have downloaded it. I’ve received several feature suggestions already, and some positive feedback too.
I’m glad other people are finding it useful. If you’re curious, you can find it here.
I’ve blogged before about the trials and tribulations of scripting rsync on NetWare, so that I can backup my NetWare servers to a big-assed storage array running Linux. Today I actually got some working stuff that I feel happy about rolling out, freeing us to start turning off tape drives and relying on the storage server for backups.
Rsync on NetWare can be run in two different ways. The first way is to run it as a command-line tool kind-of like xcopy or cp. The other way is to run it as a server daemon that offers up a share point, kind-of like smbd. Initially, I wanted to run rsync on my NetWare servers in server-daemon mode, and just script the backups to happen by adding cron jobs running Bash shell scripts on the Linux storage server. Then in the lab, this seemed to completely pound the CPU of my NetWare test boxes into the ceiling, making the NetWare box unresponsive during the backup. I didn’t like that, so I tried runnnig rsync in daemon mode on the storage server, and pushing the backups up from the NetWare box by scripting rsync in an “ncf” (NetWare batch) file. This was great except for one thing. When you run three sequential rsync commands in an ncf file, they don’t wait for each other to complete. You end up with three concurrently running rsync processes. If I thought daemon-mode rsync on NetWare was hard on the CPU, this was way worse.
Then I tried writing a java wrapper around the rsync calls on NetWare, hoping to be able to cause the rsync jobs to wait and only run one at a time. I also wanted to use the java program to capture the return value from rsync so I could automatically generate status emails to send a “success” or “error” message to the administrator. As mentioned previously, this works great on any other platform that supports java and rsync, but on NetWare, you can’t make anything wait for bloody rsync to complete. When it starts, it goes off into it’s own little envronment and you can’t communicate with it anymore.
Finally, yesterday when I got back to working on this after a hiatus to work on some project stuff, I decided to screw the java method, go back to the original concept of running rsync on server-daemon-mode on NetWare, and using Bash scripts and cron on the storage server to pull the backups from the NetWare servers. I wrote a generic script to do the backups, capture the return code from rsync, and send a result email to myself. Then I configured it to do three different servers in the production environment. Guess what? It just works. Bash is your friend. Or maybe Bash is just my friend.
On to the next thing, which is using DirXML (Novell’s Nsure Identity Manager) to replicate our user credentials for all offices from our enterprise LDAP directory down to each local office. This project will allow our users to login to any of our networks with the same credentials. This will also allow us to write our own web applications to target a common authentication database no matter where we choose to deploy them.
I’ve been spending a lot of time doing an internal engineering project review. It was a big task with thousands of pages of documentation to go through and several people to interview. I just e-mailed off the pdf of the report a little while ago, and now I should be free to concentrate on the next thing.
With that focus, I’ve already re-written our rsync backup java program that pushes data from a NetWare server into a bash script that pulls it off the NetWare server onto our Linux storage server. Woo.
We massively cleaned house on the weekend in preparation for getting all the Christmas decorations out. Then, I dragged all the boxes out of the storage room and we spent a bunch of time stting up decorations in the house. Its very Christmasy already and the kids are getting excited. It just lends a more “homey” air to the house. I’m starting to really look forward to Christmas this year.
The weather event co-operated on the Christmas feel, giving us our first really cold temperatures this winter, and a couple dozen centimetres of snow. Brr.