Archive for the ‘Work’ Category.

LVM Extention Process

dd if=/dev/zero of=/mnt/remotestorage/linuxupdates-2 bs=1k count=1048576
losetup /dev/loop2 /mnt/remotestorage/linuxupdates-2
pvcreate /dev/loop2
vgextend updates_store /dev/loop2
lvextend -L+1G /dev/updates_store/store
umount /mnt/vg
e2fsck -f /dev/updates_store/store
resize2fs /dev/updates_store/store
mount /dev/updates_store/store /mnt/vg

Where the filename (linuxupdates-2) increments, the ‘count’ in the dd command is a multiple of the one listed (1GB), and the loopback devices (loop2 here) increments.

Ecommerce SLA Update

The latest numbers. You can see that things are improving a little on average as our transaction count goes up. We are still significantly short of the 5 seconds goal, but it will require me moving the posting process out of the linear flow.
Total Transactions: 71
Min: 3
Max: 34
Average: 7.55

  Target Actual Diff Real %
50% 36 17 19 23.94%
90% 64 63 1 88.73%
98% 70 70 0 98.59%

Ecommerce Performance Tuning

As we all know (if you’ve ever used it or heard me complain), the ecommerce process (recording the payment, etc) has been running pretty slow lately. My boss and I recently had a discussion where we agreed we wanted to spend some time fixing this. I put some better logging in place to see where the bottlenecks might be and based on information gathered I have been able to make some upgrades and modifications to enhance the overall performance of the system.

It turned out the the process of sending out emails was really the killer initially. They were taking (cumulatively as there are an average of about 3 that go out per transaction) around one minute on average, and up to 2 or 3 worst case. I have redone the way those emails are being sent and have trimmed that down to an average of under 2 seconds (for all of them). That alone made a huge difference, but it isn’t all
that has changed.

There is a query that I run to determine the journal number (basically a batch number) of the transaction that was just posted. It turns out that it was taking anywhere from 3 – 10 seconds to run. Upon further investigation, the query wasn’t making use of any of the indexes on the table (and it’s a large one, so many, many records). I made a small modification there that reduced its run time to under a second.

Right now I have good data on about 32 transactions that have occurred since I made the bulk of the changes on Tuesday and Wednesday. Here’s a breakdown of the numbers and how they fall into a Service Level Agreement that we are working up:

  • Trad Student Account Payments: 7 @ 18, 34, 8, 8, 5, 13, 10 seconds
  • Non Trad Student Account Payments: 12 @ 6, 9, 7, 5, 4, 3, 4, 6, 9, 9, 12, 11, 5 seconds
  • Non Trad Web Reg Payments: 13 @ 7, 7, 5, 8, 8, 7, 7, 8, 9, 16, 6, 7, 8 seconds

The Service Level Agreement (SLA) is currently defined as the following:

  • 50% of the transactions should occur within 5 seconds
  • 90% of the transactions should occur within 10 seconds
  • 98% of the transactions should occur within 20 seconds

When you look at the numbers I provided previously, we are currently short of that SLA:

  • 5s target: Need 16, only have 7 for a real percentage of 21.88%
  • 10s target: Need 29, only have 26 for a real percentage of 81.25%
  • 20s target: Need 31, have 31, but only have a 96.88% (rounding)

Only the 20 second period is currently within limits. We aren’t far off on the others, but we aren’t done with performance enhancements either. When you look at the averages, things look a little better:

  • Max: 34 seconds (fluke, very high system load at the time)
  • Min: 3 seconds
  • Average: 8.75 seconds
  • Median: 7.50 seconds

These numbers are with the posting process (what actually applies the payment to the account) still being a part of the total time. We’ve discussed having that process run sometime after a payment has been made, but within a minute or two of the actual payment. If we go that route, we should be able to subtract at least 80% of the time that we are currently seeing (as far as the student is concerned).

Work Happenings

What say you to an update about work stuff first? No? Too bad, read another post then (more will be coming soon that aren’t work related). Sometimes it is hard for me to really step back and see just how much stuff has changed over, say, the last six months. We are in the middle of yet another web registration window. It seems that one just barely closes, and then bam, another one opens up. The real trick this time is that they decided to do summer school registration online and at the same time as fall 2006 registration. Until this time around we had only done one semester at a time. Until this time around I had no idea how badly multi-semester registration could really screw up my code. Until this time around … argh! Eh, whatever, I’m still in the process of correcting all the little problems that showed up as a result of this. The problems are almost all related to money, but aren’t they always?

More interesting things are going on, like the reason I’m in the office at 4:00 AM today. Ok, so it isn’t that much more interesting as I’m here because I couldn’t sleep and just happened to crank out some pretty nice work (if I must say so myself). Specifically, we have been having performance issues with the Web Payment stuff (pay your student account online, etc). The problems came at the very end of the process when we actually process the credit card info, send out some emails, and log all the pertinent data in the database. The user would click the submit button that began the process, then depending on how impatient they are, would wait around for up to 45 seconds while the page loaded. The really impatient ones would really screw the process up by leaving the page early. I know, I know, that shouldn’t affect anything other than what they see, but it did. Since I was in the office tonight, I decided to take some action.

We had been talking about putting some logging in place to record how long various parts of the process took to run (so we’d know where to work on things if they were in our control). So, I dropped some logging in on the testing site. Wow! The places I thought we were having issues were actually the ones performing the best. It turned out that sending the emails out was the slowest part of the whole thing, and we send quite a few emails (payment notifications, error messages, receipts, etc). So it was time to look into optimizing that section. I was using a nice cpan module (yes, this is all in Perl … oh why wouldn’t they let me use PHP!?!) that made sending the emails pretty easy, it just didn’t make it fast. Since we have Sendmail installed on this machine I figured why not pass the email off to Sendmail and let it worry about everything else. Huge speed increase! Sending a single email went from taking around seven (ya, 7) seconds down to under a single second. Makes all the difference in the world.

I have been doing a little more than just programming lately. We purchased a wiki for use here in IT called Confluence. It’s pretty nice, even though it is built on Java. Since I was the only one in the department who had any real wiki experience, the project was given to me! I was able to con talk Tech Services into giving me a machine for the wiki on which I have installed Gentoo. It’s sitting back in “DC3″ (a rack in the shop area, not a proper Data Center at all) just humming away right now. The software gave me a few problems at first. We can use our Active Directory server for authentication and group memberships, so that’s really nice, but the version of Confluence that we started with was really, really slow when I turned on AD authentication. A couple of weeks later (the wiki wasn’t really being used yet, still in testing) a new version came out that contained a ton of LDAP/AD enhancements. Those enhancements brought the speed back up to where it should be. I’m just glad it wasn’t my poor sysadmin skills that was causing the problem. So anyways, the wiki is now being used a little … and soon it will be the next big thing!

The big news around here right now are the office renovations. They are doing everything possible to fit more people in the limited space that we currently have. The problem is that we just keep growing and have no place to put any of the new hires. The construction should begin pretty soon now that the permits seem to have been properly obtained from the city. We shall see, and so will you since I’m going to try and take some pictures as the process evolves.

Look for additional posts soon that will cover the non-work related happenings around these parts.