Throttling

Printer-friendly version

Author: 

Taxonomy upgrade extras: 

Despite our best efforts, the new server slows down to a crawl in the middle of the afternoon, Pacific time.

The old one often hung or crashed but the new one is more stable and just struggles through the traffic. Well, in a week or so, we'll have the new new server online running BC and we hope to see much better performance during peak time.

But for now, it looks necessary to reduce the amount of service when the site is really really busy. So, we're going to throttle back the information sidebars on the main page and turn off the search box even for logged-in users during peak times. This usually amounts to 2 p.m. to about 4 p.m. PT, most days.

Should be temporary. After the server upgrade, we have some software upgrades to try also that may improve things.

Hugs,
Erin

Comments

miracle workers

Erin, you and Bob are true miracle workers. Considering the loads BCTS is receiving it's amazing things have held together as good as it has! That is a lot of activity. All I can say is I am grateful for all the work you two put in here for us. I might not be very computer savvy, but I can read a graph (Scary)!

hugs!

grover

throttling

I second the above comment also! Richard

Richard

Performance and Tuning

It's a bit of a black art, tweaking the performance and tuning of multi-user database systems. Once upon a time, that was my job, and I'd be lying if I said it was always fun. It was rather depressing when something which should have worked to improve things didn't. And then there were times when you found a minor change which broke open a huge bottleneck, but then people would argue with you that certainly THAT couldn't have been the reason. (Usually, programmers. :) I was actually pretty good at it, but I was working on large systems where we had pretty decent diagnostic tools. Really expensive stuff, but in an environment where an outrageously overpriced $25K bit of software to monitor database calls could earn its purchase price the afternoon we installed it and spotted the first problem.

Depending on your systems architecture, there are all sorts of tricks you can use to speed things up. With commercial relational databases like the ones I worked on, quite often creating the right index would make all the difference in the world. I'm not familiar with MySQL, but if that's what the CMS is using to store its rules and page contents, I'd look there first. Other times, hardware (specifically, how a database is distributed on disk) is the solution. Splitting a large database up between different physical disks, so that frequently-executed complicated joins of multiple tables can access the tables simultaneously was one way to eke out more performance, usually far cheaper than trying to speed up the processor and memory. Of course, you need intimate knowledge of how the application is using the data, and which operations you should "cheat" and favor, something that might not be possible in a black box Content Management System like Drupal. Even harder if the tools don't even exist.

I wish I knew something about MySQL and Drupal. I suspect the answer is in there, somewhere. I do know that the processors in today's x86 systems are way faster than the heavy-duty minicomputers that major companies were using not all that long ago to run their businesses. Not only that, but the amount of main system memory has grown enormously in recent years. You've got to be bottlenecking at the disk access level. That's the physically slowest part of a computer, and a non-optimized or missing index makes things a million times worse. I wish you the best of luck in finding a fast, cheap, and easy way to improve performance. Indexing, and tinkering with the physical distributions of the tables of a database on disks -- those are nifty tricks if you have direct control over those sorts of things, and even better if you can figure out where to use them. It's not always possible in a proprietary system.

Optimization

erin's picture

Bob moved the database onto one disk and the code on the other and that sped things up. Then we got a machine with faster drives and more ram and that speeded things up. The newest machine has even more ram, twice as many processors and a faster bus. The bottleneck is still going to be the disk drive access.

Ideally, we'll someday have three servers, one for the database and two for a distributed or mirrored front end. I'll look into how to split the database among different disks. I can think of one way to do it. Thanks for the suggestion. :)

Hugs,
Erin

= Give everyone the benefit of the doubt because certainty is a fragile thing that can be shattered by one overlooked fact.

= Give everyone the benefit of the doubt because certainty is a fragile thing that can be shattered by one overlooked fact.

Small Computer Systems/Serial Interface

Piper's picture

HeHe, you probably knew that I was gonna say this before you finished posting, but I still say for your dedicated database server, you want 10K or 15K rpm SCSI LVD HotSwap drives, with hardware Raid 5 Strip w/ Parity .... 4 drives will give you the combined space of 3 of them, with the 4th essentially redundant data... Add in a 5th drive as Hot-Live, and you don't need to do anything if any 1 drive dies, except eventually travel to the data center and replace the dead drive with a new one to act as the new hot-live :)

But I do understand cost efficency and your limited budget, so I will get off my SCSI soap box :)

-HuGgLeS-
-P/KAF


"She was like a butterfly, full of color and vibrancy when she chose to open her wings, yet hardly visible when she closed them."
— Geraldine Brooks


It Doesn't Always Work That Way

What you describe, Piper, is a very common way to speed up a file server, where the access is unpredictable, and the usage patterns are random. That's where a RAID can really shine.

In a specific application, however, especially one based on indexed or unindexed tables, a RAID setup can give much poorer performance than 4 separate disks. The reason for this is that a RAID acts like one big disk. As you know, the slowest part of a disk is the head-stepper motor, especially when it has to do random seeks across the whole disk. In a RAID drive, all the head motors have to step together to read little parts of the same file or table. If instead, you're dealing with a specific application that you understand, and you know it needs to access different tables at the same time to do something, putting those separate things on different physically separate, unsynchronized hard drives lets all the heads do their own separate thing, doing away with the head thrashing that results from trying to read records or indexes from different tables as part of the same query from one big, single drive. (On some high-end commercial systems, you can declaratively force indexes and small, commonly used tables to stay cached in RAM, which can give really impressive performance. Without actually knowing anything about it, I'm going to guess that might not be a feature available in free, open-source MySQL.)

Piper != RDBMS GuRu

Piper's picture

I have to admit that I know VERY little about Relational Databases and RDMS. In fact, in the past I have basically used MySQL more like a flat-file database than a true Relational Database (or so Kim had told me).

I'm learning as I go, and always open to learning more. But generally, every linux MySQL installation I've ran across has only had one directory for storing all the .db files that store the databases/indexes/etc... Therefore using what I know, Erin's Drive A for OS and Drive B for MySQL is the best that could be achived. But I still think the site would gain from the redundancy of RAID an the speed of 15K RPM SCSI LVD Drives.

-HuGgLeS-
-P/KAF


"She was like a butterfly, full of color and vibrancy when she chose to open her wings, yet hardly visible when she closed them."
— Geraldine Brooks


RAM-Disk?

Hmmm... If you're stuck using a single directory for the RDBMS, then it pretty much has to be on one drive, whether it's a single physical disk or a RAID. If the database could use more than one directory, then obviously you could mount physical disks anywhere to the directory tree you wanted to in linux. I do wish I knew more about MySQL. One of these days I might get serious about setting up my linux machine...

What's in the RDBMS for Drupal's setup? It doesn't contain the stories, does it? Are all the comments in it? What about the blog entries? If the Drupal database is just pointers to the text files, and those in turn are just on disk as normal files, then that means the database itself is a fairly compact and tidy thing. Which means, assuming it's already optimally indexed and still bottlenecking at the drive hardware, there'd be a real easy way to speed the thing up in linux:

Copy it to a RAM-disk on system start-up, and let MySQL use that as its primary database. That dispenses with the relatively ginormous delays (multiple milliseconds -- that's a huge interval in a machine doing everything else at gigahertz speeds!) inherent in moving the disk heads for every read. Unless you can mirror or replicate the database to a physical disk, so all writes get recorded on something less ephemeral, this is of course a recipe for disaster. So, it would have to be a RAM-disk in conjunction with some sort of mirror onto a physical drive, or ideally, database replication onto a physical disk. Reads would be lightning fast, and there's way more reads than writes in a CMS, I'm guessing.

I know most linux builds contain support for RAM drives. MySQL contains support for replication, but I don't know if the copy can be on the same server. If so, I'd guess that'd be the quickest, and cheapest, way to speed things up.

DB = everything

erin's picture

It's the weakpoint of Drupal.

Hugs,
Erin

= Give everyone the benefit of the doubt because certainty is a fragile thing that can be shattered by one overlooked fact.

= Give everyone the benefit of the doubt because certainty is a fragile thing that can be shattered by one overlooked fact.

document management

The story database is actually a document management system. A fully indexed document management system will use about 30% of the disc space just for indexing. The only document management systems I had exposure to would kill the budget here, but the concept remains the same i.e. the actual document database needs to be heavily indexed. It is also probably worthwhile to cache the last few days on the web front end(s) and let the more detailed searches run elsewhere. People will expect crisp response on recent data and will accept slower response on more involved searches.

I have said it before and will say it here again, this site is performing extremely well when compared to major commercial sites with equivalent loads and orders of magnitude more resources. If you need to throttle or reduce functionality, so be it, let us who enjoy the site just be thankful it is there.

It's all Greek to me!

I was just reading this thread and didn't understand one darn thing! lol

Does everyone here have a programing background? When I saw "throttling" I kinda looked forward to an automobile thread. :(

Mr. Ram