brendan
Members-
Posts
12 -
Joined
-
Last visited
Everything posted by brendan
-
Okay, I think this finally beginning to come together and make sense. I don't know yet what the solution is, but I'm starting to understand what the problem was. First of all, even though The Informant clearly was not the culprit in this particular server crisis, I see what you are saying re: long-term bandwidth use, and I have already completely disabled The Informant's checking of my CGI file. So, problem solved. Secondly, as for the meaning of the log file, thanks for clearing things up. What you're saying about CPU not being the same thing as bytes transferred makes sense, even though I didn't think of it that way before. Thirdly, I did look at the raw log file via CPanel, and there were no attempts to access any other mt.cgi files during the problem time period. Therefore, it was definitely /cgi-bin/mt/mt.cgi that was the culprit. As I've said, and as the log demonstrates, I wasn't using mt.cgi in an unusual way when the site went down; I wasn't even doing a full rebuild or anything. I was just trying to save a couple of regular posts to my blog -- nothing special. And the problem can't be blamed on the normal way that MT functions, because as I've said, if that were the case, I would have had this problem long, long ago. My blog didn't just spout 8,000 posts overnight. However, I think you hit the nail on the head when you said: I suspect that your guess is exactly right -- one instance of mt.cgi interfering with another and causing a hang. That makes sense, because even prior to the server slowness issues (which I unwittingly caused, but which also affected me as they were occurring), I was having problems with my Internet connection in my apartment, and because I tend to be an impatient person when it comes to technology , I may very well have started multiple instances of mt.cgi in rapid succession when the previous ones did not seem to be working. The problem then snowballed as, not only my Internet connection, but the server became slow, and I did the same thing again. The log clearly demonstrates that I did so at least two or three times, and I think the problem may have actually started even earlier than 3:00 AM EDT. So it was probably one interference/hang piled on top of another, until the server couldn't handle it anymore. So basically, this is not an issue of MT security, but rather, of MT getting hung and/or some sort of interference occurring. The central question, then, is not how to "secure" my MT installation (although that's no reason to shy away from upgrading to the newest version, installing the best security possible, etc.), but rather, how to prevent mt.cgi from "hanging" in the future. Part of this may be modifying my own behavior (i.e., not being so trigger-happy to click on my "login" bookmark multiple times when it doesn't seem to be working the first time), but I will also want to look into technological solutions, or at least partial solutions. In addition, it would be extremely helpful to know how long it takes before mt.cgi "times out" -- i.e., how long I need to wait before it is safe to load a new instance of the script. I'm starting a new job tomorrow morning, so I have to go to bed, but after work tomorrow evening, I will look into these issues further. I'm relieved to finally know (or think I know!) what is the problem that I am trying to solve, at least in general terms. As you're all well aware, it's very hard to troubleshoot when you have only the foggiest, vaguest idea of what the "trouble" is... But now it's starting to come into focus. I think my mt.cgi script "hung," probably because of multiple instances. So now I just need to know specific conditions led to that occurrence, and how to prevent it in the future. Thanks!
-
Rick, forgive me for duplicating my PM, but I'm not sure which is the most efficient. My question: What exactly *is* the log that you made available to me? I know that it doesn't show my entire website, because I was also accessing my /gallery/ directory during that time, and the log didn't show that. Is the log limited to my /cgi-bin/mt/ directory? If so, perhaps we're looking at the *wrong* mt.cgi. I have two other installations, /cgi-bin/mt3/ and /cgi-bin/mt_berkeley/. Bill H. said he was unsure *which* mt.cgi caused the server load early this morning. Perhaps it was one of the others (in which case this must definitely have been some kind of attack, since I never use those installations anymore). Do you have logs for those directories?
-
I just had a "duh" moment of my own, with regard to the source of these "GET" codes. I have a program on my PowerBook called The Informant which checks the status of several websites once a minute, to make sure that both my Internet connection and my crucial website pages are up and running. I will stop it from checking mt.cgi... but 1,662 bytes once a minute cannot possibly be overloading the server, and I have been running that program for months with no problems. So the culprit must be something else. So let's look at the log from 3:00 - 4:30 AM last night, with those "GET" commands excluded: Those all look like perfectly normal commands. At 3:27, I loaded the main editing page for my blog, which caused all the /mt-static/ files to load as well. That's the first 20 entries on the log. Total bytes: 56,575. Nothing extraordinary. At 3:29/3:30, I posted to the blog. Two entries on the log, total bytes: 25,271. At 3:40, I posted to the blog again. Two more entries on the log, total bytes: 26,450. At 3:45, I loaded mt.cgi again. One entry on the log, 13,720 bytes. At 3:46, I loaded mt.cgi twice (I think this was because I was having trouble with my Internet connection). Two entries on the log, 27,029 bytes. At 3:47, I made a failed attempt to post to the blog. The attempt used only 5 bytes. By this time, the server slowdown had begun. (I filed a help ticket at 3:50, ticket # BNA-69338, complaining that "My PHP photo galleries and my blog CGI scripts have been excruciatingly slow." That was 12 or 13 minutes before my site was suspended for supposedly *causing* the slowdown that I was experiencing.) Five seconds later, I loaded mt.cgi again. One entry on the log, 20,653 bytes. After that, all of my attempts to either load mt.cgi or post to the blog failed, because of the server slowdown. Attempted post at 3:51... 152 bytes. Attempted load at 3:59...152 bytes. In both cases, I believe those bytes came from the error message "Got an error: Bad ObjectDriver config: Connection error: Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2)," which I documented on my help ticket. That's it. At 4:02, I attempted to load mt.cgi and I saw that my site had been suspended. (You can tell because this was my first "302" code. Total number of bytes in the hour prior to my site being shut down: 170,007. If you add in the 83,660 bytes from my automated software pinging mt.cgi once a minute, the total is 253,667 bytes in an hour. Nothing extraordinary, nothing that would shut down a server. So, after reading the log, I'm afraid I'm more perplexed than ever. Do you have access to logs of the data that led your techs to conclude that my mt.cgi was the resource hogger on the server? It couldn't have been this data. I'd love to look at those logs and see what they saw, to get a better idea of what happened. Either there's something that this log isn't picking up, or there was a mistake and my mt.cgi was not the culprit.
-
Thanks, David. My account was suspended at around 4:00 AM EDT this morning. Looking at the logs at that time, the only unusual behavior is that bizarre pattern of a "GET" command emanating from my apartment complex's IP address once a minute: Can you decipher anything from that pattern? What do the numbers (first 200 1662, then 200 140, then 302 309) mean? Also, can you do me a favor and upload another copy of that log file into my account sometime after 8:00 PM EDT (i.e., more than 10 minutes from now)? I am shutting down the computers in my apartment one-by-one, to try and figure out whether one of them is sending out a ping to my MT installation, unbeknownst to me.
-
Okay, I was going to keep the log discussion to the PM, but now that David has joined in ... basically, what the logs are telling me is that there is a "GET" command to mt.cgi emanating from my IP address (or rather, the IP address of my entire apartment complex -- but I doubt I'm being hacked by a neighbor, though I suppose it's possible!) once every minute!! As for a full rebuild... I have not done a full rebuild in recent days. What is the time zone of the timestamps on the log? In other words, what does "[15/Jun/2005:18:15:40 -0400]" translate to? Is 18:15 GMT or EDT or what?
-
All I'm saying is, this isn't normal, because if it was, my account would have been suspended a long time ago. 85% server usage is not normal behavior, even for MovableType. I'm looking at that log you uploaded into my root, and something is seriously wrong... definitely not normal MT behavior. I'll PM you about it...
-
Thanks! I will take a look at those logs. I'm aware that MT rebuilding is a resource-intensive process (though less so under MT 3.x, as you say), but surely a normal rebuild should never take up 85% of the server's resources?! If it did, this would have happened to me many times before; I blog all the time! There must have been some other contributing circumstance... hopefully the logs will help me figure out what. Also, it's actually not true that the "entire site" is rebuilt every time a new post goes up. For example, publishing a new post does not rebuild every single individual entry in the blog -- that would take a good long time, and on those rare occasions when I do a manual rebuild of "all individual entries" (which I did not do last night, FWIW) it takes MUCH longer than publishing a single post does. If I'm not mistaken, publishing a post rebuilds the following things: 1) all index pages; 2) the individual entry for that post, and for the post immediately preceding it and (if applicable) the one immediately following it; 3) any relevant date-based archives (e.g. monthly, daily) for the time period of the post and the time period immediately preceding and following; and 4) the category archive for that particular post. In my case, the date-based and category archives are generated dynamically, so they are never "rebuilt" by MT. Thus, the only things that are automatically rebuilt when I publish a post are all my indexes and three individual archive pages: the one for the post I'm publishing and the ones before it and after it. Perhaps I can reduce the load slightly by cutting down on the number of index pages I have set to automatically rebuild... but as I said, surely 85% cannot be normal, even for someone with a large blog and multiple index pages. If it was, MovableType would be banned by every host on the Web!
-
Hi all, I need advice about my MT installation, and how to "secure" mt.cgi against possible attacks. Honestly, I am not sure how mt.cgi could even be attacked, since it's supposed to be executable only by people with access to my blog, but last night (wee hours of this morning, actually) it was using between 46% and 84% of server resources, and as a result, my account was suspended. I assume that such resource-hogging by mt.cgi must be the result of an attack of some sort. I certainly wasn't doing anything that would have caused such numbers, and I can't imagine that my fellow bloggers were, either. (Please note that it is specifically mt.cgi, not a comment or trackback script, that was hogging the server. I disabled my comment and trackback scripts in January after being suspended several times as a result of comment-spam floods, and I have been using an external comment provider (HaloScan) ever since. So unless spammers have some way of using mt.cgi directly, this is not a comment-spam issue.) TCH has told me that I must come up with "a plan to secure [my] MT installation" before my account will be unsuspended. The problem is, I don't know what I am "securing" it against! I feel like I'm flying blind. Comment-spam I understood, but this is bizarre. An mt.cgi attack? Huh? So anyway, I have several questions. First of all, what could possibly cause mt.cgi to reach such high server-resource levels? Does my suspicion that it was an "attack" of some sort make any sense? And what can I do to prevent it from recurring? I believe I am using either MT 3.15 or 3.16 (I can't recall which, and I can't check because my account is offline!); I know it is definitely 3.1x, the only question is what the "x" is. I have a MySQL back-end. I'm not sure what other information is relevant, but ask and ye shall receive. I would very much appreciate any advice that y'all may have. (And yes, I realize that the account-suspension aspect of my problem must be dealt with through the help desk, and that is already happening. I am just explaining that to provide context. What I am looking for here is help with "securing" my MT installation.)
