GreekChat.com Forums  

Go Back   GreekChat.com Forums > Greek Life

Greek Life This forum is for various discussion topics regarding greek life. If you are posting a non-greek related message, please do so in one of the General Chat Topic forums.


Register Now for FREE!
Join GreekChat.com, The Fraternity & Sorority Greek Chat Network. To sign up for your FREE account INSTANTLY fill out the form below!

Username: Password: Confirm Password: E-Mail: Confirm E-Mail:
 
Image Verification
Please enter the six letters or digits that appear in the image opposite.

  I agree to forum rules 

» GC Stats
Members: 307,548
Threads: 114,745
Posts: 2,158,560
Welcome to our newest member, vuuielkrp
» Online Users: 281
2 members and 279 guests
rmonclermc10071
Reply
 
Thread Tools Display Modes
  #46  
Old 03-03-2018, 08:48 PM
John John is offline
Administrator
 
Join Date: Aug 1999
Location: NJ, USA
Posts: 2,098
Happened again 8pm tonight. All sorted out now.

Yesterday I set things up so I receive text and email notifications immediately after unexpected server reboots which will speed up how fast I can get any resulting issues corrected when/if it occurs again.

Also yesterday I discovered that all these server hard reboots are causing plenty of other problems that I'll probably be needing to sort out sometime soon as well. I'll post details regarding these other issues either later tonight or tomorrow.
__________________
John Hammell
Network Admin, GreekChat.com
Reply With Quote
  #47  
Old 03-04-2018, 05:40 AM
FSUZeta FSUZeta is offline
Super Moderator
 
Join Date: Sep 2003
Location: naples, florida
Posts: 15,256
You're the best John.
__________________
I live in Fantasyland and I have waterfront property.
Reply With Quote
  #48  
Old 03-04-2018, 12:49 PM
John John is offline
Administrator
 
Join Date: Aug 1999
Location: NJ, USA
Posts: 2,098
Another hard reboot around an hour ago... All fixed up again.

Quote:
Originally Posted by FSUZeta View Post
You're the best John.
Sure doesn't feel like it during times like this. I suppose on the positive side it's good that GC hasn't had any technical issues like this in probably over 10 years. And once all the issues are sorted out more permanently GC should end up with a server setup that is much more resilient to issues such as what we're facing now.
__________________
John Hammell
Network Admin, GreekChat.com
Reply With Quote
  #49  
Old 03-04-2018, 01:06 PM
AZTheta AZTheta is offline
GreekChat Member
 
Join Date: Aug 2009
Location: N 37.811092 W -107.664643
Posts: 5,035
What you do is totally voodoo to me.

I do appreciate it!
Reply With Quote
  #50  
Old 03-05-2018, 12:38 AM
John John is offline
Administrator
 
Join Date: Aug 1999
Location: NJ, USA
Posts: 2,098
Quote:
Originally Posted by AZTheta View Post
What you do is totally voodoo to me.
My first instinct was to respond that it's not as complicated as it seems... but it probably is. I've just been doing this stuff for so long that much of it has become second nature.

Quote:
Originally Posted by John View Post
I'll post details regarding these other issues either later tonight or tomorrow.
So...

The forum software we use here at GC, similar to most forum type software, uses the MySQL database software. MySQL, at least when this version of the forum software we are on was developed, defaulted to the MyISAM database storage engine.

And it turns out that the MyISAM database storage engine is not particularly resilient to sudden power loss as has happened with GC's server quite a few times in the past month.

Essentially, if the database server was in the process of saving any pertinent information when the power was disrupted, only part of the data may have saved and the other part lost/corrupted. Which may or may not cause corruption to various important data in the database.

Up until March 1st this, as far as I can tell, wasn't a big issue since problems seemed to always impact non essential areas of the database. But, on March 1st the two reboots crashed the user database table. After checking with the forum software developer, this sort of crash (despite being "repaired" using MySQL's repair functions) may have corrupted some GCer account records which may then not be recoverable and for impacted accounts, they would need to start a new account.

I'm definitely not okay with that, so will be doing everything I can to ensure GC data is minimally impacted once all the server issues are sorted out. Nobody has emailed me so far about problems accessing their GC account, so maybe no account corruptions so far.

Also, I don't know for certain that the MySQL repair functions leave data without issues untouched. So maybe there is data corruption that is currently undetected. This is something that I'll be looking into.

---

What I'll be doing:

1. Stabilizing the GC hosting environment.

Currently I'm waiting for the datacenter to replace a faulty/failing power strip/distribution unit. After that I'll test the server hardware to determine if these problems are due to the server going bonkers or if it's the datacenter's PDU that caused the problems.

2. I've been researching what changes to make and I will either reinstall the current server or setup a new server in such a way where GC's database will be resilient (or at least significantly more resilient) to future power disruptions.

3. Possible data corruption. I'll try to determine if there is data corruption. If not, then we should be good from that point. However, if there is data corruption I might restore the last trusted database backup (which is from just before the first hard reboot back in December) and will merge all of the new stuff from then to current back into that known good copy of the database.

What that will do is limit any potential resulting data corruption issues to only the past 3 months rather than the entire history of GC.

Unsure about that part but it's something I'm considering.

---

And one last piece of info in this extra long message:

Code:
# ls -f | wc -l
1160443
That's a Linux command to list the number of files in a directory. That number (1,160,443) is the number of database email error messages that have been sent to an email account on the server related to all these issues. Probably just from the times immediately after the reboots and before I repaired the database. I haven't seen that number increase for hours, so chances are it might mean there aren't any (or aren't many) lingering database problems due to the reboots.

All those emails also aren't likely unique errors. There may just be a few dozen errors each repeated thousands of times each. If it becomes necessary for me to look through the errors I'll write a software program to sort through all that and return just one message for each unique error.

---

That's it for now. Thanks for staying tuned in to GC!
__________________
John Hammell
Network Admin, GreekChat.com
Reply With Quote
  #51  
Old 03-05-2018, 11:06 AM
AZTheta AZTheta is offline
GreekChat Member
 
Join Date: Aug 2009
Location: N 37.811092 W -107.664643
Posts: 5,035
^^^ I'm very glad it makes sense to you. Still voodoo to me. I don't speak that language!

Thanks again for everything you do, it is appreciated.
Reply With Quote
  #52  
Old 03-05-2018, 02:29 PM
rockwallgreek rockwallgreek is offline
GreekChat Member
 
Join Date: Sep 2011
Posts: 95
John, I do not understand even a little bit of what you said, but I am very thankful for all you do!!
Reply With Quote
  #53  
Old 03-05-2018, 04:23 PM
John John is offline
Administrator
 
Join Date: Aug 1999
Location: NJ, USA
Posts: 2,098
Quote:
Originally Posted by rockwallgreek View Post
John, I do not understand even a little bit of what you said, but I am very thankful for all you do!!
Let's say you had one chance to write down an important message with some requirement that the pen must not be removed from the paper until complete. You could not make corrections or finish the message if you remove the pen from the paper before you complete it.

Then, while writing, the paper is abruptly yanked away. Now your message is only half written with part of it not legible and that's how it must remain.

That's sort of what happens when there is a power outage with the web server. Anything in the process of being saved when the power is cut might end up a mess / corrupted and only partially saved to the database. Corrupted data could result in some things not working correctly on the website or maybe not at all.

Although I'm not certain, so far it seems that we may be in the clear regarding any data corruption.
__________________
John Hammell
Network Admin, GreekChat.com
Reply With Quote
  #54  
Old 03-05-2018, 09:56 PM
aephi alum aephi alum is offline
Moderator
 
Join Date: Jul 2001
Location: The Big Easy
Posts: 9,796
^ I like that explanation, John.

I've worked with MySQL, and my personal preference for engine is InnoDB, not MyISAM. Mainly because InnoDB supports foreign keys and transactions. I'm guessing you don't have control over which engine is used.

Do you have any tools available to you to analyze DB performance? (e.g. NewRelic)

Thank you again for everything you do for us.
__________________
AEΦ ... Multa Corda, Una Causa ... Celebrating Over 100 Years of Sisterhood
Have no place I can be since I found Serenity, but you can't take the sky from me...
Only those who risk going too far, find out how far they can go.
Reply With Quote
  #55  
Old 03-06-2018, 12:39 AM
John John is offline
Administrator
 
Join Date: Aug 1999
Location: NJ, USA
Posts: 2,098
Quote:
Originally Posted by aephi alum View Post
my personal preference for engine is InnoDB
InnoDB does seem really good. I've been reading up on it this week. That's what I'll likely be switching to, specifically for the transactions feature. Seems that may help significantly with data integrity in the face of all these power issues.

Once the power issues are sorted out hopefully there won't be any related problems again for a long time. But, if there are problems at least I'll know InnoDB may be able to handle it much better.

In addition, I'm going to test ZFS with my setup and if it works out well I'll place the MySQL data folder on a zpool for the additional data integrity benefits.

Quote:
Originally Posted by aephi alum View Post
I'm guessing you don't have control over which engine is used
On the server, yes, but no so much with regards to the software.

Back when this version of the forum software was developed they decided to go with MyISAM. InnoDB was available then, but MyISAM was the default for MySQL at the time so maybe that's why they went with it.

I recall back then that InnoDB wasn't necessarily recommended for vBulletin, unsure exactly why but one of the issues mentioned was relating to full text search being available in MyISAM but not in InnoDB (which I read this week that it does now have that feature). Apparently, though, full text in MyISAM only impacted the search engine of the forum software but instead of fixing the search to work with InnoDB they just went with MyISAM tables.

Anyhow, there is a path to switching over to InnoDB which works with the current software that I'll be looking more into. (Although, I'm not planning to keep GC on this forum software much longer, but that is an entirely different topic that I'll be starting a new thread about soon.)

Quote:
Originally Posted by aephi alum View Post
Do you have any tools available to you to analyze DB performance? (e.g. NewRelic)
I've read about NewRelic but never used it before. The server isn't too overloaded so I never really found it necessary to explore squeezing more performance out of the DB in that way.
__________________
John Hammell
Network Admin, GreekChat.com
Reply With Quote
  #56  
Old 03-23-2018, 05:46 PM
NinjaPoodle NinjaPoodle is offline
Super Moderator
 
Join Date: Jul 2001
Location: On the beach. Well....not really but near it. :0)
Posts: 12,937
Thumbs up

John, I just saw the IP addresses.


Thank you!!
__________________
Sigma Gamma Rho Sorority, Inc. ** Greater Service, Greater Progress
Since 1922
Reply With Quote
  #57  
Old 03-23-2018, 08:07 PM
John John is offline
Administrator
 
Join Date: Aug 1999
Location: NJ, USA
Posts: 2,098
Quote:
Originally Posted by NinjaPoodle View Post
John, I just saw the IP addresses.
From the message I posted with the server log file? All those IPs, replaced with # symbols, were my IPs when I was logged in to the server working on stuff. I've been logged in a bunch more times since while dealing with the reboot issues, etc.

We're still not in the clear yet with the server reboot problems, though. The datacenter is taking excessively long with replacing the power distribution unit. GC's server is still powered through that failing PDU but they did reduce the load on it by moving any servers off of it that they could. It was a month ago when I notified their staff of the issue and they were able to confirm the PDU is failing.

Until they replace the PDU it causes uncertainty as to whether any of the reboot issues are due to GC's server hardware or if it's solely due to their PDU being on its way out.

I have a temporary server that I'll be moving GC to soon. Then after the datacenter PDU is taken care of I'll get things set back up on the current web server again.
__________________
John Hammell
Network Admin, GreekChat.com
Reply With Quote
  #58  
Old 03-23-2018, 09:23 PM
NinjaPoodle NinjaPoodle is offline
Super Moderator
 
Join Date: Jul 2001
Location: On the beach. Well....not really but near it. :0)
Posts: 12,937
On the blue header bar of each message above the join date, next to the "report spam " icon", is the computer icon, when you move the cursor over it, it shows the IP addy.
__________________
Sigma Gamma Rho Sorority, Inc. ** Greater Service, Greater Progress
Since 1922
Reply With Quote
  #59  
Old 03-25-2018, 02:57 AM
John John is offline
Administrator
 
Join Date: Aug 1999
Location: NJ, USA
Posts: 2,098
GC was offline for a while Saturday evening. This time the culprit was not a power disruption & reboot, although it was the indirect cause.

Each time the server reboots due to the power issues the server saves some messages into the system error log which is part of the BIOS. Not much space there for logs, so this one could only hold 512 messages. And, it turns out, that once that log fills up it will cause the system to wait at a specific startup screen simply to notify about the full system error log. I had to press the F1 key to get it going again.

Not quite what I was expecting when I saw that the server was completely unresponsive. But glad it was a relatively easy fix.
__________________
John Hammell
Network Admin, GreekChat.com
Reply With Quote
  #60  
Old 03-25-2018, 01:13 PM
NinjaPoodle NinjaPoodle is offline
Super Moderator
 
Join Date: Jul 2001
Location: On the beach. Well....not really but near it. :0)
Posts: 12,937
Thanks for the update!
__________________
Sigma Gamma Rho Sorority, Inc. ** Greater Service, Greater Progress
Since 1922
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Potential server problems... follow @GreekChat on Twitter (just in case) John Greek Life 1 06-23-2013 04:48 PM
UVa sexual assault case resolved after 22 years GA-Beta Greek Life 7 11-14-2006 04:27 PM
UF-- editorial on recent issues curlygirly Risk Management - Hazing & etc. 24 11-16-2002 10:52 AM
Vote for ZPhiB! BET Technical Difficulties Resolved sexZzeta Zeta Phi Beta 0 05-06-2002 09:25 AM



All times are GMT -4. The time now is 03:06 AM.


Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2018, vBulletin Solutions, Inc.