Did Everyone's Apache Server Go Dead at the same Time?

Did Everyone's Apache Server Go Dead at the same Time?

Postby KnightMB » Nov 1st, '13, 14:05

I noticed this around 6:45AM (-6:00 CST) that all my machines running apache had just stopped. No errors in the logs, no warnings. Just one minute is working, the next minute, the apache daemon has just stopped. All it took was a simple service restart and everything was fine. The weird thing is, it happened on all my mageia machines, 32bit and 64bit distro at the exact same moment. I even had a dev machine on LAN (no Internet access, not even the same network as the other apache servers) that does not server any public websites do the same thing, so I know it wasn't some weird attack. It all appears to be centered around the time for some reason. Never seen this happen before, curious if anyone else say this?
User avatar
KnightMB
 
Posts: 76
Joined: Nov 21st, '12, 21:27

Re: Did Everyone's Apache Server Go Dead at the same Time?

Postby doktor5000 » Nov 2nd, '13, 00:12

Not running any webserver here, but did you check the logs with journalctl around that time, maybe there has been msec or some cronjob or an at job or something like that?
Or the NSA shut down all your boxes :p
Cauldron is not for the faint of heart!
Caution: Hot, bubbling magic inside. May explode or cook your kittens!
----
Disclaimer: Beware of allergic reactions in answer to unconstructive complaint-type posts
User avatar
doktor5000
 
Posts: 18052
Joined: Jun 4th, '11, 10:10
Location: Leipzig, Germany

Re: Did Everyone's Apache Server Go Dead at the same Time?

Postby KnightMB » Nov 2nd, '13, 00:19

Hehe, yeah checked everything. This is the only thing I found on the dev box (because it had all the logging enabled) The Message around 6:41 is when I restarted apache, but before it a few minutes earlier, just nothing :(
I take it since no else has seen this it is only me. The setup is basically Apache, PHP, MariaDB with all defaults basically, no much in the way of custom settings or modules.
Code: Select all
[Fri Nov 01 04:02:03.881629 2013] [auth_digest:notice] [pid 1631] AH01757: generating secret for digest authentication ...
[Fri Nov 01 04:02:04.102834 2013] [mpm_prefork:notice] [pid 1631] AH00163: Apache/2.4.4 (Unix) PHP/5.4.19 configured -- resuming normal operations
[Fri Nov 01 04:02:04.102917 2013] [core:notice] [pid 1631] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND'
[Fri Nov 01 04:02:04.217217 2013] [mpm_prefork:notice] [pid 1631] AH00171: Graceful restart requested, doing restart
AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 127.0.0.1. Set the 'ServerName' directive globally to suppress this message
[Fri Nov 01 04:02:04.322557 2013] [auth_digest:notice] [pid 1631] AH01757: generating secret for digest authentication ...
[Fri Nov 01 04:02:05.046285 2013] [mpm_prefork:notice] [pid 1631] AH00163: Apache/2.4.4 (Unix) PHP/5.4.19 configured -- resuming normal operations
[Fri Nov 01 04:02:05.046351 2013] [core:notice] [pid 1631] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND'
[Fri Nov 01 06:41:17.565662 2013] [core:notice] [pid 1631] AH00052: child pid 22379 exit signal Segmentation fault (11)
[Fri Nov 01 06:42:00.400760 2013] [mpm_prefork:notice] [pid 1631] AH00170: caught SIGWINCH, shutting down gracefully
AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 127.0.0.1. Set the 'ServerName' directive globally to suppress this message
[Fri Nov 01 06:42:02.085394 2013] [auth_digest:notice] [pid 23919] AH01757: generating secret for digest authentication ...
[Fri Nov 01 06:42:03.076773 2013] [mpm_prefork:notice] [pid 23919] AH00163: Apache/2.4.4 (Unix) PHP/5.4.19 configured -- resuming normal operations
[Fri Nov 01 06:42:03.076921 2013] [core:notice] [pid 23919] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND'
User avatar
KnightMB
 
Posts: 76
Joined: Nov 21st, '12, 21:27

Re: Did Everyone's Apache Server Go Dead at the same Time?

Postby KnightMB » Dec 1st, '13, 18:43

It has happened again, this time on a hosted server elsewhere (including all of mine)

I notice that it is Dec 1 now, last time this happened, it was Nov 1 just one month ago.

Is no one else seeing this? It happens on either the 32bit or 64bit mageia 3 installs? Got 4 completely different systems (different brands, makes, RAM, etc) that it happens at the same time where Apache just stops, no logs, no errors, nothing.

I am going to run some experiments to see if ticking over from one month to the next is the issue (basically set the clocks back and watch what happens over a 24 hour period) I sure would feel better if someone else was seeing this. :?

Are the official Mageia websites not running Mageia :?: :lol:
User avatar
KnightMB
 
Posts: 76
Joined: Nov 21st, '12, 21:27

Re: Did Everyone's Apache Server Go Dead at the same Time?

Postby doktor5000 » Dec 1st, '13, 19:15

As you didn't state it explitly yet, did you check all the various crontab locations and maybe even at jobs?
Cauldron is not for the faint of heart!
Caution: Hot, bubbling magic inside. May explode or cook your kittens!
----
Disclaimer: Beware of allergic reactions in answer to unconstructive complaint-type posts
User avatar
doktor5000
 
Posts: 18052
Joined: Jun 4th, '11, 10:10
Location: Leipzig, Germany

Re: Did Everyone's Apache Server Go Dead at the same Time?

Postby KnightMB » Dec 1st, '13, 19:21

doktor5000 wrote:As you didn't state it explitly yet, did you check all the various crontab locations and maybe even at jobs?

Great idea, never thought of that.
This is what was set on the machine (well all of them by default)
The 1 month cron runs on the 1st of every month @ 4:42AM

These are the two jobs that run, I'll have to examine them further to see what they do. :D

/etc/cron.monthly/0anacron-timestamp
/etc/cron.monthly/update-microcode
User avatar
KnightMB
 
Posts: 76
Joined: Nov 21st, '12, 21:27

Re: Did Everyone's Apache Server Go Dead at the same Time?

Postby KnightMB » Dec 1st, '13, 19:25

Ok, not much to the first job

0anacron-timestamp
Code: Select all
#!/bin/sh
# Updates the timestamp of last monthly run for anacron
date +%Y%m%d > /var/spool/anacron/cron.monthly


Second job though, might be it!
update-microcode
Code: Select all
#!/bin/sh
#
# check if there is a new microcode for your CPU and update it

# Intel 686 and above, AMD family 16 and above
vendor=`grep "^vendor_id" /proc/cpuinfo | head -n1 | awk -F ": " '{ print $2 }'`
family=`grep "^cpu family" /proc/cpuinfo | head -n1 | awk -F ": " '{ print $2 }'`

if [ "$vendor" = "GenuineIntel" ] && [ $family -ge 6 ]; then
   /usr/sbin/update-intel-microcode
elif [ "$vendor" = "AuthenticAMD" ] && [ $family -ge 16 ]; then
   minor=`uname -r |  cut -d . -f 3 | cut -d - -f 1`
   if [ $minor -ge 29 ]; then
      /usr/sbin/update-amd-microcode
   fi
fi


I have a system with the time rolled back to see if it happens again when the next month ticks over.
User avatar
KnightMB
 
Posts: 76
Joined: Nov 21st, '12, 21:27

Re: Did Everyone's Apache Server Go Dead at the same Time?

Postby doktor5000 » Dec 1st, '13, 21:55

Are that the only cronjobs you checked?
What about
Code: Select all
crontab -l root
or the contents of
Code: Select all
/etc/crontab
or
Code: Select all
/etc/anacrontab

What does
Code: Select all
find /etc/cron* -type f

show?
Cauldron is not for the faint of heart!
Caution: Hot, bubbling magic inside. May explode or cook your kittens!
----
Disclaimer: Beware of allergic reactions in answer to unconstructive complaint-type posts
User avatar
doktor5000
 
Posts: 18052
Joined: Jun 4th, '11, 10:10
Location: Leipzig, Germany

Re: Did Everyone's Apache Server Go Dead at the same Time?

Postby KnightMB » Dec 2nd, '13, 05:22

doktor5000 wrote:Are that the only cronjobs you checked?
What about
Code: Select all
crontab -l root


crontab: usage error: no arguments permitted after this option
usage: crontab [-u user] file
crontab [-u user] [ -e | -l | -r ]
crontab -n [ hostname ]
crontab -c
(default operation is replace, per 1003.2)
-e (edit user's crontab)
-l (list user's crontab)
-r (delete user's crontab)
-i (prompt before deleting user's crontab)
-n (set host in cluster to run users' crontabs)
-c (get host in cluster to run users' crontabs)
or the contents of
Code: Select all
/etc/crontab


bash: cd: /etc/crontab: Not a directory
or
Code: Select all
/etc/anacrontab


bash: cd: /etc/anacrontab: Not a directory
What does
Code: Select all
find /etc/cron* -type f

show?

/etc/cron.d/php
/etc/cron.daily/0anacron-timestamp
/etc/cron.daily/rpm
/etc/cron.daily/makewhatis.cron
/etc/cron.daily/logrotate
/etc/cron.daily/mlocate.cron
/etc/cron.daily/tmpwatch
/etc/cron.deny
/etc/cron.hourly/0anacron
/etc/cron.monthly/0anacron-timestamp
/etc/cron.monthly/update-microcode
/etc/crontab
/etc/cron.weekly/0anacron-timestamp
/etc/cron.weekly/makewhatis.cron
/etc/cron.weekly/makewhatis-en.cron
User avatar
KnightMB
 
Posts: 76
Joined: Nov 21st, '12, 21:27

Re: Did Everyone's Apache Server Go Dead at the same Time?

Postby doktor5000 » Dec 2nd, '13, 12:24

KnightMB wrote:
doktor5000 wrote:Are that the only cronjobs you checked?
What about
Code: Select all
crontab -l root

Well, it was only crontab -l - Did you read the usage?

KnightMB wrote:
or the contents of
Code: Select all
/etc/crontab


bash: cd: /etc/crontab: Not a directory

Right, it's a regular file. You can't cd to a file.

or
Code: Select all
/etc/anacrontab


bash: cd: /etc/anacrontab: Not a directory

Also a regular file.

For the other crontab files you found, you should probably take a look inside them.
I've given you pointers where to look, you need to figure this out for yourself.
Cauldron is not for the faint of heart!
Caution: Hot, bubbling magic inside. May explode or cook your kittens!
----
Disclaimer: Beware of allergic reactions in answer to unconstructive complaint-type posts
User avatar
doktor5000
 
Posts: 18052
Joined: Jun 4th, '11, 10:10
Location: Leipzig, Germany

Re: Did Everyone's Apache Server Go Dead at the same Time?

Postby KnightMB » Dec 2nd, '13, 14:41

doktor5000 wrote:Well, it was only crontab -l - Did you read the usage?

Code: Select all
# crontab -l
no crontab for root

Code: Select all
#cat /etc/crontab
SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
HOME=/

# run-parts
01 * * * * root nice -n 19 run-parts --report /etc/cron.hourly
02 4 * * * root nice -n 19 run-parts --report /etc/cron.daily
22 4 * * 0 root nice -n 19 run-parts --report /etc/cron.weekly
42 4 1 * * root nice -n 19 run-parts --report /etc/cron.monthly


Code: Select all
# cat /etc/anacrontab
# /etc/anacrontab: configuration file for anacron

# See anacron(8) and anacrontab(5) for details.

SHELL=/bin/sh
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
# the maximal random delay added to the base delay of the jobs
RANDOM_DELAY=45
# the jobs will be started during the following hours only
START_HOURS_RANGE=6-22

#period in days   delay in minutes   job-identifier   command
1       5       cron.daily              nice -n 19 run-parts /etc/cron.daily
7       25      cron.weekly             nice -n 19 run-parts /etc/cron.weekly
@monthly 45     cron.monthly            nice -n 19 run-parts /etc/cron.monthly

For the other crontab files you found, you should probably take a look inside them.
I've given you pointers where to look, you need to figure this out for yourself.

Thanks for the help, so far rolling the clock back can not duplicate it. I've tried a couple of times just to see if one of those cron jobs was doing it. It might be a coincidence and what happens needs a month to happen, I am not sure yet. One machine can be busy serving websites and the other is idle for a month and the same thing happens to both, but I am not sure why yet.
Last edited by doktor5000 on Dec 2nd, '13, 21:19, edited 1 time in total.
Reason: added code tags, to improve on clarity, removed longer quotes
User avatar
KnightMB
 
Posts: 76
Joined: Nov 21st, '12, 21:27

Re: Did Everyone's Apache Server Go Dead at the same Time?

Postby doktor5000 » Dec 2nd, '13, 21:20

Please next time use code tags as explained in ftp://ftp5.gwdg.de/pub/linux/mandriva/m ... e_tags.ogv
Cauldron is not for the faint of heart!
Caution: Hot, bubbling magic inside. May explode or cook your kittens!
----
Disclaimer: Beware of allergic reactions in answer to unconstructive complaint-type posts
User avatar
doktor5000
 
Posts: 18052
Joined: Jun 4th, '11, 10:10
Location: Leipzig, Germany

Re: Did Everyone's Apache Server Go Dead at the same Time?

Postby KnightMB » Dec 4th, '13, 01:28

No luck trying to reproduce this, setting the clock back does not appear to do anything. It might be that it just takes 30 days for this to happen, so I might experiment with setting the block to the 1st of the month, wait a day, then accelerate it to the end of the month and see what happens.
User avatar
KnightMB
 
Posts: 76
Joined: Nov 21st, '12, 21:27

Re: Did Everyone's Apache Server Go Dead at the same Time?

Postby filip » Dec 4th, '13, 08:33

I would try running jobs listed in /etc/cron.monthly manually.
filip
 
Posts: 478
Joined: May 4th, '11, 22:10
Location: Kranj, Slovenia

Re: Did Everyone's Apache Server Go Dead at the same Time?

Postby KnightMB » Dec 4th, '13, 08:44

filip wrote:I would try running jobs listed in /etc/cron.monthly manually.

I have tried this, didn't affect Apache at all, was still running even after doing it a couple of times in a row. :(
User avatar
KnightMB
 
Posts: 76
Joined: Nov 21st, '12, 21:27

Re: Did Everyone's Apache Server Go Dead at the same Time?

Postby jiml8 » Dec 4th, '13, 08:51

You should also look in /var/spool/cron. You may find some crontabs there.
jiml8
 
Posts: 1254
Joined: Jul 7th, '13, 18:09

Re: Did Everyone's Apache Server Go Dead at the same Time?

Postby KnightMB » Dec 4th, '13, 08:57

jiml8 wrote:You should also look in /var/spool/cron. You may find some crontabs there.

I checked, folder was empty. No hidden files either.
User avatar
KnightMB
 
Posts: 76
Joined: Nov 21st, '12, 21:27

Re: Did Everyone's Apache Server Go Dead at the same Time?

Postby filip » Dec 4th, '13, 09:44

Is there any collision possibility of cron.hourly, cron.daily, cron.weekly and cron.monthly?

Is there any particular reason that both cron and anacron are installed?
filip
 
Posts: 478
Joined: May 4th, '11, 22:10
Location: Kranj, Slovenia

Re: Did Everyone's Apache Server Go Dead at the same Time?

Postby KnightMB » Dec 4th, '13, 09:50

That's what the system comes with by default.

I checked all the cron dates, none of them overlap.

It might not be cron related?
User avatar
KnightMB
 
Posts: 76
Joined: Nov 21st, '12, 21:27

Re: Did Everyone's Apache Server Go Dead at the same Time?

Postby filip » Dec 4th, '13, 13:20

KnightMB wrote:I checked all the cron dates, none of them overlap.

But are they short enough when they run.
filip
 
Posts: 478
Joined: May 4th, '11, 22:10
Location: Kranj, Slovenia

Re: Did Everyone's Apache Server Go Dead at the same Time?

Postby KnightMB » Jan 1st, '14, 22:48

Well, it has been another month and it happened again :x

At least this time, I had as much logging turned on as I could find. It appears all of them get a request to "Graceful restart requested, doing restart" and afterwards that is when Apache is down until you notice and do a restart of the process.

Here is the log from a machine setup specifically to sit there and do nothing but run Apache (Mageia 3, 32bit distro) to test this issue out. Been sitting idle for a month basically.
Code: Select all
[Wed Jan 01 04:02:03.350884 2014] [mpm_prefork:notice] [pid 2070] AH00171: Graceful restart requested, doing restart
AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 127.0.0.1. Set the 'ServerName' directive globally to suppress this message
[Wed Jan 01 04:02:03.704711 2014] [auth_digest:notice] [pid 2070] AH01757: generating secret for digest authentication ...
[Wed Jan 01 04:02:04.838923 2014] [mpm_prefork:notice] [pid 2070] AH00163: Apache/2.4.4 (Unix) PHP/5.4.19 configured -- resuming normal operations
[Wed Jan 01 04:02:04.839004 2014] [core:notice] [pid 2070] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND'
[Wed Jan 01 04:02:08.862654 2014] [core:notice] [pid 2070] AH00052: child pid 28695 exit signal Segmentation fault (11)
[Wed Jan 01 05:20:21.314645 2014] [core:notice] [pid 2070] AH00052: child pid 28694 exit signal Segmentation fault (11)
[Wed Jan 01 05:20:24.321166 2014] [core:notice] [pid 2070] AH00052: child pid 28693 exit signal Segmentation fault (11)
[Wed Jan 01 05:20:34.336042 2014] [core:notice] [pid 2070] AH00052: child pid 28691 exit signal Segmentation fault (11)
[Wed Jan 01 05:22:28.571947 2014] [mpm_prefork:notice] [pid 2070] AH00170: caught SIGWINCH, shutting down gracefully
AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 127.0.0.1. Set the 'ServerName' directive globally to suppress this message
[Wed Jan 01 05:22:30.092606 2014] [auth_digest:notice] [pid 30092] AH01757: generating secret for digest authentication ...
[Wed Jan 01 05:22:31.162031 2014] [mpm_prefork:notice] [pid 30092] AH00163: Apache/2.4.4 (Unix) PHP/5.4.19 configured -- resuming normal operations
[Wed Jan 01 05:22:31.162172 2014] [core:notice] [pid 30092] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND'


Server out in the production environment. I shorten the log because the in-between is megabytes of Seg faults which I guess is from web visitors trying to visit while the Apache is in the crashed state. System Mageia 3 (64bit distro)
Code: Select all
[Wed Jan 01 06:27:19.759375 2014] [mpm_prefork:notice] [pid 24675] AH00171: Graceful restart requested, doing restart
[Wed Jan 01 06:27:19.987618 2014] [auth_digest:notice] [pid 24675] AH01757: generating secret for digest authentication ...
[Wed Jan 01 06:27:20.438882 2014] [mpm_prefork:notice] [pid 24675] AH00163: Apache/2.4.4 (Unix) OpenSSL/1.0.1e PHP/5.4.19 configured -- resuming normal operations
[Wed Jan 01 06:27:20.438907 2014] [core:notice] [pid 24675] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND'
[Wed Jan 01 06:27:20.476375 2014] [mpm_prefork:notice] [pid 24675] AH00171: Graceful restart requested, doing restart
[Wed Jan 01 06:27:20.518603 2014] [auth_digest:notice] [pid 24675] AH01757: generating secret for digest authentication ...
[Wed Jan 01 06:27:21.077690 2014] [mpm_prefork:notice] [pid 24675] AH00163: Apache/2.4.4 (Unix) OpenSSL/1.0.1e PHP/5.4.19 configured -- resuming normal operations
[Wed Jan 01 06:27:21.077718 2014] [core:notice] [pid 24675] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND'
[Wed Jan 01 06:27:21.077746 2014] [core:notice] [pid 24675] AH00052: child pid 27251 exit signal Segmentation fault (11)
[Wed Jan 01 06:27:21.078656 2014] [core:notice] [pid 24675] AH00052: child pid 27252 exit signal Segmentation fault (11)
[Wed Jan 01 06:27:21.079592 2014] [core:notice] [pid 24675] AH00052: child pid 27253 exit signal Segmentation fault (11)
[Wed Jan 01 06:27:21.080576 2014] [core:notice] [pid 24675] AH00052: child pid 27254 exit signal Segmentation fault (11)
[Wed Jan 01 06:27:21.082380 2014] [core:notice] [pid 24675] AH00052: child pid 27255 exit signal Segmentation fault (11)
[Wed Jan 01 06:27:21.087158 2014] [core:notice] [pid 24675] AH00052: child pid 27256 exit signal Segmentation fault (11)
[Wed Jan 01 06:27:21.087196 2014] [core:notice] [pid 24675] AH00052: child pid 27257 exit signal Segmentation fault (11)
[Wed Jan 01 06:27:21.087210 2014] [core:notice] [pid 24675] AH00052: child pid 27363 exit signal Segmentation fault (11)
[Wed Jan 01 06:27:21.087222 2014] [core:notice] [pid 24675] AH00052: child pid 27364 exit signal Segmentation fault (11)
[Wed Jan 01 06:27:21.087236 2014] [core:notice] [pid 24675] AH00052: child pid 27365 exit signal Segmentation fault (11)
[Wed Jan 01 06:27:21.087248 2014] [core:notice] [pid 24675] AH00052: child pid 27366 exit signal Segmentation fault (11)
[Wed Jan 01 06:27:22.090640 2014] [core:notice] [pid 24675] AH00052: child pid 27367 exit signal Segmentation fault (11)
[Wed Jan 01 06:27:23.093979 2014] [core:notice] [pid 24675] AH00052: child pid 27407 exit signal Segmentation fault (11)
[Wed Jan 01 06:27:23.094039 2014] [core:notice] [pid 24675] AH00052: child pid 27408 exit signal Segmentation fault (11)
[Wed Jan 01 06:27:24.096929 2014] [core:notice] [pid 24675] AH00052: child pid 27409 exit signal Segmentation fault (11)
[Wed Jan 01 06:27:24.097017 2014] [core:notice] [pid 24675] AH00052: child pid 27410 exit signal Segmentation fault (11)
[Wed Jan 01 06:27:25.100087 2014] [core:notice] [pid 24675] AH00052: child pid 27411 exit signal Segmentation fault (11)
[Wed Jan 01 06:27:25.100158 2014] [core:notice] [pid 24675] AH00052: child pid 27412 exit signal Segmentation fault (11)
[Wed Jan 01 06:27:26.102446 2014] [core:notice] [pid 24675] AH00052: child pid 27413 exit signal Segmentation fault (11)
[Wed Jan 01 06:27:26.102502 2014] [core:notice] [pid 24675] AH00052: child pid 27414 exit signal Segmentation fault (11)
[Wed Jan 01 06:27:27.104996 2014] [core:notice] [pid 24675] AH00052: child pid 27450 exit signal Segmentation fault (11)
[Wed Jan 01 06:27:27.105051 2014] [core:notice] [pid 24675] AH00052: child pid 27451 exit signal Segmentation fault (11)
[Wed Jan 01 06:27:28.108331 2014] [core:notice] [pid 24675] AH00052: child pid 27584 exit signal Segmentation fault (11)
[Wed Jan 01 06:27:28.108406 2014] [core:notice] [pid 24675] AH00052: child pid 27589 exit signal Segmentation fault (11)
[Wed Jan 01 06:27:29.111239 2014] [core:notice] [pid 24675] AH00052: child pid 28059 exit signal Segmentation fault (11)
[Wed Jan 01 06:27:29.111284 2014] [core:notice] [pid 24675] AH00052: child pid 28060 exit signal Segmentation fault (11)
[Wed Jan 01 06:27:30.114444 2014] [core:notice] [pid 24675] AH00052: child pid 28255 exit signal Segmentation fault (11)
..........................
[Wed Jan 01 08:40:34.815761 2014] [core:notice] [pid 24675] AH00052: child pid 24392 exit signal Segmentation fault (11)
[Wed Jan 01 08:40:34.815770 2014] [core:notice] [pid 24675] AH00052: child pid 24400 exit signal Segmentation fault (11)
[Wed Jan 01 08:40:34.815901 2014] [core:notice] [pid 24675] AH00052: child pid 24401 exit signal Segmentation fault (11)
[Wed Jan 01 08:40:34.815912 2014] [core:notice] [pid 24675] AH00052: child pid 24402 exit signal Segmentation fault (11)
[Wed Jan 01 08:40:34.815921 2014] [core:notice] [pid 24675] AH00052: child pid 24403 exit signal Segmentation fault (11)
[Wed Jan 01 08:40:34.815928 2014] [core:notice] [pid 24675] AH00052: child pid 24404 exit signal Segmentation fault (11)
[Wed Jan 01 08:40:35.615193 2014] [core:notice] [pid 24675] AH00052: child pid 24405 exit signal Segmentation fault (11)
[Wed Jan 01 08:40:35.615265 2014] [core:notice] [pid 24675] AH00052: child pid 24406 exit signal Segmentation fault (11)
[Wed Jan 01 08:40:35.615273 2014] [core:notice] [pid 24675] AH00052: child pid 24408 exit signal Segmentation fault (11)
[Wed Jan 01 08:40:35.615334 2014] [mpm_prefork:notice] [pid 24675] AH00170: caught SIGWINCH, shutting down gracefully
[Wed Jan 01 08:40:37.132632 2014] [auth_digest:notice] [pid 24616] AH01757: generating secret for digest authentication ...
[Wed Jan 01 08:40:38.094225 2014] [mpm_prefork:notice] [pid 24616] AH00163: Apache/2.4.4 (Unix) OpenSSL/1.0.1e PHP/5.4.19 configured -- resuming normal operations
[Wed Jan 01 08:40:38.094330 2014] [core:notice] [pid 24616] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND'


Another another server out in the wild that had the same thing happen today. This one I shorten the seg faults in between also since they went on for megabytes and megabytes in the log file until it was restarted. System run Mageia 3 (64bit distro)
Code: Select all
[Wed Jan 01 06:41:55.954521 2014] [mpm_prefork:notice] [pid 27705] AH00171: Graceful restart requested, doing restart
[Wed Jan 01 06:41:56.561629 2014] [auth_digest:notice] [pid 27705] AH01757: generating secret for digest authentication ...
[Wed Jan 01 06:41:57.612990 2014] [mpm_prefork:notice] [pid 27705] AH00163: Apache/2.4.4 (Unix) OpenSSL/1.0.1e PHP/5.4.19 configured -- resuming normal operations
[Wed Jan 01 06:41:57.613048 2014] [core:notice] [pid 27705] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND'
[Wed Jan 01 06:41:57.636383 2014] [core:notice] [pid 27705] AH00052: child pid 11047 exit signal Segmentation fault (11)
[Wed Jan 01 06:41:57.636453 2014] [core:notice] [pid 27705] AH00052: child pid 11048 exit signal Segmentation fault (11)
[Wed Jan 01 06:41:57.670088 2014] [mpm_prefork:notice] [pid 27705] AH00171: Graceful restart requested, doing restart
[Wed Jan 01 06:41:57.788093 2014] [auth_digest:notice] [pid 27705] AH01757: generating secret for digest authentication ...
[Wed Jan 01 06:41:58.072115 2014] [mpm_prefork:notice] [pid 27705] AH00163: Apache/2.4.4 (Unix) OpenSSL/1.0.1e PHP/5.4.19 configured -- resuming normal operations
[Wed Jan 01 06:41:58.072155 2014] [core:notice] [pid 27705] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND'
[Wed Jan 01 06:41:58.072185 2014] [core:notice] [pid 27705] AH00052: child pid 11046 exit signal Segmentation fault (11)
[Wed Jan 01 06:41:58.073615 2014] [core:notice] [pid 27705] AH00052: child pid 11049 exit signal Segmentation fault (11)
[Wed Jan 01 06:41:58.076334 2014] [core:notice] [pid 27705] AH00052: child pid 11050 exit signal Segmentation fault (11)
[Wed Jan 01 06:41:58.078804 2014] [core:notice] [pid 27705] AH00052: child pid 11051 exit signal Segmentation fault (11)
[Wed Jan 01 06:41:58.086791 2014] [core:notice] [pid 27705] AH00052: child pid 11064 exit signal Segmentation fault (11)
[Wed Jan 01 06:41:58.089276 2014] [core:notice] [pid 27705] AH00052: child pid 11066 exit signal Segmentation fault (11)
[Wed Jan 01 06:41:58.215423 2014] [mpm_prefork:notice] [pid 27705] AH00171: Graceful restart requested, doing restart
[Wed Jan 01 06:41:58.314932 2014] [auth_digest:notice] [pid 27705] AH01757: generating secret for digest authentication ...
[Wed Jan 01 06:41:59.053476 2014] [mpm_prefork:notice] [pid 27705] AH00163: Apache/2.4.4 (Unix) OpenSSL/1.0.1e PHP/5.4.19 configured -- resuming normal operations
[Wed Jan 01 06:41:59.053518 2014] [core:notice] [pid 27705] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND'
[Wed Jan 01 06:41:59.053653 2014] [core:notice] [pid 27705] AH00052: child pid 11065 exit signal Segmentation fault (11)
[Wed Jan 01 06:41:59.055276 2014] [core:notice] [pid 27705] AH00052: child pid 11067 exit signal Segmentation fault (11)
[Wed Jan 01 06:41:59.058051 2014] [core:notice] [pid 27705] AH00052: child pid 11068 exit signal Segmentation fault (11)
[Wed Jan 01 06:41:59.060458 2014] [core:notice] [pid 27705] AH00052: child pid 11069 exit signal Segmentation fault (11)
[Wed Jan 01 06:41:59.063513 2014] [core:notice] [pid 27705] AH00052: child pid 11093 exit signal Segmentation fault (11)
[Wed Jan 01 06:41:59.068588 2014] [core:notice] [pid 27705] AH00052: child pid 11095 exit signal Segmentation fault (11)
[Wed Jan 01 06:42:00.071305 2014] [core:notice] [pid 27705] AH00052: child pid 11094 exit signal Segmentation fault (11)
[Wed Jan 01 06:42:00.071402 2014] [core:notice] [pid 27705] AH00052: child pid 11096 exit signal Segmentation fault (11)
[Wed Jan 01 06:42:00.071649 2014] [core:notice] [pid 27705] AH00052: child pid 11097 exit signal Segmentation fault (11)
[Wed Jan 01 06:42:01.074137 2014] [core:notice] [pid 27705] AH00052: child pid 11192 exit signal Segmentation fault (11)
[Wed Jan 01 06:42:02.076406 2014] [core:notice] [pid 27705] AH00052: child pid 11193 exit signal Segmentation fault (11)
[Wed Jan 01 06:42:03.078976 2014] [core:notice] [pid 27705] AH00052: child pid 11194 exit signal Segmentation fault (11)
[Wed Jan 01 06:42:04.082269 2014] [core:notice] [pid 27705] AH00052: child pid 11195 exit signal Segmentation fault (11)
[Wed Jan 01 06:42:05.085299 2014] [core:notice] [pid 27705] AH00052: child pid 11216 exit signal Segmentation fault (11)
[Wed Jan 01 06:42:06.088397 2014] [core:notice] [pid 27705] AH00052: child pid 11236 exit signal Segmentation fault (11)
[Wed Jan 01 06:42:07.090995 2014] [core:notice] [pid 27705] AH00052: child pid 11237 exit signal Segmentation fault (11)
...........................................................
[Wed Jan 01 08:36:45.226272 2014] [core:notice] [pid 27705] AH00052: child pid 2355 exit signal Segmentation fault (11)
[Wed Jan 01 08:36:45.226303 2014] [core:notice] [pid 27705] AH00052: child pid 2358 exit signal Segmentation fault (11)
[Wed Jan 01 08:36:45.226386 2014] [mpm_prefork:notice] [pid 27705] AH00170: caught SIGWINCH, shutting down gracefully
[Wed Jan 01 08:36:47.216329 2014] [auth_digest:notice] [pid 2572] AH01757: generating secret for digest authentication ...
[Wed Jan 01 08:36:48.131619 2014] [mpm_prefork:notice] [pid 2572] AH00163: Apache/2.4.4 (Unix) OpenSSL/1.0.1e PHP/5.4.19 configured -- resuming normal operations
[Wed Jan 01 08:36:48.131731 2014] [core:notice] [pid 2572] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND'


What you notice is common between all of them besides being Mageia 3 is that the "AH00171: Graceful restart requested, doing restart" which I am hoping will give a clue to finding out what is causing this. This only shows up once a month when the crash happens. I don't know what is causing the request or why it does this every month, but I hoping that finding this cause will also find the solution :)
User avatar
KnightMB
 
Posts: 76
Joined: Nov 21st, '12, 21:27

Re: Did Everyone's Apache Server Go Dead at the same Time?

Postby doktor5000 » Jan 2nd, '14, 00:28

KnightMB wrote:What you notice is common between all of them besides being Mageia 3 is that the "AH00171: Graceful restart requested, doing restart" which I am hoping will give a clue to finding out what is causing this.

What would be interesting is what are the last 50-100 lines before the graceful restart?
And what do you run on top of that apache, what does it serve?
Do you maybe have logrotate installed, which can also ask Apache to gracefully restart so that it can process the logs?
Code: Select all
[doktor5000@Mageia3 ~]$ urpmf /etc/logrotate.d | sort -u | grep apache
apache:/etc/logrotate.d/httpd
apache-mod_security:/etc/logrotate.d/mod_security


Some related links:
http://forums.cpanel.net/f5/apache-perf ... ost1093141
http://www.gossamer-threads.com/lists/g ... ded#235829
http://www.serverschool.com/server-soft ... log-files/
Cauldron is not for the faint of heart!
Caution: Hot, bubbling magic inside. May explode or cook your kittens!
----
Disclaimer: Beware of allergic reactions in answer to unconstructive complaint-type posts
User avatar
doktor5000
 
Posts: 18052
Joined: Jun 4th, '11, 10:10
Location: Leipzig, Germany

Re: Did Everyone's Apache Server Go Dead at the same Time?

Postby KnightMB » Jan 4th, '14, 09:37

Sorry about the delay (New Year Holidays + catch up work :mrgreen: )

I'll start with previous logs:

Test Machine (Mageia 3, 32bit)
error_log.1
Code: Select all
[Wed Jan 01 04:02:03.350884 2014] [mpm_prefork:notice] [pid 2070] AH00171: Graceful restart requested, doing restart
AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 127.0.0.1. Set the 'ServerName' directive globally to suppress this message


Production System Mageia 3 (64bit distro)
error_log.1
Code: Select all
[Wed Jan 01 03:36:39.417590 2014] [cgi:error] [pid 3355] [client 192.241.146.164:56874] script not found or unable to stat: /var/www/cgi-bin/php
[Wed Jan 01 03:36:39.530072 2014] [cgi:error] [pid 31950] [client 192.241.146.164:56898] script not found or unable to stat: /var/www/cgi-bin/php5
[Wed Jan 01 03:36:39.628475 2014] [cgi:error] [pid 4202] [client 192.241.146.164:56915] script not found or unable to stat: /var/www/cgi-bin/php-cgi
[Wed Jan 01 03:36:39.727677 2014] [cgi:error] [pid 27893] [client 192.241.146.164:56929] script not found or unable to stat: /var/www/cgi-bin/php.cgi
[Wed Jan 01 03:36:39.828519 2014] [cgi:error] [pid 25936] [client 192.241.146.164:56951] script not found or unable to stat: /var/www/cgi-bin/php4
[Wed Jan 01 06:27:19.759375 2014] [mpm_prefork:notice] [pid 24675] AH00171: Graceful restart requested, doing restart


Another Production System Mageia 3 (64bit distro)
error_log.1
Code: Select all
[Wed Jan 01 01:34:46.914779 2014] [:error] [pid 11655] [client 213.186.127.10:55228] script '/var/www/html/index.php' not found or unable to stat
[Wed Jan 01 03:36:39.360683 2014] [cgi:error] [pid 21684] [client 192.241.146.164:47617] script not found or unable to stat: /var/www/cgi-bin/php
[Wed Jan 01 03:36:39.457281 2014] [cgi:error] [pid 22177] [client 192.241.146.164:47642] script not found or unable to stat: /var/www/cgi-bin/php5
[Wed Jan 01 03:36:39.558182 2014] [cgi:error] [pid 18639] [client 192.241.146.164:47657] script not found or unable to stat: /var/www/cgi-bin/php-cgi
[Wed Jan 01 03:36:39.654239 2014] [cgi:error] [pid 24494] [client 192.241.146.164:47673] script not found or unable to stat: /var/www/cgi-bin/php.cgi
[Wed Jan 01 03:36:39.752752 2014] [cgi:error] [pid 25675] [client 192.241.146.164:47690] script not found or unable to stat: /var/www/cgi-bin/php4
[Wed Jan 01 05:24:30.880643 2014] [:error] [pid 23429] [client 213.186.127.3:33044] script '/var/www/html/index.php' not found or unable to stat
[Wed Jan 01 06:41:55.954521 2014] [mpm_prefork:notice] [pid 27705] AH00171: Graceful restart requested, doing restart
[Wed Jan 01 06:41:56.561629 2014] [auth_digest:notice] [pid 27705] AH01757: generating secret for digest authentication ...
[Wed Jan 01 06:41:57.612990 2014] [mpm_prefork:notice] [pid 27705] AH00163: Apache/2.4.4 (Unix) OpenSSL/1.0.1e PHP/5.4.19 configured -- resuming normal operations
[Wed Jan 01 06:41:57.613048 2014] [core:notice] [pid 27705] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND'
[Wed Jan 01 06:41:57.636383 2014] [core:notice] [pid 27705] AH00052: child pid 11047 exit signal Segmentation fault (11)
[Wed Jan 01 06:41:57.636453 2014] [core:notice] [pid 27705] AH00052: child pid 11048 exit signal Segmentation fault (11)
[Wed Jan 01 06:41:57.670088 2014] [mpm_prefork:notice] [pid 27705] AH00171: Graceful restart requested, doing restart
User avatar
KnightMB
 
Posts: 76
Joined: Nov 21st, '12, 21:27

Re: Did Everyone's Apache Server Go Dead at the same Time?

Postby KnightMB » Jan 4th, '14, 09:43

doktor5000 wrote:
KnightMB wrote:What you notice is common between all of them besides being Mageia 3 is that the "AH00171: Graceful restart requested, doing restart" which I am hoping will give a clue to finding out what is causing this.

What would be interesting is what are the last 50-100 lines before the graceful restart?
And what do you run on top of that apache, what does it serve?
Do you maybe have logrotate installed, which can also ask Apache to gracefully restart so that it can process the logs?
Code: Select all
[doktor5000@Mageia3 ~]$ urpmf /etc/logrotate.d | sort -u | grep apache
apache:/etc/logrotate.d/httpd
apache-mod_security:/etc/logrotate.d/mod_security


Some related links:
http://forums.cpanel.net/f5/apache-perf ... ost1093141
http://www.gossamer-threads.com/lists/g ... ded#235829
http://www.serverschool.com/server-soft ... log-files/

I am starting to think it is the log rotation as well. It doesn't cause any issues when it runs daily during the month, but it seems that when the month changes, something goes wrong. I am not sure if it is Apache that isn't handling the month transaction properly or if the log rotate is calling for a graceful restart, but somehow in a wrong way. I've tried to simulate this before by taking a machine and just setting the date back a few days and let it tick over the month again, but could never duplicate this. What has to be done is literally leave the machine on all month, which I was able to duplicate with my test machine. Sucks that I can't just perform the test over and over, have to wait all month for it to happen :lol:

The test machine serves nothing, it just runs the apache "it works!" page and that is it, no traffic except when I check it from time to time to see if it is still up.

The other two machines get heavy traffic, one serves over 2 dozen different websites, the other serves a forum, website, bug database, etc.
User avatar
KnightMB
 
Posts: 76
Joined: Nov 21st, '12, 21:27

Re: Did Everyone's Apache Server Go Dead at the same Time?

Postby KnightMB » Jan 4th, '14, 09:55

After reading over your links, I'll try an experiment. I am going to set the cron job for the logrotate to not run on the 1st day of the month. Next month I will know then if it is related to the logrotate doing a graceful restart that fails for some reason. :D
User avatar
KnightMB
 
Posts: 76
Joined: Nov 21st, '12, 21:27

Next

Return to Networking

Who is online

Users browsing this forum: No registered users and 1 guest

cron