Mageia forum

by **KnightMB** » Nov 1st, '13, 14:05

I noticed this around 6:45AM (-6:00 CST) that all my machines running apache had just stopped. No errors in the logs, no warnings. Just one minute is working, the next minute, the apache daemon has just stopped. All it took was a simple service restart and everything was fine. The weird thing is, it happened on all my mageia machines, 32bit and 64bit distro at the exact same moment. I even had a dev machine on LAN (no Internet access, not even the same network as the other apache servers) that does not server any public websites do the same thing, so I know it wasn't some weird attack. It all appears to be centered around the time for some reason. Never seen this happen before, curious if anyone else say this?

by **doktor5000** » Nov 2nd, '13, 00:12

Not running any webserver here, but did you check the logs with journalctl around that time, maybe there has been msec or some cronjob or an at job or something like that?
Or the NSA shut down all your boxes :p

by **KnightMB** » Nov 2nd, '13, 00:19

Hehe, yeah checked everything. This is the only thing I found on the dev box (because it had all the logging enabled) The Message around 6:41 is when I restarted apache, but before it a few minutes earlier, just nothing

I take it since no else has seen this it is only me. The setup is basically Apache, PHP, MariaDB with all defaults basically, no much in the way of custom settings or modules.

Code: Select all: [Fri Nov 01 04:02:03.881629 2013] [auth_digest:notice] [pid 1631] AH01757: generating secret for digest authentication ... [Fri Nov 01 04:02:04.102834 2013] [mpm_prefork:notice] [pid 1631] AH00163: Apache/2.4.4 (Unix) PHP/5.4.19 configured -- resuming normal operations [Fri Nov 01 04:02:04.102917 2013] [core:notice] [pid 1631] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND' [Fri Nov 01 04:02:04.217217 2013] [mpm_prefork:notice] [pid 1631] AH00171: Graceful restart requested, doing restart AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 127.0.0.1. Set the 'ServerName' directive globally to suppress this message [Fri Nov 01 04:02:04.322557 2013] [auth_digest:notice] [pid 1631] AH01757: generating secret for digest authentication ... [Fri Nov 01 04:02:05.046285 2013] [mpm_prefork:notice] [pid 1631] AH00163: Apache/2.4.4 (Unix) PHP/5.4.19 configured -- resuming normal operations [Fri Nov 01 04:02:05.046351 2013] [core:notice] [pid 1631] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND' [Fri Nov 01 06:41:17.565662 2013] [core:notice] [pid 1631] AH00052: child pid 22379 exit signal Segmentation fault (11) [Fri Nov 01 06:42:00.400760 2013] [mpm_prefork:notice] [pid 1631] AH00170: caught SIGWINCH, shutting down gracefully AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 127.0.0.1. Set the 'ServerName' directive globally to suppress this message [Fri Nov 01 06:42:02.085394 2013] [auth_digest:notice] [pid 23919] AH01757: generating secret for digest authentication ... [Fri Nov 01 06:42:03.076773 2013] [mpm_prefork:notice] [pid 23919] AH00163: Apache/2.4.4 (Unix) PHP/5.4.19 configured -- resuming normal operations [Fri Nov 01 06:42:03.076921 2013] [core:notice] [pid 23919] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND'

by **KnightMB** » Dec 1st, '13, 18:43

It has happened again, this time on a hosted server elsewhere (including all of mine)

I notice that it is Dec 1 now, last time this happened, it was Nov 1 just one month ago.

Is no one else seeing this? It happens on either the 32bit or 64bit mageia 3 installs? Got 4 completely different systems (different brands, makes, RAM, etc) that it happens at the same time where Apache just stops, no logs, no errors, nothing.

I am going to run some experiments to see if ticking over from one month to the next is the issue (basically set the clocks back and watch what happens over a 24 hour period) I sure would feel better if someone else was seeing this.

Are the official Mageia websites not running Mageia :?:

by **doktor5000** » Dec 1st, '13, 19:15

As you didn't state it explitly yet, did you check all the various crontab locations and maybe even at jobs?

by **KnightMB** » Dec 1st, '13, 19:21

doktor5000 wrote:As you didn't state it explitly yet, did you check all the various crontab locations and maybe even at jobs?

Great idea, never thought of that.
This is what was set on the machine (well all of them by default)
The 1 month cron runs on the 1st of every month @ 4:42AM

These are the two jobs that run, I'll have to examine them further to see what they do.

/etc/cron.monthly/0anacron-timestamp
/etc/cron.monthly/update-microcode

by **KnightMB** » Dec 1st, '13, 19:25

Ok, not much to the first job

0anacron-timestamp

Code: Select all: #!/bin/sh # Updates the timestamp of last monthly run for anacron date +%Y%m%d > /var/spool/anacron/cron.monthly

Second job though, might be it!
update-microcode

Code: Select all: #!/bin/sh # # check if there is a new microcode for your CPU and update it # Intel 686 and above, AMD family 16 and above vendor=`grep "^vendor_id" /proc/cpuinfo | head -n1 | awk -F ": " '{ print $2 }'` family=`grep "^cpu family" /proc/cpuinfo | head -n1 | awk -F ": " '{ print $2 }'` if [ "$vendor" = "GenuineIntel" ] && [ $family -ge 6 ]; then /usr/sbin/update-intel-microcode elif [ "$vendor" = "AuthenticAMD" ] && [ $family -ge 16 ]; then minor=`uname -r | cut -d . -f 3 | cut -d - -f 1` if [ $minor -ge 29 ]; then /usr/sbin/update-amd-microcode fi fi

I have a system with the time rolled back to see if it happens again when the next month ticks over.

by **doktor5000** » Dec 1st, '13, 21:55

Are that the only cronjobs you checked?
What about

Code: Select all: crontab -l root

or the contents of

Code: Select all: /etc/crontab

or

Code: Select all: /etc/anacrontab

What does

Code: Select all: find /etc/cron* -type f

show?

by **KnightMB** » Dec 2nd, '13, 05:22

doktor5000 wrote:Are that the only cronjobs you checked?
What about
Code: Select all
crontab -l root

crontab: usage error: no arguments permitted after this option
usage: crontab [-u user] file
crontab [-u user] [ -e | -l | -r ]
crontab -n [ hostname ]
crontab -c
(default operation is replace, per 1003.2)
-e (edit user's crontab)
-l (list user's crontab)
-r (delete user's crontab)
-i (prompt before deleting user's crontab)
-n (set host in cluster to run users' crontabs)
-c (get host in cluster to run users' crontabs)

or the contents of
Code: Select all
/etc/crontab

bash: cd: /etc/crontab: Not a directory

or
Code: Select all
/etc/anacrontab

bash: cd: /etc/anacrontab: Not a directory

What does
Code: Select all
find /etc/cron* -type f

show?

/etc/cron.d/php
/etc/cron.daily/0anacron-timestamp
/etc/cron.daily/rpm
/etc/cron.daily/makewhatis.cron
/etc/cron.daily/logrotate
/etc/cron.daily/mlocate.cron
/etc/cron.daily/tmpwatch
/etc/cron.deny
/etc/cron.hourly/0anacron
/etc/cron.monthly/0anacron-timestamp
/etc/cron.monthly/update-microcode
/etc/crontab
/etc/cron.weekly/0anacron-timestamp
/etc/cron.weekly/makewhatis.cron
/etc/cron.weekly/makewhatis-en.cron

by **doktor5000** » Dec 2nd, '13, 12:24

KnightMB wrote:
doktor5000 wrote:Are that the only cronjobs you checked?
What about
Code: Select all
crontab -l root

Well, it was only crontab -l - Did you read the usage?

KnightMB wrote:
or the contents of
Code: Select all
/etc/crontab

bash: cd: /etc/crontab: Not a directory

Right, it's a regular file. You can't cd to a file.

or
Code: Select all
/etc/anacrontab

bash: cd: /etc/anacrontab: Not a directory

Also a regular file.

For the other crontab files you found, you should probably take a look inside them.
I've given you pointers where to look, you need to figure this out for yourself.

by **KnightMB** » Dec 2nd, '13, 14:41

doktor5000 wrote:Well, it was only crontab -l - Did you read the usage?

Code: Select all: # crontab -l no crontab for root

Code: Select all: #cat /etc/crontab SHELL=/bin/bash PATH=/sbin:/bin:/usr/sbin:/usr/bin MAILTO=root HOME=/ # run-parts 01 * * * * root nice -n 19 run-parts --report /etc/cron.hourly 02 4 * * * root nice -n 19 run-parts --report /etc/cron.daily 22 4 * * 0 root nice -n 19 run-parts --report /etc/cron.weekly 42 4 1 * * root nice -n 19 run-parts --report /etc/cron.monthly

Code: Select all: # cat /etc/anacrontab # /etc/anacrontab: configuration file for anacron # See anacron(8) and anacrontab(5) for details. SHELL=/bin/sh PATH=/sbin:/bin:/usr/sbin:/usr/bin MAILTO=root # the maximal random delay added to the base delay of the jobs RANDOM_DELAY=45 # the jobs will be started during the following hours only START_HOURS_RANGE=6-22 #period in days delay in minutes job-identifier command 1 5 cron.daily nice -n 19 run-parts /etc/cron.daily 7 25 cron.weekly nice -n 19 run-parts /etc/cron.weekly @monthly 45 cron.monthly nice -n 19 run-parts /etc/cron.monthly

For the other crontab files you found, you should probably take a look inside them.
I've given you pointers where to look, you need to figure this out for yourself.

Thanks for the help, so far rolling the clock back can not duplicate it. I've tried a couple of times just to see if one of those cron jobs was doing it. It might be a coincidence and what happens needs a month to happen, I am not sure yet. One machine can be busy serving websites and the other is idle for a month and the same thing happens to both, but I am not sure why yet.

by **doktor5000** » Dec 2nd, '13, 21:20

Please next time use code tags as explained in ftp://ftp5.gwdg.de/pub/linux/mandriva/m ... e_tags.ogv

by **KnightMB** » Dec 4th, '13, 01:28

No luck trying to reproduce this, setting the clock back does not appear to do anything. It might be that it just takes 30 days for this to happen, so I might experiment with setting the block to the 1st of the month, wait a day, then accelerate it to the end of the month and see what happens.

by **filip** » Dec 4th, '13, 08:33

I would try running jobs listed in /etc/cron.monthly manually.

by **KnightMB** » Dec 4th, '13, 08:44

filip wrote:I would try running jobs listed in /etc/cron.monthly manually.

I have tried this, didn't affect Apache at all, was still running even after doing it a couple of times in a row.

by **jiml8** » Dec 4th, '13, 08:51

You should also look in /var/spool/cron. You may find some crontabs there.

by **KnightMB** » Dec 4th, '13, 08:57

jiml8 wrote:You should also look in /var/spool/cron. You may find some crontabs there.

I checked, folder was empty. No hidden files either.

by **filip** » Dec 4th, '13, 09:44

Is there any collision possibility of cron.hourly, cron.daily, cron.weekly and cron.monthly?

Is there any particular reason that both cron and anacron are installed?

by **KnightMB** » Dec 4th, '13, 09:50

That's what the system comes with by default.

I checked all the cron dates, none of them overlap.

It might not be cron related?

by **filip** » Dec 4th, '13, 13:20

KnightMB wrote:I checked all the cron dates, none of them overlap.

But are they short enough when they run.

by **KnightMB** » Jan 1st, '14, 22:48

Well, it has been another month and it happened again

At least this time, I had as much logging turned on as I could find. It appears all of them get a request to "Graceful restart requested, doing restart" and afterwards that is when Apache is down until you notice and do a restart of the process.

Here is the log from a machine setup specifically to sit there and do nothing but run Apache (Mageia 3, 32bit distro) to test this issue out. Been sitting idle for a month basically.

Code: Select all: [Wed Jan 01 04:02:03.350884 2014] [mpm_prefork:notice] [pid 2070] AH00171: Graceful restart requested, doing restart AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 127.0.0.1. Set the 'ServerName' directive globally to suppress this message [Wed Jan 01 04:02:03.704711 2014] [auth_digest:notice] [pid 2070] AH01757: generating secret for digest authentication ... [Wed Jan 01 04:02:04.838923 2014] [mpm_prefork:notice] [pid 2070] AH00163: Apache/2.4.4 (Unix) PHP/5.4.19 configured -- resuming normal operations [Wed Jan 01 04:02:04.839004 2014] [core:notice] [pid 2070] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND' [Wed Jan 01 04:02:08.862654 2014] [core:notice] [pid 2070] AH00052: child pid 28695 exit signal Segmentation fault (11) [Wed Jan 01 05:20:21.314645 2014] [core:notice] [pid 2070] AH00052: child pid 28694 exit signal Segmentation fault (11) [Wed Jan 01 05:20:24.321166 2014] [core:notice] [pid 2070] AH00052: child pid 28693 exit signal Segmentation fault (11) [Wed Jan 01 05:20:34.336042 2014] [core:notice] [pid 2070] AH00052: child pid 28691 exit signal Segmentation fault (11) [Wed Jan 01 05:22:28.571947 2014] [mpm_prefork:notice] [pid 2070] AH00170: caught SIGWINCH, shutting down gracefully AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 127.0.0.1. Set the 'ServerName' directive globally to suppress this message [Wed Jan 01 05:22:30.092606 2014] [auth_digest:notice] [pid 30092] AH01757: generating secret for digest authentication ... [Wed Jan 01 05:22:31.162031 2014] [mpm_prefork:notice] [pid 30092] AH00163: Apache/2.4.4 (Unix) PHP/5.4.19 configured -- resuming normal operations [Wed Jan 01 05:22:31.162172 2014] [core:notice] [pid 30092] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND'

Server out in the production environment. I shorten the log because the in-between is megabytes of Seg faults which I guess is from web visitors trying to visit while the Apache is in the crashed state. System Mageia 3 (64bit distro)

Code: Select all: [Wed Jan 01 06:27:19.759375 2014] [mpm_prefork:notice] [pid 24675] AH00171: Graceful restart requested, doing restart [Wed Jan 01 06:27:19.987618 2014] [auth_digest:notice] [pid 24675] AH01757: generating secret for digest authentication ... [Wed Jan 01 06:27:20.438882 2014] [mpm_prefork:notice] [pid 24675] AH00163: Apache/2.4.4 (Unix) OpenSSL/1.0.1e PHP/5.4.19 configured -- resuming normal operations [Wed Jan 01 06:27:20.438907 2014] [core:notice] [pid 24675] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND' [Wed Jan 01 06:27:20.476375 2014] [mpm_prefork:notice] [pid 24675] AH00171: Graceful restart requested, doing restart [Wed Jan 01 06:27:20.518603 2014] [auth_digest:notice] [pid 24675] AH01757: generating secret for digest authentication ... [Wed Jan 01 06:27:21.077690 2014] [mpm_prefork:notice] [pid 24675] AH00163: Apache/2.4.4 (Unix) OpenSSL/1.0.1e PHP/5.4.19 configured -- resuming normal operations [Wed Jan 01 06:27:21.077718 2014] [core:notice] [pid 24675] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND' [Wed Jan 01 06:27:21.077746 2014] [core:notice] [pid 24675] AH00052: child pid 27251 exit signal Segmentation fault (11) [Wed Jan 01 06:27:21.078656 2014] [core:notice] [pid 24675] AH00052: child pid 27252 exit signal Segmentation fault (11) [Wed Jan 01 06:27:21.079592 2014] [core:notice] [pid 24675] AH00052: child pid 27253 exit signal Segmentation fault (11) [Wed Jan 01 06:27:21.080576 2014] [core:notice] [pid 24675] AH00052: child pid 27254 exit signal Segmentation fault (11) [Wed Jan 01 06:27:21.082380 2014] [core:notice] [pid 24675] AH00052: child pid 27255 exit signal Segmentation fault (11) [Wed Jan 01 06:27:21.087158 2014] [core:notice] [pid 24675] AH00052: child pid 27256 exit signal Segmentation fault (11) [Wed Jan 01 06:27:21.087196 2014] [core:notice] [pid 24675] AH00052: child pid 27257 exit signal Segmentation fault (11) [Wed Jan 01 06:27:21.087210 2014] [core:notice] [pid 24675] AH00052: child pid 27363 exit signal Segmentation fault (11) [Wed Jan 01 06:27:21.087222 2014] [core:notice] [pid 24675] AH00052: child pid 27364 exit signal Segmentation fault (11) [Wed Jan 01 06:27:21.087236 2014] [core:notice] [pid 24675] AH00052: child pid 27365 exit signal Segmentation fault (11) [Wed Jan 01 06:27:21.087248 2014] [core:notice] [pid 24675] AH00052: child pid 27366 exit signal Segmentation fault (11) [Wed Jan 01 06:27:22.090640 2014] [core:notice] [pid 24675] AH00052: child pid 27367 exit signal Segmentation fault (11) [Wed Jan 01 06:27:23.093979 2014] [core:notice] [pid 24675] AH00052: child pid 27407 exit signal Segmentation fault (11) [Wed Jan 01 06:27:23.094039 2014] [core:notice] [pid 24675] AH00052: child pid 27408 exit signal Segmentation fault (11) [Wed Jan 01 06:27:24.096929 2014] [core:notice] [pid 24675] AH00052: child pid 27409 exit signal Segmentation fault (11) [Wed Jan 01 06:27:24.097017 2014] [core:notice] [pid 24675] AH00052: child pid 27410 exit signal Segmentation fault (11) [Wed Jan 01 06:27:25.100087 2014] [core:notice] [pid 24675] AH00052: child pid 27411 exit signal Segmentation fault (11) [Wed Jan 01 06:27:25.100158 2014] [core:notice] [pid 24675] AH00052: child pid 27412 exit signal Segmentation fault (11) [Wed Jan 01 06:27:26.102446 2014] [core:notice] [pid 24675] AH00052: child pid 27413 exit signal Segmentation fault (11) [Wed Jan 01 06:27:26.102502 2014] [core:notice] [pid 24675] AH00052: child pid 27414 exit signal Segmentation fault (11) [Wed Jan 01 06:27:27.104996 2014] [core:notice] [pid 24675] AH00052: child pid 27450 exit signal Segmentation fault (11) [Wed Jan 01 06:27:27.105051 2014] [core:notice] [pid 24675] AH00052: child pid 27451 exit signal Segmentation fault (11) [Wed Jan 01 06:27:28.108331 2014] [core:notice] [pid 24675] AH00052: child pid 27584 exit signal Segmentation fault (11) [Wed Jan 01 06:27:28.108406 2014] [core:notice] [pid 24675] AH00052: child pid 27589 exit signal Segmentation fault (11) [Wed Jan 01 06:27:29.111239 2014] [core:notice] [pid 24675] AH00052: child pid 28059 exit signal Segmentation fault (11) [Wed Jan 01 06:27:29.111284 2014] [core:notice] [pid 24675] AH00052: child pid 28060 exit signal Segmentation fault (11) [Wed Jan 01 06:27:30.114444 2014] [core:notice] [pid 24675] AH00052: child pid 28255 exit signal Segmentation fault (11) .......................... [Wed Jan 01 08:40:34.815761 2014] [core:notice] [pid 24675] AH00052: child pid 24392 exit signal Segmentation fault (11) [Wed Jan 01 08:40:34.815770 2014] [core:notice] [pid 24675] AH00052: child pid 24400 exit signal Segmentation fault (11) [Wed Jan 01 08:40:34.815901 2014] [core:notice] [pid 24675] AH00052: child pid 24401 exit signal Segmentation fault (11) [Wed Jan 01 08:40:34.815912 2014] [core:notice] [pid 24675] AH00052: child pid 24402 exit signal Segmentation fault (11) [Wed Jan 01 08:40:34.815921 2014] [core:notice] [pid 24675] AH00052: child pid 24403 exit signal Segmentation fault (11) [Wed Jan 01 08:40:34.815928 2014] [core:notice] [pid 24675] AH00052: child pid 24404 exit signal Segmentation fault (11) [Wed Jan 01 08:40:35.615193 2014] [core:notice] [pid 24675] AH00052: child pid 24405 exit signal Segmentation fault (11) [Wed Jan 01 08:40:35.615265 2014] [core:notice] [pid 24675] AH00052: child pid 24406 exit signal Segmentation fault (11) [Wed Jan 01 08:40:35.615273 2014] [core:notice] [pid 24675] AH00052: child pid 24408 exit signal Segmentation fault (11) [Wed Jan 01 08:40:35.615334 2014] [mpm_prefork:notice] [pid 24675] AH00170: caught SIGWINCH, shutting down gracefully [Wed Jan 01 08:40:37.132632 2014] [auth_digest:notice] [pid 24616] AH01757: generating secret for digest authentication ... [Wed Jan 01 08:40:38.094225 2014] [mpm_prefork:notice] [pid 24616] AH00163: Apache/2.4.4 (Unix) OpenSSL/1.0.1e PHP/5.4.19 configured -- resuming normal operations [Wed Jan 01 08:40:38.094330 2014] [core:notice] [pid 24616] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND'

Another another server out in the wild that had the same thing happen today. This one I shorten the seg faults in between also since they went on for megabytes and megabytes in the log file until it was restarted. System run Mageia 3 (64bit distro)

Code: Select all: [Wed Jan 01 06:41:55.954521 2014] [mpm_prefork:notice] [pid 27705] AH00171: Graceful restart requested, doing restart [Wed Jan 01 06:41:56.561629 2014] [auth_digest:notice] [pid 27705] AH01757: generating secret for digest authentication ... [Wed Jan 01 06:41:57.612990 2014] [mpm_prefork:notice] [pid 27705] AH00163: Apache/2.4.4 (Unix) OpenSSL/1.0.1e PHP/5.4.19 configured -- resuming normal operations [Wed Jan 01 06:41:57.613048 2014] [core:notice] [pid 27705] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND' [Wed Jan 01 06:41:57.636383 2014] [core:notice] [pid 27705] AH00052: child pid 11047 exit signal Segmentation fault (11) [Wed Jan 01 06:41:57.636453 2014] [core:notice] [pid 27705] AH00052: child pid 11048 exit signal Segmentation fault (11) [Wed Jan 01 06:41:57.670088 2014] [mpm_prefork:notice] [pid 27705] AH00171: Graceful restart requested, doing restart [Wed Jan 01 06:41:57.788093 2014] [auth_digest:notice] [pid 27705] AH01757: generating secret for digest authentication ... [Wed Jan 01 06:41:58.072115 2014] [mpm_prefork:notice] [pid 27705] AH00163: Apache/2.4.4 (Unix) OpenSSL/1.0.1e PHP/5.4.19 configured -- resuming normal operations [Wed Jan 01 06:41:58.072155 2014] [core:notice] [pid 27705] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND' [Wed Jan 01 06:41:58.072185 2014] [core:notice] [pid 27705] AH00052: child pid 11046 exit signal Segmentation fault (11) [Wed Jan 01 06:41:58.073615 2014] [core:notice] [pid 27705] AH00052: child pid 11049 exit signal Segmentation fault (11) [Wed Jan 01 06:41:58.076334 2014] [core:notice] [pid 27705] AH00052: child pid 11050 exit signal Segmentation fault (11) [Wed Jan 01 06:41:58.078804 2014] [core:notice] [pid 27705] AH00052: child pid 11051 exit signal Segmentation fault (11) [Wed Jan 01 06:41:58.086791 2014] [core:notice] [pid 27705] AH00052: child pid 11064 exit signal Segmentation fault (11) [Wed Jan 01 06:41:58.089276 2014] [core:notice] [pid 27705] AH00052: child pid 11066 exit signal Segmentation fault (11) [Wed Jan 01 06:41:58.215423 2014] [mpm_prefork:notice] [pid 27705] AH00171: Graceful restart requested, doing restart [Wed Jan 01 06:41:58.314932 2014] [auth_digest:notice] [pid 27705] AH01757: generating secret for digest authentication ... [Wed Jan 01 06:41:59.053476 2014] [mpm_prefork:notice] [pid 27705] AH00163: Apache/2.4.4 (Unix) OpenSSL/1.0.1e PHP/5.4.19 configured -- resuming normal operations [Wed Jan 01 06:41:59.053518 2014] [core:notice] [pid 27705] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND' [Wed Jan 01 06:41:59.053653 2014] [core:notice] [pid 27705] AH00052: child pid 11065 exit signal Segmentation fault (11) [Wed Jan 01 06:41:59.055276 2014] [core:notice] [pid 27705] AH00052: child pid 11067 exit signal Segmentation fault (11) [Wed Jan 01 06:41:59.058051 2014] [core:notice] [pid 27705] AH00052: child pid 11068 exit signal Segmentation fault (11) [Wed Jan 01 06:41:59.060458 2014] [core:notice] [pid 27705] AH00052: child pid 11069 exit signal Segmentation fault (11) [Wed Jan 01 06:41:59.063513 2014] [core:notice] [pid 27705] AH00052: child pid 11093 exit signal Segmentation fault (11) [Wed Jan 01 06:41:59.068588 2014] [core:notice] [pid 27705] AH00052: child pid 11095 exit signal Segmentation fault (11) [Wed Jan 01 06:42:00.071305 2014] [core:notice] [pid 27705] AH00052: child pid 11094 exit signal Segmentation fault (11) [Wed Jan 01 06:42:00.071402 2014] [core:notice] [pid 27705] AH00052: child pid 11096 exit signal Segmentation fault (11) [Wed Jan 01 06:42:00.071649 2014] [core:notice] [pid 27705] AH00052: child pid 11097 exit signal Segmentation fault (11) [Wed Jan 01 06:42:01.074137 2014] [core:notice] [pid 27705] AH00052: child pid 11192 exit signal Segmentation fault (11) [Wed Jan 01 06:42:02.076406 2014] [core:notice] [pid 27705] AH00052: child pid 11193 exit signal Segmentation fault (11) [Wed Jan 01 06:42:03.078976 2014] [core:notice] [pid 27705] AH00052: child pid 11194 exit signal Segmentation fault (11) [Wed Jan 01 06:42:04.082269 2014] [core:notice] [pid 27705] AH00052: child pid 11195 exit signal Segmentation fault (11) [Wed Jan 01 06:42:05.085299 2014] [core:notice] [pid 27705] AH00052: child pid 11216 exit signal Segmentation fault (11) [Wed Jan 01 06:42:06.088397 2014] [core:notice] [pid 27705] AH00052: child pid 11236 exit signal Segmentation fault (11) [Wed Jan 01 06:42:07.090995 2014] [core:notice] [pid 27705] AH00052: child pid 11237 exit signal Segmentation fault (11) ........................................................... [Wed Jan 01 08:36:45.226272 2014] [core:notice] [pid 27705] AH00052: child pid 2355 exit signal Segmentation fault (11) [Wed Jan 01 08:36:45.226303 2014] [core:notice] [pid 27705] AH00052: child pid 2358 exit signal Segmentation fault (11) [Wed Jan 01 08:36:45.226386 2014] [mpm_prefork:notice] [pid 27705] AH00170: caught SIGWINCH, shutting down gracefully [Wed Jan 01 08:36:47.216329 2014] [auth_digest:notice] [pid 2572] AH01757: generating secret for digest authentication ... [Wed Jan 01 08:36:48.131619 2014] [mpm_prefork:notice] [pid 2572] AH00163: Apache/2.4.4 (Unix) OpenSSL/1.0.1e PHP/5.4.19 configured -- resuming normal operations [Wed Jan 01 08:36:48.131731 2014] [core:notice] [pid 2572] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND'

What you notice is common between all of them besides being Mageia 3 is that the "AH00171: Graceful restart requested, doing restart" which I am hoping will give a clue to finding out what is causing this. This only shows up once a month when the crash happens. I don't know what is causing the request or why it does this every month, but I hoping that finding this cause will also find the solution

by **doktor5000** » Jan 2nd, '14, 00:28

KnightMB wrote:What you notice is common between all of them besides being Mageia 3 is that the "AH00171: Graceful restart requested, doing restart" which I am hoping will give a clue to finding out what is causing this.

What would be interesting is what are the last 50-100 lines before the graceful restart?
And what do you run on top of that apache, what does it serve?
Do you maybe have logrotate installed, which can also ask Apache to gracefully restart so that it can process the logs?

Code: Select all: [doktor5000@Mageia3 ~]$ urpmf /etc/logrotate.d | sort -u | grep apache apache:/etc/logrotate.d/httpd apache-mod_security:/etc/logrotate.d/mod_security

Some related links:
http://forums.cpanel.net/f5/apache-perf ... ost1093141
http://www.gossamer-threads.com/lists/g ... ded#235829
http://www.serverschool.com/server-soft ... log-files/

by **KnightMB** » Jan 4th, '14, 09:37

Sorry about the delay (New Year Holidays + catch up work :mrgreen:

)

I'll start with previous logs:

Test Machine (Mageia 3, 32bit)
error_log.1

Code: Select all: [Wed Jan 01 04:02:03.350884 2014] [mpm_prefork:notice] [pid 2070] AH00171: Graceful restart requested, doing restart AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 127.0.0.1. Set the 'ServerName' directive globally to suppress this message

Production System Mageia 3 (64bit distro)
error_log.1

Code: Select all: [Wed Jan 01 03:36:39.417590 2014] [cgi:error] [pid 3355] [client 192.241.146.164:56874] script not found or unable to stat: /var/www/cgi-bin/php [Wed Jan 01 03:36:39.530072 2014] [cgi:error] [pid 31950] [client 192.241.146.164:56898] script not found or unable to stat: /var/www/cgi-bin/php5 [Wed Jan 01 03:36:39.628475 2014] [cgi:error] [pid 4202] [client 192.241.146.164:56915] script not found or unable to stat: /var/www/cgi-bin/php-cgi [Wed Jan 01 03:36:39.727677 2014] [cgi:error] [pid 27893] [client 192.241.146.164:56929] script not found or unable to stat: /var/www/cgi-bin/php.cgi [Wed Jan 01 03:36:39.828519 2014] [cgi:error] [pid 25936] [client 192.241.146.164:56951] script not found or unable to stat: /var/www/cgi-bin/php4 [Wed Jan 01 06:27:19.759375 2014] [mpm_prefork:notice] [pid 24675] AH00171: Graceful restart requested, doing restart

Another Production System Mageia 3 (64bit distro)
error_log.1

Code: Select all: [Wed Jan 01 01:34:46.914779 2014] [:error] [pid 11655] [client 213.186.127.10:55228] script '/var/www/html/index.php' not found or unable to stat [Wed Jan 01 03:36:39.360683 2014] [cgi:error] [pid 21684] [client 192.241.146.164:47617] script not found or unable to stat: /var/www/cgi-bin/php [Wed Jan 01 03:36:39.457281 2014] [cgi:error] [pid 22177] [client 192.241.146.164:47642] script not found or unable to stat: /var/www/cgi-bin/php5 [Wed Jan 01 03:36:39.558182 2014] [cgi:error] [pid 18639] [client 192.241.146.164:47657] script not found or unable to stat: /var/www/cgi-bin/php-cgi [Wed Jan 01 03:36:39.654239 2014] [cgi:error] [pid 24494] [client 192.241.146.164:47673] script not found or unable to stat: /var/www/cgi-bin/php.cgi [Wed Jan 01 03:36:39.752752 2014] [cgi:error] [pid 25675] [client 192.241.146.164:47690] script not found or unable to stat: /var/www/cgi-bin/php4 [Wed Jan 01 05:24:30.880643 2014] [:error] [pid 23429] [client 213.186.127.3:33044] script '/var/www/html/index.php' not found or unable to stat [Wed Jan 01 06:41:55.954521 2014] [mpm_prefork:notice] [pid 27705] AH00171: Graceful restart requested, doing restart [Wed Jan 01 06:41:56.561629 2014] [auth_digest:notice] [pid 27705] AH01757: generating secret for digest authentication ... [Wed Jan 01 06:41:57.612990 2014] [mpm_prefork:notice] [pid 27705] AH00163: Apache/2.4.4 (Unix) OpenSSL/1.0.1e PHP/5.4.19 configured -- resuming normal operations [Wed Jan 01 06:41:57.613048 2014] [core:notice] [pid 27705] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND' [Wed Jan 01 06:41:57.636383 2014] [core:notice] [pid 27705] AH00052: child pid 11047 exit signal Segmentation fault (11) [Wed Jan 01 06:41:57.636453 2014] [core:notice] [pid 27705] AH00052: child pid 11048 exit signal Segmentation fault (11) [Wed Jan 01 06:41:57.670088 2014] [mpm_prefork:notice] [pid 27705] AH00171: Graceful restart requested, doing restart

by **KnightMB** » Jan 4th, '14, 09:43

doktor5000 wrote:
KnightMB wrote:What you notice is common between all of them besides being Mageia 3 is that the "AH00171: Graceful restart requested, doing restart" which I am hoping will give a clue to finding out what is causing this.

What would be interesting is what are the last 50-100 lines before the graceful restart?
And what do you run on top of that apache, what does it serve?
Do you maybe have logrotate installed, which can also ask Apache to gracefully restart so that it can process the logs?
Code: Select all
[doktor5000@Mageia3 ~]$ urpmf /etc/logrotate.d | sort -u | grep apache apache:/etc/logrotate.d/httpd apache-mod_security:/etc/logrotate.d/mod_security

Some related links:
http://forums.cpanel.net/f5/apache-perf ... ost1093141
http://www.gossamer-threads.com/lists/g ... ded#235829
http://www.serverschool.com/server-soft ... log-files/

I am starting to think it is the log rotation as well. It doesn't cause any issues when it runs daily during the month, but it seems that when the month changes, something goes wrong. I am not sure if it is Apache that isn't handling the month transaction properly or if the log rotate is calling for a graceful restart, but somehow in a wrong way. I've tried to simulate this before by taking a machine and just setting the date back a few days and let it tick over the month again, but could never duplicate this. What has to be done is literally leave the machine on all month, which I was able to duplicate with my test machine. Sucks that I can't just perform the test over and over, have to wait all month for it to happen :lol:

The test machine serves nothing, it just runs the apache "it works!" page and that is it, no traffic except when I check it from time to time to see if it is still up.

The other two machines get heavy traffic, one serves over 2 dozen different websites, the other serves a forum, website, bug database, etc.

by **KnightMB** » Jan 4th, '14, 09:55

After reading over your links, I'll try an experiment. I am going to set the cron job for the logrotate to not run on the 1st day of the month. Next month I will know then if it is related to the logrotate doing a graceful restart that fails for some reason.

Mageia forum

Did Everyone's Apache Server Go Dead at the same Time?

Did Everyone's Apache Server Go Dead at the same Time?

Re: Did Everyone's Apache Server Go Dead at the same Time?

Re: Did Everyone's Apache Server Go Dead at the same Time?

Re: Did Everyone's Apache Server Go Dead at the same Time?

Re: Did Everyone's Apache Server Go Dead at the same Time?

Re: Did Everyone's Apache Server Go Dead at the same Time?

Re: Did Everyone's Apache Server Go Dead at the same Time?

Re: Did Everyone's Apache Server Go Dead at the same Time?

Re: Did Everyone's Apache Server Go Dead at the same Time?

Re: Did Everyone's Apache Server Go Dead at the same Time?

Re: Did Everyone's Apache Server Go Dead at the same Time?

Re: Did Everyone's Apache Server Go Dead at the same Time?

Re: Did Everyone's Apache Server Go Dead at the same Time?

Re: Did Everyone's Apache Server Go Dead at the same Time?

Re: Did Everyone's Apache Server Go Dead at the same Time?

Re: Did Everyone's Apache Server Go Dead at the same Time?

Re: Did Everyone's Apache Server Go Dead at the same Time?

Re: Did Everyone's Apache Server Go Dead at the same Time?

Re: Did Everyone's Apache Server Go Dead at the same Time?

Re: Did Everyone's Apache Server Go Dead at the same Time?

Re: Did Everyone's Apache Server Go Dead at the same Time?

Re: Did Everyone's Apache Server Go Dead at the same Time?

Re: Did Everyone's Apache Server Go Dead at the same Time?

Re: Did Everyone's Apache Server Go Dead at the same Time?

Re: Did Everyone's Apache Server Go Dead at the same Time?

Who is online