ircd died without obvisually reason

These are old archives. They are kept for historic purposes only.
Post Reply
TigerKatziTatzi
Posts: 36
Joined: Fri Apr 08, 2005 12:10 pm

ircd died without obvisually reason

Post by TigerKatziTatzi »

aloha, two things.

since couple days we got some probs with dying leafs. IRCD just terminate themself. Till now we didn'T find anything real realted to it. We find core dump only after crash ( file size by 60mb ) but nothing in which could give us a hint where it might be came from. Since couple days we got a kinda competition on our net by 2-3 ppl playing with floodbots, so only thing we got is a high activity by opsb, defender and our open proxy monitoring spricts and temp higher gline list. this morning two leafs died at the same time. bot activity had been 10 minutes before the died. gline list was 35 entries only, so nothing real high ( we do have soemtimes up to 300 glines for several hours ).

ircd.log will show only entries from other days. this brings me two my second point.

Code: Select all

15[Fri Jun 10 19:22:45 2005] - TROUBLE: buffer allocation error! Increase BUFFERPOOL!
qs:
- it is only possible to change this during compiling or it is possible to change it with config settings for sendq ?
- where can this 'trouble' happen on. for server sendq only, or overall sendq's ?
- can the ircd die cause of this prob, if its not able to handle this trouble anymore ?
- is the fault msg created cause of less pool or total maxsendq (16x3000000) ? cause 'increase BUFFERPOOL' i understand the pool needs to be increased ( >16)
w00t
Posts: 1136
Joined: Thu Mar 25, 2004 3:31 am
Location: Nowra, Australia

Post by w00t »

I can't remember for sure (i'd check but I'm going to bed soon):
I thought there was a question related to this in ./Config.

[slightly o/t]Should they really die because of this? :/[/slightly o/t]
-ChatSpike IRC Network [http://www.chatspike.net]
-Denora Stats [http://denora.nomadirc.net]
-Omerta [http://www.barafranca.com]
Syzop
UnrealIRCd head coder
Posts: 2117
Joined: Sat Mar 06, 2004 8:57 pm
Location: .nl
Contact:

Post by Syzop »

- is the fault msg created cause of less pool or total maxsendq (16x3000000) ? cause 'increase BUFFERPOOL' i understand the pool needs to be increased ( >16)
Correct, BUFFERPOOL is the total size of all the queues.

So to enlarge it, just put a higher number for either one (BUFFERPOOL = NUM_POOLS * MAXSENDQLENGTH)... I would suggest increasing the nr of pools (eg from 18 to 36 or 72).

It's a (kinda odd) prevention measure against the ircd eating all your memory, preventing your entire system to go down by eating all your memory (.. instead, the ircd often goes down :P).

The default (18*3M=54M) should probably be enlarged, something like 128M seems more reasonable. It might also be wise to move it to the config file (use the one from ./Config if it isn't specified).
TigerKatziTatzi
Posts: 36
Joined: Fri Apr 08, 2005 12:10 pm

Post by TigerKatziTatzi »

18*3M we are using standard. we got 4 leafs compiled with 18*5M. haven'T crashed now as the others, only during ddos. but that problem hasn'T shown up before either. So i guess we have to compile all new and have a try on it.

But still open qs if the ircd can terminate cause of this?

to give u some numbers.
the leafs which crashed and been terminated by itselfs had an load of 400 - 550 users only each. total userload was by 3,5k on 7 leafs.
makes no sence to me. cause we had at the beginning of this net, load values >950 per leaf each, by 4,5k-5k userload total. no problems appeared. our network is a pure chat network. no bottlers.
TigerKatziTatzi
Posts: 36
Joined: Fri Apr 08, 2005 12:10 pm

Post by TigerKatziTatzi »

her some additional infos about the dying of leafs......


looks like its able to be track down on a spamfilter 'bug' by triggering for 'users'.

-moderator edit: please don't post all this info here, post it on bugs.unrealircd.org instead-

according to this post couple month ago
http://forums.unrealircd.com/viewtopic. ... ght=#10340

I guess we do have this prob again in a minor form.
We wanted to do some double tests, written in that post, but getting a new net started is more work as you can think.
We will keep an eye on it and will check for this further, but probably we are right with our guess on it. We found similiar outputs on different boxes and their core dump meanwhile.

Spamfilter and trigger for user.

An bugtracker report will be written also
Post Reply