Jump to content
  • 0
Sign in to follow this  
bWolfie

Map server crash

Question

Hello,

Recently my map server crashed. Hoping somebody can help me from this gdb debug information.
First it said there was an overflow in script, at script_reg_destroy at if( p->value )

Then it crashed with this log on 'bt full'.

https://pastebin.com/q50NEiDD

Share this post


Link to post
Share on other sites

29 answers to this question

Recommended Posts

  • 0

you using any plugins or mods?

what commit you using for hercules? From stack look like it different or may be plugin intercept some vars changes?

crash because sd is malformed. In some other place look like was null pointer but server not crashed.

 

Share this post


Link to post
Share on other sites
  • 0

I'm using quite a few plugins. I've tried to disable them one by one but it is difficult to find what is the cause.

Share this post


Link to post
Share on other sites
  • 0

also if possible try build server with sanity flags enabled.

not sure how install missing packages on centos. and also need gcc 5 or newer.

Share this post


Link to post
Share on other sites
  • 0
3 hours ago, 4144 said:

also if possible try build server with sanity flags enabled.

not sure how install missing packages on centos. and also need gcc 5 or newer.

Thanks for your responses. I am unable to use the enable-sanitize=full option. It tells me 'configure: error: zlib library not found or incompatible...', despite the fact I have installed that dependency (version 1.2.7).

The crashes are referencing variables that aren't being used by any script (but were used in the past).
And I never made any edit to any src in pc or script other than adding some script command via plugin.

This part here, is where pc_setglobalreg(sd, num, val); occurs. Are there any immediate quickfix options like clearing my char_reg_num/str_db? Or that wouldn't make a difference?

name=0x7fffffffe0d0 "newbquest", value=0x1, ref=0x0) at script.c:3573
Edited by Myriad

Share this post


Link to post
Share on other sites
  • 0

this configure error mean some packages not installed. try install packages: libasan liblsan libubsan

from crash stack you have non latest hercules or modded hercules.

i already asked about commit. What hercules commit id you using? without it impossible to check what was wrong here.

this line said error on empty line and in other function, this mean stack totally wrong:

#5  0x00000000004580cf in chrif_parse (fd=14340) at chrif.c:1645

 

Share this post


Link to post
Share on other sites
  • 0

I wasn't able to enable sanitize after installing those packages. I cleared my char_reg_num_db and things have been okay for 36 hours now. I will post again if things become an issue.

Share this post


Link to post
Share on other sites
  • 0

I don't know what to do. Seems all sorts of things can cause a crash. Now pc_setregistry did it.

https://pastebin.com/Nc5e03gH

For reference, I am using src mods (Gepard Shield) and some plugins of my own (various edit).

The #BG_TIE variable is being called using

            pc_setglobalreg(sd, script->add_str("#BG_TIE"), pc_readglobalreg(sd, script->add_str("#BG_TIE")) +  1);
            pc_setglobalreg(sd, reference_uid(script->add_str("#BG_TIE"), month), pc_readglobalreg(sd, reference_uid(script->add_str("#BG_TIE"), month)) +  1);



#1: pc.c/9816

p = ers_alloc(pc->num_reg_ers, struct script_reg_num);

#2 script.c/3573
[code:c]
        case '\'':
            set_reg_instance_num(st, num, name, val);
            return 1;
        default:
            if (ref) {
                script->set_reg_pc_ref_num(st, ref, num, name, val);
            } else {
                pc_setglobalreg(sd, num, val); //<<<< Here
            }
            return 1;
[/code]

#3 intif,c/1349

script->set_reg(NULL,sd,reference_uid(script->add_str(key), index), key, (const void *)h64BPTRSIZE(ival), NULL);

#4 intif.c/2892

case 0x3804: intif->pRegisters(fd); break;

#5 chrif.c/1645

        if (cmd < 0x2af8 || cmd >= 0x2af8 + ARRAYLENGTH(chrif->packet_len_table) || chrif->packet_len_table[cmd-0x2af8] == 0) {
            int result = intif->parse(fd); // Passed on to the intif // <<<here

#6 socket.c/1418

sockt->session[i]->func_parse(i);



#7 core.c/557

sockt->perform(next);

Share this post


Link to post
Share on other sites
  • 0

issue in not this call stack

some where you had null pointer issue but server not crashed, and server used wrong sd pointer after this.

try remove plugins and gepard and try to crash server.

or use sanitize flags to see real issue.

Share this post


Link to post
Share on other sites
  • 0

I managed to --enable-sanitize=full by installing packages miniz and zopfli (not sure which one did it). Hopefully I will be able to debug this now.

Edit 1:Despite installing the same packages, I could only get sanitize to work on my production server for some reason, so makes testing hard. I assume this has something to do with any edits I made to pc.c am I correct? or could it be something doing pc->function?
 

=================================================================
==2032== ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7f23a8ea442c at pc 0x730c95 bp 0x7fffb665a940 sp 0x7fffb665a930
READ of size 4 at 0x7f23a8ea442c thread T0
    #0 0x730c94 (/home/user/Hercules/map-server+0x730c94)
    #1 0x823e0b (/home/user/Hercules/map-server+0x823e0b)
    #2 0x8cbb5a (/home/user/Hercules/map-server+0x8cbb5a)
    #3 0x8d25e2 (/home/user/Hercules/map-server+0x8d25e2)
    #4 0x715134 (/home/user/Hercules/map-server+0x715134)
    #5 0xa63330 (/home/user/Hercules/map-server+0xa63330)
    #6 0xa63458 (/home/user/Hercules/map-server+0xa63458)
    #7 0x7281a7 (/home/user/Hercules/map-server+0x7281a7)
    #8 0x729b78 (/home/user/Hercules/map-server+0x729b78)
    #9 0x6d1fea (/home/user/Hercules/map-server+0x6d1fea)
    #10 0x409ef1 (/home/user/Hercules/map-server+0x409ef1)
    #11 0x7f23b015b444 (/usr/lib64/libc-2.17.so+0x22444)
    #12 0x40a622 (/home/user/Hercules/map-server+0x40a622)
0x7f23a8ea442c is located 139791134507871 bytes to the right of global variable '<null>' (0x4d) of size 128
ASAN:SIGSEGV
==2032== AddressSanitizer: while reporting a bug found another one.Ignoring.

 

Edited by Myriad

Share this post


Link to post
Share on other sites
  • 0

gcc also should be atleast 5.0 version. 4.9 partially may works

Share this post


Link to post
Share on other sites
  • 0

Yep I updated to GCC 7.3 and can't enable it. It is working in one of my servers but I can't use it cause it's live.

The other servers I tried I just keep getting 

./configure --enable-sanitize=full
.
..
...
checking for library containing inflateEnd... no
configure: error: zlib library not found or incompatible... stopping

Share this post


Link to post
Share on other sites
  • 0
29 minutes ago, Myriad said:

Yep I updated to GCC 7.3 and can't enable it. It is working in one of my servers but I can't use it cause it's live.

The other servers I tried I just keep getting 


./configure --enable-sanitize=full
.
..
...
checking for library containing inflateEnd... no
configure: error: zlib library not found or incompatible... stopping

 

Then install zlib1g-dev library since it clearly states it's missing

Share this post


Link to post
Share on other sites
  • 0
1 hour ago, Asheraf said:

Then install zlib1g-dev library since it clearly states it's missing

yum list installed
....
zlib.x86_64                        1.2.7-17.el7                        installed
zlib-debuginfo.x86_64              1.2.7-17.el7                        @base-debuginfo
zlib-devel.x86_64                  1.2.7-17.el7                        @base
zopfli.x86_64                      1.0.1-1.el7                         @epel

I'm using CentOS 7 that package is not available. I have installed zlib-devel.

Edit: I am now trying debian and that flag has worked.

Edited by Myriad

Share this post


Link to post
Share on other sites
  • 0

@Myriad can you show config.log after failed configure run?

inside this file will be actual error why it cant find zlib. it can be wrong flags, missing files etc. can be anything.

 

Share this post


Link to post
Share on other sites
  • 0

look like compiled without debug info?

need debug info. configure flag --enable-debug

Or probably because you run with gdb at same time.

 

Anyway what is code at pc.c:9909?

look like error in this line

 

Share this post


Link to post
Share on other sites
  • 0

When compiling my server, I did:

make clean
./configure --enable-debug=gdb --disable-lto --enable-sanitize=full
make sql plugins

pc.c: 9909 = pc_eventtimer - npc->event(sd,p,0);

Share this post


Link to post
Share on other sites
  • 0

you have very old hercules? or heavy modifief?

i asked some times already what hercules commit you using. or if you not using git, say atleast date of hercules sources.

or better show whole function pc_eventtimer

 

Share this post


Link to post
Share on other sites
  • 0

Yup it's quite modified. My Hercules is up to date, as in I have merged current master with my source.
Src: 5118dceb5c1c5cad7d0c06137a9b1eee2acbe4e8
Scripts: 20f045c8de9f5fb6dde5bb8c8da6306facf2517c

static int pc_eventtimer(int tid, int64 tick, int id, intptr_t data)
{
	struct map_session_data *sd=map->id2sd(id);
	char *p = (char *)data;
	int i;
	if(sd==NULL)
		return 0;

	ARR_FIND( 0, MAX_EVENTTIMER, i, sd->eventtimer[i] == tid );
	if( i < MAX_EVENTTIMER )
	{
		sd->eventtimer[i] = INVALID_TIMER;
		sd->eventcount--;
		npc->event(sd,p,0); // pc.c: 9909 here
	}
	else
		ShowError("pc_eventtimer: no such event timer\n");

	if (p) aFree(p);
	return 0;
}

 

Share this post


Link to post
Share on other sites
  • 0

probably you show wrong function, or you running other server binary, or may be some corruption.

because from pc_eventtimer called npc->event.

and sd in pc_eventtimer is NULL, but because here check for NULL, it cant call npc->event..

 

another thing try disable memory manager, because it hiding memory errors.

make clean
./configure --enable-debug=gdb --disable-lto --enable-manager=no --enable-sanitize=full
make sql plugins

 

Share this post


Link to post
Share on other sites
  • 0

Thank you. Seems there were some errors in two buildins used in my scripts.

1. npcshopdelitem(), which was breaking every time it was used.
2. setmapflag(), breaking sometimes when setting 'mf_zone'.

I will update again if further issues persist.

Share this post


Link to post
Share on other sites
  • 0

I received this crashlog. I don't know what caused it, some plugins were referenced: https://pastebin.com/W2wXWDrg

Tried checking over all those plugins for null pointer. Disabled afk, unit, status, trade and itembonus, then it happened later, referencing again the ones I didn't disable.

Edited by Myriad

Share this post


Link to post
Share on other sites
  • 0

look like you have disabled sanitize flags and enabled memory manager.

enable sanitize and disable memory manager and probably you will get better crash. and please fix gcc, last crash report from sanitize was almost useless, because missing libs or packages. was no correct stack and additional info.

 

 

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
Sign in to follow this  

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.