Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rgUR0-006LVv-H5 for pgsql-bugs@arkaria.postgresql.org; Sat, 02 Mar 2024 18:50:10 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1rgUQx-00GtOJ-VP for pgsql-bugs@arkaria.postgresql.org; Sat, 02 Mar 2024 18:50:08 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rgUQx-00GtOA-NN for pgsql-bugs@lists.postgresql.org; Sat, 02 Mar 2024 18:50:08 +0000 Received: from sss.pgh.pa.us ([68.162.161.243]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rgUQq-002Rgq-UO for pgsql-bugs@lists.postgresql.org; Sat, 02 Mar 2024 18:50:07 +0000 Received: from sss1.sss.pgh.pa.us (localhost [127.0.0.1]) by sss.pgh.pa.us (8.15.2/8.15.2) with ESMTP id 422Inw6C3140127; Sat, 2 Mar 2024 13:49:58 -0500 From: Tom Lane To: Alexander Lakhin cc: pgsql-bugs@lists.postgresql.org Subject: Re: BUG #18374: Printing memory contexts on OOM condition might lead to segmentation fault In-reply-to: References: <18374-ebb8113ce4d02f0d@postgresql.org> <3120721.1709395887@sss.pgh.pa.us> Comments: In-reply-to Alexander Lakhin message dated "Sat, 02 Mar 2024 20:00:00 +0300" MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-ID: <3140125.1709405398.1@sss.pgh.pa.us> Content-Transfer-Encoding: quoted-printable Date: Sat, 02 Mar 2024 13:49:58 -0500 Message-ID: <3140126.1709405398@sss.pgh.pa.us> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Alexander Lakhin writes: > 02.03.2024 19:11, Tom Lane wrote: >> Hmph. That's not an out-of-memory crash, that's a stack-too-deep >> crash. > (gdb) p $rsp > $1 =3D (void *) 0x7ffcc83d4ff0 > (gdb) frame 13269 > #13269 0x000056289bc2685a in main (argc=3D8, argv=3D0x56289d3b4930) at m= ain.c:198 > 198=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 PostmasterMain(argc, a= rgv); > (gdb) p $rsp > $2 =3D (void *) 0x7ffcc84834d0 > (gdb) p $rsp - 0x7ffcc83d4ff0 > $3 =3D (void *) 0xae4e0 > (Far less than ulimit -s =3D=3D 8 MB.) Yeah, I'm seeing something similar, also with ulimit -s =3D 8192 kbytes: (gdb) i reg ... rbp 0xb0a324 0xb0a324 rsp 0x7ffd07ce4fd0 0x7ffd07ce4fd0 ... (gdb) x/64 0x7ffd07ce4fd0 0x7ffd07ce4fd0: Cannot access memory at address 0x7ffd07ce4fd0 So it's definitely out-of-stack, yet (gdb) p stack_base_ptr $3 =3D 0x7ffd07dbf570 "\b" (gdb) p 0x7ffd07dbf570 - 0x7ffd07ce4fd0 $4 =3D 894368 I'd have expected a diff in the vicinity of 8MB, but it isn't. I think what must be happening is that the kernel is refusing to expand our stack any more once we've hit the "ulimit -v" limit. This is quite nasty, because it breaks all our assumptions about having X amount of stack still available once check_stack_depth triggers. I tried inserting check_stack_depth() into MemoryContextStatsInternal, and *that did not stop the crash*, confirming that we don't think we're anywhere near the stack limit. Ugh. regards, tom lane