Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eEO2B-0002F0-1Z for pgsql-performance@arkaria.postgresql.org; Mon, 13 Nov 2017 23:20:55 +0000 Received: from localhost ([127.0.0.1] helo=postgresql.org) by malur.postgresql.org with smtp (Exim 4.84_2) (envelope-from ) id 1eEO2A-0002PO-LO for pgsql-performance@arkaria.postgresql.org; Mon, 13 Nov 2017 23:20:54 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1eEO2A-0002PF-5Q for pgsql-performance@postgresql.org; Mon, 13 Nov 2017 23:20:54 +0000 Received: from mail-it0-x22e.google.com ([2607:f8b0:4001:c0b::22e]) by magus.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1eEO26-00060E-Gz for pgsql-performance@postgresql.org; Mon, 13 Nov 2017 23:20:53 +0000 Received: by mail-it0-x22e.google.com with SMTP id l196so11439765itl.4 for ; Mon, 13 Nov 2017 15:20:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=yanpoFR2yZ5d08XK4AxzNLFxlEVfcHFYPms2Ip3QA+Q=; b=NWRsfdWUAxAgRuXkjp9oAOC+9Vl/8z8l/8A0eIudnWmOJ6yGZEVNsmd4C5HOb/AG6o l/57ZEI0ieZiNt2hQXgLhCj6Ji6JUYSKUjnNFrbyicl3PtF7t28TABWYyseyzgMqCtWy k/BxXPcehKKZlxT8DP4T2lc3L11HNL8WnhqI/P6iNcssW+gx8vdZmcrRyEtPR0FH/1Zx FwQPTsJe7kAPFji13kjqVTE3Dmvb//ukn5x0DFmZD7wkY87Eov5psyFGgFwftEd7wK50 OWMend8zOUjWm02F4HtKDCbNGXbs94RDQYkuQH3wq4bTKo42jhS4P6VyVyclTmvOR4pp 19bg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=yanpoFR2yZ5d08XK4AxzNLFxlEVfcHFYPms2Ip3QA+Q=; b=StRMiMaQGE4lz53onMwESdaWf06NXDQwogh58mokX6lmYDyG5opz1g6bNMQlyM3vKl OWCuArZGQ3yUAL26RB3c6WvTKj8mEzhH8Zv9zolG2jzuP7fXESSdBfZAyAnk4YENtSNl QUFK0j9UYW3a4Ua8a+8w6tJxWIZOfHPrmr2O+7kxFOWLcKefioJ91eeiBDdyoZWQ4j09 WgXiYoMUsVFOmxB8PYSLtSzSNQsd+bouaTByTJxCrYEHdjJv7LkvsAB2CMHSBVyLZeTK oZ/8yQqhQMUS3uiMmzoCignkNWsOYnmac5FjLp8sP/KiFJl/vs9vm1Cw/opxhw7KR4+H tcVQ== X-Gm-Message-State: AJaThX4UWCuKNm2x00rWRowslMyMUFFTaKoXuPRPYofdLDAf2tqKZyt9 V5aV62FvVaUCAoUBr+pif9/chRQ8 X-Google-Smtp-Source: AGs4zMZRNOK/Feu0QLmUXx9HPEPSogHJ2w8utGe3TpUHpOdKBuR1xWfiEu4wBfBWCimIbkajapNirQ== X-Received: by 10.36.160.68 with SMTP id o65mr13002701ite.45.1510615246915; Mon, 13 Nov 2017 15:20:46 -0800 (PST) Received: from mail-it0-f53.google.com (mail-it0-f53.google.com. [209.85.214.53]) by smtp.gmail.com with ESMTPSA id e68sm6907622ite.0.2017.11.13.15.20.45 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 13 Nov 2017 15:20:45 -0800 (PST) Received: by mail-it0-f53.google.com with SMTP id n134so7836129itg.1 for ; Mon, 13 Nov 2017 15:20:45 -0800 (PST) X-Received: by 10.36.244.5 with SMTP id d5mr12203273iti.3.1510615244617; Mon, 13 Nov 2017 15:20:44 -0800 (PST) MIME-Version: 1.0 Received: by 10.36.209.7 with HTTP; Mon, 13 Nov 2017 15:20:44 -0800 (PST) X-Originating-IP: [88.211.72.10] In-Reply-To: <4276.1510613373@sss.pgh.pa.us> References: <252cab3895894337a3a88275f423fe0b@index.de> <4276.1510613373@sss.pgh.pa.us> From: Oliver Mattos Date: Mon, 13 Nov 2017 23:20:44 +0000 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Query planner gaining the ability to replanning after start of query execution. To: Tom Lane Cc: Arne Roland , "pgsql-performance@postgresql.org" Content-Type: text/plain; charset="UTF-8" List-Archive: List-Help: List-ID: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: X-Mailing-List: pgsql-performance Precedence: bulk Sender: pgsql-performance-owner@postgresql.org > You can't just restart from scratch, because we may already have shipped rows to the client For v1, replanning wouldn't be an option if rows have already been shipped, or for DML statements. > parallel plans and most importantly cursors? Parallel plans look do-able with the same approach, but cursor use I'd probably stop replanning as soon as the first row is delivered to the client, as above. One could imagine more complex approaches like a limited size buffer of 'delivered' rows, allowing a new plan to be selected and the delivered rows excluded from the new plans resultset via a special extra prepending+dupe filtering execnode. The memory and computation costs of that execnode would be factored into the replanning decision like any other node. >errors if the physical location of a row correlates strongly with a column This is my largest concern. These cases already lead to large errors currently (SELECT * FROM foo WHERE created_date = today LIMIT 1) might scan all data, only to find all of today's records in the last physical block. It's hard to say if replacing one bad estimate with another will lead to overall better/worse results... My hope is that in most cases a bunch of plans will be tried, all end up with cost estimates revised up a lot, and then one settled on as rows start getting passed to upper layers. >underling node might return a totally inaccurate number of rows for index scans One might imagine using the last returned row as an extra histogram point when estimating how many rows are left in an index scan. That should at least make the estimate more accurate than it is without feedback. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance