From: Jan Karremans <karremans.ja@gmail.com>
Message-Id: <ADA1E0D9-0DA4-4E90-8596-D04DBFD20E4B@gmail.com>
Content-Type: multipart/alternative;
	boundary="Apple-Mail=_4846DF2C-C35F-4A5D-AD06-6CC7D8874551"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3864.600.51.1.1\))
Subject: Re: scaling up from t1n to 60 million records
Date: Tue, 19 May 2026 16:32:39 +0200
In-Reply-To: 
 <CY8PR05MB1010861EAD48ED098786C9690C4002@CY8PR05MB10108.namprd05.prod.outlook.com>
Cc: "pgsql-general@postgresql.org" <pgsql-general@postgresql.org>
To: Martin Mueller <martinmueller@northwestern.edu>
References: 
 <CY8PR05MB1010861EAD48ED098786C9690C4002@CY8PR05MB10108.namprd05.prod.outlook.com>
Archived-At: 
 <https://www.postgresql.org/message-id/ADA1E0D9-0DA4-4E90-8596-D04DBFD20E4B%40gmail.com>
Precedence: bulk


--Apple-Mail=_4846DF2C-C35F-4A5D-AD06-6CC7D8874551
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=us-ascii

Dear Martin,

I think you would be mostly good for just going ahead with this.
You might look at the size of your tables, but I expect that all to be =
well within safe ranges.

Cheers,
Jan

> On 19 May 2026, at 16:27, Martin Mueller =
<martinmueller@northwestern.edu> wrote:
>=20
> I use Postgres with a GUI frontend (Aquafold) as a very large =
spreadsheet on steroids that analyzes rare or defective spellings in a =
corpus of 65,000 texts and1.5 billion words.  I typically extract  data =
from the corpus with python scripts, turn them into tables and load them =
into the database.
>=20
> On my Mac with 32 GB of memory performance is OK with queries that =
typically within seconds extract data rows from tables  with up to ten =
million rows.  If the result set is large, I suspect that most of time =
machine's time is spent displaying result sets. I have used indexing =
sparingly. While it helps, the time savings often don't matter much.=20
>=20
> I am thinking about scaling up to table with about 60 million rows.  =
Are there things to do or watch out for? Or should I proceed on the =
assumption that that 60 million records are within scope and that the =
added timecost is roughly linear?
> =20
> Martin Mueller
> Professor emeritus of English and Classics
> Northwestern University
> =20
> =20
> =20


--Apple-Mail=_4846DF2C-C35F-4A5D-AD06-6CC7D8874551
Content-Transfer-Encoding: 7bit
Content-Type: text/html;
	charset=us-ascii

<html aria-label="message body"><head><meta http-equiv="content-type" content="text/html; charset=us-ascii"></head><body style="overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;">Dear Martin,<div><br></div><div>I think you would be mostly good for just going ahead with this.</div><div>You might look at the size of your tables, but I expect that all to be well within safe ranges.</div><div><br></div><div>Cheers,</div><div>Jan<br id="lineBreakAtBeginningOfMessage"><div><br><blockquote type="cite"><div>On 19 May 2026, at 16:27, Martin Mueller &lt;martinmueller@northwestern.edu&gt; wrote:</div><br class="Apple-interchange-newline"><div>

<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">

<div><div style="margin: 0in; font-family: Aptos, sans-serif; font-size: 12pt;">
<span style="font-size: 14pt;">I use Postgres with a GUI frontend (Aquafold) as a very large spreadsheet on steroids that analyzes rare or defective spellings in a corpus of 65,000 texts and1.5 billion words. &nbsp;I typically extract &nbsp;data
 from the corpus with python scripts, turn them into tables and load them into the database.</span></div><div style="margin: 0in; font-family: Aptos, sans-serif; font-size: 12pt;">
<span style="font-size: 14pt;"><br>
</span></div><div style="margin: 0in; font-family: Aptos, sans-serif; font-size: 12pt;">
<span style="font-size: 14pt;">On my Mac with 32 GB of memory performance is OK with queries that typically within seconds extract data rows from tables &nbsp;with up to ten million rows. &nbsp;If the result set is large, I suspect that most of time
 machine's time is spent displaying result sets. I have used indexing sparingly. While it helps, the time savings often don't matter much.&nbsp;</span></div><div style="margin: 0in; font-family: Aptos, sans-serif; font-size: 12pt;">
<span style="font-size: 14pt;"><br>
</span></div><div style="margin: 0in; font-family: Aptos, sans-serif; font-size: 12pt;">
<span style="font-size: 14pt;">I am thinking about scaling up to table with about 60 million rows. &nbsp;Are there things to do or watch out for? Or should I proceed on the assumption that that 60 million records are within scope and that the
 added timecost is roughly linear?</span></div><p class="MsoNormal" style="margin: 0in; font-family: Aptos, sans-serif; font-size: 12pt;">
&nbsp;</p><div style="margin: 0in; font-family: Aptos, sans-serif; font-size: 12pt;">
<span style="font-family: Calibri, sans-serif; font-size: 14pt;">Martin Mueller</span></div><div style="margin: 0in; font-family: Aptos, sans-serif; font-size: 12pt;">
<span style="font-family: Calibri, sans-serif; font-size: 14pt;">Professor emeritus of English and Classics</span></div><div style="margin: 0in; font-family: Aptos, sans-serif; font-size: 12pt;">
<span style="font-family: Calibri, sans-serif; font-size: 14pt;">Northwestern University</span></div><p class="MsoNormal" style="margin: 0in; font-family: Aptos, sans-serif; font-size: 12pt;">
<span style="font-size: 14pt;">&nbsp;</span></p><p class="MsoNormal" style="margin: 0in; font-family: Aptos, sans-serif; font-size: 12pt;">
<span style="font-size: 14pt;">&nbsp;</span></p><p class="MsoNormal" style="margin: 0in; font-family: Aptos, sans-serif; font-size: 12pt;">
<span style="font-size: 14pt;">&nbsp;</span></p>
</div>

</div></blockquote></div><br></div></body></html>
--Apple-Mail=_4846DF2C-C35F-4A5D-AD06-6CC7D8874551--