public inbox for pgsql-general@postgresql.org  
help / color / mirror / Atom feed
From: Daniel Verite <daniel@manitou-mail.org>
To: Laurenz Albe <laurenz.albe@cybertec.at>
Cc: Ron Johnson <ronljohnsonjr@gmail.com>
Cc: pgsql-general@lists.postgresql.org
Subject: Re: Choosing default collation/ctype
Date: Mon, 04 May 2026 21:34:00 +0200
Message-ID: <627add7e-94df-49ca-aa12-ae3900b7945f@manitou-mail.org> (raw)
In-Reply-To: <63e4b5165442ada9f187a0e14bbfe04795088bcd.camel@cybertec.at>

	Laurenz Albe wrote:

> > Then choose UTF8.
> 
> Right!  And I recommend "C" for the collation.

Yet the "C" collation is unsuitable for handling character types
beyond ASCII.
For instance, it considers that accented letters are not letters,
so upper('été') is 'éTé' instead of 'ÉTÉ', and 'é' ~ '\w' is false.

C.UTF-8 solves that, and since Postgres 17, it's available for all operating
systems with the builtin provider.
So if you target Postgres 17+, C.UTF-8 from the builtin provider is
a better choice for UTF-8 databases than "C" .


Best regards,
-- 
Daniel Vérité 
https://postgresql.verite.pro/






reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: pgsql-general@postgresql.org
  Cc: daniel@manitou-mail.org, laurenz.albe@cybertec.at, ronljohnsonjr@gmail.com, pgsql-general@lists.postgresql.org
  Subject: Re: Choosing default collation/ctype
  In-Reply-To: <627add7e-94df-49ca-aa12-ae3900b7945f@manitou-mail.org>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox