public inbox for pgsql-general@postgresql.org
help / color / mirror / Atom feedRe: Choosing default collation/ctype
6+ messages / 4 participants
[nested] [flat]
* Re: Choosing default collation/ctype
@ 2026-05-03 20:52 Igor Korot <ikorot01@gmail.com>
0 siblings, 1 reply; 6+ messages in thread
From: Igor Korot @ 2026-05-03 20:52 UTC (permalink / raw)
To: Ron Johnson <ronljohnsonjr@gmail.com>; +Cc: pgsql-generallists.postgresql.org <pgsql-general@lists.postgresql.org>
Hi,
On Sun, May 3, 2026 at 3:09 PM Ron Johnson <ronljohnsonjr@gmail.com> wrote:
>
> On Sun, May 3, 2026 at 3:52 PM Igor Korot <ikorot01@gmail.com> wrote:
>>
>> Hi, ALL,
>> In the CREATE DATABASE statement I can use encoding/collation/ctype.
>>
>> I can retrieve the encoding list with:
>>
>> [code]
>> SELECT pg_encoding_to_char( conforencoding ) AS name FROM pg_conversion
>> [/code]
>>
>> And then I can get a list of collations/ctypes with:
>>
>> [code]
>> SELECT collname, collencoding, collprovider collctype FROM pg_collation
>> [/code]
>>
>> And then add a logic in my UI to switch collations/ctypes based on encoding.
>>
>> However, what I wonder is:
>>
>> Is there a way to select a default collation/ctype for a specific encoding?
>>
>> Or maybe I'm overthinking it and I should let the user choose and if
>> nothing - just keep those 2 as "Default" and let the server pick it
>> up. However it will be weird, especially from me as a user POV.
>
>
> You know your data, not us. The first question I'd try to is "how much client text data is not compatible with bog-standard UTF8?"
I don't.
Just trying to create a generic tool to use for people everywhere...
Thank you.
>
> --
> Death to <Redacted>, and butter sauce.
> Don't boil me, I'm still alive.
> <Redacted> lobster!
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Choosing default collation/ctype
@ 2026-05-03 21:05 Ron Johnson <ronljohnsonjr@gmail.com>
parent: Igor Korot <ikorot01@gmail.com>
0 siblings, 2 replies; 6+ messages in thread
From: Ron Johnson @ 2026-05-03 21:05 UTC (permalink / raw)
To: pgsql-generallists.postgresql.org <pgsql-general@lists.postgresql.org>
On Sun, May 3, 2026 at 4:52 PM Igor Korot <ikorot01@gmail.com> wrote:
> Hi,
>
> On Sun, May 3, 2026 at 3:09 PM Ron Johnson <ronljohnsonjr@gmail.com>
> wrote:
> >
> > On Sun, May 3, 2026 at 3:52 PM Igor Korot <ikorot01@gmail.com> wrote:
> >>
> >> Hi, ALL,
> >> In the CREATE DATABASE statement I can use encoding/collation/ctype.
> >>
> >> I can retrieve the encoding list with:
> >>
> >> [code]
> >> SELECT pg_encoding_to_char( conforencoding ) AS name FROM pg_conversion
> >> [/code]
> >>
> >> And then I can get a list of collations/ctypes with:
> >>
> >> [code]
> >> SELECT collname, collencoding, collprovider collctype FROM pg_collation
> >> [/code]
> >>
> >> And then add a logic in my UI to switch collations/ctypes based on
> encoding.
> >>
> >> However, what I wonder is:
> >>
> >> Is there a way to select a default collation/ctype for a specific
> encoding?
> >>
> >> Or maybe I'm overthinking it and I should let the user choose and if
> >> nothing - just keep those 2 as "Default" and let the server pick it
> >> up. However it will be weird, especially from me as a user POV.
> >
> >
> > You know your data, not us. The first question I'd try to is "how much
> client text data is not compatible with bog-standard UTF8?"
>
> I don't.
> Just trying to create a generic tool to use for people everywhere...
>
Then choose UTF8.
--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Choosing default collation/ctype
@ 2026-05-03 21:17 Laurenz Albe <laurenz.albe@cybertec.at>
parent: Ron Johnson <ronljohnsonjr@gmail.com>
1 sibling, 1 reply; 6+ messages in thread
From: Laurenz Albe @ 2026-05-03 21:17 UTC (permalink / raw)
To: Ron Johnson <ronljohnsonjr@gmail.com>; pgsql-generallists.postgresql.org <pgsql-general@lists.postgresql.org>
On Sun, 2026-05-03 at 17:05 -0400, Ron Johnson wrote:
> > Just trying to create a generic tool to use for people everywhere...
>
> Then choose UTF8.
Right! And I recommend "C" for the collation.
(The user can override the default in column definitions where necessary.)
Yours,
Laurenz Albe
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Choosing default collation/ctype
@ 2026-05-03 21:51 Igor Korot <ikorot01@gmail.com>
parent: Ron Johnson <ronljohnsonjr@gmail.com>
1 sibling, 0 replies; 6+ messages in thread
From: Igor Korot @ 2026-05-03 21:51 UTC (permalink / raw)
To: Ron Johnson <ronljohnsonjr@gmail.com>; pgsql-generallists.postgresql.org <pgsql-general@lists.postgresql.org>
Hi, Ron.
On Sun, May 3, 2026 at 4:05 PM Ron Johnson <ronljohnsonjr@gmail.com> wrote:
>
> On Sun, May 3, 2026 at 4:52 PM Igor Korot <ikorot01@gmail.com> wrote:
>>
>> Hi,
>>
>> On Sun, May 3, 2026 at 3:09 PM Ron Johnson <ronljohnsonjr@gmail.com> wrote:
>> >
>> > On Sun, May 3, 2026 at 3:52 PM Igor Korot <ikorot01@gmail.com> wrote:
>> >>
>> >> Hi, ALL,
>> >> In the CREATE DATABASE statement I can use encoding/collation/ctype.
>> >>
>> >> I can retrieve the encoding list with:
>> >>
>> >> [code]
>> >> SELECT pg_encoding_to_char( conforencoding ) AS name FROM pg_conversion
>> >> [/code]
>> >>
>> >> And then I can get a list of collations/ctypes with:
>> >>
>> >> [code]
>> >> SELECT collname, collencoding, collprovider collctype FROM pg_collation
>> >> [/code]
>> >>
>> >> And then add a logic in my UI to switch collations/ctypes based on encoding.
>> >>
>> >> However, what I wonder is:
>> >>
>> >> Is there a way to select a default collation/ctype for a specific encoding?
>> >>
>> >> Or maybe I'm overthinking it and I should let the user choose and if
>> >> nothing - just keep those 2 as "Default" and let the server pick it
>> >> up. However it will be weird, especially from me as a user POV.
>> >
>> >
>> > You know your data, not us. The first question I'd try to is "how much client text data is not compatible with bog-standard UTF8?"
>>
>> I don't.
>> Just trying to create a generic tool to use for people everywhere...
>
>
> Then choose UTF8.
Let me give you a quick run of what I'm trying to do:
In my code I have 3 combo boxes: encoding, collation and ctype.
Initially they all have a value of "Default".
Lets say a user selected "KOI8-R" as an emcoding.
What I will do is populate collation and ctype combo boxes with
values available for such encodings.
But I want to go a little further and change the values in those
to be the default collation/ctype for the "KOI8-R" encoding.
Now are you saying I should choose the one that have "UTF8"
in it?
Thank you.
>
> --
> Death to <Redacted>, and butter sauce.
> Don't boil me, I'm still alive.
> <Redacted> lobster!
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Choosing default collation/ctype
@ 2026-05-04 19:34 Daniel Verite <daniel@manitou-mail.org>
parent: Laurenz Albe <laurenz.albe@cybertec.at>
0 siblings, 1 reply; 6+ messages in thread
From: Daniel Verite @ 2026-05-04 19:34 UTC (permalink / raw)
To: Laurenz Albe <laurenz.albe@cybertec.at>; +Cc: Ron Johnson <ronljohnsonjr@gmail.com>; pgsql-general@lists.postgresql.org
Laurenz Albe wrote:
> > Then choose UTF8.
>
> Right! And I recommend "C" for the collation.
Yet the "C" collation is unsuitable for handling character types
beyond ASCII.
For instance, it considers that accented letters are not letters,
so upper('été') is 'éTé' instead of 'ÉTÉ', and 'é' ~ '\w' is false.
C.UTF-8 solves that, and since Postgres 17, it's available for all operating
systems with the builtin provider.
So if you target Postgres 17+, C.UTF-8 from the builtin provider is
a better choice for UTF-8 databases than "C" .
Best regards,
--
Daniel Vérité
https://postgresql.verite.pro/
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Choosing default collation/ctype
@ 2026-05-04 20:45 Laurenz Albe <laurenz.albe@cybertec.at>
parent: Daniel Verite <daniel@manitou-mail.org>
0 siblings, 0 replies; 6+ messages in thread
From: Laurenz Albe @ 2026-05-04 20:45 UTC (permalink / raw)
To: Daniel Verite <daniel@manitou-mail.org>; +Cc: Ron Johnson <ronljohnsonjr@gmail.com>; pgsql-general@lists.postgresql.org
On Mon, 2026-05-04 at 21:34 +0200, Daniel Verite wrote:
> Laurenz Albe wrote:
>
> > > Then choose UTF8.
> >
> > Right! And I recommend "C" for the collation.
>
> Yet the "C" collation is unsuitable for handling character types
> beyond ASCII.
> For instance, it considers that accented letters are not letters,
> so upper('été') is 'éTé' instead of 'ÉTÉ', and 'é' ~ '\w' is false.
>
> C.UTF-8 solves that, and since Postgres 17, it's available for all operating
> systems with the builtin provider.
> So if you target Postgres 17+, C.UTF-8 from the builtin provider is
> a better choice for UTF-8 databases than "C" .
Yes, "builtin" and the "C" collation is the best default value.
Yours,
Laurenz Albe
^ permalink raw reply [nested|flat] 6+ messages in thread
end of thread, other threads:[~2026-05-04 20:45 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-05-03 20:52 Re: Choosing default collation/ctype Igor Korot <ikorot01@gmail.com>
2026-05-03 21:05 ` Ron Johnson <ronljohnsonjr@gmail.com>
2026-05-03 21:17 ` Laurenz Albe <laurenz.albe@cybertec.at>
2026-05-04 19:34 ` Daniel Verite <daniel@manitou-mail.org>
2026-05-04 20:45 ` Laurenz Albe <laurenz.albe@cybertec.at>
2026-05-03 21:51 ` Igor Korot <ikorot01@gmail.com>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox