public inbox for pgsql-general@postgresql.org  
help / color / mirror / Atom feed
Re: Choosing default collation/ctype
6+ messages / 4 participants
[nested] [flat]

* Re: Choosing default collation/ctype
@ 2026-05-03 20:52 Igor Korot <ikorot01@gmail.com>
  2026-05-03 21:05 ` Re: Choosing default collation/ctype Ron Johnson <ronljohnsonjr@gmail.com>
  0 siblings, 1 reply; 6+ messages in thread

From: Igor Korot @ 2026-05-03 20:52 UTC (permalink / raw)
  To: Ron Johnson <ronljohnsonjr@gmail.com>; +Cc: pgsql-generallists.postgresql.org <pgsql-general@lists.postgresql.org>

Hi,

On Sun, May 3, 2026 at 3:09 PM Ron Johnson <ronljohnsonjr@gmail.com> wrote:
>
> On Sun, May 3, 2026 at 3:52 PM Igor Korot <ikorot01@gmail.com> wrote:
>>
>> Hi, ALL,
>> In the CREATE DATABASE statement I can use encoding/collation/ctype.
>>
>> I can retrieve the encoding list with:
>>
>> [code]
>> SELECT pg_encoding_to_char( conforencoding ) AS name FROM pg_conversion
>> [/code]
>>
>> And then I can get a list of collations/ctypes with:
>>
>> [code]
>> SELECT collname, collencoding, collprovider collctype FROM pg_collation
>> [/code]
>>
>> And then add a logic in my UI to switch collations/ctypes based on encoding.
>>
>> However, what I wonder is:
>>
>> Is there a way to select a default collation/ctype for a specific encoding?
>>
>> Or maybe I'm overthinking it and I should let the user choose and if
>> nothing - just keep those 2 as "Default" and let the server pick it
>> up. However it will be weird, especially from me as a user POV.
>
>
> You know your data, not us.  The first question I'd try to is "how much client text data is not compatible with bog-standard UTF8?"

I don't.
Just trying to create a generic tool to use for people everywhere...

Thank you.

>
> --
> Death to <Redacted>, and butter sauce.
> Don't boil me, I'm still alive.
> <Redacted> lobster!






^ permalink  raw  reply  [nested|flat] 6+ messages in thread

* Re: Choosing default collation/ctype
  2026-05-03 20:52 Re: Choosing default collation/ctype Igor Korot <ikorot01@gmail.com>
@ 2026-05-03 21:05 ` Ron Johnson <ronljohnsonjr@gmail.com>
  2026-05-03 21:17   ` Re: Choosing default collation/ctype Laurenz Albe <laurenz.albe@cybertec.at>
  2026-05-03 21:51   ` Re: Choosing default collation/ctype Igor Korot <ikorot01@gmail.com>
  0 siblings, 2 replies; 6+ messages in thread

From: Ron Johnson @ 2026-05-03 21:05 UTC (permalink / raw)
  To: pgsql-generallists.postgresql.org <pgsql-general@lists.postgresql.org>

On Sun, May 3, 2026 at 4:52 PM Igor Korot <ikorot01@gmail.com> wrote:

> Hi,
>
> On Sun, May 3, 2026 at 3:09 PM Ron Johnson <ronljohnsonjr@gmail.com>
> wrote:
> >
> > On Sun, May 3, 2026 at 3:52 PM Igor Korot <ikorot01@gmail.com> wrote:
> >>
> >> Hi, ALL,
> >> In the CREATE DATABASE statement I can use encoding/collation/ctype.
> >>
> >> I can retrieve the encoding list with:
> >>
> >> [code]
> >> SELECT pg_encoding_to_char( conforencoding ) AS name FROM pg_conversion
> >> [/code]
> >>
> >> And then I can get a list of collations/ctypes with:
> >>
> >> [code]
> >> SELECT collname, collencoding, collprovider collctype FROM pg_collation
> >> [/code]
> >>
> >> And then add a logic in my UI to switch collations/ctypes based on
> encoding.
> >>
> >> However, what I wonder is:
> >>
> >> Is there a way to select a default collation/ctype for a specific
> encoding?
> >>
> >> Or maybe I'm overthinking it and I should let the user choose and if
> >> nothing - just keep those 2 as "Default" and let the server pick it
> >> up. However it will be weird, especially from me as a user POV.
> >
> >
> > You know your data, not us.  The first question I'd try to is "how much
> client text data is not compatible with bog-standard UTF8?"
>
> I don't.
> Just trying to create a generic tool to use for people everywhere...
>

Then choose UTF8.

-- 
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!


^ permalink  raw  reply  [nested|flat] 6+ messages in thread

* Re: Choosing default collation/ctype
  2026-05-03 20:52 Re: Choosing default collation/ctype Igor Korot <ikorot01@gmail.com>
  2026-05-03 21:05 ` Re: Choosing default collation/ctype Ron Johnson <ronljohnsonjr@gmail.com>
@ 2026-05-03 21:17   ` Laurenz Albe <laurenz.albe@cybertec.at>
  2026-05-04 19:34     ` Re: Choosing default collation/ctype Daniel Verite <daniel@manitou-mail.org>
  1 sibling, 1 reply; 6+ messages in thread

From: Laurenz Albe @ 2026-05-03 21:17 UTC (permalink / raw)
  To: Ron Johnson <ronljohnsonjr@gmail.com>; pgsql-generallists.postgresql.org <pgsql-general@lists.postgresql.org>

On Sun, 2026-05-03 at 17:05 -0400, Ron Johnson wrote:
> > Just trying to create a generic tool to use for people everywhere...
> 
> Then choose UTF8.

Right!  And I recommend "C" for the collation.
(The user can override the default in column definitions where necessary.)

Yours,
Laurenz Albe






^ permalink  raw  reply  [nested|flat] 6+ messages in thread

* Re: Choosing default collation/ctype
  2026-05-03 20:52 Re: Choosing default collation/ctype Igor Korot <ikorot01@gmail.com>
  2026-05-03 21:05 ` Re: Choosing default collation/ctype Ron Johnson <ronljohnsonjr@gmail.com>
  2026-05-03 21:17   ` Re: Choosing default collation/ctype Laurenz Albe <laurenz.albe@cybertec.at>
@ 2026-05-04 19:34     ` Daniel Verite <daniel@manitou-mail.org>
  2026-05-04 20:45       ` Re: Choosing default collation/ctype Laurenz Albe <laurenz.albe@cybertec.at>
  0 siblings, 1 reply; 6+ messages in thread

From: Daniel Verite @ 2026-05-04 19:34 UTC (permalink / raw)
  To: Laurenz Albe <laurenz.albe@cybertec.at>; +Cc: Ron Johnson <ronljohnsonjr@gmail.com>; pgsql-general@lists.postgresql.org

	Laurenz Albe wrote:

> > Then choose UTF8.
> 
> Right!  And I recommend "C" for the collation.

Yet the "C" collation is unsuitable for handling character types
beyond ASCII.
For instance, it considers that accented letters are not letters,
so upper('été') is 'éTé' instead of 'ÉTÉ', and 'é' ~ '\w' is false.

C.UTF-8 solves that, and since Postgres 17, it's available for all operating
systems with the builtin provider.
So if you target Postgres 17+, C.UTF-8 from the builtin provider is
a better choice for UTF-8 databases than "C" .


Best regards,
-- 
Daniel Vérité 
https://postgresql.verite.pro/






^ permalink  raw  reply  [nested|flat] 6+ messages in thread

* Re: Choosing default collation/ctype
  2026-05-03 20:52 Re: Choosing default collation/ctype Igor Korot <ikorot01@gmail.com>
  2026-05-03 21:05 ` Re: Choosing default collation/ctype Ron Johnson <ronljohnsonjr@gmail.com>
  2026-05-03 21:17   ` Re: Choosing default collation/ctype Laurenz Albe <laurenz.albe@cybertec.at>
  2026-05-04 19:34     ` Re: Choosing default collation/ctype Daniel Verite <daniel@manitou-mail.org>
@ 2026-05-04 20:45       ` Laurenz Albe <laurenz.albe@cybertec.at>
  0 siblings, 0 replies; 6+ messages in thread

From: Laurenz Albe @ 2026-05-04 20:45 UTC (permalink / raw)
  To: Daniel Verite <daniel@manitou-mail.org>; +Cc: Ron Johnson <ronljohnsonjr@gmail.com>; pgsql-general@lists.postgresql.org

On Mon, 2026-05-04 at 21:34 +0200, Daniel Verite wrote:
> Laurenz Albe wrote:
> 
> > > Then choose UTF8.
> > 
> > Right!  And I recommend "C" for the collation.
> 
> Yet the "C" collation is unsuitable for handling character types
> beyond ASCII.
> For instance, it considers that accented letters are not letters,
> so upper('été') is 'éTé' instead of 'ÉTÉ', and 'é' ~ '\w' is false.
> 
> C.UTF-8 solves that, and since Postgres 17, it's available for all operating
> systems with the builtin provider.
> So if you target Postgres 17+, C.UTF-8 from the builtin provider is
> a better choice for UTF-8 databases than "C" .

Yes, "builtin" and the "C" collation is the best default value.

Yours,
Laurenz Albe






^ permalink  raw  reply  [nested|flat] 6+ messages in thread

* Re: Choosing default collation/ctype
  2026-05-03 20:52 Re: Choosing default collation/ctype Igor Korot <ikorot01@gmail.com>
  2026-05-03 21:05 ` Re: Choosing default collation/ctype Ron Johnson <ronljohnsonjr@gmail.com>
@ 2026-05-03 21:51   ` Igor Korot <ikorot01@gmail.com>
  1 sibling, 0 replies; 6+ messages in thread

From: Igor Korot @ 2026-05-03 21:51 UTC (permalink / raw)
  To: Ron Johnson <ronljohnsonjr@gmail.com>; pgsql-generallists.postgresql.org <pgsql-general@lists.postgresql.org>

Hi, Ron.

On Sun, May 3, 2026 at 4:05 PM Ron Johnson <ronljohnsonjr@gmail.com> wrote:
>
> On Sun, May 3, 2026 at 4:52 PM Igor Korot <ikorot01@gmail.com> wrote:
>>
>> Hi,
>>
>> On Sun, May 3, 2026 at 3:09 PM Ron Johnson <ronljohnsonjr@gmail.com> wrote:
>> >
>> > On Sun, May 3, 2026 at 3:52 PM Igor Korot <ikorot01@gmail.com> wrote:
>> >>
>> >> Hi, ALL,
>> >> In the CREATE DATABASE statement I can use encoding/collation/ctype.
>> >>
>> >> I can retrieve the encoding list with:
>> >>
>> >> [code]
>> >> SELECT pg_encoding_to_char( conforencoding ) AS name FROM pg_conversion
>> >> [/code]
>> >>
>> >> And then I can get a list of collations/ctypes with:
>> >>
>> >> [code]
>> >> SELECT collname, collencoding, collprovider collctype FROM pg_collation
>> >> [/code]
>> >>
>> >> And then add a logic in my UI to switch collations/ctypes based on encoding.
>> >>
>> >> However, what I wonder is:
>> >>
>> >> Is there a way to select a default collation/ctype for a specific encoding?
>> >>
>> >> Or maybe I'm overthinking it and I should let the user choose and if
>> >> nothing - just keep those 2 as "Default" and let the server pick it
>> >> up. However it will be weird, especially from me as a user POV.
>> >
>> >
>> > You know your data, not us.  The first question I'd try to is "how much client text data is not compatible with bog-standard UTF8?"
>>
>> I don't.
>> Just trying to create a generic tool to use for people everywhere...
>
>
> Then choose UTF8.

Let me give you a quick run of what I'm trying to do:

In my code I have 3 combo boxes: encoding, collation and ctype.

Initially they all have a value of "Default".

Lets say a user selected "KOI8-R" as an emcoding.

What I will do is populate collation and ctype combo boxes with
values available for such encodings.
But I want to go a little further and change the values in those
to be the default collation/ctype for the "KOI8-R" encoding.

Now are you saying I should choose the one that have "UTF8"
in it?

Thank you.

>
> --
> Death to <Redacted>, and butter sauce.
> Don't boil me, I'm still alive.
> <Redacted> lobster!






^ permalink  raw  reply  [nested|flat] 6+ messages in thread


end of thread, other threads:[~2026-05-04 20:45 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-05-03 20:52 Re: Choosing default collation/ctype Igor Korot <ikorot01@gmail.com>
2026-05-03 21:05 ` Ron Johnson <ronljohnsonjr@gmail.com>
2026-05-03 21:17   ` Laurenz Albe <laurenz.albe@cybertec.at>
2026-05-04 19:34     ` Daniel Verite <daniel@manitou-mail.org>
2026-05-04 20:45       ` Laurenz Albe <laurenz.albe@cybertec.at>
2026-05-03 21:51   ` Igor Korot <ikorot01@gmail.com>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox