Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wKDzY-000jxr-1V for pgsql-general@arkaria.postgresql.org; Tue, 05 May 2026 11:31:08 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wKDzW-00BtAj-0T for pgsql-general@arkaria.postgresql.org; Tue, 05 May 2026 11:31:06 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wKDzV-00BtAb-23 for pgsql-general@lists.postgresql.org; Tue, 05 May 2026 11:31:05 +0000 Received: from mail-wm1-x32e.google.com ([2a00:1450:4864:20::32e]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wKDzS-00000000Il1-4A1w for pgsql-general@lists.postgresql.org; Tue, 05 May 2026 11:31:04 +0000 Received: by mail-wm1-x32e.google.com with SMTP id 5b1f17b1804b1-48374014a77so60842445e9.3 for ; Tue, 05 May 2026 04:31:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cybertec.at; s=google; t=1777980662; x=1778585462; darn=lists.postgresql.org; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:from:to:cc:subject :date:message-id:reply-to; bh=svIkqa9F5ve90Kqx1aHDP11Z+VeNeMxLlx29HaLXrfo=; b=CUEqg8EGMh66E9vu3p8r+eQpyJzHExMQNpnV+8nPMbmihldqkAL5+Qjga72xcOdsBn RpubYIec7hZXjj/l6N2gRFkugBitMELp0l4em4M3YBJ+cUNEPsX2kBTY6ROCqyg/X+GA Ow+B/Blu5ywHXPsOh72/+T3ztFjFHEO5dBDEQx5Je+ik7/2zSoMk99aJUDbAS+LrhmGv 8T/bcSul9MC6DwktMnedRAARR5kOQ3NFBaThkexbnt7fiCLkGm+18sn8w/lwx4Fdjo6h CKZllGV7NruE/qw1ziAQSH/KIYygnPV9I1qpRS7Y2bU14eGz/M67OQ1RcUSrr1ZF8Wjl mzMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1777980662; x=1778585462; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=svIkqa9F5ve90Kqx1aHDP11Z+VeNeMxLlx29HaLXrfo=; b=guGU14B0yG7F3EEgDquq+6rAtYdK3vccksXD4F7E9XhYemmGapy4flWYody4q6Ns0G 8szhPFdqewxyITHuRxjP8kwHSeEfCVYbBST5IBIUK3VC7hpNGIL/PGwx8LD4481L/Ym4 LAUmxJT9AXQTBVgceiqUxAt6IGg7tjiNeXUJG2bGUoT91H2AKpw4AYwP5BALIXu8tW6v TuicKyM56A8empmkT9Fvakm9424Djalj9qvyau5erA0RZwE8C5lL5m7kJGhR1l448L8z gBL68jAeDj4ELJFys2FgOrvmPX3+prCXPzGFLHTtjIsyWaL6apOwl+Mo4mCp4W+0BSdd 6b6g== X-Forwarded-Encrypted: i=1; AFNElJ9HiktkEYywJnJTHUyDqOJR1udhNEJoxx0gFobWB/4pTid0NW2HAHjHDfTKWSmfywyBRR2RBQp2m3oTJbe6@lists.postgresql.org X-Gm-Message-State: AOJu0YyxOOg0HLouWpyYQ9377TjkNKp1mL4fcodNvuDQSASox6mqGgA5 Gf9NQPo6UOHz7Q9tJIEchfA/VytnxV9QEFk3j7FctYb9xgEQxX5IQl/hI9xWwAs7tTI= X-Gm-Gg: AeBDietcTQINevLr7+M5PmN/4fQA1xIt7uVlTCx6hEGTIt7MNl8bHdawUhNIy6wllT7 dN1YplzpOBhbrzks6B7s6oh4cFnsD5iTQt+LjSVnyjUQdZnaUreEwwZU0cBBBl4iSboCrHtOhyd QqlyshUIbZ+FCZOZgOUXqFzcqEGbAy0eRdyKpzuoSMF+CbGXfFx26iZ/+RLQXpNtbBj4Jo5rQ8x ZB83HwBwgI91aDu0omY08tc2KJycbmnIgoE1TLtTW5tuqu+ByjffSE7GFlg1bOdTc46YFAzxwRE Sl+aqQIZ4b37wibZWKyu5TBcABLGv5Wv8FsdB7UbcciQpsMOcc1x9X8DxGP99rG1knLQXcbdTc1 XqYHK147dXMsqzL7KTGupu2EWZ9WMBmr/a8PVu2f+gW4kh1kQVdhvCLVjC1vov4VIkmhlhYtjir JPSJZlu5jA5YDTEJXM8aJu2Mx10ZunaoPdiF1zKVYbPZhSxkp47SU0pjUkjLARocXzUKWk40sq X-Received: by 2002:a05:600c:4e43:b0:480:1d0b:2d32 with SMTP id 5b1f17b1804b1-48a98638227mr239901745e9.12.1777980662010; Tue, 05 May 2026 04:31:02 -0700 (PDT) Received: from laurenz.albe-K4N0CV00F97414D ([2001:871:70:6736:84dd:31f9:f079:dcb]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-45055960973sm3878742f8f.30.2026.05.05.04.31.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 05 May 2026 04:31:01 -0700 (PDT) Message-ID: <7a6954c2dfe3f224b4c45aa59f7fb6e951ce93b0.camel@cybertec.at> Subject: Re: Choosing default collation/ctype From: Laurenz Albe To: Daniel Verite Cc: Ron Johnson , pgsql-general@lists.postgresql.org Date: Tue, 05 May 2026 13:31:00 +0200 In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.58.3 (3.58.3-1.fc43) MIME-Version: 1.0 List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Tue, 2026-05-05 at 13:16 +0200, Daniel Verite wrote: > Laurenz Albe wrote: >=20 > > > So if you target Postgres 17+, C.UTF-8 from the builtin provider is > > > a better choice for UTF-8 databases than "C" . > >=20 > > Yes, "builtin" and the "C" collation is the best default value. >=20 > But my point was that, no, it's not. > Let's show a concrete example with Postgres 18: >=20 > [...] > > It is not the correct uppercasing. That is true. But if you are using "C.UTF-8", the semantics of upper() can change between versions, if Unicode is upgraded. That bears a residual risk of OS upgrades breaking indexes on upper(col). I'd say that the small benefit of better case conversion isn't worth the risk. I'd chose "C", and use a natural language collation explicitly on columns where these things matter. Yours, Laurenz Albe