[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Problems with accents in filenames

From: Julian Reschke <julian.reschke_at_gmx.de>
Date: 2003-11-24 16:55:05 CET

kfogel@collab.net wrote:

> Julian Reschke <julian.reschke@gmx.de> writes:
>
>>>I think Vincent knows this, and that what he means by "somewhat
>>>compatible" is that for some strings in some languages, the UTF-8
>>>encoding and the UTF-16 encoding will be exactly the same sequence of
>>>bytes.
>>
>>That honestly sounds *very* unlikely: In particular that would mean
>>that those strings do not contain any ASCII characters.
>
>
> Oh? I must have misunderstood how UTF-16 works, then. (And Vincent
> should clarify for himself what he meant, I guess.)

UTF-8 encodes any Unicode character to a multibyte sequence. In
particular, it encodes each ASCII character to one single byte, and
every non-ASCII character to multiple bytes with the high bit set.

UTF-16 encodes "most" Unicode characters to pairs of octets (so for
ASCII characters, every second byte will be zero).

Julian

-- 
<green/>bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Mon Nov 24 16:55:59 2003

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.