5 PEP 277: Unicode file name support for Windows NT

On Windows NT, 2000, and XP, the system stores file names as Unicode strings. Traditionally, Python has represented file names as byte strings, which is inadequate because it renders some file names inaccessible.

Python now allows using arbitrary Unicode strings (within the limitations of the file system) for all functions that expect file names, most notably the open() built-in function. If a Unicode string is passed to os.listdir(), Python now returns a list of Unicode strings. A new function, os.getcwdu(), returns the current directory as a Unicode string.

Byte strings still work as file names, and on Windows Python will transparently convert them to Unicode using the mbcs encoding.

Other systems also allow Unicode strings as file names but convert them to byte strings before passing them to the system, which can cause a UnicodeError to be raised. Applications can test whether arbitrary Unicode strings are supported as file names by checking os.path.supports_unicode_filenames, a Boolean value.

Under MacOS, os.listdir() may now return Unicode filenames.

See Also:

PEP 277, Unicode file name support for Windows NT
Written by Neil Hodgson; implemented by Neil Hodgson, Martin von Löwis, and Mark Hammond.

See About this document... for information on suggesting changes.