PEP:332
Title:Byte vectors and String/Unicode Unification
Version:1888
Last-Modified:2004-08-27 06:44:37 -0700 (Fri, 27 Aug 2004)
Author:Skip Montanaro <skip at pobox.com>
Status:Draft
Type:Standards Track
Content-Type:text/x-rst
Created:11-Aug-2004
Python-Version:2.5
Post-History:

Contents

Abstract

This PEP outlines the introduction of a raw bytes sequence object and the unification of the current str and unicode objects.

Rationale

Python's current string objects are overloaded. They serve both to hold ASCII and non-ASCII character data and to also hold sequences of raw bytes which have no reasonable interpretation as displayable character sequences. This overlap hasn't been a big problem in the past, but as Python moves closer to requiring source code to be properly encoded, the use of strings to represent raw byte sequences will be more problematic. In addition, as Python's Unicode support has improved, it's easier to consider strings as ASCII-encoded Unicode objects.

Proposed Implementation

The number in parentheses indicates the Python version in which the feature will be introduced.

Bytes Object API

TBD.

Issues