Let us walk on the 3-isogeny graph
Loading...
Searching...
No Matches
pip._vendor.webencodings Namespace Reference

Namespaces

namespace  labels
 
namespace  mklabels
 
namespace  tests
 
namespace  x_user_defined
 

Data Structures

class  Encoding
 
class  IncrementalDecoder
 
class  IncrementalEncoder
 

Functions

 ascii_lower (string)
 
 lookup (label)
 
 _get_encoding (encoding_or_label)
 
 decode (input, fallback_encoding, errors='replace')
 
 _detect_bom (input)
 
 encode (input, encoding=UTF8, errors='strict')
 
 iter_decode (input, fallback_encoding, errors='replace')
 
 _iter_decode_generator (input, decoder)
 
 iter_encode (input, encoding=UTF8, errors='strict')
 
 _iter_encode_generator (input, encode)
 

Variables

str VERSION = '0.5.1'
 
dict PYTHON_NAMES
 
dict CACHE = {}
 
 UTF8 = lookup('utf-8')
 
 _UTF16LE = lookup('utf-16le')
 
 _UTF16BE = lookup('utf-16be')
 

Detailed Description

    webencodings
    ~~~~~~~~~~~~

    This is a Python implementation of the `WHATWG Encoding standard
    <http://encoding.spec.whatwg.org/>`. See README for details.

    :copyright: Copyright 2012 by Simon Sapin
    :license: BSD, see LICENSE for details.

Function Documentation

◆ _detect_bom()

_detect_bom (   input)
protected
Return (bom_encoding, input), with any BOM removed from the input.

Definition at line 161 of file __init__.py.

161def _detect_bom(input):
162 """Return (bom_encoding, input), with any BOM removed from the input."""
163 if input.startswith(b'\xFF\xFE'):
164 return _UTF16LE, input[2:]
165 if input.startswith(b'\xFE\xFF'):
166 return _UTF16BE, input[2:]
167 if input.startswith(b'\xEF\xBB\xBF'):
168 return UTF8, input[3:]
169 return None, input
170
171
for i

References i.

Referenced by pip._vendor.webencodings.decode(), and IncrementalDecoder.decode().

Here is the caller graph for this function:

◆ _get_encoding()

_get_encoding (   encoding_or_label)
protected
Accept either an encoding object or label.

:param encoding: An :class:`Encoding` object or a label string.
:returns: An :class:`Encoding` object.
:raises: :exc:`~exceptions.LookupError` for an unknown label.

Definition at line 91 of file __init__.py.

91def _get_encoding(encoding_or_label):
92 """
93 Accept either an encoding object or label.
94
95 :param encoding: An :class:`Encoding` object or a label string.
96 :returns: An :class:`Encoding` object.
97 :raises: :exc:`~exceptions.LookupError` for an unknown label.
98
99 """
100 if hasattr(encoding_or_label, 'codec_info'):
101 return encoding_or_label
102
103 encoding = lookup(encoding_or_label)
104 if encoding is None:
105 raise LookupError('Unknown encoding label: %r' % encoding_or_label)
106 return encoding
107
108

References i.

Referenced by IncrementalEncoder.__init__(), pip._vendor.webencodings.decode(), and pip._vendor.webencodings.encode().

Here is the caller graph for this function:

◆ _iter_decode_generator()

_iter_decode_generator (   input,
  decoder 
)
protected
Return a generator that first yields the :obj:`Encoding`,
then yields output chukns as Unicode strings.

Definition at line 214 of file __init__.py.

214def _iter_decode_generator(input, decoder):
215 """Return a generator that first yields the :obj:`Encoding`,
216 then yields output chukns as Unicode strings.
217
218 """
219 decode = decoder.decode
220 input = iter(input)
221 for chunck in input:
222 output = decode(chunck)
223 if output:
224 assert decoder.encoding is not None
225 yield decoder.encoding
226 yield output
227 break
228 else:
229 # Input exhausted without determining the encoding
230 output = decode(b'', final=True)
231 assert decoder.encoding is not None
232 yield decoder.encoding
233 if output:
234 yield output
235 return
236
237 for chunck in input:
238 output = decode(chunck)
239 if output:
240 yield output
241 output = decode(b'', final=True)
242 if output:
243 yield output
244
245

References i.

Referenced by pip._vendor.webencodings.iter_decode().

Here is the caller graph for this function:

◆ _iter_encode_generator()

_iter_encode_generator (   input,
  encode 
)
protected

Definition at line 262 of file __init__.py.

262def _iter_encode_generator(input, encode):
263 for chunck in input:
264 output = encode(chunck)
265 if output:
266 yield output
267 output = encode('', final=True)
268 if output:
269 yield output
270
271

Referenced by pip._vendor.webencodings.iter_encode().

Here is the caller graph for this function:

◆ ascii_lower()

ascii_lower (   string)
Transform (only) ASCII letters to lower case: A-Z is mapped to a-z.

    :param string: An Unicode string.
    :returns: A new Unicode string.

    This is used for `ASCII case-insensitive
    <http://encoding.spec.whatwg.org/#ascii-case-insensitive>`_
    matching of encoding labels.
    The same matching is also used, among other things,
    for `CSS keywords <http://dev.w3.org/csswg/css-values/#keywords>`_.

    This is different from the :meth:`~py:str.lower` method of Unicode strings
    which also affect non-ASCII characters,
    sometimes mapping them into the ASCII range:

        >>> keyword = u'Bac\N{KELVIN SIGN}ground'
        >>> assert keyword.lower() == u'background'
        >>> assert ascii_lower(keyword) != keyword.lower()
        >>> assert ascii_lower(keyword) == u'bac\N{KELVIN SIGN}ground'

Definition at line 35 of file __init__.py.

35def ascii_lower(string):
36 r"""Transform (only) ASCII letters to lower case: A-Z is mapped to a-z.
37
38 :param string: An Unicode string.
39 :returns: A new Unicode string.
40
41 This is used for `ASCII case-insensitive
42 <http://encoding.spec.whatwg.org/#ascii-case-insensitive>`_
43 matching of encoding labels.
44 The same matching is also used, among other things,
45 for `CSS keywords <http://dev.w3.org/csswg/css-values/#keywords>`_.
46
47 This is different from the :meth:`~py:str.lower` method of Unicode strings
48 which also affect non-ASCII characters,
49 sometimes mapping them into the ASCII range:
50
51 >>> keyword = u'Bac\N{KELVIN SIGN}ground'
52 >>> assert keyword.lower() == u'background'
53 >>> assert ascii_lower(keyword) != keyword.lower()
54 >>> assert ascii_lower(keyword) == u'bac\N{KELVIN SIGN}ground'
55
56 """
57 # This turns out to be faster than unicode.translate()
58 return string.encode('utf8').lower().decode('utf8')
59
60

References i.

Referenced by pip._vendor.webencodings.lookup().

Here is the caller graph for this function:

◆ decode()

decode (   input,
  fallback_encoding,
  errors = 'replace' 
)
Decode a single string.

:param input: A byte string
:param fallback_encoding:
    An :class:`Encoding` object or a label string.
    The encoding to use if :obj:`input` does note have a BOM.
:param errors: Type of error handling. See :func:`codecs.register`.
:raises: :exc:`~exceptions.LookupError` for an unknown encoding label.
:return:
    A ``(output, encoding)`` tuple of an Unicode string
    and an :obj:`Encoding`.

Definition at line 139 of file __init__.py.

139def decode(input, fallback_encoding, errors='replace'):
140 """
141 Decode a single string.
142
143 :param input: A byte string
144 :param fallback_encoding:
145 An :class:`Encoding` object or a label string.
146 The encoding to use if :obj:`input` does note have a BOM.
147 :param errors: Type of error handling. See :func:`codecs.register`.
148 :raises: :exc:`~exceptions.LookupError` for an unknown encoding label.
149 :return:
150 A ``(output, encoding)`` tuple of an Unicode string
151 and an :obj:`Encoding`.
152
153 """
154 # Fail early if `encoding` is an invalid label.
155 fallback_encoding = _get_encoding(fallback_encoding)
156 bom_encoding, input = _detect_bom(input)
157 encoding = bom_encoding or fallback_encoding
158 return encoding.codec_info.decode(input, errors)[0], encoding
159
160

References pip._vendor.webencodings._detect_bom(), pip._vendor.webencodings._get_encoding(), and i.

Here is the call graph for this function:

◆ encode()

encode (   input,
  encoding = UTF8,
  errors = 'strict' 
)
Encode a single string.

:param input: An Unicode string.
:param encoding: An :class:`Encoding` object or a label string.
:param errors: Type of error handling. See :func:`codecs.register`.
:raises: :exc:`~exceptions.LookupError` for an unknown encoding label.
:return: A byte string.

Definition at line 172 of file __init__.py.

172def encode(input, encoding=UTF8, errors='strict'):
173 """
174 Encode a single string.
175
176 :param input: An Unicode string.
177 :param encoding: An :class:`Encoding` object or a label string.
178 :param errors: Type of error handling. See :func:`codecs.register`.
179 :raises: :exc:`~exceptions.LookupError` for an unknown encoding label.
180 :return: A byte string.
181
182 """
183 return _get_encoding(encoding).codec_info.encode(input, errors)[0]
184
185

References pip._vendor.webencodings._get_encoding(), and i.

Here is the call graph for this function:

◆ iter_decode()

iter_decode (   input,
  fallback_encoding,
  errors = 'replace' 
)
"Pull"-based decoder.

:param input:
    An iterable of byte strings.

    The input is first consumed just enough to determine the encoding
    based on the precense of a BOM,
    then consumed on demand when the return value is.
:param fallback_encoding:
    An :class:`Encoding` object or a label string.
    The encoding to use if :obj:`input` does note have a BOM.
:param errors: Type of error handling. See :func:`codecs.register`.
:raises: :exc:`~exceptions.LookupError` for an unknown encoding label.
:returns:
    An ``(output, encoding)`` tuple.
    :obj:`output` is an iterable of Unicode strings,
    :obj:`encoding` is the :obj:`Encoding` that is being used.

Definition at line 186 of file __init__.py.

186def iter_decode(input, fallback_encoding, errors='replace'):
187 """
188 "Pull"-based decoder.
189
190 :param input:
191 An iterable of byte strings.
192
193 The input is first consumed just enough to determine the encoding
194 based on the precense of a BOM,
195 then consumed on demand when the return value is.
196 :param fallback_encoding:
197 An :class:`Encoding` object or a label string.
198 The encoding to use if :obj:`input` does note have a BOM.
199 :param errors: Type of error handling. See :func:`codecs.register`.
200 :raises: :exc:`~exceptions.LookupError` for an unknown encoding label.
201 :returns:
202 An ``(output, encoding)`` tuple.
203 :obj:`output` is an iterable of Unicode strings,
204 :obj:`encoding` is the :obj:`Encoding` that is being used.
205
206 """
207
208 decoder = IncrementalDecoder(fallback_encoding, errors)
209 generator = _iter_decode_generator(input, decoder)
210 encoding = next(generator)
211 return generator, encoding
212
213

References pip._vendor.webencodings._iter_decode_generator().

Here is the call graph for this function:

◆ iter_encode()

iter_encode (   input,
  encoding = UTF8,
  errors = 'strict' 
)
“Pull”-based encoder.

:param input: An iterable of Unicode strings.
:param encoding: An :class:`Encoding` object or a label string.
:param errors: Type of error handling. See :func:`codecs.register`.
:raises: :exc:`~exceptions.LookupError` for an unknown encoding label.
:returns: An iterable of byte strings.

Definition at line 246 of file __init__.py.

246def iter_encode(input, encoding=UTF8, errors='strict'):
247 """
248 “Pull”-based encoder.
249
250 :param input: An iterable of Unicode strings.
251 :param encoding: An :class:`Encoding` object or a label string.
252 :param errors: Type of error handling. See :func:`codecs.register`.
253 :raises: :exc:`~exceptions.LookupError` for an unknown encoding label.
254 :returns: An iterable of byte strings.
255
256 """
257 # Fail early if `encoding` is an invalid label.
258 encode = IncrementalEncoder(encoding, errors).encode
259 return _iter_encode_generator(input, encode)
260
261

References pip._vendor.webencodings._iter_encode_generator().

Here is the call graph for this function:

◆ lookup()

lookup (   label)
Look for an encoding by its label.
This is the spec’s `get an encoding
<http://encoding.spec.whatwg.org/#concept-encoding-get>`_ algorithm.
Supported labels are listed there.

:param label: A string.
:returns:
    An :class:`Encoding` object, or :obj:`None` for an unknown label.

Definition at line 61 of file __init__.py.

61def lookup(label):
62 """
63 Look for an encoding by its label.
64 This is the spec’s `get an encoding
65 <http://encoding.spec.whatwg.org/#concept-encoding-get>`_ algorithm.
66 Supported labels are listed there.
67
68 :param label: A string.
69 :returns:
70 An :class:`Encoding` object, or :obj:`None` for an unknown label.
71
72 """
73 # Only strip ASCII whitespace: U+0009, U+000A, U+000C, U+000D, and U+0020.
74 label = ascii_lower(label.strip('\t\n\f\r '))
75 name = LABELS.get(label)
76 if name is None:
77 return None
78 encoding = CACHE.get(name)
79 if encoding is None:
80 if name == 'x-user-defined':
81 from .x_user_defined import codec_info
82 else:
83 python_name = PYTHON_NAMES.get(name, name)
84 # Any python_name value that gets to here should be valid.
85 codec_info = codecs.lookup(python_name)
86 encoding = Encoding(name, codec_info)
87 CACHE[name] = encoding
88 return encoding
89
90

References pip._vendor.webencodings.ascii_lower(), and i.

Here is the call graph for this function:

Variable Documentation

◆ _UTF16BE

_UTF16BE = lookup('utf-16be')
protected

Definition at line 136 of file __init__.py.

◆ _UTF16LE

_UTF16LE = lookup('utf-16le')
protected

Definition at line 135 of file __init__.py.

◆ CACHE

dict CACHE = {}

Definition at line 32 of file __init__.py.

◆ PYTHON_NAMES

dict PYTHON_NAMES
Initial value:
1= {
2 'iso-8859-8-i': 'iso-8859-8',
3 'x-mac-cyrillic': 'mac-cyrillic',
4 'macintosh': 'mac-roman',
5 'windows-874': 'cp874'}

Definition at line 26 of file __init__.py.

◆ UTF8

UTF8 = lookup('utf-8')

Definition at line 133 of file __init__.py.

◆ VERSION

str VERSION = '0.5.1'

Definition at line 22 of file __init__.py.