Email Package Issue Summary
This is an attempt to categorize the open (at the time of inclusion)
bug reports relevant to the email package. It is a work in progress.
Issues that appear to be fixable in the current package
Note that regardless of whether these actually get fixed in the current
package, unit tests for them must be included in email6.
issue7472: email.encoders.encode_7or8bit(): typo “iso-2202”. “iso-2022” is correct. (closed)
issue4487: Add utf8 alias for email charsets (closed)
- issue4768: email.generator.Generator object bytes/str crash - b64encode() bug? (closed)
- This can be “fixed” in the current package. Victor’s patch may
or may not be appropriate, depending on the decision made about
issue 4769 (closed).
- issue5277: email message.get_params() and related methods sometimes fail. (closed)
- Bug in header parsing. Patch and test available, just needs to
be reviewed and applied.
- issue5610: email feedparser.py CRLFLF bug: $ vs \Z (closed)
- How to handle mixed line endings properly during parsing.
- issue6465: email.feedparser regular expression bug (NLCRE_crack) (closed)
- Similar to issue 5610 (closed), but regarding a CRLF split across input
chunks.
- issue6598: calling email.utils.make_msgid frequently has a non-trivial probability of generating colliding ids
- Issue has patch and tests, just needs to be applied.
issue7304: email.message.Message.set_payload and as_string given charset ‘us-ascii’ plus 8bit data produces invalid message
- issue1555570: email parser incorrectly breaks headers with a CRLF at 8192 (closed)
- Corner case in feedparser when handling CRLF that gets broken
across its chunk-read boundary. Tony Nelson provides a fix
and a test.
- issue7143: get_payload(decode=True) eats last newline in base64 encoded payload (closed)
- If a base64 encoded part ends with an (encoded) newline, that newline
is incorrectly stripped by get_payload.
- issue8769: Straightforward usage of email package fails to round-trip (closed)
- If an encoded word contains a , or ; and is long, it gets wrapped
at the , or ;, thereby breaking the encoded word into invalid
chunks. Andrew has a patch that may or may not be general enough
to apply.
Email relevant issues in modules the email package uses
Parser/Generator
- issue4661: email.parser: impossible to read messages encoded in a different encoding (closed)
- This issue could be the poster child for the need to rewrite the email
package.
- issue724459: Add documentation about line endings in email messages.
- Discussion of general Python philosophy about handling line endings:
use \n internally, and any module that writes to the wire should
convert to CRLF (smtplib, imaplib, etc). The issue is a request for
a doc enhancement, and it certainly applies to the design of the
email package. This issue needs to be addressed at the fundamental
design level.
- issue1349106: email.Generator does not separate headers with “\r\n” (closed)
- IMO, despite (or because of) issue 724459, the generator should
have an API for creating standards compliant output using CRLF
regardless of platform. This is to support consumers of the package
that do communicate on the wire, and may also be necessary in order
to fully support handling mixed line endings (see issue 975330).
The default, however, should continue to be \n, because that’s
what general python programs expect the line end discipline to be.
- issue975330: Inconsistent newline handling in email module
- The new API must be consistent in how newlines are handled in text
parts, regardless of what encoding happens. This issue interacts
directly with issue 724459.
- issue6942: email.generator.Generator memory consumption
- A performance/resource usage enhancement proposal.
- issue1243730: Big speedup in email message parsing
- Another performance enhancement, by eliminating some uses of
re in favor of direct string manipulation.
- issue740495: API enhancement: poplib.MailReader()
- Request, essentially, for an additional API for FeedParser that
would make life easier when using data returned from poplib:
accepting a list of lines.
- issue1440472: email.Generator is not idempotent (closed)
- The parser/generator are not currently inverses. We intend to fix
this in email6.
- issue1459867: Message.as_string should use “mangle_from_=unixfrom”? (closed)
- In addition, __str__ and as_string don’t respect the unixfrom
flag value set on the Message object.
- issue1443866: email 3.0+ stops parsing headers prematurely (closed)
- Current parser treats non-header lines that start in column 1 as
the end of the headers. It is not clear that this is in fact
wrong as long as a defect is recorded, but we should consider
how smart it is possible/reasonable to be about detecting the
start of the body when the message is ill-formed.
- issue1443875: email/charset.py convert() patch
- Essentially a request to allow non-strict decoding (‘replace’)
at the application program’s request.
- issue1672568: silent error in email.message.Message.get_payload (closed)
- Example of where “never fail on query” and “don’t let errors pass
silently” may conflict in the parser. Should be considered in the
API design.
- issue1243654: Faster output if message already has a boundary (closed)
- Optimization, but raises the issue of what should happen if a message’s
boundary is already defined.
- issue8008: Allow Arbitrary OpenID providers in this bug tracker (closed)
- The current method of handing string input (turning it into a
StringIO so it looks like a file to FeedParser) has memory (and speed)
consequences when the input string is large.
General issues from consumers of the email package
- issue747320: rfc2822 formatdate functionality duplication
- A general call for removing code duplication, but the point is made
that some RFCs have slightly different formats. However, logging
in particular should use email instead of having its own. There is
a further note that there are other places in the stdlib where
duplication of email services occurs, but no pointers in the issue.
httplib
- issue4403: regression from 2.6: smtplib.py requiring ascii for sending messages (closed)
- Here smtplib needs to put bytes on the wire correctly, and the
email package is the logical way to do this. At the very least
there are smtplib doc issues here, and perhaps some use cases.
- issue5053: http.client.HTTPMessage.getallmatchingheaders() always returns []
- http.client has a function :func:getallmatchingheaders that
could be replaced by the current :func:get_all from :class:Message.
The resolution of this issue will be affected by the transition to
headers always being Header objects, since that in fact changes
the API that http.client is exposing if http.client switches to
using get_all. It would arguably not be a bad thing for this API
to return header objects, but it means we need to think about the
proposed compatibility layer in a wider context than just the email
package itself.
- issue7370: BaseHTTPServer reinventing rfc822 date formatting
- Suggested refactoring...the patch uses rfc822 but of course it should
be the email module’s formatdate (but see issue 5207 as well).
- issue8318: Deprecation of multifile inappropriate or incomplete (closed)
- The use case that triggered this (parsing range objects) may or may not be
handled by the current MIME implementation, but certainly needs to be. We
may wish to consider other possible applications of multifile, which is a
bit more general that MIME parsing, and whether or not we want to provide
an API to support those uses. In any case the multifile docs need to be
updated with transition instructions.
nntplib