email-normalize

email-normalize is a Python 3 library for returning a normalized email-address stripping mailbox provider specific behaviors such as “Plus addressing” (foo+bar@gmail.com).

The email-normalize API has two primary components: a single function, email_normalize.normalize() and the email_normalize.Normalizer class. Both use Python’s asyncio library.

The normalize() function is intended for use in non-async applications and the Normalizer is intended for async applications. normalize() uses Normalizer under the hood.

Documentation

normalize Function

email_normalize.normalize(email_address)[source]

Normalize an email address

This method abstracts the asyncio base for this library and provides a blocking function. If you intend to use this library as part of an asyncio based application, it is recommended that you use the normalize() instead.

Note

If the MX records could not be resolved, the mx_records attribute of the result will be an empty list and the mailbox_provider attribute will be None.

Usage Example

import email_normalize

result = email_normalize.normalize('foo@bar.io')
Parameters

email_address (str) – The address to normalize

Return type

email_normalize.Result

Normalizer Class

class email_normalize.Normalizer(name_servers=None, cache_limit=1024, cache_failures=True, failure_ttl=300)[source]

Class for normalizing an email address and resolving MX records.

Normalization is processed by splitting the local and domain parts of the email address and then performing DNS resolution for the MX records associated with the domain part of the address. The MX records are processed against a set of mailbox provider specific rules. If a match is found for the MX record hosts, the rules are applied to the email address.

This class implements a least frequent recently used cache that respects the DNS TTL returned when performing MX lookups. Data is cached at the module level.

Usage Example

async def normalize(email_address: str) -> email_normalize.Result:
    normalizer = email_normalize.Normalizer()
    return await normalizer.normalize('foo@bar.io')
Parameters
  • name_servers (list(str) or None) – Optional list of hostnames to use for DNS resolution

  • cache_limit (int) – The maximum number of domain results that are cached. Defaults to 1024.

  • cache_failures (bool) – Toggle the behavior of caching DNS resolution failures for a given domain. When enabled, failures will be cached for failure_ttl seconds. Defaults to True.

  • failure_ttl (int) – Duration in seconds to cache DNS failures. Only works when cache_failures is set to True. Defaults to 300 seconds.

  • cache_limit

  • cache_failures

  • failure_ttl

async mx_records(domain_part)[source]

Resolve MX records for a domain returning a list of tuples with the MX priority and value.

Parameters

domain_part (str) – The domain to resolve MX records for

Return type

MXRecords

async normalize(email_address)[source]

Return a Result instance containing the original address, the normalized address, the MX records found, and the detected mailbox provider.

Note

If the MX records could not be resolved, the mx_records attribute of the result will be an empty list and the mailbox_provider will be None.

Parameters

email_address (str) – The address to normalize

Return type

Result

MXRecords Type

email_normalize.MXRecords

A typing alias for list of tuples containing the priority and host name for each record returned during the MX lookup.

typing.List[typing.Tuple[int, str]]

Example

[
    (5, 'gmail-smtp-in.l.google.com'),
    (10, 'alt1.gmail-smtp-in.l.google.com'),
    (20, 'alt2.gmail-smtp-in.l.google.com'),
    (30, 'alt3.gmail-smtp-in.l.google.com'),
    (40, 'alt4.gmail-smtp-in.l.google.com')
]

Result Class

class email_normalize.Result(address, normalized_address, mx_records, mailbox_provider=None)[source]

Instances of the Result class contain data from the email normalization process.

Parameters
  • address (str) – The address that was normalized

  • normalized_address (str) – The normalized version of the address

  • mx_records (MXRecords) – A list of tuples representing the priority and host of the MX records found for the email address. If empty, indicates a failure to lookup the domain part of the email address.

  • mailbox_provider (str) – String that represents the mailbox provider name - is None if the mailbox provider could not be detected or was unsupported.

Note

If during the normalization process the MX records could not be resolved, the mx_records attribute will be an empty list and the mailbox_provider attribute will be None.

Example

@dataclasses.dataclass(frozen=True)
class Result:
    address = 'Gavin.M.Roy+ignore-spam@gmail.com'
    normalized_address = 'gavinmroy@gmail.com'
    mx_records =     [
        (5, 'gmail-smtp-in.l.google.com'),
        (10, 'alt1.gmail-smtp-in.l.google.com'),
        (20, 'alt2.gmail-smtp-in.l.google.com'),
        (30, 'alt3.gmail-smtp-in.l.google.com'),
        (40, 'alt4.gmail-smtp-in.l.google.com')
    ]
    mailbox_provider = 'Gmail'

Currently Supported Mailbox Providers

  • Apple

  • Fastmail

  • Google

  • Microsoft

  • ProtonMail

  • Rackspace

  • Yahoo

  • Yandex

  • Zoho

Installation

email-normalize is available via the Python Package Index.

pip3 install email-normalize

Indices and tables