Email message size limits

Background

Prompted by a request from staff at a client’s head office, a couple of days ago i posed this question to a couple of the mailing lists i’m on: what is your size limit on individual email messages?

I was blown away by the speed, quantity, and quality of the responses i received from the AusNOG and SAGE-AU communities.  Within an hour i had some hard data and a useful recommendation to take to my client.

Results

I’ve published the statistics and the raw figures to separate sheetes in the same Google Docs workbook; a few explanatory comments about the results are necessary:

  • A number of responses indicated two values, often broken down by receive/send or internal/external criteria (with the latter being the smaller).  This is indicated as “Tier 1” vs. “Tier 2” in the raw results.  I’ve used the “Tier 1” figure to calculate the results.
  • Answers which were ambiguous or indicated no limit were not included in the calculations, nor was one answer of 5 GB, since it skewed the results unrealistically.

Statistics[1]

  • Number of responses: 64
  • Number of numerically quantifiable responses: 57
  • Mean: 30.105 MB
  • Median: 25 MB
  • Mode: 20 MB
  • Standard Deviation: 20.929 MB

Bottom line

I’d say that anyone using something in the range of 10 – 50 MB could consider themselves reasonably “normal”; both those figures are within one standard deviation of the mean.

Commentary

Here are some of the more interesting comments i received, along with the size they indicated.  In most cases, these are direct quotes, but i’ve edited them for spelling, clarity, and punctuation where necessary.  I’ve highlighted two responses that i found striking, given their closeness to the actual results.  (I also suggest reading the AusNOG discussion – both threads – some excellent points were made.)

Size(s) Comments
8 If people need to send more than that, email is the wrong answer.
15 We’ve found in the past increasing above 15 MB resulted in a large number of bounce backs for organizations rejecting messages that were too large being sent to them. The biggest issue we have is explaining this to our customers and them believing it. Mainly because they don’t understand that a simple 8 MB JPEG can blow out to 20-25 MB because of mime encoding etc. We try our best to advise them of this, we do get quite a lot of arguing and feedback requesting we increase it anyway. However, slowly they’re realizing: when their large messages start bouncing back they ask us to set the limit back to what it was before.
25 I imagine a general consensus will be 25 MB upper limit due to Google Apps.
25 Most of my clients have gone Google Apps.
30 Our general view is that if a limitation is lower than what a customer gets on gmail (which is currently 25 MB) and related free services, then you will need to support at least that limit. A limit of 30 MB doesn’t have to be in place long before user actively notice that the limits are typically elsewhere, and start talking about how good their system is. Non-technical high-ups will struggle with paying for a business service that offers less than their personal accounts.
30 Microsoft did a risk assessment for us and noted that having large message sizes and large mailbox sizes (10G to 60G) is a high risk.
40 … and we still get complaints.
50 People still run into [our limit.  We] had ‘someone’s IT guy’ tell us the ‘industry standard’ was 10 MB. I expect you’re getting a wide range of answers, and that there really isn’t an ‘industry standard’.
50 Unfortunately, I still get called every time an email bounces due to remote size limits.
100 We didn’t see any notable impact because of this change [to 100 MB], no delays, additional load or problems caused by the larger emails. Note: These clients had 20, 50, 100meg or faster Internet pipes.
5? I’m actually looking at reducing email size limits to force users into using technologies designed for file sharing and governance – Sharepoint, Skydrive Pro, etc. Reducing limits to 5 MB has all sorts of flow on effects: not even talking about freeing up link bandwidth, Exchange store sizes, etc. I’ve found that email enables poor habits. Emailing a 10 MB doc to the user 2 rooms down via a hosted exchange? Floods the link twice, plus stores the attachment in your local OST, the recipients local OST, and two copies in the exchange store. Now, modify it, and send it back. Ouch.
20? If I had to pick a single size that’s used, it would probably be 20 MB – but there’s no end of variation. 10 MB is common, although mainly for historic reasons, and the number of people with such a low limit is dropping. 25 MB and even 50 MB aren’t uncommon. 100 MB is rare, but out there – mainly in situations where mail is being sent to a specific recipient and they have also upped their [overall] limit. I’ve even seen one company who wanted their limit set to 1 GB…
unlimited/10 I can not express enough the frustration in a customer saying they want to send a bigger email and wanting us to up our limit, explaining the internet is just too hard a task sometimes. In one specific case it was an 11 MB email, the customer response was “It’s only an extra 1 MB can you just let it through this once”, so I pointed him to an SMTP with no limit on it; next day he is forwarding a bounce back from the receiving end who blocked him based on size.

Decision

For those who are interested in the decision: my client and myself were both previously part of the “10 MB is the industry standard” camp, but found the argument about gmail compatibility compelling, and have decided to increase to 25 MB, much to the delight of the staff member pushing for the change.

Notes

  1. Disclaimer: I am not a statistician; this is not a scientifically- or statistically-valid survey; all online polls are inherently bogus due to the respondents self-selecting; i have no idea whether this sample is statistically significant or valid; i did not attempt to authenticate or validate the responses in any way; YMMV; no warranties expressed or implied, etc.


Source: libertysys.com.au

Email trivialities – a couple of first encounters

Last week i had my first encounter with a person who was unable to read an email which used Usenet-style quoting.  For those not familiar with the whole debate, which probably started probably before i first encountered the Internet (way back in 1989), the following references offer some insight:

  1. http://en.wikipedia.org/wiki/Posting_style
  2. http://en.wikipedia.org/wiki/Usenet_quoting
  3. http://lipas.uwasa.fi/~ts/http/quote.html
  4. http://www.catb.org/jargon/html/email-style.html
  5. http://mailformat.dan.info/quoting/bottom-posting.html 
  6. http://mailformat.dan.info/quoting/top-posting.html

(Number 5 is particularly good because it explains why inline quoting is not just for geeks, but is a judicious practice for all sensible people, and number 6 preserves a particularly sad and hilarious example of why posting styles are important.)

Most people in the Linux world seem to be inline quoters, although this is changing gradually with more and more members of the Ubuntu community coming from non-technical backgrounds.  (Even my own wife cannot accept inline quoting as the one true method – despite my urgings…)  Those familiar with the debate and committed to inline quoting would likely be familiar with the frustration of dealing constantly with those who just accept Microsoft Outlook’s defaults and top-post everything, even when you have already responded in a point-by-point manner.

I’ve actually changed a number of my email habits over the last few years in order to make interacting with these people easier, including switching my default mail format to HTML, and switching from PGP and S/MIME to DKIM for cryptographic authentication.  But last week was the first time i’d ever come across someone who was so ignorant of inline quoting that he literally could not understand my email.  He thought i had just sent his email straight back to him.  (Why would anyone do that, i wonder?  Is top-posting not wasteful enough of bandwidth already?)  I believe that the inefficient nature of top-posting (the fact that no trimming of the quoted text is ever done) has taught people like him to simply ignore everything after the first visual divider in the email.

Anyway, i’m curious to know what others think (drop me a line, if you like) – how do you deal with this while still being responsible yourself?  I would have thought that the fact that my email looked superficially similar to my correspondent’s and was inexplicable should have clued him into the fact that something was afoot and signalled to his brain that he should read more closely.  Maybe he’s dyslexic?  Or just plain lazy?  I have to deal with this person on an ongoing basis – should i simply ignore it and move on?

In related news, today i had my first encounter with Hebrew spam.  Neither SpamAssasin nor Thunderbird’s spam filter picked it up, but Thunderbird’s phishing detection marked it as a scam (correctly, as far as i can tell).  It got Bayes rating of 60%, so i imagine i won’t see another one once my server runs its nightly Bayes training.  Having studied ancient Hebrew at postgraduate level, i was actually interested in the email from a linguistic perspective, but alas, modern Hebrew has no vowel pointing, so my woeful vocabulary (and the fact that languages change over 3000 years or so) rules out me actually reading it.  Interestingly enough, the spam was to the public email address of my Free Software project, Photo Importer, which means that the spammer has a crawling engine which actually extracts email addresses from .tar.gz archives which it has downloaded from the web.


Source: libertysys.com.au

Spam insights from Project Honeypot

Project Honeypot just published a report of their experience in processing 1 billion spam messages.  Highlights for the impatient:

  • For the past 5 years, spam “bots have grown at a compound annual growth rate of more than 378%. In other words, the number of bots has nearly quadrupled ever year.”
  • The top 5 countries which host bots are: China (11.4%), Brazil (9.2%), United States (7.5%), Turkey (6.3%), and Germany (6.0%).
  • Top 5 countries with the best ratio of security professionals to spam sources: Finland, Canada, Belgium, Australia (yay!), and the Netherlands.
  • The corresponding bottom 5: China, Azerbaijan, South Korea, Colombia, and Macedonia.
  • Top Spam harvesting countries: United States, Spain, the Netherlands, United Arab Emirates, and Hong Kong.
  • Fraud is rising as a cause for spamming:

    On the other hand “Fraud” spammers — those committing phishing or so-called “419” advanced fee scams — tend to send to and discard harvested addresses almost immediately. The increased average speed of spammers appears to be mostly attributable to the rise in spam as a vehicle for fraud rather than an increasing efficiency among traditional product spammers.

    As an anecdote to reinforce this, on one site i administer, i set up a dedicated subdomain which was purely designed to catch spam.  I placed some addresses in that domain on a web page, and within 1 day they had been harvested and 1 spam had been sent to each email address.  No email to that subdomain has been seen since.

Check out Project Honeynet’s full analysis.

Source: libertysys.com.au

"Just say no!" to e-cards

Richard Bliss recently blogged at Novell and on his personal blog with some great advice: don’t click on e-cards from your friends, and think about asking them not to send them at all, since the risks of clicking on e-cards vastly outweigh the benefits. Here’s another thought: real money spent on real cards, envelopes, and stamps shows that you’ve actually made an investment in reminding your friends and family of your regard for them. It’s much better for your online security, too! (Of course, one could argue that it is less environmentally friendly, but you can find cards and envelopes made from recycled materials.)


Source: libertysys.com.au