Linking to Documents and Files: The Devil’s in the Details

It’s a given that when you link to anything that is not a web page you should indicate the type of file you are linking to and its size. For example, here’s a link to a creative brief example (PDF 29KB) that I used for a project.

File Data In or Out?

However, the question I have been pondering is how you should write the link to that document or file. More specifically, should the information identifying the file type and size be included in the link or not?

Let’s run that example again. Here’s the above sentence again, with the identifying information not included in the link:

… here’s a link to a creative brief example (PDF 29KB) that I used for a project.

And here’s how it looks with the information included in the link:

… here’s a link to a creative brief example (PDF 29KB) that I used for a project.

Here’s how the two examples would look in a list:

  1. Creative brief example (PDF 29KB)
  2. Creative brief example (PDF 29KB)

As you can see from my lead-in, I prefer the first approach as I find it makes these links easier to read (less ‘clutter’), more scannable on the page and it is visually more appealing to me.

However, I’m not certain that I’m in the majority. I’d be very interested to know what others think and why.

PDF or Portable Document Format?

On a related note, I’m also not sure about how the file type should be written. Yes, PDF is a no-brainer. But what about for Word documents, Excel spreadsheets, and other less well-known file formats?

Using the three letter file extension is certainly tidy and consistent from a presentational standpoint. However, does it provide enough information for your users — who may be less computer literate?

Is it better to write (DOC 250KB) or (Word 250KB)? What about PowerPoint: (PPT 500KB) or (PowerPoint 500KB)?

My preference is to use the file extension; largely for the sake of consistency and simplicity. But I’m happy to be persuaded otherwise.

Like I said at the beginning — the devil’s in the details…

24 thoughts to “Linking to Documents and Files: The Devil’s in the Details”

  1. It’s all about the audience. Geeks will be able to figure anything out. Office workers will know .ppt and .doc. For the layman, you probably will want a whole paragraph of explanation.

  2. I’d go for example one with the bit in brackets not in the link. I’d also use the full name of the application rather than it’s file extension if it weren’t an abbreviation like PDF (which I’d mark up with the abbr tag for the first instance on a page as well).
    I’ve also seen it recommended that your anchor tag include the type attribute to correctly identify the document, e.g. for a PDF it would type=”application/pdf”.

  3. 1. The information identifying the file type and size should not be included in the link because the link should be strictly and primarily focused on the content. The information about file type and size is secondarily because it’s less important for the reader to decide whether he should follow the link or not. If the reader is scanning the page, he should be able to concentrate on the content and nothing else.
    2. doc or Word, xls or Excel? I’m not sure. I agree with Gabe, it’s about the audience. For most people, “word” is better understandable than “doc”.

  4. It’s odd to think that I want to put a period before DOC and PPT, even in the parenthetical, but not for PDFs.
    Some or another document (.DOC – 56KB)
    Some or another presentation (.PPT – 3.4MB)
    This here file (PDF – 4.5MB)
    Of course, I’ve always been a fan of including an icon with such notations just to be sure.

  5. The icon distracts from the content of the link. It gives the file type information even more weight. If you scan a page, you see first a lot of PDF icons, and that’s not the important thing. Important is, what the links says.

  6. *All* — thanks for the helpful feedback! I’m glad to see that the consensus is to keep the information identifying the file out of the link.
    I’m inclined to agree with Robert regarding the use of icons, plus how useful are they for file types that are less well known?
    *John* — I’m not familiar with using ‘type’ within links. I did a quick search but couldn’t find anything useful. Do you have any more information on this attribute?

  7. The best way to be clear here, in my opinion at least, is to use the acronym tag e.g.
    my link PDF (28kb).
    Always looks pretty good on the page, and makes it a little more content rich, while offering the information to those who need it in a tidy way. Use different color or underline style for links and acronyms and your readers quickly differentiate between the 2 different sorts of information.
    Came across a similar quandry recently – how to sign post an HTML multimedia presentation in a hyperlink which is using the expected .html file extension, but for a medium usually associated with Powerpoint .ppt or with OOo Impress. Not sure I quite got it right. Any old hooo, the presntation itself explores Án Integrated Approach to Search Engine Optimisation (HTML presentation – opens in a new window) and uses Erik A Meyer’s brilliant S5.

  8. Addendum – ahh, the <acronym> tag doesn’t seem to function in the example I give above – it does in a regular hypertext environment, and did in my comment preview.
    Ho hum.

  9. PDFs are relatively common on the web, and “PDF” is used to describe these files more than “Acrobat” is.
    Word and Excel documents are rare on the web. It is better to be very explicit about the file type when that type is uncommon. Therefore, I recommend (Microsoft Word document) and (Excel spreadsheet). This reflects how people talk about these files. This might seem wordy, but since very few links will be to Word and Excel documents, the verbosity is not going to be a significant issue.
    People talk about “PDF files” but they don’t talk about “DOC files” or “XLS files.” That is why (PDF) looks okay but (DOC) or (XLS) seems to be less informative without the period: (.DOC) or (.XLS).
    Most people have their computers set up to hide the .DOC and .XLS extension; (.DOC) and (.XLS) are not going to be meaningful to the majority of people.

  10. *Brian* — I’m inclined to agree with you. I would also treat “MP3” in a similar fashion to PDF.
    It’s interesting that you included “Microsoft” in relation to the Word document but not for the Excel spreadsheet. Do you think you need to include the vendor name in this information?
    For some files the abbreviation may actually be more meaningful than the full description. For example, I know what an “AAC” file is, but I had to look up what it stood for (Advanced Audio Codec, by the way).
    It would seem that there is no ‘one size fits all’ approach…

  11. I personally like having the small PDF icon after a link. But that does depend on where it occurs in the page. If the link occurs at the end of the content and there are just a few links, then i like it. If you have lots of links and they are scattered about the page then I think that can be distracting. I dont see too much value in icons for other types of files unless you are really sure that your target audience will recognize those icons.

  12. Probably “Microsoft” isn’t necessary for word documents but I think it is clearer to include it. The reason I left “Microsoft” off the spreadsheet label was because “spreadsheet” is so informative.
    I agree with you about MP3–more people know “MP3” than “PDF.” I would recomend “AAC music file” for AAC files.
    Adobe has a mini icon for links to acrobat files available on their website. But, it comes with a disturbingly long list of terms and conditions.

  13. I’m going to buck the trend here and say that I would probably include the file description in the link. It might be less visually pleasing for some but I think it wins out from an accessibility standpoint.
    Fully sited users are likely to scan the line that the link is on and gather the information such as file type and size; however, for a user browsing with a screen reader, this link is a potential exit point from flow of the page and thus any information after the link could well be missed. Thus I would provide them with the information inside the link itself.
    You guys might be interested in the Firefox Add on Link Alert which gives an icon based tool tip when hovering over non standard html file links.

  14. I’m inclined towards keeping the information within the link.
    You could wrap it in span tags and apply a class to remove the link styles, but I think it should be clickable due to the fact that alot of places do include it within the link and many people will be used to being able to click the (PDF – 200kb) bit in order to start their download.
    As others have mentioned I would stick to the 3 letter extension but use abbr tags to mark it up as well.

  15. I definitely prefer the first method, since the filesize is additional info rather than essential, most of the time. I think it’s best to clearly define what is the actual link and what is just for informational purposes. I also think it’s a good idea to precede or follow such links with a technology specific icon, where applicable. I followed this practice almost exclusively on a recent build for my dayjob: “http://sbimpactmag.com”:http://sbimpactmag.com

  16. Including the ancillary information in the link makes the link bigger, and thus easier to click. Thus, Fitt’s law says we should include the annotations in the link.
    If the isn’t interested in the link–she is skipping over it–then she won’t be interested in reading the ancillary information either. By including the annotations in the link text, we make them easier to skip over.
    At the same time, the file type and size notations are overbearing. This is because they are ALL CAPS. The traditional way to handle this in print publishing is to set abbreviations like PDF and KB in small caps.
    On my weblog–which is currently offline–I use the following rules:
    font-size: .8em;
    vertical-align: .1em;
    font-family: “trebuchet ms”, “arial narrow”;
    letter-spacing: 1px;
    Trebuchet and Arial Narrow are both much narrower than Verdana, my body typeface. This, combined with the smaller font-size, de-emphasizes the annotations. I added one pixel of letter spacing it greatly enhanced the legibility of the narrow, all-caps text; the parentheses become especially slight, which is ideal.

  17. Great post! I would like to implement this on my blog (as I plan on providing some cheat sheets and downloadable references in the near-future) – however, it might get a little tedious to do it manually. Are you aware of any way to automate this process with any of the various popular CMS’s that are out there?

  18. If the filetype and size information is useful but possibly interrupts the flow of text, why not put it in the link title – so that it appears when you hover the mouse over it, and in the status bar?

  19. Ross, people do not usually hover over links to get the title text. If they did that, then they could just look at the status bar (or tooltip in some browsers) and see the .PDF/.DOC/.XLS/.MP3 extension at the end of the link URI.
    If you remove the file size, and use some simulated small caps, you will end up with a very compact annotation that is not so obtrusive.
    I do not bother with the file size. “KB” and “MB” is all jargon that the average computer user does not understand–I’ve heard many people say say “I have 80 megabytes of memory” when referring to their 80 GB disks. Even technical people who know the difference between a kilobyte and a megabyte will rarely need to know the file size of a word document. If the documents were really large (over a megabyte), then I might include an annotation like “1.3 MB”).
    There is also a problem of keeping the link annotations in sync with the linked file. Even if I make just a few small edits to a word document, its file size may change dramatically. Having outdated, very inaccurate annotations would be worse than having no annotation at all. K.I.S.S.

  20. @seremar: yeah you could set up some code in your cms to automatically calculate the filesize of any pdf etc link and output that in html. Basic example in php: first you would do a preg_match_all on the content to be filtered, looking for something like a rel of pdf, or just the file extension .pdf, then something like this:

    if (file_exists($file_location)) {
    $size = filesize($file_location);
    if ($size > 1048576) {
    $size = round($size/1048576, 1).' MB';
    } else {
    $size = round($size/1024, 1).' KB';
    }
    echo "<span>".$size."</span>";
    }
    
  21. *Ross* — interesting idea (which I hadn’t thought of). However, I agree with Brian in that when someone is mousing over a link they are about to click it.
    This file type information will likely come too late in the majority of cases. I want them to be able to make a decision before they choose to click the link.
    *Brian* — I would be inclined to include the file size for all sizes of files so that users know whether the file they are about to open is big or small.
    You’re right about keeping file size info current if the file gets updated. It’s a real pain, but one that’s worth it, in my opinion.
    *Jim* — nice solution!

  22. I use the extension because some extensions aren’t software specific. Eg. a JPG, PNG etc. also I tend to include filesize & extension details in italics but I’m just grateful if someone warns me before I click something and get my browser destroyed by Acrobat.

Comments are closed.