What is MIME type "application/warc"?

A MIME type is a string that tells browsers and other tools how to handle a particular kind of file.

application/warc is a MIME type for web archive files. It designates containers that store archived web content, including copies of web pages, images, and metadata.

This format collects entire HTTP exchanges—requests, responses, and headers—to capture a website’s state at a given time. It is essential for digital preservation and research projects that aim to analyze historical versions of web content.

Key uses include:
WARC files are commonly distributed in their standard form as WARC or in a compressed format as WARC.GZ.

For detailed technical specifications, see the WARC File Specifications.

Associated file extensions

Usage Examples

HTTP Header

When serving content with this MIME type, set the Content-Type header:


    Content-Type: application/warc    
  

HTML

In HTML, you can specify the MIME type in various elements:


    <a href="file.dat" type="application/warc">Download file</a>    
  

Server-side (Node.js)

Setting the Content-Type header in Node.js:


    const http = require('http');    
    
    http.createServer((req, res) => {    
      res.setHeader('Content-Type', 'application/warc');    
      res.end('Content here');    
    }).listen(3000);    
  

Associated file extensions

FAQs

What is the application/warc MIME type used for?

The application/warc MIME type designates the Web ARChive format, which is the international standard (ISO 28500) for preserving web content. Unlike a simple HTML save, this format captures the entire HTTP exchange, including request headers, response headers, and the payload (images, scripts, text) to create a perfect historical snapshot.

How do I open a .warc file?

Standard web browsers like Chrome or Firefox cannot render .warc files natively. To view the archived content, you must use specialized replay software such as ReplayWeb.page or the Webrecorder Player, which emulate the original network environment to display the pages correctly.

How do I configure Apache or Nginx to serve WARC files?

To ensure browsers and tools identify the file correctly, you must update your MIME configuration. For Apache, add AddType application/warc .warc to your configuration or .htaccess file. For Nginx, add application/warc warc; to your mime.types file or within the types block.

Can I create WARC files using command-line tools?

Yes, the common utility Wget supports creating web archives natively. You can use the command wget --mirror --warc-file=myarchive https://example.com to crawl a website and save the results directly into an application/warc container.

What is the difference between .warc and .warc.gz?

A .warc file contains uncompressed archive data, while .warc.gz is the same data compressed using the Gzip algorithm. While the underlying content type remains application/warc, the compressed version is standard for storage and transfer; most replay tools can read .warc.gz files without needing manual decompression.

Are there security risks associated with WARC files?

Potentially, yes. Since a WARC file captures the exact state of a website, it can also capture malicious scripts or malware present on that site at the time of archiving. When replaying a file, the archived JavaScript executes in your browser, so you should only open archives from trusted sources or use sandboxed viewers.

How does application/warc differ from the older ARC format?

The WARC format is a more flexible successor to the legacy ARC format used by the Internet Archive in the 1990s. While ARC files only stored the response content, application/warc stores both the request and response headers, handles duplicate records more efficiently, and supports arbitrary metadata, making it better suited for modern digital preservation.

General FAQ

What is a MIME type?

A MIME (Multipurpose Internet Mail Extensions) type is a standard that indicates the nature and format of a document, file, or assortment of bytes. MIME types are defined and standardized in IETF's RFC 6838.

MIME types are important because they help browsers and servers understand how to process a file. When a browser receives a file from a server, it uses the MIME type to determine how to display or handle the content, whether it's an image to display, a PDF to open in a viewer, or a video to play.

MIME types consist of a type and a subtype, separated by a slash (e.g., text/html, image/jpeg, application/pdf). Some MIME types also include optional parameters.

How do I find the MIME type for a file?

You can check the file extension or use a file identification tool such as file --mime-type on the command line. Many programming languages also provide libraries to detect MIME types.

Why are multiple MIME types listed for one extension?

Different applications and historical conventions may use alternative MIME identifiers for the same kind of file. Showing them all helps ensure compatibility across systems.