What is MIME type "application/warc-fields"?

A MIME type is a string that tells browsers and other tools how to handle a particular kind of file.

application/warc-fields is a MIME type that indicates a file holds structured metadata from a web archive. It defines the format for the header fields that describe an archived record.


This MIME type is used when storing details like record type, capture time, target URL, and content type. Its structured format lets different programs read and process archive data consistently.



Files that use this MIME type include those with extensions WARC and WARC.GZ. These file types are common in web crawling and digital preservation projects.


More details are available via the WARC Specifications reference.

Associated file extensions

Usage Examples

HTTP Header

When serving content with this MIME type, set the Content-Type header:


    Content-Type: application/warc-fields    
  

HTML

In HTML, you can specify the MIME type in various elements:


    <a href="file.dat" type="application/warc-fields">Download file</a>    
  

Server-side (Node.js)

Setting the Content-Type header in Node.js:


    const http = require('http');    
    
    http.createServer((req, res) => {    
      res.setHeader('Content-Type', 'application/warc-fields');    
      res.end('Content here');    
    }).listen(3000);    
  

Associated file extensions

FAQs

What is the specific purpose of the application/warc-fields MIME type?

The application/warc-fields MIME type identifies a block of structured metadata headers within a Web ARChive (WARC) record. It is specifically used to describe the control fields—such as the capture date, record ID, and content type—allowing software to parse the archive's structure without needing to read the full payload.

How does application/warc-fields differ from application/warc?

While application/warc typically refers to the entire WARC container file, application/warc-fields is often used to denote the specific format of the metadata or warcinfo records inside that container. However, in many server configurations, the general application/warc type covers the whole file extension.

Can web browsers natively open files with this MIME type?

No, standard web browsers like Chrome or Firefox cannot render application/warc-fields data directly. To view these files, you need specialized replay tools like ReplayWeb.page, OpenWayback, or the Wayback Machine software.

How do I configure Apache to serve WARC files with this type?

To associate the type with .warc files, add the following line to your .htaccess or httpd.conf file: AddType application/warc-fields .warc. For compressed files, you might use AddType application/warc-fields .warc.gz alongside appropriate content-encoding headers.

How do I set up Nginx to handle application/warc-fields?

In your nginx.conf or mime.types file, add the directive: application/warc-fields warc;. If you are serving compressed archives, ensure your server is configured to handle the .gz extension correctly, often by setting the Content-Encoding header to gzip.

What software creates files with the application/warc-fields type?

This format is generated by web crawling tools used for digital preservation, such as Heritrix, Wget (with --warc-file options), and Browsertrix. These tools capture web content and generate the necessary header fields to catalog the data.

Is application/warc-fields text-based or binary?

The fields themselves are text-based, similar to HTTP headers, but they are often contained within binary-safe WARC files or compressed .warc.gz archives. You should generally avoid editing these files in standard text editors to prevent corrupting the record byte counts.

General FAQ

What is a MIME type?

A MIME (Multipurpose Internet Mail Extensions) type is a standard that indicates the nature and format of a document, file, or assortment of bytes. MIME types are defined and standardized in IETF's RFC 6838.

MIME types are important because they help browsers and servers understand how to process a file. When a browser receives a file from a server, it uses the MIME type to determine how to display or handle the content, whether it's an image to display, a PDF to open in a viewer, or a video to play.

MIME types consist of a type and a subtype, separated by a slash (e.g., text/html, image/jpeg, application/pdf). Some MIME types also include optional parameters.

How do I find the MIME type for a file?

You can check the file extension or use a file identification tool such as file --mime-type on the command line. Many programming languages also provide libraries to detect MIME types.

Why are multiple MIME types listed for one extension?

Different applications and historical conventions may use alternative MIME identifiers for the same kind of file. Showing them all helps ensure compatibility across systems.