Monday, April 10, 2006
Media Type Dilution Round 2
I'm open the the suggestion that I am not fully aware of how Media Types are used in their entirety. I am, however, suggesting that there is a growing problem with use of Media Types in determining dispatch requirements.
Functional Interpretation
Although I have not seen dispatching based on Media Types generically described in the RFCs that I reviewed, I have found the description in a w3c document "Client handling of MIME headers" [1] and RFC3023 Section 13 (Appendix A) [2].
This seems to me to be a major use and benefit of defining appropriate Media Types, especially with respect to the assignment of Sub-types within the appropriate top-level types.
Common browsers and operating systems both allow dispatching based on Media Types, in addition to the "similar-to-dispatching" uses I mentioned previously in ATOM, RSS Media, and Autodiscovery.
Even at the operating system or the browser level, this sometimes run into problems for the user where a particular audio or video Media Container contains a Media Encoding that is not decodable by the application that has been associated based on the Media Type.
Terminology Interpretation
RFC4288 Section 4.1 "Functionality Requirement" states that "Media types MUST function as an actual media format."
The next sentence states "Registration of things that are better thought of as a transfer encoding...are not allowed." and then goes on to give base64 as an example of a transfer encoding.
It is still unclear which, if any, of the following categories (using my prior terminology) are construed as a "transfer encoding": Type, Media Container, and Media Encoding. To me, these "categories" are the logical way of distinguishing audio and video media simply because these are the broad categories determining how and if a particular file can be played on any particular system.
"Type" clearly corresponds to top-level Media Type (audio, video, etc.). Media Container (such as Ogg, MPEG-4 Part 14, "AVI", "MOV") clearly does *not* correspond to RFC4288 Section 4.1 "Transfer Encoding" and clearly corresponds to an actual media format. So the remaining question is: do media encodings commonly in use correspond to "transfer encodings", in which case they do not qualify as distinct MIME types. As examples of Media Encoding I offer: Vorbis (audio), Theora (video), MPEG-4 Part 2 (video), MPEG-4 Part 3 (audio), MPEG-4 Part 10 (video). A deeper and more exhaustive survey of current Media Type registrations would be useful here.
Part of the issue here may be that "transfer encoding" in the sense of RFC4288 Section 4.1 seems to refer to "lossless," "reversible," and "universally decodable". Such is not the case with many audio and/or video encodings. Based on this interpretation, I would argue that Transfer Encodings do not refer to the common audio and/or video encodings, and therefore that these audio and video encodings qualify for Media Type registration.
Possible Solution
The danger is that if every combination of Media Containers and Media Encodings were registered as distinct Media Types, the result would be an astronomical increase in the number of Media Types and it would increase the number, if not the complexity, of the dispatch rules within operating systems and browsers.
It seems one way around this might be to use a "+suffix" mechanism (obviously specified such that it is backward compatible with RFC3023), or some other "extension". It could be specified that only Media Containers qualify as a Media Sub-type, and that one or more (optional) suffixes would indicate the contained media encodings. And that for containers that can contain or do commonly contain only audio, then they be registered under the audio top-level type. For contains that can contain or do commonly contain video (with synchronized audio), they should be registered as a video top-level Media Type. For Media Containers that serve both functions (including audio only), they should be registered as audio and video top-level Media Types.
With the added "Encoding extension" mechanism, there may be complications in that current dispatching mechanisms are "fixed" meaning that the media dispatchers expect a fixed string match of Media Type to dispatchee. This does not easily permit interpretation of added-in-arbitrary-order suffixes. To solve this problem, the order of suffixes could be programmatically determined (for example, either absolute alphabetical or alphabetical within major groups of audio, video, other). This way, only one permutation of each combination of Media Encodings would need to be added.
Summary
I'm not sure if this is viewed as a problem by others, but it seems to be a serious issue that is beginning to be "worked around" outside of, and parallel to, the Media Type definitions in several different and incompatible ways. Inclusion of more strict interpretation and differentiation of top-level types and/or formal recognition of Media Containers and Media Encodings within the Media Type system seems like significant and appropriate improvements to Media Type registration.
References
[1] "Client handling of MIME headers"
http://www.w3.org/2001/tag/doc/mime-respect-20030709
"The architecture of the Web depends on applications making dispatching and security decisions for resources based on their Internet Media Types and other MIME headers."
http://www.w3.org/2001/tag/doc/mime-respect.html
The current version states "For example, HTTP and MIME use the value of the "Content-Type" header field to indicate the Internet media type of the representation, which influences the dispatching of handlers and security-related decisions made by recipients of the message."
Section 3.1
A media type is not simply an indication of data format; it also refers to a preferred interpretation of that data format. This preferred interpretation may impact the recipient's functional decisions, such as whether the data is rendered, stored, or executed. In practice, media types are often used as the key for selecting an appropriate handler to interpret the data received. It is possible for a single data format to be associated with multiple media types and for a single media type to describe a superset of many different data formats.
--- end excerpt
[2] http://www.ietf.org/rfc/rfc3023.txt
Functional Interpretation
Although I have not seen dispatching based on Media Types generically described in the RFCs that I reviewed, I have found the description in a w3c document "Client handling of MIME headers" [1] and RFC3023 Section 13 (Appendix A) [2].
This seems to me to be a major use and benefit of defining appropriate Media Types, especially with respect to the assignment of Sub-types within the appropriate top-level types.
Common browsers and operating systems both allow dispatching based on Media Types, in addition to the "similar-to-dispatching" uses I mentioned previously in ATOM, RSS Media, and Autodiscovery.
Even at the operating system or the browser level, this sometimes run into problems for the user where a particular audio or video Media Container contains a Media Encoding that is not decodable by the application that has been associated based on the Media Type.
Terminology Interpretation
RFC4288 Section 4.1 "Functionality Requirement" states that "Media types MUST function as an actual media format."
The next sentence states "Registration of things that are better thought of as a transfer encoding...are not allowed." and then goes on to give base64 as an example of a transfer encoding.
It is still unclear which, if any, of the following categories (using my prior terminology) are construed as a "transfer encoding": Type, Media Container, and Media Encoding. To me, these "categories" are the logical way of distinguishing audio and video media simply because these are the broad categories determining how and if a particular file can be played on any particular system.
"Type" clearly corresponds to top-level Media Type (audio, video, etc.). Media Container (such as Ogg, MPEG-4 Part 14, "AVI", "MOV") clearly does *not* correspond to RFC4288 Section 4.1 "Transfer Encoding" and clearly corresponds to an actual media format. So the remaining question is: do media encodings commonly in use correspond to "transfer encodings", in which case they do not qualify as distinct MIME types. As examples of Media Encoding I offer: Vorbis (audio), Theora (video), MPEG-4 Part 2 (video), MPEG-4 Part 3 (audio), MPEG-4 Part 10 (video). A deeper and more exhaustive survey of current Media Type registrations would be useful here.
Part of the issue here may be that "transfer encoding" in the sense of RFC4288 Section 4.1 seems to refer to "lossless," "reversible," and "universally decodable". Such is not the case with many audio and/or video encodings. Based on this interpretation, I would argue that Transfer Encodings do not refer to the common audio and/or video encodings, and therefore that these audio and video encodings qualify for Media Type registration.
Possible Solution
The danger is that if every combination of Media Containers and Media Encodings were registered as distinct Media Types, the result would be an astronomical increase in the number of Media Types and it would increase the number, if not the complexity, of the dispatch rules within operating systems and browsers.
It seems one way around this might be to use a "+suffix" mechanism (obviously specified such that it is backward compatible with RFC3023), or some other "extension". It could be specified that only Media Containers qualify as a Media Sub-type, and that one or more (optional) suffixes would indicate the contained media encodings. And that for containers that can contain or do commonly contain only audio, then they be registered under the audio top-level type. For contains that can contain or do commonly contain video (with synchronized audio), they should be registered as a video top-level Media Type. For Media Containers that serve both functions (including audio only), they should be registered as audio and video top-level Media Types.
With the added "Encoding extension" mechanism, there may be complications in that current dispatching mechanisms are "fixed" meaning that the media dispatchers expect a fixed string match of Media Type to dispatchee. This does not easily permit interpretation of added-in-arbitrary-order suffixes. To solve this problem, the order of suffixes could be programmatically determined (for example, either absolute alphabetical or alphabetical within major groups of audio, video, other). This way, only one permutation of each combination of Media Encodings would need to be added.
Summary
I'm not sure if this is viewed as a problem by others, but it seems to be a serious issue that is beginning to be "worked around" outside of, and parallel to, the Media Type definitions in several different and incompatible ways. Inclusion of more strict interpretation and differentiation of top-level types and/or formal recognition of Media Containers and Media Encodings within the Media Type system seems like significant and appropriate improvements to Media Type registration.
References
[1] "Client handling of MIME headers"
http://www.w3.org/2001/tag/doc/mime-respect-20030709
"The architecture of the Web depends on applications making dispatching and security decisions for resources based on their Internet Media Types and other MIME headers."
http://www.w3.org/2001/tag/doc/mime-respect.html
The current version states "For example, HTTP and MIME use the value of the "Content-Type" header field to indicate the Internet media type of the representation, which influences the dispatching of handlers and security-related decisions made by recipients of the message."
Section 3.1
A media type is not simply an indication of data format; it also refers to a preferred interpretation of that data format. This preferred interpretation may impact the recipient's functional decisions, such as whether the data is rendered, stored, or executed. In practice, media types are often used as the key for selecting an appropriate handler to interpret the data received. It is possible for a single data format to be associated with multiple media types and for a single media type to describe a superset of many different data formats.
--- end excerpt
[2] http://www.ietf.org/rfc/rfc3023.txt