Professional Documents
Culture Documents
------------------------------------
Copyright (c) CDDB, Inc.
@(#)cddb.howto 1.27 98/12/09
In this document:
- WHAT IS THE CDDB
- CDDB USE RESTRICTIONS
- TWO FORMS OF ACCESS TO THE CDDB
- CDDB DISCID
- REMOTE CDDB ACCESS
- CDDB SUBMISSION
- QUESTIONS?
- APPENDIX A - CDDB DISCID ALGORITHM
- APPENDIX B - CDDB FILE FORMAT
- APPENDIX C - CDDB SERVER PROTOCOL
- APPENDIX D - OFFICIAL CDDB SOFTWARE DISTRIBUTION SITES
CDDB DISCID
-----------
Both forms of CDDB access require that the software compute a "disc
ID" which is an identifier that is used to access the CDDB. The disc
ID is a 8-digit hexadecimal (base-16) number, computed using data from
a CD's Table-of-Contents (TOC) in MSF (Minute Second Frame) form. The
algorithm is listed below in Appendix A.
It is crucial that your software compute the disc ID correctly. If it
does not generate the correct disc ID, it will not be compatible with CDDB.
Moreover, if your software submits CDDB entries with bad disc IDs to the
CDDB archives, it could compromise the integrity of the CDDB.
We suggest installing one of the disc ID checker programs listed on the
CDDB web page at http://www.cddb.com/downloads, and then testing the disc
ID code in your software by comparing the disc ID generated by the program
with that of your software for as large a number of CDs as possible. Bugs
in disc ID calculation can be subtle, and history shows that it sometimes
takes hundreds of discs to find problems.
REMOTE CDDB ACCESS
------------------
In order to perform remote access of CDDB servers, your software must be
able to communicate with a remote CD server system via HTTP. There are a
number of public CDDB servers operating on the Internet. The current list
of public servers may be obtained programmatically via the CDDB protocol
"sites" command. The permanent server site, cddb.cddb.com has been
established in order to provide a reliable source of server site information
via the "sites" command. This address may be safely hard-wired into client
software for this purpose, as it is guaranteed to exist on a permanent basis.
Furthermore, the "cddb.cgi" program is guaranteed to always reside at the
following path: /~cddb/cddb.cgi
Thus, the URL for accessing the server at cddb.cddb.com is:
http://cddb.cddb.com/~cddb/cddb.cgi
You should make the CDDB server host (or hosts) user-selectable in your
software. DO NOT hard-wire the list of CD database servers into your code.
The list of active servers changes over time.
The CDDB server protocol is described below in Appendix C.
The CDDB entry returned from the server via a "cddb read" command is in
the format described in Appendix B below.
Some additional notes for accessing CDDB over the Internet:
Your application should always specify the highest documented protocol
level in the "proto=" field of the HTTP command. The highest level currently
specified is "4". Lower protocol levels will work, but are only provided
for compatibility with older CDDB applications. If you do not use the
highest available protocol level, certain important features will not be
available to your application.
Make sure to use the proper arguments in the "hello=" command. The user
and hostname arguments should be that of the user's email address, not
some fixed hard-coded value. The application name and version should be
that of your application, not that of another existing application.
We consider the use of the "cddb query" command mandatory for all CDDB
clients. It is not valid to issue a "cddb read" command without issuing
a prior "cddb query" and receiving a good response, as it may yield incorrect
results. In addition, it is required that clients support close matches
(aka "fuzzy" matches, or response code 211) and multiple exact matches
(response code 210) in response to a query.
The proper way to handle multiple exact/fuzzy matches is to present the
entire list of matches to the user and to let the user choose between them.
Matches are listed in the order of best fit for the user's disc, so they
should be presented to the user in the order they are listed by the server.
The suggested algorithm for obtaining the list of server sites is
as follows. The application should attempt to get the list from
cddb.cddb.com with the "sites" command the first time the user runs the
program. After the initial download of the site list, the application
should periodically attempt to download the site list, or at least
provide the user with some method of downloading the list on-demand.
Should the user be unable to subsequently download the list of sites
due to temporary network perturbation, the application should attempt
to download the site list from one of the sites in its current list. All
of the official CDDB server sites will contain a valid list of servers,
though cddb.cddb.com is the only site which is guaranteed to always exist.
We do strongly suggest that you provide your users with the capability of
choosing CDDB server sites as described above. However, for some applications
this may not be feasible. If you do not wish to offer this functionality,
you may safely hard-code "cddb.cddb.com" in your application as the sole
CDDB site to access. This will deprive your users of the option to choose
a site near their locale for optimal response, but that is your choice.
PLEASE NOTE: older versions of the CDDB specification describe two methods
of accessing the CDDB servers: HTTP mode and CDDBP mode. CDDBP mode is
being deprecated in favor of HTTP mode, so new applications should be sure
to only implement the HTTP mode of access. All text describing CDDBP
mode has been removed from this document.
CDDB SUBMISSION
---------------
Your software may allow users to enter CDDB data and then submit it to the
CDDB archives. The method of submission is to transmit the entry to the
database through a CGI program at the following URL:
http://hostname.cddb.com/~cddb/submit.cgi
where "hostname.cddb.com" is one of the hosts listed in the CDDB server
"sites" command, and also cddb.cddb.com.
Submissions are made through the CGI program as follows. You must only use
the "POST" method of sending data; "GET" is not supported. There are several
HTTP "Entity-Header" fields that must be included in the data followed by a
blank line, followed by the "Entity-Body" (a.k.a the CDDB entry) in the
format described in Appendix B below. The required header fields are:
Category: CDDB_category
Discid: CDDB_discid
User-Email: user@domain
Submit-Mode: test_or_submit
Content-Length: length_of_CDDB_entry
Where:
- "CDDB_category" is one of the valid CDDB categories listed by the CDDB
server "cddb lscat" command. Invalid categories will result in the entry
being rejected.
- "CDDB_discid" is the 8-digit hex CDDB disc ID of the entry as described in
the "CDDB Discid" section below. This must be the same disc ID that appears
in the "DISCID=" section of the entry being submitted. If not, the entry
will be rejected.
- "user@domain" is the valid email address of the user submitting the entry.
This is required in case a submission failure notice must be sent to the
user.
- "test_or_submit" is the word "test" or "submit" (without the surrounding
quotes) to indicate whether the submission is a test submission or a real
submission to the database, respectively. See below for an explanation of
test submissions.
- "length_of_CDDB_entry" is the size in bytes of the CDDB entry being
submitted. This number does not include the length of the header or the
blank line separating the HTTP header and the CDDB entry.
There are several additional optional HTTP header fields that may also
be specified:
Charset: character_set_of_CDDB_entry
X-Cddbd-Note: message for user
Where:
- "character_set_of_CDDB_entry" is one of ISO-8859-1 or US-ASCII (lower case
may be used if desired). This specifies to the CDDB server which character
set the CDDB entry has been encoded in. If your application knows the
user's character set, then you should specify it here. Only these two
character sets are supported currently. DO NOT specify the character set
if your application does not have any way of verifying the user's character
set (i.e. do not guess; it's better not to specify it at all).
- "message for user" is an arbitrary message to be included at the top of
any rejection notice that may be sent to the submitting user.
An example submission showing the HTTP command, "Entity-Header" and "Entity-
Body" follows:
POST /~cddb/submit.cgi HTTP/1.0
Category: rock
Discid: 2a09310a
User-Email: joe@joeshost.joesdomain.com
Submit-Mode: submit
Charset: ISO-8859-1
X-Cddbd-Note: Problems with Super CD Player? Send email to support@supercd.com.
Content-Length: 820
# xmcd
# Copyright (c) 1998 CDDB Inc.
#
# Track frame offsets:
[ data omitted in this example for brevity ]
PLAYORDER=
Note the blank line between the "Content-Length" header field and the
"# xmcd" which marks the beginning of the CDDB entry.
When your application submits an entry through the CGI program, it will
respond with a 3-digit response code indicating whether or not the entry has
been forwarded to the CDDB server for inclusion in the database, followed
by a textual description of the response code. For example:
200 OK, submission has been sent.
400 Internal error: failed to forward submission.
500 Missing required header information.
These are but a few of the possible responses. See the description of the
CDDB server protocol in Appendix C for more information on handling response
codes.
The body of the CDDB entry being submitted should be sent verbatim as
described in Appendix B. DO NOT encode the data in any way before transmitting
it; data must be sent as raw text. For example, Windows programmers should not
use the Windows URL encode function prior to calling the submit CGI program.
Doing so may lead to corrupt data being sent and also possibly to rejected
submissions.
You may implement a button or somesuch in your software's user interface
to initiate submissions. Rejected submissions are automatically returned
via email to the sender specified in the "User-Email" header field with an
explanation of the reason for the rejection.
Please do not allow a user to submit CD database entries that have
completely unfilled contents (i.e., blank information in the disc
artist/title as well as the track titles). Please design your client
with this in mind. An example minimum requirement that a CD player client
should meet is listed below:
1. Don't allow the "send" or "submit" feature to be activated if
the CD database information form is not edited at all.
2. Check that the disc artist/title contains something (that the user
typed in).
3. Don't submit a default string if a field is not filled in
(e.g. If track 3 is not filled in, submit a blank "TTITLE3=" line.)
If you must use a default string, please use "track N" where N
is the track number.
Before you release your software, please be sure that it produces
submissions that adhere to the CDDB file format, and that the frame
offset, disc length, and disc ID information are correctly computed.
For testing, please make your software send submissions with the
"Submit-Mode" HTTP header field set to "test".
CDDB submissions sent in test mode will be sanity-checked by the CDDB server
and pass/fail confirmation sent back to the submitter, but will not actually
be deposited in the CD database. Please DO NOT send submisions in "submit"
mode until your application has been approved by the CDDB support group.
When you feel your application is ready to support submissions, please contact
us at support@cddb.com. We will provide you with our qualification
procedure, which involves submitting a number of entries of different types.
Once qualified, your application will be permitted to submit to the database.
QUESTIONS?
----------
Please send any questions or comments to support@cddb.com.
APPENDIX A - CDDB DISCID ALGORITHM
----------------------------------
The following is a C code example that illustrates how to generate the
CDDB disc ID. Examples in other programming languages may be found on
the CDDB web site at http://www.cddb.com/downloads. A text description
of the algorithm follows, which should contain the necessary information
to code the algorithm in any programming language.
struct toc {
int min;
int sec;
int frame;
};
struct toc cdtoc[100];
int
read_cdtoc_from_drive(void)
{
/* Do whatever is appropriate to read the TOC of the CD
* into the cdtoc[] structure array.
*/
return (tot_trks);
}
int
cddb_sum(int n)
{
int ret;
/* For backward compatibility this algorithm must not change */
ret = 0;
while (n > 0) {
ret = ret + (n % 10);
n = n / 10;
}
return (ret);
}
unsigned long
cddb_discid(int tot_trks)
{
int i,
t = 0,
n = 0;
/* For backward compatibility this algorithm must not change */
i = 0;
while (i < tot_trks) {
n = n + cddb_sum((cdtoc[i].min * 60) + cdtoc[i].sec);
i++;
}
t = ((cdtoc[tot_trks].min * 60) + cdtoc[tot_trks].sec) -
((cdtoc[0].min * 60) + cdtoc[0].sec);
return ((n % 0xff) << 24 | t << 8 | tot_trks);
}
main()
{
int tot_trks;
tot_trks = read_cdtoc_from_drive();
printf("The discid is %08x", cddb_discid(tot_trks));
}
This code assumes that your compiler and architecture support 32-bit
integers.
The cddb_discid function computes the discid based on the CD's TOC data
in MSF form. The frames are ignored for this purpose. The function is
passed a parameter of tot_trks (which is the total number of tracks on
the CD), and returns the discid integer number.
It is assumed that cdtoc[] is an array of data structures (records)
containing the fields min, sec and frame, which are the minute, second
and frame offsets (the starting location) of each track. This
information is read from the TOC of the CD. There are actually
tot_trks + 1 "active" elements in the array, the last one being the
offset of the lead-out (also known as track 0xAA).
The function loops through each track in the TOC, and for each track
it takes the (M * 60) + S (total offset in seconds) of the track and
feeds it to cddb_sum() function, which simply adds the value of each digit
in the decimal string representation of the number. A running sum of this
result for each track is kept in the variable n.
At the end of the loop:
1. t is calculated by subtracting the (M * 60) + S offset of the lead-out
minus the (M * 60) + S offset of first track (yielding the length of
the disc in seconds).
2. The result of (n modulo FFh) is left-shifted by 24 bits.
3. t is left shifted by 8.
The bitwise-OR operation of result 2., 3. and the tot_trks number is
used as the discid.
The discid is represented in hexadecimal form for the purpose of
xmcd cddb file names and the DISCID= field in the xmcd cddb file itself.
If the hexadecimal string is less than 8 characters long, it is
zero-padded to 8 characters (i.e., 3a8f07 becomes 003a8f07). All
alpha characters in the string should be in lower case, where
applicable.
Important note for clients using the MS-Windows MCI interface:
The Windows MCI interface does not provide the MSF location of the
lead-out. Thus, you must compute the lead-out location by taking the
starting position of the last track and add the length of the last track
to it. However, the MCI interface returns the length of the last track
as ONE FRAME SHORT of the actual length found in the CD's TOC. In most
cases this does not affect the disc ID generated, because we truncate
the frame count when computing the disc ID anyway. However, if the
lead-out track has an actual a frame count of 0, the computed quantity
(based on the MSF data returned from the MCI interface) would result in
the seconds being one short and the frame count be 74. For example,
a CD with the last track at an offset of 48m 32s 12f and having a
track length of 2m 50s 63f has a lead-out offset of 51m 23s 0f. Windows
MCI incorrectly reports the length as 2m 50s 62f, which would yield a
lead-out offset of 51m 22s 74f, which causes the resulting truncated
disc length to be off by one second. This will cause an incorrect disc
ID to be generated. You should thus add one frame to the length of the
last track when computing the location of the lead-out.
The easiest way for Windows clients to compute the lead-out given information
in MSF format is like this:
(offset_minutes * 60 * 75) + (offset_seconds * 75) + offset_frames +
(length_minutes * 60 * 75) + (length_seconds * 75) + length_frames + 1 = X
Where X is the offset of the lead-out in frames. To find the lead-out in
seconds, simply divide by 75 and discard the remainder.
APPENDIX B - CDDB FILE FORMAT
-----------------------------
Database entries must be in the ISO-8859-1 character set (the 8-bit ASCII
extension also known as "Latin alphabet #1" or ISO-Latin-1). Lines must
always be terminated by a newline/linefeed (ctrl-J, or 0Ah) character
or a carriage return character (ctrl-M, or 0Dh) followed by a newline/linefeed
character. All lines in a database entry must be less than or equal to 80
bytes in length, including the terminating character(s). Database entries
with lines that are longer will be considered invalid. There must be no
blank lines in a database entry.
Lines that begin with # are comments. Comments should appear only at the
top of the file before any keywords. Comments in the body of the file are
subject to removal when submitted for inclusion to the database. Comments
may consist only of characters in the set:
{ tab (09h); space (20h) through tilde (7Eh) inclusive }
Comments should be ignored by applications using the database file, with
several exceptions described below.
The beginning of the first line in a database entry should consist of the
string "# xmcd". This string identifies the file as an xmcd format CD
database file. More text can appear after the "xmcd", but is unnecessary.
The comments should also contain the string "# Track frame offsets:" followed
by the list of track offsets (the # of frames from the beginning of the CD)
obtained from the table of contents on the CD itself, with any amount of white
space between the "#" and the offset. There should be no other comments
interspersed between the list of track offsets. This list must follow the
initial identifier string described above. Following the offset list should
be at least one blank comment.
After the offset list, the following string should appear:
"# Disc length: N seconds"
where the number of seconds in the CD's play length is substituted for "N".
The number of seconds should be computed by dividing the total number of
1/75th second frames in the CD by 75 and truncating any remainder. This number
should not be rounded.
Note for Windows programmers:
The disc length provided by the Windows MCI interface should not be used here.
Instead, the lead-out (address of the N+1th track) should be used. Since the
MCI interface does not provide the address of the lead-out, it should be
computed by adding the length of the last track to the offset of the last
track and truncating (not rounding) any remaining fraction of a second. Note
that the MCI interface yields an incorrect track offset which must be
corrected by adding one frame to the total frame count when performing the
disc length computation. For more information, see Appendix A.
After the disc length, the following string should appear:
"# Revision: N"
where the database entry revision (decimal integer) is substituted for "N".
Files missing a revision are assumed to have a revision revision level of 0.
The revision is used for database management when comparing two entries in
order to determine which is the most recent. Client programs which allow the
user to modify a database entry should increment the revision when the user
submits a modified entry for inclusion in the database.
After the revision, the following string should appear:
"# Submitted via: client_name client_version optional_comments"
where the name of the client submitting the entry is substituted for
"client_name", the version of the client is substituted for "client_version",
and "optional_comments" is any sequence of legal characters. Clients which
allow users to modify database entries read from the database should update
this string with their own information before submitting.
The "client_version" field has a very specific format which should be observed:
[leading text]version_number[release type][level]
Where:
Leading text: is any string which does not include numbers.
Version number and level: is any (possibly) decimal-separated list of
positive numbers.
Release type: is a string of the form:
alpha, a, beta, b, patchlevel, patch, pl
Level: is a positive number.
For example:
release:2.35.1alpha7
v4.0PL0
2.4
The only required portion of the version field is the version number. The
other parts are optional, though it is strongly recommended that the release
type field be filled in if relevant. Strict version checking may be
applied by software which evaluates the submitter revision, so it is wise
to make it clear when a release is beta, etc.
Following the comments is the disc data. Each line of disc data consists
of the format "KEYWORD=data", where "KEYWORD" is a valid keyword as described
below and "data" is any string consisting of characters in the set:
{ space (20h) through tilde (7Eh) inclusive; no-break-space (A0h) through
y-umlaut (FFh) inclusive }
Newlines (0Ah), tabs (09h) and backslashes (2Fh) may be represented by the
two-character sequences "\n", "\t" and "\\" respectively. Client programs must
translate these sequences to the appropriate characters when displaying
disc data.
All of the applicable keywords must be present in the file. They must appear
in the file in the order shown below. Multiple occurrences of the same keyword
indicate that the data contained on those lines should be concatenated; this
applies to any of the textual fields. There is no practical limit to the size
of any of the textual fields or a database entry itself, though the CDDB server
software may place a restriction on especially large entries. Valid keywords
are as follows:
DISCID: The data following this keyword should contain the 8-byte disc ID.
indicated by the track offsets in the comment section. The algorithm
for generating the disc ID is explained in Appendix A.
DTITLE: Technically, this may consist of any data, but by convention contains
the artist and disc title (in that order) separated by a "/" with a
single space on either side to separate it from the text. If the "/"
is absent, it is implied that the artist and disc title are the same.
TTITLEN:There must be one of these for each track in the CD. The track
number should be substituted for the "N", starting with 0. This field
should contain the title of the Nth track on the CD.
EXTD: This field contains the "extended data" for the CD. This is intended
to be used as a place for interesting information related to the CD,
such as credits, et cetera. If there is more than one of these lines
in the file, the data is concatenated. This allows for extended data
of arbitrary length.
EXTTN: This field contains the "extended track data" for track "N". There
must be one of these for each track in the CD. The track number
should be substituted for the "N", starting with 0. This field is
intended to be used as a place for interesting information related to
the Nth track, such as the author and other credits, or lyrics. If
there is more than one of these lines in the file, the data is
concatenated. This allows for extended data of arbitrary length.
PLAYORDER:
This field contains a comma-separated list of track numbers which
represent a programmed track play order. This field is generally
stripped of data in non-local database entries. Applications that
submit entries for addition to the main database should strip this
keyword of data.
CDDB Protocol
-------------
Notation:
-> : client to server
<- : server to client
terminating marker: `.' character in the beginning of a line
CDDB lscat:
----------
Client command:
-> cddb lscat
Server response:
<- code Okay category list follows
<- category
<- category
<- (more categories...)
<- .
code:
210 Okay category list follows (until terminating marker)
category:
CD category. Example: rock
CDDB query:
-----------
Client command:
-> cddb query discid ntrks off1 off2 ... nsecs
discid:
CD disc ID number. Example: f50a3b13
ntrks:
Total number of tracks on CD.
off1, off2, ...:
Frame offset of the starting location of each track.
nsecs:
Total playing length of CD in seconds.
Server response:
<- code categ discid dtitle
or
<- code close matches found
<- categ discid dtitle
<- categ discid dtitle
<- (more matches...)
<- .
code:
200 Found exact match
211 Found inexact matches, list follows (until terminating marker)
202 No match found
403 Database entry is corrupt
409 No handshake
categ:
CD category. Example: rock
discid:
CD disc ID number of the found entry. Example: f50a3b13
dtitle:
The Disc Artist and Disc Title (The DTITLE line). For example:
Pink Floyd / The Dark Side of the Moon
CDDB read:
----------
Client command:
-> cddb read categ discid
categ:
CD category. Example: rock
discid:
CD disc ID number. Example: f50a3b13
Server response:
<- code categ discid
<- # xmcd CD database file
<- # ...
<- (CDDB data...)
<- .
or
<- code categ discid No such CD entry in database
code:
210 OK, CDDB database entry follows (until terminating marker)
401 Specified CDDB entry not found.
402 Server error.
403 Database entry is corrupt.
409 No handshake.
417 Access limit exceeded, explanation follows (until marker)
categ:
CD category. Example: rock
discid:
CD disc ID number. Example: f50a3b13
Help information:
-----------------
Client command:
-> help
or
-> help cmd
cmd:
CDDB command. Example: quit
or
-> help cmd subcmd
cmd:
CDDB command. Example: cddb
subcmd:
CDDB command argument. Example: query
Server response:
<- code Help information follows
<- (help data ...)
<- .
or
<- code no help information available
code:
210 OK, help information follows (until terminating marker)
401 No help information available
Server sites:
--------------
Client command:
-> sites
Server response:
<- code OK, site information follows (until terminating `.')
<- (data)
<- .
code:
210 Ok, site information follows
401 No site information available.
The data format is as follows:
site port latitude longitude description
The fields are as follows:
site:
The Internet address of the remote site.
port:
The port at which the server resides on that site.
latitude:
The latitude of the server site. The format is as follows:
CDDD.MM
Where "C" is the compass direction (N, S), "DDD" is the
degrees, and "MM" is the minutes.
longitude:
The longitude of the server site. Format is as above, except
the compass direction must be one of (E, W).
description:
A short description of the geographical location of the site.
Example:
ca.us.cddb.com 888 N037.23 W122.01 Fremont, CA USA
Server status:
--------------
Client command:
-> stat
Server response:
<- code OK, status information follows (until terminating `.')
<- (data)
<- .
code:
210 Ok, status information follows
The possible data is as follows:
current proto: <current_level>
An integer representing the server's current operating protocol
level.
max proto: <max_level>
The maximum supported protocol level.
gets: <yes | no>
Whether or not the client is allowed to get log information,
according to the string "yes" or "no".
updates: <yes | no>
Whether or not the client is allowed to initiate a database
update, according to the string "yes" or "no".
posting: <yes | no>
Whether or not the client is allowed to post new entries,
according to the string "yes" or "no".
quotes: <yes | no>
Whether or not quoted arguments are enabled, according to
the string "yes" or "no".
current users: <num_users>
The number of users currently connected to the server.
max users: <num_max_users>
The number of users that can concurrently connect to the server.
strip ext: <yes | no>
Whether or not extended data is stripped by the server before
presented to the user.
Database entries: <num_db_entries>
The total number of entries in the database.
Database entries by category:
This field is followed by a list of catgories and the number
of entries in that category. Each entry is of the following
format:
<white space>catgory: <num_db_entries>
The list of entries is terminated by the first line that does
not begin with white space.
Pending file transmissions:
This field is followed by a list of sites that are fed new
database entries at periodic intervals, and the number of
entries that have yet to be transmitted to that site.
Each entry is of the following format:
<white space>site: <num_db_entries>
The list of entries is terminated by the first line that does
not begin with white space.
This list may grow as needed, so clients must expect possible
unrecognizable data. Also, additional fields may be added to
the currently existing lines, although no existing fields will
be removed or change position.
Server version:
---------------
Client command:
-> ver
Server response:
<- code servername version copyright
or
<- code Version information follows
code:
200 Version information.
211 OK, version information follows (until terminating marker)
version:
Server version. Example: v1.0PL0
copyright:
Copyright string. Example: Copyright (c) 1996 Steve Scherf
Server users:
-------------
Client command:
-> whom
Server response:
<- code User list follows
code:
210 OK, user list follows (until terminating marker)
401 No user information available.
Reserved errors:
----------------
The following error codes are reserved, and will never be returned as a
response to a CDDB protocol command. They are intended to be used internally
by clients that have a need for generating pseudo-responses.
600-699