Personal tools
You are here: Home Tech Tidbits Legislative Data Standards

Legislative Data Standards

— filed under:

As Congress moves to XML and persistent and thoughtfully formed URLs for documents, the ability of web sites to use these as standards for indexing and labeling related online documents effectively becomes possible. And the possibilities are endless.

Internet and web technology can, if applied in a RESTful way, finally give legislative documents such as bills and votes unique, universal and understandable names/locations/identifiers. So far, the confusion over bill numbers, long and short names, and the tradition of using the main sponsors last names as ways to refer to legislation has led to a popular consensus method of identifying legislation. And as long as you are familiar with the popular connotation, you can find related information. However, there is an effort to use the Internet naming conventions (URIs and URLs) to bring order necessary for the interconnected web where people want to be instantly connected with source documents as well as related material.

Once the standards are issued, then the exciting work of using them for metadata for the vast web will start in earnest. Already, there have been proprietary standards proposed for citing legislative documents on the Internet. However, they are doomed for failure do to the lack of general consensus. The Library of Congress has issued permanent URLs for legislation though only for the summary page that leads to the versions of the bill. I have been hoping that the standard will be extended first to the different versions of the bill and then to the parts of the bill.


Here are some important resources:


Suggesting and Testing a Standard for Citing/Linking to Parts of Bills and Laws

As a proof of concept I will be taking the 111 H.R. 1/Public Law 111-7, Recovery Act, and creating an XHTML version with anchors. The hope is that the anchors can be used for metadata and actual links to parts of the law which is important for bills and laws that are divided into multiple titles (and can be really long).

An example (not working)

In first case, the anchor follows the <enum>'s to determine the section, avoiding the quoted ones. In the second, the AP indicates an appropriations section that has <header>'s filled with long text and no <enum>'s. The number after the AP indicates the number of paragraphs.

The reason to not just number all paragraphs is to avoid using unseen data used for citations. And having a person count all the paragraphs in a 400+ bill seems wrong when the bill uses visible enumeration and is used for real citations. Also, it might be possible to just use Section numbers which are unique in all bills, but then it would be impossible to anchor or cite smaller pieces. Also Sections fit in the middle of the <enum>s.

Each anchored area would be in a <div> or <li> that could be easily addressed by CSS, JavaScript and XPath. Also, the same variables used by the anchors could be used for XPath functions.

Note that it appears that the UK statutes are being referenced by URIs including sections of legislation/laws using slashes rather than anchors. In that in the United States legislation is composed as separate documents, I feel it is important to make clear that the document uses a URL that stands on its own, and that sections of the bill have anchors. This is consistent with the normal way static documents work on the web in HTML, with anchors to instantly scroll to the linked section. As XPointer is adopted into browsers, direct links to any portion of the document would be possible (as opposed to HTML anchors that must be specifically added).

Javascript could also be used with CSS to make only the linked portion of the bill to be seen. There would be breadcrumbs/outline indicators to allow the viewer to turn on adjacent portions or more.



Document Actions
What's News