Parsing the URL of a Web Page
URL stands for Uniform Resource Locator. A URL identifies a particular Internet resource: a Web page, a gopher server, a library catalog, an image, or a text file. It's the address of a resource and can tell you a lot about that resource.
The basic structure of a URL is hierarchical, and the hierarchy moves from left to right:
So, for instance...
tells us that...
- the protocol used to access and transmit the resource is HTTP (i.e., it's a web page)
- the resource is on the machine "www" in the second-level domain "lawrence", which is part of ".edu," or the top level domain for educational organizations (see below for more on domains);
- the resource is in the directory "academics" in the subdirectory "it" and is an html file called "url.html"
Tricks to know
- The server name(s) may give you a hint as to the origin of the page: http://www.student.carleton.edu or http://stu.beloit.edu
- The domain name may be one you recognize: harvard.edu or not: savetz.com
- The top-level domain is a good clue: .edu vs. .com vs. .org
- The first directory after the top-level domain is sometimes helpful: http://www.colby.edu/personal/
- The tilde (~) usually means that the page is a personal one: http://www.lawrence.edu/~gilbertp/
- Cut back the URL to see from whence comes a particular page.
More on domains
The Internet is made up of many computer networks. These networks are classified into categories called domains. Domains are classified either by the type of organization that hosts the network or by geographical location.
The host-type domains are:
State domain codes are generally the two-letter state abbreviations. An example:
http://dpi.wi.gov/ Wisconsin Department of Public Instruction
Country domain codes (and more on domains generally) can be found here: