Various researchers and surveys
suggest that the Internet/World Wide Web has between
8 billion and 100 billion
Web pages. For a
comparative perspective on the size of the Internet at the
Internet
Archive's "How
Big Is 100 Terabytes?"
web page (http://www.archive.org/xterabytes.html).
Estimates of the number of
documents on the "hidden" Internet range upwards of 1 trillion documents. The "hidden" Internet consists of files
and documents in:
Databases that can only be
reached through a query (database search-and-retrieval)
interface ( Examples:
Priodical articles in
Infotrac databases as Magazine Index or Academic Index).
Accessing these databases generally requires some kind of
license and/or permission and, consequently, a login and a
password.
Virtual corporate networks,
some of which may have upwards of a 100,000 or more Web pages
and other files intended for only employee, intra-company, or
inter-company access.
Formats that can not be
readily read or accessed across the Internet.
Size Doesn't
Matter
Whatever the size of the
Internet/Web, it does not matter to most people. This is because
most people who have access to the Internet are turning to it
first for information. Not books. Not magazines. Not radio. Not
even TV is the first choice for information. (Note: The latter two
are usually the first choice for information on things happening
right now, but that may not even continue, as news web sites
sometimes provide information (even audio and video) before radio
and television.)
A great site for getting a handle
on how people are using the Internet/Web around the world is
NUA (http://www.nua.com/surveys/),
Tools to Search
the Internet/World Wide Web
To find information on the
Internet, you need tools that:
Collect information (generally
the text and graphics from all types of files, including Web
pages) from the Internet
Store that information in a
database.
Index all the words and parts
of the Web pages and other documents.
Provide a query interface that
allows you to search those downloaded files, and
Provide active links to the
original or other linked documents.
Most tools for searching the
Internet/WWW are are nothing more than databases of or indexes to
Internet/Web files and pages.
These tools offer varying levels
of searching sophistication and functionality, though they are
generally much less sophistcated than library catalogs,
bibliographic utilities, and bibliographic databases.
Remember, when you are "searching
the Internet" through one of these databases you are actually
searching their databases of downloaded and indexed Internet/Web
pages and files and not the actual Internet itself.