Making Sense of Search Results by Automatic Web-page Classifications
Ben Choi
Computer Science
College of
Engineering and Science
Louisiana Tech University, USA
pro@BenChoi.org
Abstract: This paper reports the development of a system for automatically organizing Internet web pages into meaningful categories. The aim of the system is to allow Internet users to find useful information in less time. The current problem with using the Internet is how to find the information that we need. With the explosive growth in the Internet, the information overload situation is getting worse. The proposed system automatically classifies web pages based on three types of information: (1) The system analyzes organizational information among web pages (inter-web-page relationship), such as an URL and links within a web page. (2) It analyzes the meta-web-page information such as data contained in META tags and formatting data of a web page. And (3), it analyzes web-page-content information such as keywords and phrases in the content of a web page. Our results show that combining all three types of information provides better accuracy.
Full Paper: