![]() | |
![]() |
| | Thread Tools | Display Modes |
#1
| |||
| |||
|
|
I am new to Data Mining but I need to develop a tool for mining the apache log file for my final year project. I have several questions to ask. 1. How can I insert apache log file like access_log and referer_log into the sql server 2000, any tools can load all the data immediately? Format of the access_log: 218.102.21.133 - - [01/Sep/2003:00:00:03 +0800] "GET /cslab/pics/d_hours.gif HTTP/1.1" 304 - "http://www.cs.cityu.edu.hk/cslab/left.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)" Format of the referer_ log: http://www.cs.cityu.edu.hk/~fypms/student_menu.cgi? show_proposal.cgi -> /~fypms/student.cgi 2. Is it possible to get statistics result like a. the greatest number of hits b. where the potential applicants came from c. where they will go after visiting the main page d. show the users' access pattern over a certain period time (equals to revisit the same pages) 3. Can ASP handle the above work with sql server 2000? Would anyone give me a hand on it? |
#2
| |||
| |||
|
|
The first question is data preparation issue that the MS Analysis Services DM component doesn't directly addresses. I'm forwarding to DTS group in case they have a suggestion. Once you load, the first two questions could be answered by simple SQL queries. The other two are more sequence analysis problems. MS Analysis Services 2000 DM component doesn't directly support sequence analysis, but you could use decision trees and clustering algorithms to analyze the log without ordering being modelled. I believe DBMiner(www.dbminer.com) has an implementation for sequence analysis as an aggregated provider of MS Analysis Services. -- Peter Kim This posting is provided "AS IS" with no warranties, and confers no rights. "jenny" <anonymous (AT) discussions (DOT) microsoft.com> wrote in message news:48cc01c3fedb$91cd2720$a101280a (AT) phx (DOT) gbl... I am new to Data Mining but I need to develop a tool for mining the apache log file for my final year project. I have several questions to ask. 1. How can I insert apache log file like access_log and referer_log into the sql server 2000, any tools can load all the data immediately? Format of the access_log: 218.102.21.133 - - [01/Sep/2003:00:00:03 +0800] "GET /cslab/pics/d_hours.gif HTTP/1.1" 304 - "http://www.cs.cityu.edu.hk/cslab/left.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)" Format of the referer_ log: http://www.cs.cityu.edu.hk/~fypms/student_menu.cgi? show_proposal.cgi -> /~fypms/student.cgi 2. Is it possible to get statistics result like a. the greatest number of hits b. where the potential applicants came from c. where they will go after visiting the main page d. show the users' access pattern over a certain period time (equals to revisit the same pages) 3. Can ASP handle the above work with sql server 2000? Would anyone give me a hand on it? |
![]() |
| Thread Tools | |
| Display Modes | |
| |