dbTalk Databases Forums  

Re: Dealing with Web Mining on Apache Log

microsoft.public.sqlserver.dts microsoft.public.sqlserver.dts


Discuss Re: Dealing with Web Mining on Apache Log in the microsoft.public.sqlserver.dts forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Peter Kim [MS]
 
Posts: n/a

Default Re: Dealing with Web Mining on Apache Log - 03-01-2004 , 07:57 PM






The first question is data preparation issue that the MS Analysis Services
DM component doesn't directly addresses. I'm forwarding to DTS group in case
they have a suggestion.

Once you load, the first two questions could be answered by simple SQL
queries. The other two are more sequence analysis problems. MS Analysis
Services 2000 DM component doesn't directly support sequence analysis, but
you could use decision trees and clustering algorithms to analyze the log
without ordering being modelled. I believe DBMiner(www.dbminer.com) has an
implementation for sequence analysis as an aggregated provider of MS
Analysis Services.

--
Peter Kim
This posting is provided "AS IS" with no warranties, and confers no rights.

"jenny" <anonymous (AT) discussions (DOT) microsoft.com> wrote

Quote:
I am new to Data Mining but I need to develop a tool for
mining the apache log file for my final year project.
I have several questions to ask.

1. How can I insert apache log file like access_log and
referer_log into the sql server 2000, any tools can load
all
the data immediately?

Format of the access_log:
218.102.21.133 - - [01/Sep/2003:00:00:03
+0800] "GET /cslab/pics/d_hours.gif HTTP/1.1" 304 -
"http://www.cs.cityu.edu.hk/cslab/left.html" "Mozilla/4.0
(compatible; MSIE 6.0; Windows NT 5.0)"

Format of the referer_ log:
http://www.cs.cityu.edu.hk/~fypms/student_menu.cgi?
show_proposal.cgi -> /~fypms/student.cgi

2. Is it possible to get statistics result like
a. the greatest number of hits
b. where the potential applicants came from
c. where they will go after visiting the main page
d. show the users' access pattern over a certain period
time (equals to revisit the same pages)

3. Can ASP handle the above work with sql server 2000?

Would anyone give me a hand on it?



Reply With Quote
  #2  
Old   
Peter Kim [MS]
 
Posts: n/a

Default Re: Dealing with Web Mining on Apache Log - 03-02-2004 , 01:17 PM






Actually, I found an MSDN entry showing how to use SQL DTS to load web log
files to your SQL Server data warehouse. Hope you find it useful:

http://msdn.microsoft.com/library/de...house_dayx.asp

--
Peter Kim
This posting is provided "AS IS" with no warranties, and confers no rights.

"Peter Kim [MS]" <peterkim (AT) online (DOT) microsoft.com> wrote

Quote:
The first question is data preparation issue that the MS Analysis Services
DM component doesn't directly addresses. I'm forwarding to DTS group in
case
they have a suggestion.

Once you load, the first two questions could be answered by simple SQL
queries. The other two are more sequence analysis problems. MS Analysis
Services 2000 DM component doesn't directly support sequence analysis, but
you could use decision trees and clustering algorithms to analyze the log
without ordering being modelled. I believe DBMiner(www.dbminer.com) has an
implementation for sequence analysis as an aggregated provider of MS
Analysis Services.

--
Peter Kim
This posting is provided "AS IS" with no warranties, and confers no
rights.

"jenny" <anonymous (AT) discussions (DOT) microsoft.com> wrote in message
news:48cc01c3fedb$91cd2720$a101280a (AT) phx (DOT) gbl...
I am new to Data Mining but I need to develop a tool for
mining the apache log file for my final year project.
I have several questions to ask.

1. How can I insert apache log file like access_log and
referer_log into the sql server 2000, any tools can load
all
the data immediately?

Format of the access_log:
218.102.21.133 - - [01/Sep/2003:00:00:03
+0800] "GET /cslab/pics/d_hours.gif HTTP/1.1" 304 -
"http://www.cs.cityu.edu.hk/cslab/left.html" "Mozilla/4.0
(compatible; MSIE 6.0; Windows NT 5.0)"

Format of the referer_ log:
http://www.cs.cityu.edu.hk/~fypms/student_menu.cgi?
show_proposal.cgi -> /~fypms/student.cgi

2. Is it possible to get statistics result like
a. the greatest number of hits
b. where the potential applicants came from
c. where they will go after visiting the main page
d. show the users' access pattern over a certain period
time (equals to revisit the same pages)

3. Can ASP handle the above work with sql server 2000?

Would anyone give me a hand on it?





Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.