
Good day Chad,
I was wondering if there are any plans as far as the utilisation of the information stored in the pl_querystring field of tha PageLog table ?
Reason for that is rather simple, I would save roughly 530KB worth of disk space if I drop the field.
Also, This is a bit of a redundant piece of information, since it is already included in another field with the script names.
Any suggestions/comments/ideas?
Regards.
pl_querystring field usage ?
You make a good point. I am planning on using the querystring field but the information could come from the pl_scripturl field. I was hoping to utilize the querystring data sooner, but I don't think this will even make v2.0 because of the long list of stuff that ranks a lot higher.
Regards,
~Chad
pl_querystring field usage ?
Good enough,
So what is your suggestion, dropping the field would be safe at this point of time ?
BTW, I'm actually using v1.3 Beta, however I applied a massive load of changes to it such as loading the arrays from lookup tables, using reference pointers instead of fully qualified strings, and stuff like that.
I also added few reports, as well as expanding some of the existing ones. Even though I haven't finished applying my changes, the group of scripts I'm using now are fairly solid.
On another line of thought, did you have the time to create new "RegExp" expression strings to resolve the OS/Browsers/Robots ?
BTW, quite surprisingly, the overall disk usage of my Access database is not too bad. I have a bit over 18,000 records and the DB is about 6.5 MB in size. I know there is still room for improvement, but I'm getting there! 8)
Finally, in v2, beside the database/table(s) structure alteration, will there be new reports, options, etc... ? Are you holding the secret till you release the beta ?
Bye for now... :)
pl_querystring field usage ?
If you want the extra space savings, then dropping it will not hurt. After all, if worse comes to worse, you could always re-generate the data later from the pl_scripturl field.
I am planning a ton of new features and changes for version 2.0. Nothing is really finalized yet in terms of what is in and what is out, but the software is going to look considerably different when I am done with it. Here are some of the things that are on the proposed list:
- redesign of the db schema (a lot of normalization to decrease database disk space usage and improve performance)
- web based configuration / user management
- new reports including country detection, exit / entry pages, click paths, etc.
- real graphing (pie, chart, bars, etc)
- mySql support
- moving definitions to the database and allowing custom definitions via web interface
- reporting interface redesign
- exporting data
- xml support
- moving all queries to stored procedures
- data summarizing / archiving for space optimizations
- alternate reverse dns solution
And a bunch of other little things. The list is quite long actually. :) Nothing is set in stone at this point, I am still redesigning the database architecture at this point.
Regards,
~Chad
pl_querystring field usage ?
Wow!
With all that stuff in your plate, this will be a HUGE project!
I personally initated some tasks in order to accomplish few of the items you mentioned, but it will never be like what you're describing.
Normalization is the key. If you start well, you'll have an easy to expand product. If you "miss the boat", the upgrades will look more like patches than anything else and will be a nightmare to maintain/expand. It's worth the effort, specially considering the type of utility you're dealing with.
Just by looking rapidely at the options/features you want to implement, that V2 will be something else, no doubt about it.
I wish you the best luck in that BIG projects.
Looking forward for that "little toy" ! :D
Before I go, here are some ideas, which are implemented, ongoing or in the approbation phase, that you could perhaps consider for V2 :
1) Something similar to the "click paths" you mentioned.
- Actually, I list the timestamps, the urls, and the approximated time spent on that page.
2) More an more reports I created have links/click-on icon to feed the report selection area with the corresponding underlying IP address which can be carried over as a filter for the following report(s). I'm also carrying the SessionID as a hidden field, for the same reasons.
3) All the UPDATE/DELETE/INSERT queries are now invoked like this :
objConn.Execute strSql, AffectedCnt, adCmdText+adExecuteNoRecords
- The third item is one of the key thing. It prevents ADO from creating a recordset, simply because these type of queries will never return a recordset. I've seen some substantial improvement when invoking the command multiple times, like in a loop for instance.
4) Reference arrays are loaded via GetRows(), allowing to sort them as required. To do a search, a binary search is used, one for the strings with "strcomp()" and the other for the numerics.
5) Under the selection report list, I added a checkbox to allow reverse order listing, thus listing the lower values first. The maximum values are computed accordingly so that the graphs render adequately.
6) The "instance" is used as an alternate database pointer when dealing with MsAccess, allowing to flip between databases. This is allowing to download the main DB, and do some maintenance. Then uupload it back, along with some "one shot deal" queries if appropriate, and transfer the data from the alternate database over to the newly uploaded one. Logging can be stopped completely if appropriate.
7) Seggretation between the logging and reporting logic blocks. This is allowing to deal with report adjustments in an easier way, as it is not impacting the logging section.
8) Send a notification e-mail when the database reaches a pre-established threshold, would that be records wise or disk utilisation. The evaluation is performed once a day, when the very first page of the day is logged, right after the auto-delete completion, which is also invoked once a day.
9) The entry/exit pages is already something I'm working on. Good timing I guess. :)
10) Had to implement some sort of cleanup routines to remove "almost" duplicated records. This situation is occuring when the track.asp or track.js is on a page that is resized via javascript on the onload event of the BODY tag of the page. For instance, I have a popup window used to inform the people about the link they clicked on, and what to expect when going to that site. Since the description may vary from one entry to another, I didn't have the choice but to resize the window based on its content otherwise it was looking funny at times. Everything has a price I guess.
11) I think I will convert the pl_querytring field to a numeric, allowing me to store the estimated time it took to load the page along with the Timer() value, providing more precision for the eventual calculations.
12) Whenever a page is logged, the quantity of users actually active on the site is also logged. This is in prevision of showing the peaks and the lows during a give day/period.
My plate is not as loaded as yours, but I have few items as well. ;)
Bye for now...