google analytics - big difference in "visitor" count -
i try pull out (unique) visitor count directory using 3 different methods: * profile * using dynamic advanced segment * using custom report filter
on smaller site 3 methods give same result. on large site (> 5m visits/month) big discrepancy between profile on 1 hand , advanced segment , filter on other. might because of sampling - difference smaller when comes pageviews. estimation of visitors worse , discrepancy bigger when using sampled data? when extracting data api (using filters or profiles) still different data if ga doesn't indicate data sampled - ie i'm looking @ unsampled data.
another strange thing pageviews higher in profile filter, while visitor count higher filter vs profile. applied filter @ profile force use sample data - , again quite similar results filter , segment-data.
profile filter segment filter@profile unique 25550 37778 36433 37971 pageviews 202761 184130 n/a 202761 what trying achieve find way accurat data on unique visitors when i've run out of profiles use.
more data discrepancies can found in google docs: https://docs.google.com/spreadsheet/ccc?key=0aqzq0ujqny0xdg1drfpaewjvewhhdxzremrlz3pfb0e
google analytics (free version) tracks 10 mio page interactions [0] (pageviews , events, tracker method start "track" interaction) per month [1], presumably data larger site heavily sampled (i guess each of 5 million visitors has more 2 interactions) [2]. ad hoc reports use 1 mio datapoints @ max, have sample of sample. naturally aggregated values suffer more smaller sample sizes.
and i'm pretty sure data limits apply api access (google says there "no assurance excess hits processed"), large site api returns sampled (or incomplete) data, - cannot looking @ unsampled data.
as differences, i'd different ad hoc report use different samples end different results. ga shouldn't rely absolute numbers anyway , more general trends.
[1] analytics premium tracks 50 mio interactions per month (and has support google) comes @ 150 000 usd per year
[2] google suggests use "_setsamplerate()" on large sites make sure have sampled data each day of month instead of random hit or miss after exceed data limits.
data limits:
http://support.google.com/analytics/bin/answer.py?hl=en&answer=1070983).
setsamplerate:
Comments
Post a Comment