Baffling magic behaviour with dict variable & for loop in Python -


junior python programmer here , i've been beating head against brick wall on unexpected loop , dictionary behavior. i'm looping through csv file of log entries , parsing data categories dict. when initialize categories dict each time through loop, works expected..

like so:

log_entries = autovivification() # http://stackoverflow.com/questions/635483/what-is-the-best-way-to-implement-nested-dictionaries-in-python  def scrublooper(log_file):      ll in log_file:     # initialize  categories dict every round through loop     categories = {'requests': {'content_visual': 0, 'content_programsupdates': 0, 'content_text': 0, 'pages': 0, 'content_files': 0}, 'filter_action': {'re': 0, 'pl': 0, 'bs': 0}}     lld = logdomain(ll)     domain, hostname, lan_host = lld.domain, lld.hostname, lld.lan_host       mimetypes = url_searcher(settings.mimetypes, lld.mime_type)      if mimetypes:         category = mimetypes[2]          if not log_entries[lan_host].has_key(domain):              log_entries[lan_host][domain]= categories          log_entries[lan_host][domain]['requests'][category] += 1   print log_entries['192.168.5.210']['google.com']['requests'] print log_entries['192.168.5.210']['webtrendslive.com']['requests'] print log_entries['192.168.5.210']['osnews.com']['requests'] print log_entries['192.168.5.210']['question-defense.com']['requests'] print log_entries['192.168.5.210']['optimost.com']['requests'] 

output expect:

{'content_visual': 0, 'content_programsupdates': 0, 'content_text': 95, 'pages': 0, 'content_files': 0} {'content_visual': 0, 'content_programsupdates': 0, 'content_text': 1, 'pages': 0, 'content_files': 0} {'content_visual': 0, 'content_programsupdates': 0, 'content_text': 2, 'pages': 0, 'content_files': 0} {'content_visual': 0, 'content_programsupdates': 0, 'content_text': 18, 'pages': 0, 'content_files': 0} {'content_visual': 0, 'content_programsupdates': 0, 'content_text': 3, 'pages': 0, 'content_files': 0} 

however! here problem. don't want initialize categories dict every time through loop. in simplified example case doesn't matter, down road program, it'll cause significant performance degradation (30%).

i need initialize categories dict once:

log_entries = autovivification() categories = {'requests': {'content_visual': 0, 'content_programsupdates': 0, 'content_text': 0, 'pages': 0, 'content_files': 0}, 'filter_action': {'re': 0, 'pl': 0, 'bs': 0}}  def scrublooper(log_file):      ll in log_file:     lld = logdomain(ll)     # etc, etc, etc 

however, when initialize categories dict anywhere outside loop (whether in scrublooper function or right after log_entries variable), output is:

{'content_visual': 0, 'content_programsupdates': 0, 'content_text': 685, 'pages': 0, 'content_files': 0} {'content_visual': 0, 'content_programsupdates': 0, 'content_text': 685, 'pages': 0, 'content_files': 0} {'content_visual': 0, 'content_programsupdates': 0, 'content_text': 685, 'pages': 0, 'content_files': 0} {'content_visual': 0, 'content_programsupdates': 0, 'content_text': 685, 'pages': 0, 'content_files': 0} {'content_visual': 0, 'content_programsupdates': 0, 'content_text': 685, 'pages': 0, 'content_files': 0} 

all 'conent_text' values have incremented equally! happening here? i'm sure i've violating python principle don't know or how find out. took me hours figure out problem connected categories dict.

much obliged explanation.

i'm not familiar tools you're using, when create dictionary outside of loop, you're creating 1 dictionary.

if not log_entries[lan_host].has_key(domain):          log_entries[lan_host][domain]= categories 

this code makes log_entries[lan_host][domain] point single dictionary. python doesn't copy values or that. these lines refer same dictionary.

log_entries['192.168.5.210']['google.com'] log_entries['192.168.5.210']['webtrendslive.com'] 

p.s. can't sure, gut says not wanting initialize new dictionary performance excessive.


Comments

Popular posts from this blog

jquery - Invalid Assignment Left-Hand Side -

java - Play! framework 2.0: How to display multiple image? -

gmail - Is there any documentation for read-only access to the Google Contacts API? -