file - Reading Freebase data dump in python, read to few lines? -
i trying use freebase data dump, seams have problems reading files python. looks program cant read lines.
def test2(): count=0 line in open(freebase_topic): count+=1 return count def test3(): count=0 line in open(freebase_quad): count+=1 return count if __name__ == "__main__": print "freebase topic - nr lines:",test2() print "freebase quad - nr lines:",test3() results in this:
freebase topic - itr time: 1.21000003815 freebase topic - nr lines: 1643010 freebase quad - iter time: 0.797000169754 freebase quad - nr lines: 3155131 this can all. looks to few lines contain whole freebase. , cant see how possible iterate on 1 33gb file , 5gb file in 2 seconds.
what wrong? downloading files again in case went wrong during download process, takes decades connections, asking ere in mean time. file size correct, , have printed of lines , correct.
there problem occurred me:
open('file', 'rb') should solve it.
chr(26) sometimes causes file ending text mode 'r' default.
Comments
Post a Comment