python - Loading directory of JSON files takes much less time on 2nd run -
i'm loading directory of 3.7m json files on 64-bit ubuntu 14.04 server using python. strange thing first time run simple script takes multiple hours, whereas second run takes 100 seconds. how can be? os doing kind of disk optimization in background?
code
parsed_links = {} paths = glob.glob('parsed_links/*') i, p in enumerate(paths): try: open(p, 'r') f: parsed = json.load(f) parsed_links[parsed['link']] = parsed except exception e: pass
Comments
Post a Comment