python - set returning duplicates? -


i'm pulling list of url's off census website putting them in set make sure don't end duplicates, exporting list of non-duplicate url's .csv file. however, set continues return duplicate values, shouldn't possible. here's code:

import bs4 bs4 import beautifulsoup import requests import csv  source_link = "https://www.census.gov/data/tables/2016/demo/popest/state-total.html" s = requests.get(source_link) usable_html = s.text setupsoup = beautifulsoup(usable_html, 'lxml') silver = csv.writer(open("wgucsv.csv", "r+")) silver.writerow(["url"])  set(gold) in setupsoup.findall('a', href=true):     gold.add['href']     print (gold)     silver.writerow(gold) 

as bonus question, need way convert resulting relative url's absolute url's, preferably before sorting them non-duplicated list. thought adding them set filter out duplicates on it's own.

if want add set, try

gold = set() x in setupsoup.findall('a', href=true):     gold.add(x) 

or more simply

gold = set(setupsout.findall('a', href=true)) 

Comments

Popular posts from this blog

ubuntu - PHP script to find files of certain extensions in a directory, returns populated array when run in browser, but empty array when run from terminal -

php - How can i create a user dashboard -

javascript - How to detect toggling of the fullscreen-toolbar in jQuery Mobile? -