SN 988: National Public Data

Beep boop - this is a robot. A new show has been posted to TWiT…

What are your thoughts about today’s show? We’d love to hear from you!

This is the first time I’ve ever been able to see the “revoked.grc” page, so maybe this comment is a little late but… mentioning IE and not Edge? What year is it!? :rofl:

Also is my comment going to be deleted if I write the email address down?

On the one hand I get why he’s keeping it obscure, but on the other hand is he worried that he’s going to be spammed to hell and back if it’s a public address? Maybe add a secret word into each podcast so he knows it’s from actual listeners?

I commented something similar on TWIT. but it’s a shame you can only use the NPD pentester site if you’re living in the US. If you don’t input a US state you get no results.

Yeah, he’s been made aware (I see commentary in the newsgroups.) I expect he’ll comment on it in the next show.

1 Like

I have the data on my NAS. (It’s huge, two files, one 165G and one 120G.) In doing some basic searching, I cannot yet confirm there is anything in the data other than US addresses. The data does have some data quality issues, some of it was clearly manually entered. For example I found someone with a Chinese seeming name, listed as being in Toronto Ontario, NH. I am just about to write a small parser to extract unique entries in the state field, and maybe also in the city field (if it doesn’t take too much memory.) There is no country field to look at in the data I have seen.

2 Likes

So my utility produced some interesting results. First thing to note is I screwed up a little. It seemed all the data was in all upper case, so I didn’t force the state entry to be in a particular case. Turns out there are variations in case, and I got some duplicates. Also, it’s supposed to be CSV data, but there is no “protection” against a comma being in a field. (Normal CSV protects such fields by requiring them to be in quotes to prevent parsing ambiguities.) Therefore you’ll notice some of the records didn’t get parsed, or got parsed improperly, because there is no automated way to solve the ambiguity. This means some strange entries for city and/or state. Finally, I will list some examples below of non-American addresses, where they have a state anyway, because the data was forced into an American address template.

lineCount = 2695681513

recordCount = 2644837538
unparsedRecords = 50843975
questionableParseRecords = 33889268

blankCityAndStateCount = 6759714
blankCityCount = 6804685
blankStateCount = 8710960

states size = 155
cities size = 210040
citiesByState size= 280134

All States:
  35
  AA
  AB
  AE
  AK
  AL
  ALLEGANY
  ANNE ARUNDEL
  AP
  AR
  AS
  AZ
  Ar
  BALTIMORE CITY
  BC
  BR
  BRONX
  CA
  CAROLINE
  CD
  CECIL
  CHO
  CI
  CN
  CO
  CT
  Ca
  Co
  DC
  DE
  DORCHESTER
  DURHAM
  Dc
  De
  EA
  FAYETTEVILLE
  FF
  FL
  FM
  Fl
  G
  GA
  GL
  GREENSBORO
  GU
  Ga
  HARRIS
  HI
  Hi
  IA
  ID
  IL
  IN
  Il
  In
  JA
  KENT
  KEY BISCAYNE
  KS
  KY
  Ke
  Ks
  LA
  Lo
  MA
  MB
  MD
  ME
  MH
  MI
  MN
  MO
  MP
  MS
  MT
  Ma
  Md
  Me
  Mi
  Mn
  Mo
  Ms
  Mt
  NA
  NC
  ND
  NE
  NH
  NJ
  NM
  NV
  NY
  Nc
  Nd
  Ne
  Nh
  Nj
  Nv
  Ny
  OA
  OH
  OK
  ON
  OR
  OT
  Oh
  Or
  PA
  PR
  PW
  Pa
  Pr
  QC
  QUEEN ANNES
  RE
  RI
  RO
  Ri
  S.
  SAINT MARYS
  SAN DIEGO
  SC
  SD
  SILER CITY
  SK
  SP
  ST
  Sc
  Sd
  TALBOT
  TN
  TX
  Tn
  Tx
  UT
  Ut
  VA
  VI
  VT
  Va
  WA
  WE
  WI
  WV
  WY
  Wa
  We
  Wi
  Wv
  co
  ny
  pa
  st
  tx
  va
  calgary ab ca,WA
  calgary ab can,MA
  calgary ab cana,TX
  calgary ab canad,CA
  calgary ab canad,MN
  calgary ab canad,TX
  calgary ab canada,CA
  calgary ab,CA
  calgary ab,NV
  calgary albe,OH
  calgary alber,CA
  calgary alber,DE
  calgary alber,MN
  calgary alber,MT
  calgary alberta te,AZ
  calgary alberta,AB
  calgary alberta,CA
  calgary alberta,CO
  calgary alberta,CT
  calgary alberta,IL
  calgary alberta,LA
  calgary alberta,MN
  calgary alberta,MS
  calgary alberta,NC
  calgary alberta,NY
  calgary alberta,PA
  calgary alberta,TN
  calgary alberta,TX
  calgary alberta,WI
  calgary alborta,TX
  calgary alerb,NE
  calgary alta can,MN
  calgary canad,NJ
  calgary canada t,AZ
  calgary canada,DC
  calgary canada,TX
  calgary canada,VA
  calgary on taj,MA
  calgary thj,CA
  calgary,AK
  calgary,AL
  calgary,AZ
  calgary,CA
  calgary,CO
  calgary,CT
  calgary,FL
  calgary,IL
  calgary,KY
  calgary,MA
  calgary,MI
  calgary,MN
  calgary,MT
  calgary,NJ
  calgary,NV
  calgary,NY
  calgary,OH
  calgary,PA
  calgary,TX
  calgary,VA
  calgary,WA
  calgaryabtb,CA
  calgaryalberta,CA
  calgaryalberta,FL
  calgarycanada,CA
  calgate,WI
  calgery alberta,PA
  calgery,MT
  calgory alberta te,AL
  calgory,AR
  calgory,CA
  toron,NJ
  torono,OH
  toronot,OH
  toronoto,AZ
  toronoto,OH
  toronto ca,GA
  toronto canad,CA
  toronto canad,IL
  toronto canada,AZ
  toronto canada,CA
  toronto canada,FL
  toronto canada,IL
  toronto canada,MA
  toronto canada,NJ
  toronto canada,NY
  toronto canada,RI
  toronto nd,DE
  toronto o,MN
  toronto on ca,CA
  toronto on ca,NY
  toronto on ca,TX
  toronto on cana,CA
  toronto on canad,CA
  toronto on canad,GA
  toronto on canad,NC
  toronto on mgm r,MA
  toronto on mn,MA
  toronto on msa p,CA
  toronto on,CA
  toronto on,CO
  toronto on,MA
  toronto ont m,MA
  toronto ontaio,VA
  toronto ontar,CA
  toronto ontar,FL
  toronto ontar,IL
  toronto ontar,OH
  toronto ontario,AZ
  toronto ontario,CA
  toronto ontario,CD
  toronto ontario,FL
  toronto ontario,MA
  toronto ontario,MN
  toronto ontario,NH
  toronto ontario,NJ
  toronto ontario,OR
  toronto ontario,SC
  toronto ontario,TX
  toronto onterio,MA
  toronto,AL
  toronto,AZ
  toronto,CA
  toronto,CO
  toronto,CT
  toronto,FL
  toronto,GA
  toronto,IA
  toronto,IL
  toronto,IN
  toronto,KS
  toronto,MA
  toronto,MD
  toronto,ME
  toronto,MI
  toronto,MN
  toronto,MO
  toronto,MS
  toronto,NC
  toronto,NJ
  toronto,NY
  toronto,OH
  toronto,OK
  toronto,ON
  toronto,PA
  toronto,SC
  toronto,SD
  toronto,TX
  toronto,UT
  toronto,VA
  toronto,VT
  toronto,WV
  torontocanada,CA
  torontocanada,MA
  torontomwg,NY
  toronton,OH
  torontoon,TX
  torontoontaria,OH
  torontoontario,CA
  torontoontario,NJ
  toronyo,OH
  toroto,IL
  baguio city p,NY
  baguio city philippines,CA
  baguio city,CA
  baguio city,PA
  baguio city,PR
  baguiocityp,NY
  zurich switz,NJ
  zurich switzerla,MN
  zurich switzerla,NC
  zurich switzerland,NH
  zurich,CT
  zurich,IL
  zurich,KS
  zurich,MA
  zurich,MT
  zurich,NJ
  zurora,IL
  zurrich,CO
  zuruch,IL
1 Like

very impressive. I suspected that even inside the states field, there would be stuff like ON, QC, etc. Just that the web site for NPD didn’t allow that. I actually did a quick look for funsies if I could poke at their JS to play with the API but I wasn’t successful.