User:Jules/howtogeoIP

"How do IP Geolocation service providers fill their databases?"

I found some information on quora.com regarding the matter.

Methods

Rob Friedman, Executive Vice President & Co-Founder Digital Envoy / Digital Element Leader ...
582 Views • Rob is a Most Viewed Writer in Geolocation.

IP geolocation databases are generally gathered based on the following:
1. IP spidering--traceroutes and other automated methods designed to map the routing infrastructure of the Internet. These techniques can be fairly complex and time consuming, given the task (4+billion IP addresses that constantly are allocated, deallocated, or moved). Plus, with IPv6, this becomes orders of magnitude more difficult.
2. Data supplied by users tied to IP addresses--some companies take anonymous user data (postal codes/city) tied to IP addresses and use that to help populate their databases. Obviously, this data needs to be carefully scrubbed to make sure it's reliable.
3. Sharing relationships with ISPs. Companies such as mine (Digital Element...http://www.digitalelement.com/) are often contacted by ISPs to make sure our data is accurate, because they don't want their users to be incorrectly targeted by services such as Hulu or ESPN and possibly blocked from content when they should otherwise be able to get it. This data is usually highly accurate, assuming it is kept up to date, because ISPs have perfect knowledge of the location of their own IP addresses.
4. Registry data--looking at ARIN, RIPE, etc. [Generally not that accurate.]

Jeremy Grosser

Most GeoIP services use one of MaxMind's databases to resolve IPs to locations.

For residential and commercial business services, MaxMind is likely collecting this data from ISPs that usually assign a subnet to a service area or region. These subnets are often assigned to a central office or multiplexing point within the network not an individual customer's location, making GeoIP's accuracy somewhat misleading.

ISPs providing satellite or cellular data services complicate things further, making it difficult or impossible to determine an accurate location for a given IP without realtime cooperation of the network provider.

IP addresses may also be assigned to a smaller organization that MaxMind does not have a data sharing agreement with. In these cases, the GeoIP location will likely resolve to the physical address in the network's whois record, which may not have any relation at all to it's actual location. (indeed)

Greg Villain

Multiple sources:
- RIRs
- Payment / Shopping card info channels
- Crowd Sourced feedback chain

ARIN, RIPE, LACNIC, AFRNIC, JPNIC and all Routing Registries that allcoate IP ranges set a country in the IP allocation record depending on where the ISP operates.

The other way is to tie online-shopper's IPs to the address they choose for shipping. I think Maxmind offers online payment security products that allows them to get the info across their products

Lastly, they all have some sort of free version of their products, also offering a feedback chain to crowsource whatever data is found to be incorrect.

If you are looking at building one, the answer is usually to use at least two of them (Maxmind is the cheaper, Quova and DE the most expensive ones), one cheap and one expensive, and add your own layer that supersedes both records depending on how you can, from your users, get the Address info for your users' IPs. (then match lat/long using Google Maps for instance)

Summing up the sources

IP Spidering traceroute -> demands too much resources over a wide scale
Data provided by users -> by feedback from users (open source versions of products) or taking data back from online payment services (crossing IP with shipping address)
ISP relationship -> ISPs want their users to be able to access geolocated online services correctly
RIR -> they distribute blocks of IP addresses to ISPs
Whois registry -> very imperfect

The Digital element method

http://www.google.com/patents/US6757740