-
Notifications
You must be signed in to change notification settings - Fork 7
Add Wikidata queries to return data and maps URLs for grid operators #22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Great idea. @davidhicks We would need a list of all grid operators at first. After that we could search all website for maps. |
|
Also checkout: https://github.com/open-energy-transition/osm-wikidata-toolset @davidhicks @diazr-david
|
For alltheplaces, a spider has to be individually written for each transmission/distribution operator. There is an experimental branch which can be used to semi-automatically create spiders for common patterns, such as use of ArcGIS Feature Server layers, but this doesn't save much time for transmission/distribution operator spiders. There are also helpers for simple and general GeoJSON/other vector file formats. What I find though is every transmission/distribution operator applies wildly different tagging/metadata, and manual clean up is almost always required. It can be everything from mapping tower/pole materials to OSM equivalents, through to converting units (MVA, V, etc) for transformers. It can take 15-30min I predict to write a typical alltheplaces spider for transmission/distribution operators depending on how much data cleanup is needed. |
|
Hello @davidhicks Regarding creating spiders for ATP:
How do you address license issues? It's true there is no common standard regarding grid mapping (sometimes CGMES but operators won't publish their public map with a data format designed for industry). OpenStreetMap is intended to reconcile them all. |
|
@davidhicks Here a list of all the Transmission Grid System Operators in WikiData that you can download. We are considering to clean this data soon. Lots of system operators from low to medium income countries are missing in Wikidata: https://query.wikidata.org/#SELECT%20%3Foperator%20%3FoperatorLabel%20%3Fcountry%20%3FcountryLabel%20WHERE%20%7B%0A%20%20%3Foperator%20wdt%3AP31%20wd%3AQ112046.%20%20%20%20%20%20%20%23%20Instance%20of%20transmission%20system%20operator%0A%20%20OPTIONAL%20%7B%20%3Foperator%20wdt%3AP17%20%3Fcountry.%20%7D%20%23%20Country%20%28if%20available%29%0A%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%0A%20%20%20%20bd%3AserviceParam%20wikibase%3Alanguage%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%0A%20%20%7D%0A%7D%0AORDER%20BY%20%3FcountryLabel @flacombe To avoid any licensing issues, I would like us to obtain a list of all the maps first. Linking to the maps is not a problem. When it comes to mapping in OpenStreetMap that is a different topic, not directly related to the Awesome Electrical Grid Mapping project. If this data is available on alltheplaces we can also just link it here. If and how we allows data to be used as a Hint Layer is something we have to discuss indeed. |
In short, the ATP repository is just a collection of freely licensed spiders (not containing any extracted data). Each feature extracted by a spider is linked to the ATP spider and the URL from which the feature was extracted. Some spiders add other dataset-related metadata, for example, date of last data update as advised by the data source, or the preferred attribution string the data source provides for citing the data. ATP spiders only extract data that is publicly accessible without having to agree to a contract, register for an account and obtain an API key or password. This is different than, for example, yt-dlp (for video extraction) which allows users to specify account details to extract video that is not publicly accessible. It's a separate activity to run the spiders and publish the data extracted. In what I think is now most jurisdictions that reject sweat of the brow, data generated from an ATP crawl is not subject to copyright because it is a collation of facts and doesn't meet a threshold of creativity and/or originality. ATP spiders further reinforce this point by:
It is however ultimately up to the person running ATP spiders to decide which spiders they are comfortable to run and publish data for, depending on that person's jurisdiction. I don't think ATP data extraction is as legally messy as Wikimedia Commons restrictions on images/video where for example, there is an entirely separate law unrelated to copyright for prohibiting commercial use of imagery of some Australian national parks (see https://commons.wikimedia.org/wiki/Kata_Tjuta for example). But as you can see from this example, an Australian law (probably never enforced in practice) doesn't result in Wikimedia Commons preventing upload of images of Kata Tjuta, or from foreigners selling such images overseas. Wikimedia Commons leaves it up to users of the images to decide what they can/can't do with them. For more information on how complex Wikimedia Commons imagery restrictions might be on a jurisdiction-by-jurisdiction basis, see https://commons.wikimedia.org/wiki/Commons:Freedom_of_panorama OSM has very strict data import rules that as far as I'm aware, requires written confirmation from a data source to OK the import of a data set into OSM. It doesn't matter if the data source states "This is public domain data", OSM will still expect written confirmation from the data source. There are already some tools and people using ATP data to assist in finding missing or incorrect/outdated features and tags in OSM and then correcting them based on manual human review of ATP data (and other sources the human may have available). This seems to be accepted by the OSM community as not being an import because of the human-in-the-loop who is not just blindly/automatically importing random stuff into OSM. Source data and ATP extracted data are not always correct. Plenty of errors in source data will carry forward into the ATP extracted data (for example, incorrect geographic coordinates inaccurate by 100m). |
I've been updating Wikidata so this list of transmission system operators and distribution network operators should be improving each day, as well as additional external data/map URLs being added as they're found. |
|
@davidhicks Awesome work. |
No description provided.