Many-to-One mappings

One nagging problem with `countrycode` (e.g., https://github.com/vincentarelbundock/countrycode/issues/182 https://github.com/vincentarelbundock/countrycode/issues/180 ) is that the current approach to `codelist` strictly requires bidirectional one-to-one mappings.

This is problematic in cases where we want:

Russia -> RUS (iso)
USSR -> RUS (iso)
RUS -> Russia

I have been trying to find a solution forever without much result. Today, I pushed a (nearly working) branch with a potential path forward: https://github.com/vincentarelbundock/countrycode/tree/manytoone

The concept:

1. A unique regex identifies *every single* geographic unit covered by any of the schemes in `countrycode`. This means, for example, that we need a different regexes for Russia and USSR because Correlates of War treat them separately. 
2. Each *destination* code must be associated with one and only one regex: many-to-one
3. *origin* codes can be associated with more than one regex: many-to-one
4. This requires that we keep separate lists of origin and destination codes. The differences between origin and destination codes are handled explicitly in a centralized location: `dictionary/merge.R`
5. instead of using `codelist` internally, we use `codelist_map`, which is a list of lists of data.frames. For example, if we want to convert from cowc to iso3c, we use `codelist_map$cowc$iso3c`, which is a data.frame with only two columns.

One key, for me is number 4 above, and right now too much still happens in the `get_*` functions. The `get` functions should just be scrapers, and users should have access to a well-document script to see how we reconciled origin vs. destination.

Curious what @cjyetman thinks of this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Many-to-One mappings #186

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Many-to-One mappings #186

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions