Skip to content

iesave: code breaks if too many unique values in a string variable #358

@luizaandrade

Description

@luizaandrade

If there are too many unique values in a string/categorical variable, levelsof breaks with an error message of "cannot compute". I have just run into this with a variable that had 700k+ unique values.

It now runs with the workaround of replacing the following lines

* Number of levels and complete observations
qui levelsof `var'
local varlevels = r(r)
local varcomplete = r(N)

with

* Number of levels
preserve 
	keep `var'
	duplicates drop
	count
	
	local varlevels = r(r)
restore

* Number of complete observations
qui count if !missing(`var')		
local varcomplete	= r(N)

There may be a more elegant approach, though. If no one can think of one, I can open a PR with this one.

Metadata

Metadata

Assignees

No one assigned

    Labels

    minor bugBug unlikely to lead to incorrect analysis

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions