Skip to content

MaaniBeigy/pycvcqv

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

607 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

pycvcqv

PyPI Python Version Build status coverage report Downloads "Buy Me A Coffee" static analysis dependencies vulnerabilities maintainability complexity lint report docstring Code style: black Security: bandit Pre-commit License

Find homogeneity with confidence.

Python port of cvcqv

Introduction

pycvcqv provides versatile functions to quantify homogeneity with confidence intervals. It offers a variety of well-established methods from the literature (Kelley, McKay, Miller, Vangel, Mahmoudvand-Hassani, Equal-Tailed, Shortest-Length, Normal Approximation, Bonett, and the Abu-Shawiesh-Akyuz-Kibria adjusted-degrees-of-freedom, large-sample, and augmented-large-sample CIs) and bootstrap resampling techniques (Normal, Basic, Percentile, BCa) for constructing confidence intervals on the Coefficient of Variation (cv) and the Coefficient of Quartile Variation (cqv).

Coefficient of Variation

cv is a measure of relative dispersion representing the degree of variability relative to the mean (Albatineh et al., 2014). Since cv is unitless, it is useful for comparing variables that have different units. It is also a measure of homogeneity (Albatineh et al., 2014).

Coefficient of Quartile Variation

cqv is a measure of relative dispersion based on the interquartile range (IQR). Since cqv is unitless, it is also useful for comparing variables that have different units. It is also a measure of homogeneity (Bonett, 2006; Altunkaynak, 2018).

Install

pip install pycvcqv

Usage

import pandas as pd
from pycvcqv import coefficient_of_variation, cqv

coefficient_of_variation(
    data=[
        0.2, 0.5, 1.1, 1.4, 1.8, 2.3, 2.5, 2.7, 3.5, 4.4,
        4.6, 5.4, 5.4, 5.7, 5.8, 5.9, 6.0, 6.6, 7.1, 7.9
    ],
    multiplier=100,
    ndigits=2
)
# {'cv': 57.77, 'lower': 41.43, 'upper': 98.38}
cqv(
    data=[0.2, 0.5, 1.1, 1.4, 1.8, 2.3, 2.5, 2.7, 3.5, 4.4, 4.6, 5.4, 5.4],
    multiplier=100,
)
# 51.7241
data = pd.DataFrame(
    {
        "col-1": pd.Series([0.2, 0.5, 1.1, 1.4, 1.8, 2.3, 2.5, 2.7, 3.5]),
        "col-2": pd.Series([5.4, 5.4, 5.7, 5.8, 5.9, 6.0, 6.6, 7.1, 7.9]),
    }
)
coefficient_of_variation(data=data, num_threads=3)
#   columns      cv      lower      upper
# 0   col-1  0.6076     0.3770     1.6667
# 1   col-2  0.1359     0.0913     0.2651
cqv(data=data, num_threads=-1)
#   columns      cqv
# 0   col-1  0.3889
# 1   col-2  0.0732

Confidence-interval methods for cv

coefficient_of_variation accepts a method argument that selects the confidence-interval estimator.

from pycvcqv import coefficient_of_variation

x = [
    0.2, 0.5, 1.1, 1.4, 1.8, 2.3, 2.5, 2.7, 3.5, 4.4,
    4.6, 5.4, 5.4, 5.7, 5.8, 5.9, 6.0, 6.6, 7.1, 7.9,
]

for method in (
    "kelley", "mckay", "miller", "vangel",
    "mahmoudvand_hassani", "equal_tailed",
    "shortest_length", "normal_approximation",
    "aak_adj", "aak_ls", "aak_als",
    "norm", "basic", "perc", "bca",
):
    print(method, coefficient_of_variation(
        data=x,
        method=method,
        multiplier=100,
        ndigits=3,
        num_replicates=10000,
        random_state=42,
    ))

Output (95% CI, multiplier=100, ndigits=3, bootstrap methods use num_replicates=10000, random_state=42):

method est lower upper description
kelley 57.774 41.303 97.950 cv with Kelley 95% CI
mckay 57.774 41.441 108.483 cv with McKay 95% CI
miller 57.774 34.053 81.495 cv with Miller 95% CI
vangel 57.774 40.955 103.931 cv with Vangel 95% CI
mahmoudvand_hassani 57.774 43.476 82.857 cv with Mahmoudvand-Hassani 95% CI
equal_tailed 57.774 43.937 84.383 cv with Equal-Tailed 95% CI
shortest_length 57.774 42.015 81.013 cv with Shortest-Length 95% CI
normal_approximation 57.774 44.533 85.272 cv with Normal Approximation 95% CI
aak_adj 57.774 48.029 72.516 cv with Abu-Shawiesh-Akyuz-Kibria Adjusted-DoF 95% CI
aak_ls 57.774 46.310 72.075 cv with Abu-Shawiesh-Akyuz-Kibria Large-Sample 95% CI
aak_als 57.774 45.839 75.092 cv with Abu-Shawiesh-Akyuz-Kibria Augmented-LS 95% CI
norm 57.774 38.850 78.379 cv with Normal Approximation Bootstrap 95% CI
basic 57.774 37.716 77.166 cv with Basic Bootstrap 95% CI
perc 57.774 38.382 77.832 cv with Bootstrap Percentile 95% CI
bca 57.774 41.556 83.032 cv with Adjusted Bootstrap Percentile (BCa) 95% CI

Confidence-interval methods for cqv

cqv accepts a method argument that selects the confidence-interval estimator. When method is omitted only the point estimate is returned (the legacy behavior).

from pycvcqv import cqv

x = [
    0.2, 0.5, 1.1, 1.4, 1.8, 2.3, 2.5, 2.7, 3.5, 4.4,
    4.6, 5.4, 5.4, 5.7, 5.8, 5.9, 6.0, 6.6, 7.1, 7.9,
]

for method in ("bonett", "norm", "basic", "perc", "bca"):
    print(method, cqv(
        data=x,
        method=method,
        multiplier=100,
        ndigits=3,
        num_replicates=10000,
        random_state=42,
    ))

Output (95% CI, multiplier=100, ndigits=3, bootstrap methods use num_replicates=10000, random_state=42):

method est lower upper description
bonett 45.625 24.785 77.329 cqv with Bonett 95% CI
norm 45.625 19.937 70.403 cqv with Normal Approximation Bootstrap 95% CI
basic 45.625 21.081 73.923 cqv with Basic Bootstrap 95% CI
perc 45.625 17.327 70.169 cqv with Bootstrap Percentile 95% CI
bca 45.625 22.006 76.331 cqv with Adjusted Bootstrap Percentile (BCa) 95% CI

Credits

🚀 Your next Python package needs a bleeding-edge project structure. This project was generated with python-package-template

About

Find homogeneity with confidence. Python port of https://github.com/MaaniBeigy/cvcqv)

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Sponsor this project

Contributors