Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 1101848

TVOR: Finding Discrete Total Variation Outliers Among Histograms


Banić, Nikola; Elezović, Neven
TVOR: Finding Discrete Total Variation Outliers Among Histograms // IEEE Access, 9 (2021), 1807-1832 doi:10.1109/ACCESS.2020.3047342 (međunarodna recenzija, članak, znanstveni)


CROSBI ID: 1101848 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
TVOR: Finding Discrete Total Variation Outliers Among Histograms

Autori
Banić, Nikola ; Elezović, Neven

Izvornik
IEEE Access (2169-3536) 9 (2021); 1807-1832

Vrsta, podvrsta i kategorija rada
Radovi u časopisima, članak, znanstveni

Ključne riječi
age heaping ; anomaly detection ; discrete total variation ; expected value ; fitting ; histogram ; Myers' index ; outlier detection ; Pearson's chi-squared test ; total variation ; Whipple's index

Sažetak
Pearson's chi-squared test can detect outliers in the data distribution of a given set of histograms. However, in fields such as demographics (for e.g. birth years), outliers may be more easily found in terms of the histogram smoothness where techniques such as Whipple's or Myers' indices handle successfully only specific anomalies. This paper proposes smoothness outliers detection among histograms by using the relation between their discrete total variations (DTV) and their respective sample sizes. This relation is mathematically derived to be applicable in all cases and simplified by an accurate linear model. The deviation of the histogram's DTV from the value predicted by the model is used as the outlier score and the proposed method is named Total Variation Outlier Recognizer (TVOR). TVOR requires no prior assumptions about the histograms' samples' distribution, it has no hyperparameters that require tuning, it is not limited to only specific patterns, and it is applicable to histograms with the same bins. Each bin can have an arbitrary interval that can also be unbounded. TVOR finds DTV outliers easier than Pearson's chi-squared test. In case of distribution outliers, the opposite holds. TVOR is tested on real census data and it successfully finds suspicious histograms. The source code is given at https://github.com/DiscreteTotalVariation/TVOR.

Izvorni jezik
Engleski

Znanstvena područja
Matematika, Računarstvo



POVEZANOST RADA


Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb

Profili:

Avatar Url Neven Elezović (autor)

Avatar Url Nikola Banić (autor)

Poveznice na cjeloviti tekst rada:

doi ieeexplore.ieee.org

Citiraj ovu publikaciju:

Banić, Nikola; Elezović, Neven
TVOR: Finding Discrete Total Variation Outliers Among Histograms // IEEE Access, 9 (2021), 1807-1832 doi:10.1109/ACCESS.2020.3047342 (međunarodna recenzija, članak, znanstveni)
Banić, N. & Elezović, N. (2021) TVOR: Finding Discrete Total Variation Outliers Among Histograms. IEEE Access, 9, 1807-1832 doi:10.1109/ACCESS.2020.3047342.
@article{article, author = {Bani\'{c}, Nikola and Elezovi\'{c}, Neven}, year = {2021}, pages = {1807-1832}, DOI = {10.1109/ACCESS.2020.3047342}, keywords = {age heaping, anomaly detection, discrete total variation, expected value, fitting, histogram, Myers' index, outlier detection, Pearson's chi-squared test, total variation, Whipple's index}, journal = {IEEE Access}, doi = {10.1109/ACCESS.2020.3047342}, volume = {9}, issn = {2169-3536}, title = {TVOR: Finding Discrete Total Variation Outliers Among Histograms}, keyword = {age heaping, anomaly detection, discrete total variation, expected value, fitting, histogram, Myers' index, outlier detection, Pearson's chi-squared test, total variation, Whipple's index} }
@article{article, author = {Bani\'{c}, Nikola and Elezovi\'{c}, Neven}, year = {2021}, pages = {1807-1832}, DOI = {10.1109/ACCESS.2020.3047342}, keywords = {age heaping, anomaly detection, discrete total variation, expected value, fitting, histogram, Myers' index, outlier detection, Pearson's chi-squared test, total variation, Whipple's index}, journal = {IEEE Access}, doi = {10.1109/ACCESS.2020.3047342}, volume = {9}, issn = {2169-3536}, title = {TVOR: Finding Discrete Total Variation Outliers Among Histograms}, keyword = {age heaping, anomaly detection, discrete total variation, expected value, fitting, histogram, Myers' index, outlier detection, Pearson's chi-squared test, total variation, Whipple's index} }

Časopis indeksira:


  • Current Contents Connect (CCC)
  • Web of Science Core Collection (WoSCC)
    • Science Citation Index Expanded (SCI-EXP)
    • SCI-EXP, SSCI i/ili A&HCI
  • Scopus


Citati:





    Contrast
    Increase Font
    Decrease Font
    Dyslexic Font