Reinforcement learning in non-Markov conservative environment using an inductive qualitative model

Jović, Franjo; Slavek, Ninoslav; Blažević, Damir

doi:10.1142/S0218213011000425

Pregled bibliografske jedinice broj: 536292

Reinforcement learning in non-Markov conservative environment using an inductive qualitative model

Jović, Franjo; Slavek, Ninoslav; Blažević, Damir

Reinforcement learning in non-Markov conservative environment using an inductive qualitative model // International journal on artificial intelligence tools, 20 (2011), 5; 887-909 doi:10.1142/S0218213011000425 (međunarodna recenzija, članak, znanstveni)

CROSBI ID: 536292 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Reinforcement learning in non-Markov conservative environment using an inductive qualitative model
(Reinforcement learning in non-markov conservative environment using an inductive qualitative model)

Autori
Jović, Franjo ; Slavek, Ninoslav ; Blažević, Damir

Izvornik
International journal on artificial intelligence tools (0218-2130) 20 (2011), 5; 887-909

Vrsta, podvrsta i kategorija rada
Radovi u časopisima, članak, znanstveni

Ključne riječi
retail process; data normalization; periodicity elimination

Sažetak
The majority of real-world processes, such as power plants, banking and retail businesses, are non-Markov processes, being conservative systems with stochastic supply and demand. As an example, a retail process possesses long-term memory of the customer's experience and market price drift that deviates from the Markov property. Modeling the reward in this process is directed towards actions that have to be executed daily in order to support it. These actions are further severely distracted by the hidden periodicity of customer behavior on a monthly and weekly basis. Alternative solutions in the retail business are achieved using a retail potential market model and a pricing policy based on demography. The policy of non-Markov behavior has not been intensively studied, although the literature indicates the non-Markov nature of many real process models, such as bank rating migrations. A solution is proposed, based on day-to-day data collection from point-of-sale (POS) locations, synthesizing the reward function from separate sale component rewards using qualitative models, and indicating the most outstanding sale groups that form the reward model. The normalization of POS data has been used for the elimination of periodicities and of non-Markov features of the process data. Reinforcement learning has been additionally supported by artificial corrections of the normalized reward function, and thus the obtained models used for recognition of the most promising and most defective hidden retail product groups. Model data were analyzed for the statistical significance of the obtained results, comparing normalized and non-normalized sales data distributions. The method is simple and effective, being applicable to each POS separately, for a complex retail business network, as well as for other conservative environments. The obtained qualitative correlations of model and reward function lie between 0.72 and 0.95, even for the simple cases presented.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo

POVEZANOST RADA

Projekti:
165-1652017-2016 - Holografski logički analizator (Slavek, Ninoslav, MZO ) ( CroRIS)

Ustanove:
Fakultet elektrotehnike, računarstva i informacijskih tehnologija Osijek

Profili:

Damir Blažević (autor)

Franjo Jović (autor)

Ninoslav Slavek (autor)

Poveznice na cjeloviti tekst rada:

doi

www.worldscientific.com www.worldscientific.com dx.doi.org

Citiraj ovu publikaciju:

Časopis indeksira:

Current Contents Connect (CCC)
Web of Science Core Collection (WoSCC)

Science Citation Index Expanded (SCI-EXP)
SCI-EXP, SSCI i/ili A&HCI

Scopus

Uključenost u ostale bibliografske baze podataka::

ISI Alerting Services
CompuMath Citation Index
Current Contents�//Engineering, Computing, and Technology

CROSBI Hrvatska znanstvena bibliografija

Pregled bibliografske jedinice broj: 536292

Reinforcement learning in non-Markov conservative environment using an inductive qualitative model

Poveznice na cjeloviti tekst rada:

Citiraj ovu publikaciju:

Časopis indeksira:

Uključenost u ostale bibliografske baze podataka::

Citati:

Altmetrijski pokazatelji:

Pregled bibliografske jedinice broj: 536292

Reinforcement learning in non-Markov conservative environment using an inductive qualitative model

Poveznice na cjeloviti tekst rada:

Citiraj ovu publikaciju:

Časopis indeksira:

Uključenost u ostale bibliografske baze podataka::

Citati:

Altmetrijski pokazatelji:

Podijeli: