博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Extending Markov to Hidden Markov
阅读量:5902 次
发布时间:2019-06-19

本文共 5756 字,大约阅读时间需要 19 分钟。

, , , , ,, 

 

When we talked about  and  in previous blog posts, we assumed that we knew the states of the process. That’s often true, and hence Markov model is great tool for predicting and modeling systems where discrete events happen in discrete time-steps. There are some special cases though where we are interested in states underlying the events observed, and events do not map to states in one-to-one or one-to-many fashion as had been requirement so far.

Consider example we have described before – about observing customer’s purchase events at a retail store. If we want to predict next purchase product from sequences of past purchases, Markov model may do good job. But what if you want to remind customer about purchase? Reminder is only helpful if customer is out-of-stock on that product. However, observed purchase event doesn’t necessarily directly correlate to out-of-stock state. Sometimes customer will buy when he is out-of-stock, other times he will be buy because he is at the store, or wants to stock up, or has some discount offer. Sometimes customer may not buy even when he is out-of-stock because he didn’t get time to do so. In this example, true state of interest is out-of-stock but observed state is purchase or no purchase.

By way of another example, we compared customer’s clicks on bank’s website to Markov “memoryless” process. This is good enough if want to improve web layout, but not good enough if want to figure out why customer is visiting website in the first place. Intent behind clicks is the state we are interested in, but all we observe is webpage visits. Maybe she is interested in finding interest rates, or maybe looking for nearest ATM, or maybe wants to read up about new pension plan. Cases like these call for Hidden Markov Model (HMM) where unknown hidden states are of interest but correspond to multiple observed states1.

HMM may be represented as directed graph with hidden edges2.

 

representation of Hidden Markov Models

 

Apart from Transition Matrix which governs probability of state transition from one hidden state to another,HMM also involves an Emission Matrix which governs probability of observation of observed state from underlying hidden state. Goal of HMM learning is estimation of both these matrices.

Learning HMM

Learning HMM isn’t as simple as learning MM is, but we will give schematic overview in this post.

First, like with Markov Process, we need to know number of observed states, which is obvious from data. However, we also need to make assumption about order of Markov process, and number of hidden states – both are not available. Cross-validation and Akaike Information Criteriondiscussed in previous post come handy. Here, we need to train multiple HMMs with varying number of hidden states and varying order of Markov process and select simplest model which explains training data well.

We can gain understanding of HMM training algorithm by following mental exercises:

Exercise 1 – If we knew Emission and Transition Probabilities, and an observed sequence, can we compute probability of observing that sequence?

Let’s say our observed sequence is S1-S2-S3-... and underlying hidden sequence is H1-H2-H3-... , then probability of observing the given observed sequence is

P(H1)*B(S1|H1)*A(H2|H1)*B(S2|H2)*A(H3|H2)*B(S3|H3)*....

Where B(Si|Hj) is Emission Probability of observing state when hidden state is , and is Transition Probability of transitioning to hidden state from hidden state , assuming Markov Process of order one. However, since we don’t know true underlying sequence we can do so over all combinations of underlying sequences3 and sum over computed probability.

Exercise 2 – If we knew Emission and Transition Probabilities, and an observed sequence, can we make best guess about underlying hidden sequence?

If we compute probabilities of observing given state sequence under all possible combinations of hidden state sequences, one of the hidden sequences will correspond to highest probability of observed sequence. In absence of any other information, that is our best guess of underlying sequence under Maximum Likelihood Estimation method.

Exercise 3 – Given number of observed sequences, and assumption on number of observed and hidden states, can we make best estimate of Emission and Transition Probabilities which will explain our sequences?

This, of course, is HMM training. Here we build on previous steps, and starting with random probabilities, compute joint probabilities of observing all observed sequences (as in, again, Maximum Likelihood Estimation), and search for right set of Emission and Transition probabilities which will maximize this joint probability.

We have intentionally skipped mathematics of the algorithm – called Viterbi Algorithm – for HMM training, but interested reader is encouraged to go to classical paper by  or slightly simpler variant by . However, many software packages provide easy implementation of HMM training (depmixS4 in R).

In last three posts, we discussed practical cases when sequence modeling through Markov process may come handy, and provided overview of training Markov Models. Markov models are often easy to train, interpret and implement and can be relevant in many business problems with right design and state identification.


1If hidden states correspond to observed states in one-to-one map, what happens?

2This is not completely true representation for this graph, because underlying process graph is very simple, and making anything more realistic means ugly picture. However, idea is that K number of hidden states map to Nnumber of observed state in many-to-many fashion.
3For K hidden states and L length of observed sequence, we will have KL combinations.


Other Articles by the same author

Other Related Links that you may like

转载地址:http://wdkpx.baihongyu.com/

你可能感兴趣的文章
linux 命令详解 二十三
查看>>
IT职场人生系列之二:大学生活
查看>>
手把手教你做出好看的文本输入框
查看>>
zabbix 3.2.7 (源码包)安装部署
查看>>
vsCode 快捷键、插件
查看>>
vue-validator(vue验证器)
查看>>
jQuery Ajax MVC 下拉框联动
查看>>
html
查看>>
c#创建文件夹
查看>>
Hibernate事务代码规范写法
查看>>
网络最大流问题算法小结 [转]
查看>>
面试之Java知识整理
查看>>
iOS推送消息报错误“Domain=NSCocoaErrorDomain Code=3000”的可能问题
查看>>
kvm-1
查看>>
hdu1045 Fire Net---二进制枚举子集
查看>>
leetcode 64. Minimum Path Sum
查看>>
textkit
查看>>
CentOS7+CDH5.14.0安装CDH错误排查: HiveServer2 该角色的进程已退出。该角色的预期状态为已启动...
查看>>
The Oregon Trail 俄勒冈之旅
查看>>
Excel VBA连接MySql 数据库获取数据
查看>>