一个统计机器学习的project。隐马尔科夫是一个统计机器学习模型，本作业在数据(label)不完整时通过MAP猜测出state sequence，然后推算未来的状态。本作业是一个kaggle的project，最后我们提交的预测结果的accuracy取得前五的好成绩。

[Continuous state hidden markov]

A bot moves around in a 2D plane following some probabilistic pattern unknown to you. You don’t observe this bot’s location on every time-step. What you do observe is the angle of the box to x axis at every time step (see figure below).

On every run, we place the bot at its starting location (fixed at same starting location for all runs) and let it run for 1000 +1 steps. We perform 10000 runs each with 1000+1 steps.

You are given observations for first 6000 runs of the angle observed at each step. You are additionally provided exact location of the bot at some random time steps on every round.

Your goal: predict the final location of the bot at the 1001’th step for rounds 6001 to 10000

Here is what you are provided with

The Observations of the bots: You are given a 6000x1000 matrix where each row of the matrix is observations made in one run (for runs from 1 to 6000). That is, row 20 column 40 specifies, the angle of the bot to X axis on the 20th run and 40th step. This data is provided to you in the Obervations.csv file in the comma separated values format.

A few labeled example: You are also provided the location of the bot on some random

subset of steps on every run. Label.csv consists of 600000 example locations. It consists of 600000 rows and 4 columns. Each row is one example location of the bot.

For instance, a row in Label.csv of form “201,333,1.2,-0.8” means that on run 201, step 333, the bot was at location (1,2,-0.8).

Label.csv only has locations for runs from 1 to 6000

Task: For each of the remaining 4000 runs (from 6001 to 10000), predict location of the bot at step 1001. The competition will be hosted on in-class-Kaggle. Kaggle will be intialized soon. You can use the Label.csv to evaluate your solutions. The score will be calculate by the RMSE of your prediction.

Output format: csv file with headings (8000 prediction rows):

id, value

6001x, [x value for run 6001]

6001y, [y value for run 6001]

6002x, [x value for run 6002]

6002y, [y value for run 6002]