久久久久久精品无码人妻_青春草无码精品视频在线观_无码精品国产VA在线观看_国产色无码专区在线观看

CS5012代做、代寫Python設(shè)計程序

時間:2024-03-03  來源:  作者: 我要糾錯



CS5012 Mark-Jan Nederhof Practical 1
Practical 1: Part of speech tagging:
three algorithms
This practical is worth 50% of the coursework component of this module. Its due
date is Wednesday 6th of March 2024, at 21:00. Note that MMS is the definitive source
for deadlines and weights.
The purpose of this assignment is to gain understanding of the Viterbi algorithm,
and its application to part-of-speech (POS) tagging. The Viterbi algorithm will be
related to two other algorithms.
You will also get to see the Universal Dependencies treebanks. The main purpose
of these treebanks is dependency parsing (to be discussed later in the module), but
here we only use their part-of-speech tags.
Getting started
We will be using Python3. On the lab (Linux) machines, you need the full path
/usr/local/python/bin/python3, which is set up to work with NLTK. (Plain
python3 won’t be able to find NLTK.)
If you run Python on your personal laptop, then next to NLTK (https://www.
nltk.org/), you will also need to install the conllu package (https://pypi.org/
project/conllu/).
To help you get started, download gettingstarted.py and the other Python
files, and the zip file with treebanks from this directory. After unzipping, run
/usr/local/python/bin/python3 gettingstarted.py. You may, but need not, use
parts of the provided code in your submission.
The three treebanks come from Universal Dependencies. If you are interested,
you can download the entire set of treebanks from https://universaldependencies.
org/.
1
Parameter estimation
First, we write code to estimate the transition probabilities and the emission probabilities of an HMM (Hidden Markov Model), on the basis of (tagged) sentences from
a training corpus from Universal Dependencies. Do not forget to involve the start-ofsentence marker ⟨s⟩ and the end-of-sentence marker ⟨/s⟩ in the estimation.
The code in this part is concerned with:
• counting occurrences of one part of speech following another in a training corpus,
• counting occurrences of words together with parts of speech in a training corpus,
• relative frequency estimation with smoothing.
As discussed in the lectures, smoothing is necessary to avoid zero probabilities for
events that were not witnessed in the training corpus. Rather than implementing a
form of smoothing yourself, you can for this assignment take the implementation of
Witten-Bell smoothing in NLTK (among the implementations of smoothing in NLTK,
this seems to be the most robust one). An example of use for emission probabilities is
in file smoothing.py; one can similarly apply smoothing to transition probabilities.
Three algorithms for POS tagging
Algorithm 1: eager algorithm
First, we implement a naive algorithm that chooses the POS tag for the i-th token
on the basis of the chosen (i − 1)-th tag and the i-th token. To be more precise, we
determine for each i = 1, . . . , n, in this order:
tˆi = argmax
ti
P(ti
| tˆi−1) · P(wi
| ti)
assuming tˆ0 is the start-of-sentence marker ⟨s⟩. Note that the end-of-sentence marker
⟨/s⟩ is not even used here.
Algorithm 2: Viterbi algorithm
Now we implement the Viterbi algorithm, which determines the sequence of tags for a
given sentence that has the highest probability. As discussed in the lectures, this is:
tˆ1 · · ·tˆn = argmax
t1···tn
 Yn
i=1
P(ti
| ti−1) · P(wi
| ti)
!
· P(tn+1 | tn)
2
where the tokens of the input sentence are w1 · · ·wn, and t0 = ⟨s⟩ and tn+1 = ⟨/s⟩ are
the start-of-sentence and end-of-sentence markers, respectively.
To avoid underflow for long sentences, we need to use log probabilities.
Algorithm 3: individually most probable tags
We now write code that determines the most probable part of speech for each token
individually. That is, for each i, computed is:
tˆi = argmax
ti
X
t1···ti−1ti+1···tn
 Yn
i=1
P(ti
| ti−1) · P(wi
| ti)
!
· P(tn+1 | tn)
To compute this effectively, we need to use forward and backward values, as discussed
in the lectures on the Baum-Welch algorithm, making use of the fact that the above is
equivalent to:
tˆi = argmax
ti
P
t1···ti−1
Qi
k=1 P(tk | tk−1) · P(wk | tk)

·
P
ti+1···tn
Qn
k=i+1 P(tk | tk−1) · P(wk | tk)

· P(tn+1 | tn)
The computation of forward values is very similar to the Viterbi algorithm, so you
may want to copy and change the code you already had, replacing statements that
maximise by corresponding statements that sum values together. Computation of
backward values is similar to computation of forward values.
See logsumexptrick.py for a demonstration of the use of log probabilities when
probabilities are summed, without getting underflow in the conversion from log probabilities to probabilities and back.
Evaluation
Next, we write code to determine the percentages of tags in a test corpus that are
guessed correctly by the above three algorithms. Run experiments for the training
and test corpora of the three included treebanks, and possibly for treebanks of more
languages (but not for more than 5; aim for quality rather than quantity). Compare
the performance of the three algorithms.
You get the best experience out of this practical if you also consider the languages of
the treebanks. What do you know (or what can you find out) about the morphological
and syntactic properties of these languages? Can you explain why POS tagging is more
difficult for some languages than for others?
3
Requirements
Submit your Python code and the report.
It should be possible to run your implementation of the three algorithms on the
three corpora simply by calling from the command line:
python3 p1.py
You may add further functionality, but then add a README file to explain how to run
that functionality. You should include the three treebanks needed to run the code, but
please do not include the entire set of hundreds of treebanks from Universal
Dependencies, because this would be a huge waste of disk space and band
width for the marker.
Marking is in line with the General Mark Descriptors (see pointers below). Evidence of an acceptable attempt (up to 7 marks) could be code that is not functional but
nonetheless demonstrates some understanding of POS tagging. Evidence of a reasonable attempt (up to 10 marks) could be code that implements Algorithm 1. Evidence
of a competent attempt addressing most requirements (up to 13 marks) could be fully
correct code in good style, implementing Algorithms 1 and 2 and a brief report. Evidence of a good attempt meeting nearly all requirements (up to 16 marks) could be
a good implementation of Algorithms 1 and 2, plus an informative report discussing
meaningful experiments. Evidence of an excellent attempt with no significant defects
(up to 18 marks) requires an excellent implementation of all three algorithms, and a
report that discusses thorough experiments and analysis of inherent properties of the
algorithms, as well as awareness of linguistic background discussed in the lectures. An
exceptional achievement (up to 20 marks) in addition requires exceptional understanding of the subject matter, evidenced by experiments, their analysis and reflection in
the report.
Hints
Even though this module is not about programming per se, a good programming style
is expected. Choose meaningful variable and function names. Break up your code into
small functions. Avoid cryptic code, and add code commenting where it is necessary for
the reader to understand what is going on. Do not overengineer your code; a relatively
simple task deserves a relatively simple implementation.
You cannot use any of the POS taggers already implemented in NLTK. However,
you may use general utility functions in NLTK such as ngrams from nltk.util, and
FreqDist and WittenBellProbDist from nltk.
4
When you are reporting the outcome of experiments, the foremost requirement is
reproducibility. So if you give figures or graphs in your report, explain precisely what
you did, and how, to obtain those results.
Considering current class sizes, please be kind to your marker, by making their task
as smooth as possible:
• Go for quality rather than quantity. We are looking for evidence of understanding
rather than for lots of busywork. Especially understanding of language and how
language works from the perpective of the HMM model is what this practical
should be about.
• Avoid Python virtual environments. These blow up the size of the files that
markers need to download. If you feel the need for Python virtual environments,
then you are probably overdoing it, and mistake this practical for a software
engineering project, which it most definitely is not. The code that you upload
would typically consist of three or four .py files.
• You could use standard packages such as numpy or pandas, which the marker will
likely have installed already, but avoid anything more exotic. Assume a version
of Python3 that is the one on the lab machines or older; the marker may not
have installed the latest bleeding-edge version yet.
• We strongly advise against letting the report exceed 10 pages. We do not expect
an essay on NLP or the history of the Viterbi algorithm, or anything of the sort.
• It is fine to include a couple of graphs and tables in the report, but don’t overdo
it. Plotting accuracy against any conceivable hyperparameter, just for the sake
of producing lots of pretty pictures, is not what we are after.
請加QQ:99515681  郵箱:99515681@qq.com   WX:codehelp 

標(biāo)簽:

掃一掃在手機打開當(dāng)前頁
  • 上一篇:代做CS252編程、代寫C++設(shè)計程序
  • 下一篇:AcF633代做、Python設(shè)計編程代寫
  • 無相關(guān)信息
    昆明生活資訊

    昆明圖文信息
    蝴蝶泉(4A)-大理旅游
    蝴蝶泉(4A)-大理旅游
    油炸竹蟲
    油炸竹蟲
    酸筍煮魚(雞)
    酸筍煮魚(雞)
    竹筒飯
    竹筒飯
    香茅草烤魚
    香茅草烤魚
    檸檬烤魚
    檸檬烤魚
    昆明西山國家級風(fēng)景名勝區(qū)
    昆明西山國家級風(fēng)景名勝區(qū)
    昆明旅游索道攻略
    昆明旅游索道攻略
  • 短信驗證碼平臺 理財 WPS下載

    關(guān)于我們 | 打賞支持 | 廣告服務(wù) | 聯(lián)系我們 | 網(wǎng)站地圖 | 免責(zé)聲明 | 幫助中心 | 友情鏈接 |

    Copyright © 2025 kmw.cc Inc. All Rights Reserved. 昆明網(wǎng) 版權(quán)所有
    ICP備06013414號-3 公安備 42010502001045

    久久久久久精品无码人妻_青春草无码精品视频在线观_无码精品国产VA在线观看_国产色无码专区在线观看

    香蕉视频xxxx| 日本va中文字幕| 精品少妇一区二区三区在线| 国内自拍视频网| 国产在线拍揄自揄拍无码| 欧美一区二区三区爽大粗免费| 中文av一区二区三区| 日本一区午夜艳熟免费| 99精品999| 黄色a级片免费| www国产免费| 亚洲成人av免费看| 精品视频在线观看一区| www.com久久久| 无遮挡又爽又刺激的视频| 强开小嫩苞一区二区三区网站| 日韩精品你懂的| 免费无码不卡视频在线观看| 在线观看视频在线观看| 免费看a级黄色片| 分分操这里只有精品| 免费看av软件| 伊人影院综合在线| 免费无码国产v片在线观看| 毛片在线视频观看| 国产5g成人5g天天爽| 国产成人手机视频| 黄色片视频在线免费观看| 香港三级日本三级a视频| 异国色恋浪漫潭| 中日韩av在线播放| www.日本xxxx| 日韩av黄色网址| 91免费黄视频| 奇米777四色影视在线看| 超碰中文字幕在线观看| 伊人网在线综合| 天堂av在线网站| 久久久久免费精品| 久久久噜噜噜www成人网| 青青草成人免费在线视频| 黄色一级片国产| 99re99热| 波多野结衣在线免费观看| 天天干天天玩天天操| 青青青在线视频免费观看| 久久精品免费网站| 88av.com| 熟女人妇 成熟妇女系列视频| 欧美乱做爰xxxⅹ久久久| 美女av免费观看| bt天堂新版中文在线地址| 亚洲精品天堂成人片av在线播放| 国产美女视频免费| 日本一本草久p| 久久综合亚洲精品| 国产片侵犯亲女视频播放| 福利视频免费在线观看| 福利视频一区二区三区四区| 日韩欧美不卡在线| 黄色一级片国产| 欧美黄网在线观看| 97在线国产视频| 妞干网在线视频观看| 人妻少妇被粗大爽9797pw| 欧美日韩一区二区在线免费观看| 可以在线看的黄色网址| 91精品无人成人www| 天天影视色综合| 国产手机视频在线观看| 国产毛片久久久久久国产毛片| 亚洲人成无码网站久久99热国产| 男人添女荫道口图片| 国产精品wwwww| 孩娇小videos精品| 波多野结衣网页| 欧美又粗又长又爽做受| 婷婷视频在线播放| 欧洲精品视频在线| 国产精品333| 成人午夜激情av| а 天堂 在线| 久久国产精品网| 十八禁视频网站在线观看| 中文字幕国内自拍| 六月婷婷激情网| 国产69精品久久久久999小说| 欧洲av无码放荡人妇网站| 一二三级黄色片| xxxx18hd亚洲hd捆绑| 午夜视频你懂的| 男同互操gay射视频在线看| 男女猛烈激情xx00免费视频| 污污的网站18| 高清无码视频直接看| 成人在线观看黄| 最新av网址在线观看| 粗暴91大变态调教| 国产成年人在线观看| 又粗又黑又大的吊av| 午夜国产福利在线观看| 日韩精品一区在线视频| 亚洲黄色av网址| 日韩精品久久一区二区| 国产视频一区二区视频| 精品嫩模一区二区三区| 欧美两根一起进3p做受视频| 亚洲黄色网址在线观看| 丁香婷婷激情网| 久久av综合网| 91插插插影院| 大肉大捧一进一出好爽视频| 中文字幕55页| 久久婷婷国产91天堂综合精品| 久久久无码中文字幕久...| 国产乱子夫妻xx黑人xyx真爽| 日本成人性视频| 爱情岛论坛亚洲首页入口章节| 日韩久久久久久久久久久久| 亚洲天堂2018av| 六月丁香激情网| 男同互操gay射视频在线看| 久久久久久久片| 欧美视频在线观看视频| 成年人网站av| 不要播放器的av网站| 国产综合中文字幕| 国产对白在线播放| 国产视频1区2区3区| 日本精品www| 国内精品视频一区二区三区| 日本成人xxx| 手机视频在线观看| 99福利在线观看| 欧美精品久久久久久久自慰 | 欧美一级黄色录像片| 麻豆一区二区三区视频| 噜噜噜久久亚洲精品国产品麻豆| 精品一区二区成人免费视频| 午夜国产一区二区三区| 久久无码高潮喷水| www.99热这里只有精品| 大地资源网在线观看免费官网| 亚洲精品综合在线观看| 在线观看的毛片| 成年人免费大片| 国产精品亚洲αv天堂无码| 日本少妇高潮喷水视频| 欧美黑人在线观看| 国产情侣第一页| 成人手机在线播放| 久久最新免费视频| 天堂在线精品视频| 在线视频一二区| 天堂av在线8| 五月天av在线播放| 污视频网址在线观看| 欧美成人福利在线观看| 日韩在线不卡一区| 国内国产精品天干天干| 中文字幕中文在线| 小明看看成人免费视频| 中文字幕 日韩 欧美| 中文字幕成人免费视频| 亚洲日本黄色片| www.-级毛片线天内射视视| 日本精品免费视频| 免费在线看黄色片| 欧美极品欧美精品欧美| 欧美日韩在线视频一区二区三区| 日本中文字幕网址| 日韩少妇内射免费播放18禁裸乳| aa在线观看视频| 777久久久精品一区二区三区| 黄色片视频在线播放| 成人在线激情网| 波多野结衣国产精品| 吴梦梦av在线| 少妇一晚三次一区二区三区| 国产真实老熟女无套内射| 国产精品一区二区免费在线观看| 国产主播在线看| 在线看的黄色网址| aaaaaaaa毛片| 国产夫妻自拍一区| 免费裸体美女网站| 在线观看免费的av| 成人手机在线播放| 国产偷人视频免费| 91亚洲精品久久久蜜桃借种| avove在线观看| 免费观看美女裸体网站| 成人免费xxxxx在线视频| 一本色道久久亚洲综合精品蜜桃| 亚洲一区二区在线视频观看| 亚洲激情免费视频| 国产日产欧美视频| 亚洲黄色av片| 亚洲一区二区三区av无码| 日本888xxxx|