åæžã
以åQiitaã®èšäºã§ãå¹³æã®æ¬¡ã®å
å·ãAIã§äºæž¬ããããšãããã
ããã«ã€ããŠããã¬ãåæïŒããžãã¬ãïŒãåããããšã«ãªã£ãã
ååäœã£ãå
å·äºæž¬AIã匷åããŠã
åæžäºçµããæ¥æ¬ã®å€å
žãªã©ãåºå
žãèŠéã«å
¥ããŠã
AIã«ããå
å·ã¬ãäºæž¬ã宿œããŠæ¬²ãããšã®ããšã
4æ1æ¥ãæ°å
å·ãçºè¡šãããåœæ¥ãçºè¡šçŽåã«æŸæ ããäºå®ã ããã ã
ã€ãŸããæ°å
å·ã®äºæ³ã¯ïŒžïŒžã§ãïŒãšèšã£ãç¬éã«ã
æ®å¿µéããŸãããŒãŒïŒãšåãã£ãŠããŸãããã£ããæ
äºåã«ãŠã¯ãµçã§äºæ³ãããŠãããã®ã¯æ¡çšãããªãã
ãšãã話ãããããã«ãã¬ãã§åœãŠãããã«ã¯çŽåæŸæ ãããªã
ãšããæãã¹ãæŠãã
ïŒã©ãèããŠãããã¿ãªåœãŠãã®ã¯é£æåºŠãé«ãããïŒ
ãŸã ååã®èšäºãã芧ã«ãªã£ãŠããªãæ¹ã¯ã以äžãåç §ããããã
ååã®èšäºâãå¹³æã®æ¬¡ã®å å·ããAIã ãã§æ±ºããããç©èª
æ¬èšäºæçš¿çç±
æ¬åœã¯ã4æ1æ¥ã®æŸéã«æåŸ
ããŠãã ããã
以äžçµããããªã®ã ãã
â ãã¬ãäžã®æŸéæéã¯æããïŒåçšåºŠãšçãã
ãäžè¬äººåãã®æŸéã§ããããã
ãããã°ã©ã çãªé¢çœãã¯ãã¶ãã»ãŒå«ãŸããªãã
â¡4æ1æ¥ã®æŸéïŒçºè¡šåŸã«ã詊è¡é¯èª€ããå
容ãã
ãããã°ã©ã èªäœãã¢ããããŠãè¯ããã
ã4æ1æ¥ã®çºè¡šåŸã«èšäºã«ããŠãã
ãæ¢ã«æ£è§£ãåºãŠããã®ã§å
šãé¢çœããªãã
â¢ããããçºè¡šåŸã«æçš¿ãããªãã°ã
ããå®éã«å
¬éãããå
å·ããAIããèŠãŠã©ãã ã£ãã®ãïŒããšãã
ãæ°å
å·ã®è©äŸ¡ã®ç©èªã®æ¹ã
ãå€ããäºæ³ã®ç©èªãããããªã¢ã·ãã€ã ããã
âèŠèã®çãšããŠã
ãæŸéåã«ãäºæ³AIïŒããã°ã©ã ã®ç©èªã®å€§éšåã®è¶£æšã¯å
¬éããŠã¿ãã
ãQiitaã®èªè
諞å
ã«ã¯ååã®ç©èªã®ç¶ããå¿åŸ
ã¡ã«ããŠãã
ãããããªäººããããããããªãã®ã§ãã¡ãããšç¶ããæžãã
ãçªçµå®£äŒçã«ããªã£ãŠãããããããªãããã¶ãã
ãäºåã«äºæ³ãããŠããŸããšåŽäžãšãã話ã®ããã
ãæåŸã®ãçµè«ãçãªéšåã¯
ãå
¬éã§ããªãéšåãããããããã¯ãäºæ¿ãé¡ãããã
ããŸãããäºæ³ãã¯ãçºè¡šåŸã«æžããŠãç¡æå³ãªã®ã§ã
ãæŸéæéã§ã¯å
¥ããããªãïŒè©³çްãããéšåãã
ãããã«ããã«æžããŠããããã£ãŠããšã
ããããŸã§ãããã°ã©ã ã䜿ã£ã
ãçŸåšé²è¡åœ¢ã®ãç©èªããšããŠã楜ãã¿ãã ããã
ååã®äºæž¬ã®æŠèŠãšããã®èšäºã§ç®æãå 容
ååã®èšäºã®å
容ã¯ã
è¯ãæå³ãæã€æŒ¢åãšã
ãã®æŒ¢åã®ãã©ã³ã¹ã®åããçµã¿åããã
æ©æ¢°çãªèšç®ã®ã¿ã§èŠã€ããããšãã§ãããïŒ
ãšãã詊ã¿ã§ãã£ãã
ååã®èšäºâãå¹³æã®æ¬¡ã®å å·ããAIã ãã§æ±ºããããç©èª
å
å·ã®å€ãã¯ãåæžäºçµãªã©ã®æŒ¢ç±ã«åºå
žãæã€ãã
ããã¯å®ããããã«ãŒã«ã§ã¯ãªãã
å
å·æ³ã«é¢é£ããŠå®ããããã
1979幎ã®å
å·éžå®æç¶ãã®èŠé ã«ããã°ã
ãããŸã§ã以äžã®ïŒã€ãã«ãŒã«ã§ããã
ã1. åœæ°ã®çæ³ãšããŠãµãããããããªããæå³ãæã€ãã®ã§ããããšã
ã2. 挢å2åã§ããããšã
ã3. æžããããããšã
ã4. èªã¿ãããããšã
ã5. ãããŸã§ã«å
å·åã¯ãããåãšããŠçšãããããã®ã§ãªãããšã
ã6. ä¿çšãããŠãããã®ã§ãªãããšã
ãã£ãŠãåæžäºçµãªã©ã®æŒ¢ç±ãåç
§ãããšããã
ããã°ãå®ç³ããåŠç¿ããã«ã
ãïŒãã®ãåœæ°ã®çæ³ãšããŠãµãããããããªããæå³ããã
æ¥æ¬èªã®éåžžã®æç« ïŒwikipediaã®ããã¹ãããŒã¿ïŒãã
æ©æ¢°åŠç¿ã«ãã£ãŠå°ãåºãããšãã§ãããïŒãšããç¹ã
ååææŠããããŒãã§ããã
äŸãã°ãå²ç¢ãå°æ£ã®AIãäœãå ŽåãèããŠã¿ããšã
ãã®ã«ãŒã«ã ãããäœãã®ãã
ããæ£å£«åå£«ã®æ£èãèŠãŠåŠã¶ã®ããäºéãã®æ¹æ³ããããã
ããæ£å£«ã®æ£èïŒæŒ¢ç±ãã®ãã®åã³ãåŠè
ã®éžå®åºæºïŒã
äžåå
¥ããã«ã©ããŸã§åºæ¥ãã®ãããšããããšã
å å·ããåœãŠããããã«äœã£ãŠããã®ãšã¯ã¡ãã£ãšéãã
â»å®éã¯ãåæžäºçµãªã©ã®å€å
žããŒã¿ã®å
¥æãé¢åãšããç¹ãšã
ããã®ã©ã®æžããææ¬ã«éžã¶ããªã©ã®ã人é倿ãå
¥ã£ãŠããŸãããšã
ãå«ã£ããšããå®åçãªçç±ããã£ãŠããã«ãŒã«ã®ã¿ãã®åãçµã¿ãšããã
ä»åã®ããžãã¬ã殿ã®äŸé Œå
容ã«ã€ããŠãç§ãšå©å®³ãäžèŽããçç±ã¯ã
ãããã³ã€ãã«ãåºå
žãã®åè£ã«ãªãããŒã¿ãå°ããããã©ããªãã®ïŒ
ãã£ãšé 匵ãã°ïŒåœãŠã«ããã°ïŒã©ããªãã®ïŒ
ãšããèå³ã®æ¢ç©¶ã§ããã
ããžãã¬ã殿ãšããŠããQiitaèšäºãã®ãŸãŸã§ã¯ãªãããããªã匷åçãæŸéãããã
ç§ãšããŠããåºå
žãã¿ïŒæ§ã
ãªèŠçŽ ã匷åããããŒãžã§ã³ã
å
å·çºè¡šåãŸã§ã®æ¬ãªéã«äœã£ãŠã¿ããããåœãŠãŠã¿ããã
ãšããããšã§ãåºå
žã®å
¥æãããžãã¬ã殿ã«ãé¡ããã
ååã®ããã°ã©ã ã®ãã©ãã·ã¥ã¢ãããè¡ãããšã«ãªã£ãã
ïŒåæã®ç³ã蟌ã¿ãããã ããã®ã¯è¯ãããã«ã±ãšãªã£ãïŒ
å®éã«åºãçµè«ã¯4æ1æ¥ãŸã§ã®ã楜ãã¿ãšããŠã
詊ããå
容ããããã°ã©ã ã匷åããå
容ã
ãã®è©Šè¡é¯èª€ã®æ©ã¿ã®äžéšãã以äžã«èšãã
ïŒâ»æ¬èšäºæçš¿æç¹ã§ã¯ã»ãŒå®æãããã®ã®ãæçµçµè«ã¯ããã¡ããïŒ
ã¡ãããšèšé²ãããšãŸããŸãé·ããªã£ãŠããŸãã
åãæä»£ãçãã人ã
ãžãæŽå²çç¬éã«å
±ã«ç«ã¡äŒãç©èªãšããŠã
æªæ¥ãžã®æŠãã«æãã èšé²ãæ®ãæ¥èªãšããŠãæ±ã£ãŠæ¬²ãã
æ¬æçš¿ã®å 容
ååèšäºãèŠãŠããã ããäžã§ããã®åŒ·åçïŒç¶ç·šã§ãã
ããããæ€èšããããã©ãã¡ã ã£ããã®éšåã¯
æŸéæéçã«å
šãå
¥ããªããšæãããããããã®ãžãã®èšé²ç®çãå«ããŸãã
- ã¬ãã§ãAIã ãã«æŒ¢åã®ã»ã¬ã¯ããä»»ããæ°å å·ã«çžå¿ããèšèãèšç®ããŸãã
- 䜿ããINPUTæ
å ±ã¯ã以äžã®ããã«ããŸããã
- Wikipediaã®ããã¹ãããŒã¿
- ïŒãã ã®å€§éããã¹ããšããŠæ±ãïŒ
- æ¢ã«äœ¿ãããå å·ã®äžèЧïŒïŒããããææ¬ãšããŠåŠç¿ïŒ
- æè²çšæŒ¢åäžèЧïŒå°åŠæ ¡ã§èŠããæŒ¢åäžèЧïŒ1006åïŒïŒïŒåè£äžèЧïŒ
- èªã¿ãããïŒå°åŠæ ¡ã¬ãã«ã®æŒ¢åããšä»®å®ã
- ïŒâ»åžžçšæŒ¢åã®å Žåã¯1945åãããŒã¿ãå ¥ãæ¿ããã°ãã¡ãã§ãå¯ïŒ
- å¥ãªæå³ã®é€å€ãªã¹ãã®äœæã®ããã®æ
å ±
- Mecabã®èŸæžïŒããã¯ååãå©çšããŠããïŒ
- éå»ã®å€©çåã®äžèЧïŒNewïŒ
- Wikipediaã®ã¿ã€ãã«é ç®ã®äžèЧïŒNewïŒ
- 挢åèªã¿æ¹ïŒ¡ïŒ°ïŒ©ïŒNewïŒ
- IPAïŒæ å ±åŠçæšé²æ©æ§ïŒã«ããæŒ¢åã®è©³çްæ å ±
- https://mojikiban.ipa.go.jp/search/help/api
- èªã¿æ¹ãã§ãã¯ïŒç»æ°ãã§ãã¯ã«å©çš
- â
åæžäºçµãå€äºèšããªã©ã®äžåœïŒæ¥æ¬ã®å€å
žããŒã¿â
- ããžãã¬ã殿ã«å ¥æããŠããã ããããŒã¿ã詳现ã¯ãããïŒNewïŒ
- 挢ç±ã ãã§ãªããæ¥æ¬å€å žã®å¯èœæ§ãããããããã
- äžéšåŠè å çã«ãããªã¹ã¹ã¡å€å žãå ¥ãããããããªã
- Wikipediaã®ããã¹ãããŒã¿
-
補äœè
ã調æŽåºæ¥ãããšã¯ãæ°å€ã®èšå®ã ããã§ã䞻芳ã§å€æã¯ãããªãããšããŸãã
- â ããã¯ååãšåæ§ãæ£çŽäžçªå³ããå¶çŽã
- äŸãã°ããèŠãã®åã¯æå³ãæªããã䜿ãããªãããããšãã£ãŠé€å€ããã®ã¯çŠæ¢
- äœãã®åºæºã§ãäžäœNåã«çµããšããäœãã®åŸç¹å€ã10以äžããªã©ãšæ°å€ãåãã®ã¯OK
- ã€ãŸããåè£ã®çŽ1000åã«å¯ŸããŠããå
šãŠã®æåãå¹³çã«æ±ããããšã«ãªãã
- â»åºå®ã«ãŒã«ãäºãå®è£ ããã®ã¯ã¢ãªãšããããå¹³ããå床ã«ãªãããšã¯ç¡ããæŒ¢æ°åã¯èª€è§£ãçãããã䜿ããªãããªã©
- æ°å€ãããã¹ãã®å€æŽã ãã«äŸåãã誰ããã£ãŠãåãçµæãåºãããšããšèšãæããŠãè¯ãã
- ãããŸããªå
容
- 倧éã®æ¥æ¬èªã³ãŒãã¹ïŒwikipediaïŒor 挢ç±ïŒå€å žã®åŠç¿çµæãããšã«ã
- ãææ¬ããŒã¿ïŒéå»å å·ããšäŒŒããããªæå ïŒ ãè¯ãæåããèŠã€ãã
- ç»æ°ãªã©ãèžãŸããŠãå å·ã§æ¡çšãããã§ãããæŒ¢åã®åè£ãçµã
- 挢ç±ïŒå€å žã®åºå žãããçµã¿åãããæ¢çŽ¢ãã
- MTSHãã§ãã¯ïŒææ²»å€§æ£æåå¹³æãšã€ãã·ã£ã«ãéè€ããªãïŒãã
- ãã®ä»ã®æå³ãæã€ãã©ããããªã©ã®è€åçãªèŠå ã§ããã£ã«ã¿ãªã³ã°
- åºãŠããæåã®çµã¿åããã®ããã©ã³ã¹ããšãè¯åå ·åãããAIã«è©äŸ¡ããã
- åºå žããšã«ãè©äŸ¡å€ãé«ãã£ããã®ãæç€ºãã
- ã³ãŒãã®å®è¡ç°å¢ã¯å šãŠãWindows10 + Python3 +JupyterNotebook ãåæã
å å·äºæ³ã®ææŠæ¹æ³ã®æ¹é
æåã«ã倧ããªé²ãæ¹ãšããŠïŒæ¡ããã
ååã¯ããè¯ãæå³ã®æåããèŠã€ããããã«ã
æ¥æ¬èªã³ãŒãã¹ã«å¯Ÿããæ©æ¢°åŠç¿ãè¡ãã
char2vecã®æè¡ãçšããŠã
挢ååå£«ã®æå³ããã¯ãã«åããã
ãããèžãŸããŠã以äžã®ïŒæ¹éãããããæ€èšïŒå®è¡ãã
æ¹éæ¡â ãåŠç¿ã®INPUTã«æŒ¢ç±å€å
žã䜿ããæ¡
ãã®char2vecã®æ©æ¢°åŠç¿ã®å
ããŒã¿ãšããŠã挢ç±ãå€å
žã䜿ã
ïŒæåããæŒ¢ç±ã»å€å
žãããŒã¹ãšããïŒãšããæ¡
æ¹éæ¡â¡ãåºå
žãèŠã€ãã«ãããæ¡
çŸä»£æ¥æ¬äººã«ãšã£ãŠã®æŒ¢åã®æå³ä»ãã¯Wikipediaããã¹ãããŒã¿ã§
æ¢ã«åºæ¥ãŠãããšèããŠãååã§ããªãã£ããåºå
žããæ¢ãã«è¡ã
ïŒæŒ¢ç±ãå€å
žãæ€çŽ¢å¯Ÿè±¡ãšããŠäœ¿ãïŒãšããæ¡
æ¹éæ¡â ãåŠç¿ã®INPUTã«æŒ¢ç±å€å žã䜿ããæ¡
ãŸãããæŒ¢ç±ãïŒåæžäºçµãªã©ã
ãšæ¥æ¬ã®å€å
žã¯ãããããåããŠèããã
ããããèšèªãéãç¹ãšããæŒ¢ç±ãã®ããŒã¿ã¯
äžåœèªã®æŒ¢åã§äœãããŠãããæ¥æ¬èªã®æŒ¢åãšã³ãŒããäžèŽããªãããã
æçµçã«ã¯ãäž¡æ¹ãšã詊ããã
ãŸãã察象ãšããutf-8圢åŒã®ããã¹ãããŒã¿ã
åããã©ã«ãã«éããããããçµåãããã
äœèšãªæåãæé€ãããããããŒã¿ãäœã£ãŠã
åŠç¿ã®INPUTãã¡ã€ã«ãšããã
%%time
import codecs
import glob
import re
def open_ch_file(filepath):
input_text = ""
with codecs.open(filepath,"r", "utf-8") as f:
input_text += f.read()
#æ¹è¡ãå°ããªç©ºçœã«å€æ
input_text = input_text.replace('\r','')
input_text = input_text.replace('\n',' ')
# æ°å€ãé€å»
input_text = re.sub(r'[0-9]+', "", input_text)
return input_text
folder_file_path = "XXXXX/XXXXX/*.txt"
file_list = glob.glob(folder_file_path)
input_text = ""
for filepath in file_list:
input_text += open_ch_file(filepath)
#1æåãã€ã«åºåã
chars = [c for c in input_text if c != u' ']
with codecs.open('KANSEKI.txt',"w", "utf-8") as f:
f.write(u' '.join(chars))
éãããã¡ã€ã«ã«å¯ŸããŠãword2vecã®åŠç¿ã宿œããã
ãã®éšåã®ã³ãŒãã¯ããã©ã¡ãŒã¿ã®éã以å€ã¯ååãšåæ§ã
%%time
import logging
from gensim.models import word2vec
logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level=logging.INFO)
sentences = word2vec.Text8Corpus('KANSEKI.txt')
model = word2vec.Word2Vec(sentences, size=40, window=12, min_count=3, hs=0, negative=5, iter=30)
model.save("mychar2vec_XXXXXX_model")
ååãšã®å€§ããªéãã¯ã䜿çšãããã©ã¡ãŒã¿ã
ã²ãããªãªã©ã®ããŒã¿ãæ··ãã£ãŠããªãããã
windowãµã€ãºã¯å°ããã«èšå®ããŠããããããããªãã
ããŒã¿ã®ç·éãå°ãªãããã
次å
æ°ïŒsizeïŒãmin_countã¯å°ããã«ã
ç¹°ãè¿ãå®è¡åæ°ïŒiterïŒã¯å€§ããã«èšå®ãããªã©å€æŽããŠã
äœåºŠã詊ããŠããã
Wikipediaã»ã©ã®ããªã¥ãŒã ããªãããã
äœåºŠã詊ããŠãããã«çµããç¹ã¯æ¥œã
åºæ¥ãã¢ãã«ã®æ§èœã以äžã®ããã«è©ŠããŠã¿ãã
out = model.most_similar(positive = [u'æž'], topn=10)
print(out)
out = model.most_similar(positive = [u'å³'], topn=10)
print(out)
out = model.most_similar(positive = [u'倩'], topn=10)
print(out)
[('èš', 0.6210867762565613), ('諱', 0.5740101337432861), ('è®', 0.5511521697044373), ('å€', 0.510854184627533), ('è', 0.5030744075775146), ('è§', 0.47281569242477417), ('è', 0.4715611934661865), ('å', 0.46267572045326233), ('è«', 0.45687738060951233), ('å©', 0.4497215151786804)]
[('å·Š', 0.6416146159172058), ('çº', 0.6342562437057495), ('殳', 0.6199294924736023), ('æ', 0.5840848684310913), ('借', 0.5733482241630554), ('æ', 0.5700335502624512), ('ç¶Š', 0.5488435626029968), ('衜', 0.5380955934524536), ('æ', 0.5014756917953491), ('è©', 0.47802940011024475)]
[('é', 0.8151514530181885), ('ç¥', 0.5380247235298157), ('å', 0.5238890647888184), ('äž', 0.4873164892196655), ('æ¿', 0.4687088131904602), ('å®', 0.4571455717086792), ('ç¥', 0.447026789188385), ('äœ', 0.43108344078063965), ('ç¶±', 0.4226115643978119), ('è¶³', 0.4220472574234009)]
ãæžãã«å¯ŸããŠãèšã
ãå³ãã«å¯ŸããŠãå·Šã
ãåºãŠãããããå€å°äžæããã£ãŠãããšããã¯ãããã
ïŒäœä»¥éãè¯ãåãããªãã
wikipediaã§äœã£ããã®ãšã¯å
šãã¬ãã«ãéãã
詊ãã«ïŒïŒïŒçš®é¡ã«ã¯ã©ã¹ã¿ãªã³ã°ããçµæãèŠããšã
ãæ±è¥¿ååããåãã¯ã©ã¹ã¿ã«å
¥ã£ãŠãããã
ãç¶æ¯å
åŒããåãã¯ã©ã¹ã¿ã«å
¥ã£ãŠããããšã
è¯ãéšåãããã€ãèŠããããã®ã®ã
äºæ³ã§äœ¿ããã¬ãã«ã«ã¯ãªã£ãŠããªãã£ãã
ãåºå
žã«ãªããããªè¯ãå€å
žãã ããINPUTã«ããå Žåã
ãã¯ããåçŽã«ããŒã¿ããªã¥ãŒã ãå°ããããã«ã
æ§ã
ãªæŒ¢åãç¶²çŸ
ããåŠç¿ã¢ãã«ãäœãã®ã«ã¯ç¡çããã£ãã
äžèšã®äŸã¯ã挢ç±ã®åºåäŸã ãæ¥æ¬ã®å€å
žã§ãåæ§ããã以äžã§ãã£ãã
æ¹éæ¡â ã®çµè«ïŒ
ãåŠç¿çšããŒã¿ããšããŠãå€å
žããæ±ãããšã¯ãã®ããŒã¿éçã«é£ããããšå€æããã
æ¹éæ¡â¡ãåºå žãèŠã€ãã«ãããæ¡
æ¡â¡ã¯ãçµè«ããèšãã°ããçšåºŠäžæããã£ãã
ååãšåæ§ã®æ¹æ³ã§ãæ¡çšãããã§ãããæŒ¢ååè£ãåºãã
ãã®æŒ¢åãå€å
žäžã§äœ¿ãããŠããäœçœ®ãæ€çŽ¢ã
ãåºå
žããšèšããã»ã©è¿ããå Žæã§äœ¿ãããŠãããã¢ãèŠã€ããŠ
æœåºãããšããèãæ¹ã
ãŸãããæŒ¢ååè£ãåºãå Žæã
ãåºãŠããå
å·ããã§ãã¯ïŒè©äŸ¡ïŒããå Žæãã
ããããã§ãååã«æ¯ã¹ãŠæ§ã
ãªãã¯ãŒã¢ãããå®è£
ããŠããã
以éã远å ããç¹ãããã€ã³ãããå®è£ å«ããŠã玹ä»ããã
挢ååè£ã®æœåº
ãããªãããã®ç©èªã®æå€§ã®æ žå¿çãªé¢æ°ãæç€ºããã
å
å·ã«äœ¿ãããŠããæŒ¢åïŒãè¯ãæå³ããæã€ããšä»®å®ããŠã
åžžçšæŒ¢åãªã¹ãoræè²æŒ¢åãªã¹ãã®äžããã
ãã®ãè¯ãæå³ãã«è¿ãæåãèŠã€ããããšããè¶£æšã
äŸãã°ãå å·ã§ïŒå以äžäœ¿ãããŠããã以äžã®æååãããææ¬ããšèããã
- æ°žå 倩治å¿åé·æ£æå®å»¶æŠåŸ³å¯ä¿æ¿ä»ååº·å»ºæ ¶ä¹ å¹³åŒè²äº«å®çп倧äºå¯¿äžåé€èгåäžæ¿
ãææ¬ãšã®è·é¢ïŒã³ãµã€ã³é¡äŒŒåºŠïŒãèšç®ãã颿°ãšã
å
šæŒ¢åã«å¯ŸããŠããã®è·é¢ãæ±ããŠãè¿ãåã®äžäœãè¿ã颿°ãã
ãããã以äžã®ããã«å®è£
ããã
#äžããããæŒ¢åãšããææ¬ããšã®è·é¢æãç®åºããã
#äžããããæŒ¢åãšè¿ãäžäœïŒïŒïŒ
ã®ãææ¬æã®è·é¢ã®å¹³åã
def get_otehon_ave_ruijido(char, otehon_str):
jyoui_kosuu = round(len(otehon_str)/5)
distance = 0
cnt = 0
distance_list = []
for stridx in range(0, len(otehon_str)):
distance = JPmodel.similarity(char, otehon_str[stridx])
distance_list.append(distance)
distance_list.sort(reverse=True)
distance_list=distance_list[0:jyoui_kosuu]
ave = sum(distance_list)/len(distance_list)
return ave
#ãææ¬ãå
¥åãããšããããšäŒŒã挢åã®äžèЧãè¿ã颿°
#ïŒã©ãããã䌌ãŠããæåãè¿ãã®ãæå®ããïŒ
def get_Gengou_Kouho_Kanji(otehon_str, target_str, min_ruijido):
char_val_list=[]
for char in target_str:
if char in NG_STR:
continue
#å
ã®æåãšããææ¬ãšããŠæå®ãããªã¹ããšã®å¹³åå€ããªã¹ãå
char_val_list.append([char, get_otehon_ave_ruijido(char, otehon_str)] )
#ãœãŒã
char_val_list = sorted(char_val_list, key=lambda x:x[1], reverse=True)
gengou_kouho_kanji_str = ""
for char_val in char_val_list:
if char_val[1] >= min_ruijido:
gengou_kouho_kanji_str += char_val[0]
return gengou_kouho_kanji_str
ååã¯ãããããã®ããææ¬ãã«äŒŒãŠããæåãéžãã§ãã
åæåããè©äŸ¡ããããšããæ®µåãã«ããŠããã
ä»åã¯ããè©äŸ¡é¢æ°ããå
ã«äœã£ãŠããã
åžžçšæŒ¢åãªã¹ããæè²æŒ¢åãªã¹ããªã©ã
ä»»æã®å¯Ÿè±¡ã«å¯ŸããŠããã®å
šæŒ¢åããè©äŸ¡ãåºæ¥ãä»çµã¿ãšããã
çµæããæè²æŒ¢åãã察象ã«ãããšããã€ã³ãã®é«ãé ã«ã
以äžã®çµæãåŸãããã
ããã§ã¯ãTOP50äœãŸã§ã衚瀺ããŠããã
- æ°žåŸ³å¿ ä»åä¹ æž å åº·å€©å®æ£å幞æŸç«¹åç°æ°å®ææŽè£å®®å®è±åå®å«å»¶èèµé·çé€èª åå®¶éæžçŽé·å¿å°å€ªæ©æå¯ºè³æ¬
äžèšã®çµæåºæ¥ãåè£æŒ¢åãã
éå»ã®å
å·ã®æ¡çšåæ°ãšãšãã«åºåããŠã¿ããïŒ
for char in GENGOU_KOUHO_KANJI_STR:
if char in OTEHON_STR:
print(char ,"ïŒã¯",otehon_val_dict[char] ,"åéå»äœ¿çšïŒãææ¬ãªã¹ãã«å«ãïŒ")
elif char in ALL_OTEHON:
print(char ,"ïŒã¯",otehon_val_dict[char] ,"åéå»äœ¿çšïŒãææ¬ãªã¹ãã«å«ãŸããªãïŒAIãçºèŠïŒ")
else:
print(char , ": ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
")
æ°ž ïŒã¯ 29 åéå»äœ¿çšïŒãææ¬ãªã¹ãã«å«ãïŒ
埳 ïŒã¯ 16 åéå»äœ¿çšïŒãææ¬ãªã¹ãã«å«ãïŒ
å¿ : ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
ä» ïŒã¯ 13 åéå»äœ¿çšïŒãææ¬ãªã¹ãã«å«ãïŒ
å : ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
ä¹
ïŒã¯ 9 åéå»äœ¿çšïŒãææ¬ãªã¹ãã«å«ãïŒ
æž
: ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
å
ïŒã¯ 28 åéå»äœ¿çšïŒãææ¬ãªã¹ãã«å«ãïŒ
康 ïŒã¯ 10 åéå»äœ¿çšïŒãææ¬ãªã¹ãã«å«ãïŒ
倩 ïŒã¯ 23 åéå»äœ¿çšïŒãææ¬ãªã¹ãã«å«ãïŒ
å® ïŒã¯ 17 åéå»äœ¿çšïŒãææ¬ãªã¹ãã«å«ãïŒ
æ£ ïŒã¯ 19 åéå»äœ¿çšïŒãææ¬ãªã¹ãã«å«ãïŒ
å ïŒã¯ 3 åéå»äœ¿çšïŒãææ¬ãªã¹ãã«å«ãïŒ
幞 : ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
æŸ : ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
竹 : ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
å : ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
ç° : ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
æ° : ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
å® ïŒã¯ 7 åéå»äœ¿çšïŒãææ¬ãªã¹ãã«å«ãïŒ
æ ïŒã¯ 7 åéå»äœ¿çšïŒãææ¬ãªã¹ãã«å«ãïŒ
æŽ : ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
è£ : ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
å®® : ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
å® : ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
è± : ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
å : ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
å® : ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
å« : ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
å»¶ ïŒã¯ 16 åéå»äœ¿çšïŒãææ¬ãªã¹ãã«å«ãïŒ
è ïŒã¯ 1 åéå»äœ¿çšïŒãææ¬ãªã¹ãã«å«ãŸããªãïŒAIãçºèŠïŒ
èµ : ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
é· : ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
ç : ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
é€ ïŒã¯ 3 åéå»äœ¿çšïŒãææ¬ãªã¹ãã«å«ãïŒ
èª : ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
å : ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
å®¶ : ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
é : ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
æž : ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
çŽ : ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
é· ïŒã¯ 19 åéå»äœ¿çšïŒãææ¬ãªã¹ãã«å«ãïŒ
å¿ : ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
å° : ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
倪 : ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
æ© : ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
æ ïŒã¯ 19 åéå»äœ¿çšïŒãææ¬ãªã¹ãã«å«ãïŒ
寺 : ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
è³ : ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
æ¬ : ã¯ã©ã¡ãã«ãå«ãŸããŠããªããAIãèŠã€ããæ°åã§ãâ
ååã®åè£æåã«è¿ãã
äžäœïŒïŒäœ ïŒ å
šäœïŒæè²çšæŒ¢åçŽ1000åïŒã®ïŒïŒ
ã®ã
ãéå»å
å·ïŒïŒãææ¬ïŒãšæå³ã䌌ãŠããããšAIãæããããçµæãã³ã¬ã
åååªåè
ã®ãåããã¯ãããšãã
ã幞ããæŽããè±ããè³ããªããŠã®ãã
å®ã¯ããŸãŸã§å
å·ã«äœ¿ãããŠããªãã£ããã®ã¯å°ãé©ãã
ã£ãœãæåïŒïŒïŒãåºãŠããæ°ã¯ããã
ãªããINPUTããåžžçšæŒ¢åãã«æ¡å€§ããå ŽåãTOP50ã«ã¯
ãéããè³¢ããæŸããæ±ããé
ããªã©ããæ°åãšããŠå
¥ã£ãŠããŠããã
人éã®æèŠãšããŠã¯ãã¡ãã£ãšå
å·ãšããŠã¯ã©ãããªãŒã
ãšããåãããã®ã ããAIã®èšç®çµæã§ã¯ã
ãããã®åãå
å·ã®æåã«è¿ããšã¿ãªãããã®ãã
ãšããããã«èŠããšãããã¯ããã§é¢çœãã
äœãšãªãããå®®ããåããé·ããå®¶ããéãã寺ããªã©ã
å Žæç³»ã®èšèãè¿ãããšæãããã®ã ãããã
ã倩ããããæå³ã§ã¯è¯ãå Žæã瀺ããŠãããã
ã建ããªã©ã®åããææ¬ã«ããããïŒ
ãã£ã«ã¿ãªã³ã°â ç»æ°
åºãŠããæåã«å¯ŸããŠããæžãããããã®æ¡ä»¶ãæºããããã«ã
ç»æ°ã§ãã£ã«ã¿ãªã³ã°ããããããšã«ããã
IPAãè©ŠéšæäŸããŠãã以äžã®APIãçšããŠã
äºãåžžçšæŒ¢åã«å¯ŸããŠãå
šéšã®ç»æ°ãååŸããŠãããã
MJæåæ
å ±ååŸAPI
https://mojikiban.ipa.go.jp/search/help/api
import time
import pickle
target_list_dict = {}
for char in target_list:
#ã¹ãªãŒãã¯å¿
é
time.sleep(3)
api_result = get_char_data(char)
if api_result['status'] == "success":
print(api_result['results'][0]['èªã¿'])
target_list_dict[char] = get_char_data(char)
else:
print("API-ERR")
#print(api_result)
print("API-Finish")
with open('jyouyou_kanji.dic', mode='wb') as f:
pickle.dump(target_list_dict, f)
print("pickle-Finish")
äœåºŠãåãæŒ¢åã«ã€ããŠAPIãæããã®ã¯ã
APIæäŸè
ã®æ
å ±åŠçæšé²æ©æ§æ®¿ã«ç³ãèš³ãªãã®ã§ã
ä»å䜿ãåã¯ãããããäžèšã®ããã«äžåã ãååŸããŠã
pickleã§ä¿åããŠããããšã«ãã£ãŠã
ããšã¯ããŒã«ã«ã§äœ¿ããããã«ããŠãããã
import pickle
import pprint
with open('jyouyou_kanji.dic', mode='rb') as f:
KANJI_INFO_DIC = pickle.load(f)
pprint.pprint(KANJI_INFO_DIC["å"])
print(KANJI_INFO_DIC["ç"]['results'][0]['ç·ç»æ°'])
{'count': 1,
'find': True,
'results': [{'IPAmjææãã©ã³ãå®è£
': {'ãã©ã³ãããŒãžã§ã³': '005.01', 'å®è£
ããUCS': 'U+548C'},
'JISX0213': {'å
æåºå': '0', 'æ°Žæº': '1', 'é¢åºç¹äœçœ®': '1-47-34'},
'MJæåå³åœ¢': {'MJæåå³åœ¢ããŒãžã§ã³': '1.0',
'uri': 'http://mojikiban.ipa.go.jp/MJ008199.png'},
'MJæåå³åœ¢å': 'MJ008199',
'UCS': {'察å¿ããUCS': 'U+548C', '察å¿ã«ããŽãªãŒ': 'A'},
'äœåºãããçµ±äžæåã³ãŒã': 'J+548C',
'å
¥ç®¡å€åã³ãŒã': '',
'å
¥ç®¡æ£åã³ãŒã': '548C',
'å€§åæº': 1162,
'倧挢å': '3490',
'å€§æŒ¢èªæ': 1374,
'æžç±çµ±äžæåçªå·': '040260',
'æ°å€§åå
ž': 1886,
'æ¥æ¬èªæŒ¢åèŸå
ž': 1397,
'æŒ¢åæœç': {'人åçšæŒ¢å': True, 'åžžçšæŒ¢å': True},
'ç»èšçµ±äžæåçªå·': '00040260',
'ç·ç»æ°': 8,
'èªã¿': {'èšèªã¿': ['ãããã', 'ããããã', 'ãªãã', 'ãªããã', 'ããã'],
'é³èªã¿': ['ã¯', 'ãª', 'ã«']},
'éšéŠå
ç»æ°': [{'å
ç»æ°': 5, 'éšéŠ': 30}]}],
'status': 'success'}
9
ãã®ããã«ãç¹å®ã®æŒ¢åã«é¢ãã詳现ããŒã¿ãç»æ°ãªã©ã
ãã€ã§ãååŸã§ããããã«ãªã£ãã
å
å·äºæ³ã ãã§ãªããããªã䟿å©ãªããŒã¿ãäœããïŒïŒ
ãã®é¢æ°ã䜿ã£ãŠãããã»ã©ã®èŠé ã§åºããåè£æåã«å¯ŸããŠã
äžå®ã®ç»æ°ä»¥äžã§ããããšãããã£ã«ã¿ãŒããããã
ã§ã¯ããã®ãäžå®ã®ç»æ°ããšã¯ããã€ãªã®ãïŒ
éå»å
å·ã®ç»æ°ã調ã¹ãã
ïŒâ»ç»æ°ååŸé¢æ°ã¯äžã®äŸã§ããäœããã®ã§ã³ãŒãã¯çç¥ïŒ
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
gengoulist = ["倧å","çœé",,,"çç¥",,,"æå","å¹³æ"]
kakusuulist=[]
for idx in range(0, len(gengoulist)):
kakusuu = get_Gengou_kakusuu(gengoulist[idx][0])
kakusuulist.append(kakusuu)
kakusuu = get_Gengou_kakusuu(gengoulist[idx][1])
kakusuulist.append(kakusuu)
# æãç·ã°ã©ããåºå
left = np.array(range(0, len(kakusuulist)))
height = np.array(kakusuulist)
plt.plot(left, height)
from statistics import mean, median,variance,stdev
m = mean(kakusuulist)
median = median(kakusuulist)
variance = variance(kakusuulist)
stdev = stdev(kakusuulist)
print('å¹³å: {0:.2f}'.format(m))
print('äžå€®å€: {0:.2f}'.format(median))
print('忣: {0:.2f}'.format(variance))
print('æšæºåå·®: {0:.2f}'.format(stdev))
å¹³å: 7.83
äžå€®å€: 8.00
忣: 12.90
æšæºåå·®: 3.59
ãšããããšã§ã
å¹³åã¯ãããïŒç»ãšå€æããã
å³åŽïŒè¿ä»£åŽïŒã«ããã«ã€ããŠãç»æ°ãæžã£ãŠããããªãŒã
ãšããããšãå°ãæåŸ
ããŠã°ã©ãåããŠã¿ããã
ããŸãå€ãã£ãŠããªãããã«èŠããã
ç»æ°ãªã©ãæ¬æ Œçã«æ°ã«ããŠããã®ã¯
æããæ¬åœã«æè¿ã ããªã®ã§ããããã
ãã®çµæãå
ã«æ°åãæ±ºããŠãç»æ°ã§ã®ãã£ã«ã¿ãªã³ã°åŠçãå®è£
ããã
ïŒçµæã¯ãããŠçç¥ïŒ
ãªããã€ãã§ã«ããé³èªã¿ãã§ã®èªã¿æ¹ãå
ã«ã
MTSHé€å€ïŒææ²»å€§æ£æåå¹³æãšåãã¢ã«ãã¡ãããã«ãªããªãããã«ïŒ
ãšããç°¡æçãªãã§ãã¯ãå®è£
ã§ããã
ãã£ã«ã¿ãªã³ã°â¡ä»ã§äœ¿ãããŠããèšèãé€å€
挢ç±ãå€å
žãšã®ãããã³ã°çµæåŸã§è¯ãã®ã ãã
ä»ã§äœ¿ãããŠããèšèã¯äœ¿ããªãã®ã§é€å€ããå¿
èŠãããã
å®éã«è©Šè¡é¯èª€ãé²ããŠãããªãã§ã
ãå»¶å®ããšããçµã¿åãããäžäœåè£ãšããŠåºçŸããã
ãããããå»¶å®ãã¯äžåœã®éœåžåãšããŠååšããŠããã
ãå»¶å®åžããšããé
ç®ãwikipediaã«ååšããã
ååã¯ãmecabã®èŸæžã«ããé€å€ã®ã¿ãå®è£
ããŠãããã
mecabã®èŸæžã¯ãããŸã§äžèœã«ã¯äœ¿ããªãã
ïŒãããã䜿ãã«ããç¹ãããïŒ
ããã§ãwikipediaã®å
šã¿ã€ãã«é
ç®ãååŸãã
ãã®æåã®ïŒæåïŒæåŸã®ïŒæåãã«ã€ããŠã¯ã
æ¢ã«å¥ãªäœããååšããor誀解ãçããããã®ã§NGã
ãšãããã£ã«ã¿ãªã³ã°ãèããã
äžèšã§èšãã°ãå»¶å®ãã¯ã³ã¬ã§ãã£ã«ã¿åºæ¥ããã
æå人ã®èåããæå人ã®ååãã®å€§éšåããã£ã«ã¿åºæ¥ãã
äŸïŒç°ææ£åâãç°æããšãæ£åããé€å€ãªã¹ãã«ç»é²ãããã
import codecs
#wiki_title/jawiki-latest-all-titles-in-ns0
def makeWikiTitleList(filepath):
WIKI_TITLE_LIST_MAE = []
WIKI_TITLE_LIST_ATO = []
infile = codecs.open(filepath,"r", "utf-8")
for line in infile:
if len(line) > 1:
#måã®äºæå
WIKI_TITLE_LIST_MAE.append(line[0:2])
#mæåŸã®äºæå
WIKI_TITLE_LIST_ATO.append(line[-2:])
infile.close()
#éè€åé€
WIKI_TITLE_LIST_MAE = list(set(WIKI_TITLE_LIST_MAE))
WIKI_TITLE_LIST_ATO = list(set(WIKI_TITLE_LIST_ATO))
return WIKI_TITLE_LIST_MAE, WIKI_TITLE_LIST_ATO
ãïŒæåãã¡ããã©ã®ã¿ã€ãã«ã ãé€å€ããããã§ã¯ãªãã
ååŸïŒæåé€å€ããšããã«ãŒã«ã§ããããã
å®ã¯ããã¯çµæ§åŒ·åãªãã£ã«ã¿ãŒã§ã人åå°åãã¯ããã
誀解ãçããããç³»ãä»ã®æå³ã«æãããããç³»ããããªãã¯ããã
挢ç±ãå€å žãšã®ãããã³ã°
ä»ã«ããéå»å€©çåãªã©ã®ãã£ã«ã¿ãã
è©äŸ¡æºåãé²ããããæ¢ã«è¶
é·ãã®ã§ã
现ãããã®ã¯çç¥ããŠãããããæŒ¢ç±ãå€å
žãšã®ãããã³ã°ã«å
¥ãã
ãäžããããäžå®è·é¢å
ã«ãåè£ãšããæŒ¢åãããç¶æ
ã
ããå
¥åæç« å
ããå
šæ¢çŽ¢ããã³ãŒãã§ããã
ã¡ãã£ãšæ±ãã³ãŒãã«ãªã£ãŠããŸã£ãã
åºå
žãã©ããåããããã«ãè¿é£éšåãåãããŠè¡šç€ºãããã
ãææ¬ïŒå
å·ã®æŒ¢åïŒãšè©²åœã®åè£æŒ¢åãšã®é¡äŒŒåºŠã衚瀺ãããã
ãªã©ã®ã€ã³ãã©ã¡ãŒã·ã§ã³ç³»ã远å ããŠããã®ã§ãã¡ããã¡ãããŠããã
ïŒå®éã¯ããã«ããã¡ãã£ãšæ
å ±ã远å ïŒ
#ããã¹ããªã¹ããå
¥ãããšãè¿æ¥ããŠåè£æåã䜿çšããŠãããšãããšããã®å Žæãè¿ã颿°
def get_Gengou_Kouho_kinsetu(OTEHON_STR, GENGOU_KOUHO_KANJI_STR, target_str, max_kyori):
kouho_resultlist =[]
kaisi_no = 0
target_str_len = len(target_str)
while kaisi_no < target_str_len :
kaisi_char = target_str[kaisi_no]
if kaisi_char in GENGOU_KOUHO_KANJI_STR:
kaisi_to_end = 1
#å
šäœãå
¥ã£ãŠããïŒäžéæåæ°ãè¶
ããªãããšããäž¡æ¡ä»¶
while kaisi_no + kaisi_to_end < target_str_len and kaisi_to_end <= max_kyori:
end_no = kaisi_no + kaisi_to_end
end_char = target_str[end_no]
if end_char in GENGOU_KOUHO_KANJI_STR:
#äž¡æ¹ãšããåè£ãšãªãæååã«å
¥ã£ãŠããç¶æ
distance = JPmodel.similarity(kaisi_char, end_char)
inyou_kaisi_no = kaisi_no-5
if inyou_kaisi_no < 0:
inyou_kaisi_no = 0
inyou_end_no = end_no+1+5
if inyou_end_no > target_str_len-1:
inyou_end_no = target_str_len-1
kouho_resultlist.append([kaisi_char+end_char,
kaisi_to_end,
target_str[inyou_kaisi_no : inyou_end_no],
get_otehon_ave_ruijido(kaisi_char, OTEHON_STR),
get_otehon_ave_ruijido(end_char, OTEHON_STR),
distance
])
#æ€çŽ¢å¯Ÿè±¡ãããã
kaisi_to_end += 1
else:
pass
kaisi_no +=1
return kouho_resultlist
ãã®é¢æ°ãçšããŠãããžãã¬ã殿ã«çšæããŠããã ãã
å皮挢ç±ããå€å
žã«å¯ŸããŠãæ€çŽ¢ãããŠãããšã
ãAIã®èŠã€ããè¯ã挢åã®ïŒæåã®çµã¿åãããã§ã
ãæŒ¢ç±ãå€å
žå
ã«åºå
žãšèšããå Žæããããããªãã¢ã
ãèŠã€ãããšããããã ã
å®éã«ãåè£ãããã€ãèŠã€ããããšãã§ããã
ããã§ãã¡ãã£ãšé¢çœãç¹ã¯ãæ¥æ¬ã®å€å
žã察象ã«ããå Žåã
ããšããšããXX倩çããšãã衚èšãå«ãŸããããšãå€ãããŠã
éå»ã®å€©çåãããã®äžéšã®æåã䜿ã£ãçµæã°ããã
åºãŠããŠããŸããšããåŸåããã£ãã
ããããåç§°ã®äžéšã䜿ãããšã¯ããåºå
žãã§ã¯ãªãããã
æ¥æ¬ã®å€å
žãæ±ãå Žåã«ã¯ããXX倩çãåã³ãXXãã®éšåã¯ã
äºãå
šãŠæ¶å»ããããã¹ãããŒã¿ãçšããããšãå ±åããŠããã
æåŸã«ãããããŠèŠã€ããåè£ã«å¯ŸããŠ
ãå
å·ãšããŠã©ã®çšåºŠçžå¿ããã®ãæ°å€è©äŸ¡ããè¡ãã
ããŒã¹ã¯ãåæåãã©ããŸã§ããææ¬ãã«è¿ããããš
åååæ§ã®ã挢åïŒæåéã®è·é¢ãã®å€å®ã ã
ããã»ã©ã®wikiã¿ã€ãã«ãã£ã«ã¿ãªã©ãå
¥ããŠããã
åŒã£æãã£ããã®ã¯ïŒç¹ïŒé€å€ãããšãããã§ãã¯ãã
MTSHé€å€ãªã©ã®ãã§ãã¯ã宿œããã
ïŒååã¯ãè©äŸ¡ä»¥åã«çµã蟌ã¿ã§ããªãåè£ãæžã£ãŠããŸã£ãããã
ãæ¶å»æ³çãªãšããããã£ããããä»åã¯ã
ãåè£ãå€ãã«ãšã£ãŠããã£ã«ã¿ãŒæ¡ä»¶ã¯ãã€ãããªãããã
ãæåŸã¯çžå¿ãã床åã®æ°å€ãã®äžäœãåºãããïŒ
ãã®åŸã®çµæãæžããããšããã§ã¯ãããã
ãŸã æçµæ®µéã¯ç¢ºèªäžã§ããããšãšã
ãããããçµæãŸã§æžããŠããŸããš
å®éžåŽã«åé¿ãããŠããŸãããšãããåŸãŠã
ã¬ãåœãŠã«ãªããªãã®ã§ã
äžæŠãæ®å¿µãªãããããŸã§ã®èšèŒãšããã
ç¶ã
ïœ to be continuedãïœ
ããšãããææ
â
ãã®èšäºãžã®åå¿ïŒãããïŒãå€ããã°ãåŸæ¥å¿
ãç¶ããæžããŸãã
â»å
å·åœãŠã¯ãååãä»åãé·ãã£ãããã«ãæžãã®ã倧å€ãªãã§ãããã
ããããç¶ããç¥ããã人ãå€ããã°ãé 匵ã£ãŠæžããŸãã
ãå°ãªããã°ãããšã¯ãã¬ãèŠãŠãã ãããã ãã§ããããªã
ãæ°ã«ãªã£ãæ¹ã¯å¿ããªããã¡ã«ãããããããŠãããŠãã ãããŸãã
â ææïŒ
ä»åã¯åçŽãªchar2vecã«ããAIåŠç¿ã ãã§ãªãã
ãèšç®ã ããã§ã®çäžç¢ºçãé«ããããããã«
æ¢åçšèªãšã®éè€ãã§ãã¯ããç»æ°å€å®ã
ãããŠç¹ã«åºå
žç®æã®æ€çŽ¢ãªã©ã
å®éã®å
å·å€å®ã§è¡ãããŠããããªå
容ã
ã§ããã ãå
šãŠããã°ã©ã ã§å®çµããããã«ãå®è£
ããŠãã£ãã
ããªãæ³¥èãéšåãäœãããã ããšãèšããã
ãããŸã§äœã£ãŠãããšããã¯ãAIã«ããäºæ³ãšããããã
æ¬åœã«å
å·ãéžãã§ããåŽã®äººãã¡ã«ã
ãããã®ããã°ã©ã ãå·®ãäžããŠã
ãåè£ã®çºèŠããããã§ãã¯ããæ¥œã«ãããããªä»çµã¿ãšããŠ
䜿ã£ãŠããã ãã»ããè¯ããããªæ°ãããŠããã
éžå®äœæ¥ã®å¹çãã¢ããããããšã¯ééããªãæ°ãããã
ä»ããã§ããã³ã¢ãã®äººããç³ãå
¥ããããã°ãã€ã§ãå·®ãäžããã
ããšã¯ãçºé³çãªãã§ãã¯ããåºæºãé£ããåºæ¥ãŠããªããšãããããã ã
ïŒèšããããïŒã©ãããåºæºã§ïŒããšãã
ãé³ã ãã§å¥ãªæå³ã«èãããïŒã©ããŸã§ãã€ããŒåèªãå«ããã¹ãïŒããšãïŒ
ä»åãåºå
žã«æ€çŽ¢ãããããšããŠãã
èªåèªèº«ã§æ°ã«å
¥ã£ãæåããäºæ³ããçµã¿åããã®å¥œã¿ãã
çŽæ¥ã¯äœ¿ããªãããšã¯ãããªãé£ããæããã
ãããŸã§ãæ°å€ãããããã°ã©ã äžã®èª¿æŽãã ãã§ã
å
šãŠã決ããŠããããšã¯ããªãé£ããã
ïŒãã©ã¡ãŒã¿ã®ãã¥ãŒãã³ã°ã¯è¡ã£ãŠããããã§ãå®å
šã«ç§èŠããŒãããšèšããããšã
ãå€å°ã¯å
¥ã£ãŠããŸãã®ãããããªãããæ¥µåèªèº«ã®ææ§ãåæãªäºæ³ã¯é€å€ãã®æå³ïŒ
äžã®äžã«å€ãåºåã£ãŠãããå
å·äºæ³ãã¯ã
ãã®äººèªèº«ã®èããçæ³ãæ¥æ¬ãžã®æåŸ
ãç¥ããã«ã³ãã蟌ãããããã®ã ã
ãã®ãããæšªããèŠããšå®éã¯çµè«ãããã§ããã
ãäºæ³ããã»ã¹ã¯äžéæãã ããäºæ³è
èªèº«ã«ãšã£ãŠã¯æ£ããäºæ³ããšãªãã
äžæ¹ãä»åã®AIã§ã®äºæ³ã¯ã
ãäºæ³ã®ããã»ã¹ã¯éæãã ããäºæ³è
èªèº«ã«ãšã£ãŠã¯æ£ãããªãäºæ³ãã§çéã ã
ãšã¯ãããäžåºŠæŒç®ããŠã¿ããšã
ååã®çµæãå倩ããšããã
ä»åã®çµæãïŒãããïŒããšããã
ãã°ããçºããŠãããšæçãæ¹§ããŠããã®ãäžæè°ã§ããã
ãããè¯ã挢åã®æã€ãšãã«ã®ãŒãªã®ã ãããïŒ
æããã4æ1æ¥ã®çºè¡šã«å¯ŸããŠããå€ãã®äººãã
æåã¯çåãéåæãæã¡ã€ã€ãã
ãã°ããçºããŠãããã¡ã«æçãæ¹§ããŠããã®ã ãããªã
ãšæãããããªã£ãŠæ¬²ããã
ã¿ãªãæçãæã€è¯ãæä»£ã«ãªã£ãŠæ¬²ããã
以äžã
次åãžç¶ãïŒïŒïŒããã©ããã¯åå¿æ¬¡ç¬¬