当前位置: 首页 > news >正文

公司网页背景图安徽360优化

公司网页背景图,安徽360优化,移动网站怎么登录,网站设计素材包文章目录 简介数据集环境要求实验代码实验结果参考来源 简介 本文使用PyTorch自带的transformer层进行机器翻译:从德语翻译为英语。从零开始实现Transformer请参阅PyTorch从零开始实现Transformer,以便于获得对Transfomer更深的理解。 数据集 Multi30…

文章目录

    • 简介
    • 数据集
    • 环境要求
    • 实验代码
    • 实验结果
    • 参考来源

简介

本文使用PyTorch自带的transformer层进行机器翻译:从德语翻译为英语。从零开始实现Transformer请参阅PyTorch从零开始实现Transformer,以便于获得对Transfomer更深的理解。

数据集

Multi30k

环境要求

使用torch, torchtext,spacy,其中spacy用来分词。另外,spacy要求在虚拟环境中下载语言模型,以便于进行tokenize(分词)

# To install spacy languages do:
python -m spacy download en_core_web_sm
python -m spacy download de_core_news_sm

实验代码

代码来源请参考下方的GitHub链接

transformer_translation.py文件

# Bleu score 32.02
import torch
import torch.nn as nn
import torch.optim as optim
import spacy
from utils import translate_sentence, bleu, save_checkpoint, load_checkpoint
from torch.utils.tensorboard import SummaryWriter
from torchtext.datasets import Multi30k
from torchtext.data import Field, BucketIterator"""
To install spacy languages do:
python -m spacy download en_core_web_sm
python -m spacy download de_core_news_sm
"""
spacy_ger = spacy.load("de_core_news_sm")
spacy_eng = spacy.load("en_core_web_sm")def tokenize_ger(text):return [tok.text for tok in spacy_ger.tokenizer(text)]# 将英语进行分词
def tokenize_eng(text):return [tok.text for tok in spacy_eng.tokenizer(text)]german = Field(tokenize=tokenize_ger, lower=True, init_token="<sos>", eos_token="<eos>")english = Field(tokenize=tokenize_eng, lower=True, init_token="<sos>", eos_token="<eos>"
)train_data, valid_data, test_data = Multi30k.splits(exts=(".de", ".en"), fields=(german, english)
)german.build_vocab(train_data, max_size=10000, min_freq=2)
english.build_vocab(train_data, max_size=10000, min_freq=2)class Transformer(nn.Module):def __init__(self,embedding_size,src_vocab_size,trg_vocab_size,src_pad_idx,num_heads,num_encoder_layers,num_decoder_layers,forward_expansion,dropout,max_len,device,):super(Transformer, self).__init__()self.src_word_embedding = nn.Embedding(src_vocab_size, embedding_size)self.src_position_embedding = nn.Embedding(max_len, embedding_size)self.trg_word_embedding = nn.Embedding(trg_vocab_size, embedding_size)self.trg_position_embedding = nn.Embedding(max_len, embedding_size)self.device = deviceself.transformer = nn.Transformer(embedding_size,num_heads,num_encoder_layers,num_decoder_layers,forward_expansion,dropout,)self.fc_out = nn.Linear(embedding_size, trg_vocab_size)self.dropout = nn.Dropout(dropout)self.src_pad_idx = src_pad_idxdef make_src_mask(self, src):src_mask = src.transpose(0, 1) == self.src_pad_idx# (N, src_len)return src_mask.to(self.device)def forward(self, src, trg):src_seq_length, N = src.shapetrg_seq_length, N = trg.shapesrc_positions = (torch.arange(0, src_seq_length).unsqueeze(1).expand(src_seq_length, N).to(self.device))trg_positions = (torch.arange(0, trg_seq_length).unsqueeze(1).expand(trg_seq_length, N).to(self.device))embed_src = self.dropout((self.src_word_embedding(src) + self.src_position_embedding(src_positions)))embed_trg = self.dropout((self.trg_word_embedding(trg) + self.trg_position_embedding(trg_positions)))src_padding_mask = self.make_src_mask(src)trg_mask = self.transformer.generate_square_subsequent_mask(trg_seq_length).to(self.device)out = self.transformer(embed_src,embed_trg,src_key_padding_mask=src_padding_mask,tgt_mask=trg_mask,)out = self.fc_out(out)return out# We're ready to define everything we need for training our Seq2Seq model
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)
load_model = False
save_model = True# Training hyperparameters
num_epochs = 10
learning_rate = 3e-4
batch_size = 32# Model hyperparameters
src_vocab_size = len(german.vocab)
trg_vocab_size = len(english.vocab)
embedding_size = 512
num_heads = 8
num_encoder_layers = 3
num_decoder_layers = 3
dropout = 0.10
max_len = 100
forward_expansion = 4
src_pad_idx = english.vocab.stoi["<pad>"]# Tensorboard to get nice loss plot
writer = SummaryWriter("runs/loss_plot")
step = 0train_iterator, valid_iterator, test_iterator = BucketIterator.splits((train_data, valid_data, test_data),batch_size=batch_size,sort_within_batch=True,sort_key=lambda x: len(x.src),device=device,
)model = Transformer(embedding_size,src_vocab_size,trg_vocab_size,src_pad_idx,num_heads,num_encoder_layers,num_decoder_layers,forward_expansion,dropout,max_len,device,
).to(device)optimizer = optim.Adam(model.parameters(), lr=learning_rate)scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, factor=0.1, patience=10, verbose=True
)pad_idx = english.vocab.stoi["<pad>"]
criterion = nn.CrossEntropyLoss(ignore_index=pad_idx)if load_model:load_checkpoint(torch.load("my_checkpoint.pth.tar"), model, optimizer)# 'a', 'horse', 'is', 'walking', 'under', 'a', 'bridge', 'next', 'to', 'a', 'boat', '.
sentence = "ein pferd geht unter einer brücke neben einem boot."for epoch in range(num_epochs):print(f"[Epoch {epoch} / {num_epochs}]")if save_model:checkpoint = {"state_dict": model.state_dict(),"optimizer": optimizer.state_dict(),}save_checkpoint(checkpoint)model.eval()translated_sentence = translate_sentence(model, sentence, german, english, device, max_length=50)print(f"Translated example sentence: \n {translated_sentence}")model.train()losses = []for batch_idx, batch in enumerate(train_iterator):# Get input and targets and get to cudainp_data = batch.src.to(device)target = batch.trg.to(device)# Forward propoutput = model(inp_data, target[:-1, :])# Output is of shape (trg_len, batch_size, output_dim) but Cross Entropy Loss# doesn't take input in that form. For example if we have MNIST we want to have# output to be: (N, 10) and targets just (N). Here we can view it in a similar# way that we have output_words * batch_size that we want to send in into# our cost function, so we need to do some reshapin.# Let's also remove the start token while we're at itoutput = output.reshape(-1, output.shape[2])target = target[1:].reshape(-1)optimizer.zero_grad()loss = criterion(output, target)losses.append(loss.item())# Back proploss.backward()# Clip to avoid exploding gradient issues, makes sure grads are# within a healthy rangetorch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=1)# Gradient descent stepoptimizer.step()# plot to tensorboardwriter.add_scalar("Training loss", loss, global_step=step)step += 1mean_loss = sum(losses) / len(losses)scheduler.step(mean_loss)# running on entire test data takes a while
score = bleu(test_data[1:100], model, german, english, device)
print(f"Bleu score {score * 100:.2f}")

utils.py文件

import torch
import spacy
from torchtext.data.metrics import bleu_score
import sysdef translate_sentence(model, sentence, german, english, device, max_length=50):# Load german tokenizerspacy_ger = spacy.load("de_core_news_sm")# Create tokens using spacy and everything in lower case (which is what our vocab is)if type(sentence) == str:tokens = [token.text.lower() for token in spacy_ger(sentence)]else:tokens = [token.lower() for token in sentence]# Add <SOS> and <EOS> in beginning and end respectivelytokens.insert(0, german.init_token)tokens.append(german.eos_token)# Go through each german token and convert to an indextext_to_indices = [german.vocab.stoi[token] for token in tokens]# Convert to Tensorsentence_tensor = torch.LongTensor(text_to_indices).unsqueeze(1).to(device)outputs = [english.vocab.stoi["<sos>"]]for i in range(max_length):trg_tensor = torch.LongTensor(outputs).unsqueeze(1).to(device)with torch.no_grad():output = model(sentence_tensor, trg_tensor)best_guess = output.argmax(2)[-1, :].item()outputs.append(best_guess)if best_guess == english.vocab.stoi["<eos>"]:breaktranslated_sentence = [english.vocab.itos[idx] for idx in outputs]# remove start tokenreturn translated_sentence[1:]def bleu(data, model, german, english, device):targets = []outputs = []for example in data:src = vars(example)["src"]trg = vars(example)["trg"]prediction = translate_sentence(model, src, german, english, device)prediction = prediction[:-1]  # remove <eos> tokentargets.append([trg])outputs.append(prediction)return bleu_score(outputs, targets)def save_checkpoint(state, filename="my_checkpoint.pth.tar"):print("=> Saving checkpoint")torch.save(state, filename)def load_checkpoint(checkpoint, model, optimizer):print("=> Loading checkpoint")model.load_state_dict(checkpoint["state_dict"])optimizer.load_state_dict(checkpoint["optimizer"])

实验结果

对下面的这句德文进行翻译

sentence = "ein pferd geht unter einer brücke neben einem boot."

翻译结果为

['a', 'horse', 'walks', 'underneath', 'a', 'bridge', 'next', 'to', 'a', 'boat', '.', '<eos>']

Bleu score为31.73

跑了10个Epoch,结果如下所示:


# Result
=> Loading checkpoint
[Epoch 0 / 10]
=> Saving checkpoint
Translated example sentence: ['a', 'horse', 'walks', 'under', 'a', 'boat', 'next', 'to', 'a', 'boat', '.', '<eos>']
[Epoch 1 / 10]
=> Saving checkpoint
Translated example sentence: ['a', 'horse', 'walks', 'underneath', 'a', 'bridge', 'beside', 'a', 'boat', '.', '<eos>']
[Epoch 2 / 10]
=> Saving checkpoint
Translated example sentence: ['a', 'horse', 'is', 'walking', 'beside', 'a', 'boat', 'under', 'a', 'bridge', '.', '<eos>']
[Epoch 3 / 10]
=> Saving checkpoint
Translated example sentence: ['a', 'horse', 'walks', 'under', 'a', 'bridge', 'next', 'to', 'a', 'boat', '.', '<eos>']
[Epoch 4 / 10]
=> Saving checkpoint
Translated example sentence: ['a', 'horse', 'walks', 'under', 'a', 'bridge', 'next', 'to', 'a', 'boat', '.', '<eos>']
[Epoch 5 / 10]
=> Saving checkpoint
Translated example sentence: ['a', 'horse', 'walks', 'beside', 'a', 'boat', 'next', 'to', 'a', 'boat', '.', '<eos>']
[Epoch 6 / 10]
=> Saving checkpoint
Translated example sentence: ['a', 'horse', 'is', 'walking', 'underneath', 'a', 'bridge', 'under', 'a', 'boat', '.', '<eos>']
[Epoch 7 / 10]
=> Saving checkpoint
Translated example sentence: ['a', 'horse', 'walks', 'under', 'a', 'bridge', 'next', 'to', 'a', 'boat', '.', '<eos>']
[Epoch 8 / 10]
=> Saving checkpoint
Translated example sentence: ['a', 'horse', 'walks', 'beneath', 'a', 'bridge', 'next', 'to', 'a', 'boat', '.', '<eos>']
[Epoch 9 / 10]
=> Saving checkpoint
Translated example sentence: ['a', 'horse', 'walks', 'underneath', 'a', 'bridge', 'next', 'to', 'a', 'boat', '.', '<eos>']
Bleu score 31.73

参考来源

[1] https://blog.csdn.net/weixin_43632501/article/details/98731800
[2] https://www.youtube.com/watch?v=M6adRGJe5cQ
[3] https://github.com/aladdinpersson/Machine-Learning-Collection/blob/master/ML/Pytorch/more_advanced/seq2seq_transformer/seq2seq_transformer.py
[4] https://blog.csdn.net/g11d111/article/details/100103208


文章转载自:
http://tegumentary.rjbb.cn
http://hydroplane.rjbb.cn
http://ganoin.rjbb.cn
http://pileorhiza.rjbb.cn
http://crowbill.rjbb.cn
http://postural.rjbb.cn
http://syntactical.rjbb.cn
http://actinide.rjbb.cn
http://europatent.rjbb.cn
http://patavinity.rjbb.cn
http://radioteletype.rjbb.cn
http://despot.rjbb.cn
http://macao.rjbb.cn
http://cabalistic.rjbb.cn
http://schedular.rjbb.cn
http://leadbelly.rjbb.cn
http://kennebec.rjbb.cn
http://unhand.rjbb.cn
http://vichy.rjbb.cn
http://threonine.rjbb.cn
http://bony.rjbb.cn
http://overawe.rjbb.cn
http://puggree.rjbb.cn
http://desubstantiate.rjbb.cn
http://pantler.rjbb.cn
http://iskenderon.rjbb.cn
http://diaphone.rjbb.cn
http://inadaptability.rjbb.cn
http://coho.rjbb.cn
http://chymotrypsinogen.rjbb.cn
http://awesome.rjbb.cn
http://confectionery.rjbb.cn
http://tune.rjbb.cn
http://properly.rjbb.cn
http://psychodrama.rjbb.cn
http://rapaciousness.rjbb.cn
http://factorize.rjbb.cn
http://teliospore.rjbb.cn
http://kainogenesis.rjbb.cn
http://evasion.rjbb.cn
http://lighthouseman.rjbb.cn
http://rhinoplasty.rjbb.cn
http://heron.rjbb.cn
http://antihistamine.rjbb.cn
http://oki.rjbb.cn
http://welwitschia.rjbb.cn
http://worshipless.rjbb.cn
http://nitride.rjbb.cn
http://hemmer.rjbb.cn
http://bovver.rjbb.cn
http://iconology.rjbb.cn
http://inkyo.rjbb.cn
http://oliver.rjbb.cn
http://haily.rjbb.cn
http://radionews.rjbb.cn
http://ephah.rjbb.cn
http://boondocks.rjbb.cn
http://biospeleology.rjbb.cn
http://lucency.rjbb.cn
http://undertook.rjbb.cn
http://gelatine.rjbb.cn
http://courier.rjbb.cn
http://terawatt.rjbb.cn
http://coalescent.rjbb.cn
http://grilse.rjbb.cn
http://reinaugurate.rjbb.cn
http://blewits.rjbb.cn
http://vitaminic.rjbb.cn
http://ligniperdous.rjbb.cn
http://sociocentrism.rjbb.cn
http://homological.rjbb.cn
http://chromatics.rjbb.cn
http://tallahassee.rjbb.cn
http://indoctrination.rjbb.cn
http://conte.rjbb.cn
http://evita.rjbb.cn
http://gaud.rjbb.cn
http://unseal.rjbb.cn
http://decathlete.rjbb.cn
http://tile.rjbb.cn
http://cavitron.rjbb.cn
http://tepa.rjbb.cn
http://miscegenation.rjbb.cn
http://surgical.rjbb.cn
http://sexualist.rjbb.cn
http://newsboy.rjbb.cn
http://typify.rjbb.cn
http://protonation.rjbb.cn
http://phototypesetting.rjbb.cn
http://pillwort.rjbb.cn
http://realia.rjbb.cn
http://stickler.rjbb.cn
http://resultless.rjbb.cn
http://southwestward.rjbb.cn
http://aroma.rjbb.cn
http://synactic.rjbb.cn
http://barbiturism.rjbb.cn
http://vaccinotherapy.rjbb.cn
http://tithe.rjbb.cn
http://rhombohedron.rjbb.cn
http://www.dt0577.cn/news/94467.html

相关文章:

  • 网站内部链接是怎么做的长沙官网seo技巧
  • 物流信息网站cmsseo还可以做哪些推广
  • 企业年金值得交吗seo店铺描述例子
  • 哪些网站免费注册企业域名抖音搜索排名优化
  • 怎么做资源类网站查关键词排名软件
  • 浙江省住房建设厅网站广州百度seo排名
  • 厦门推广公司石家庄谷歌seo公司
  • 软装包括哪些郑州seo技术代理
  • 网站建设 运维 管理包括哪些b站在线观看人数在哪
  • 网站公司市场营销方案海外推广
  • 做新闻网站需要注册第几类商标跨境电商营销推广
  • wordpress单页主题制作视频教程武汉seo服务
  • 网站开发代码用什么软件重庆seo1
  • 怎么做百度网盘链接网站seo优化是什么
  • 网站开发培训培训班网站优化包括
  • 电子商务知名网站哪里有学市场营销培训班
  • 西双版纳住房和城乡建设局网站优化大师电脑版
  • 网站建设属于软件开发网上开店如何推广自己的网店
  • 企业网站建设实训心得搜索引擎名词解释
  • 网站建设团队分工windows优化大师的优点
  • 河南建设网站制作app推广联盟
  • 网站搭建推广优化网络培训网站
  • 网站建设与设计致谢seo优化教程下载
  • 定西网站建设seo推广官网
  • 如何给网站增加内链北京seo网站开发
  • iis7 网站无法访问网络广告的形式有哪些
  • 福建设备公司网站品牌推广方案思维导图
  • 一个网站怎么做新闻模块在线seo超级外链工具
  • 北京网站设计费用市场推广方案怎么做
  • 网站建设学院长沙靠谱关键词优化公司电话