- 博客(19)
- 资源 (3)
- 收藏
- 关注
原创 PyTorch入门(四)——Regression
import torchfrom torch.autograd import Variableimport torch.nn.functional as Fimport matplotlib.pyplot as plttorch.manual_seed(1)x = torch.unsqueeze(torch.linspace(-1, 1, 100), dim=1)y = x.pow(2) + 0.2 * torch.rand(x.size())x, y = Variable(x), V.
2022-03-25 10:38:13 259
原创 PyTorch入门(三)——Activation Function
import torchimport torch.nn.functional as Ffrom torch.autograd import Variableimport matplotlib.pyplot as plt# fake datax = torch.linspace(-5,5,200)x = Variable(x)x_np = x.data.numpy()y_relu = F.relu(x).data.numpy()y_sigmoid = torch.sigmoi...
2022-03-19 16:48:20 366
原创 PyTorch入门(二)——Variable
import torchfrom torch.autograd import Variabletensor = torch.FloatTensor([[1,2],[3,4]])variable = Variable(tensor, requires_grad = True)print(tensor)print(variable)t_out = torch.mean(tensor * tensor) # x^2v_out = torch.mean(variable * variab.
2022-03-19 16:20:02 93
原创 PyTorch入门(一)——Numpy vs Torch
非常抱歉,前几个月写论文,强化学习内容一直没更新。以后一定坚持更下面开一篇新文:PyTorch入门,coding...import torchimport numpy as npnp_data = np.arange(6).reshape((2, 3))torch_data = torch.from_numpy(np_data)tensor2array = torch_data.numpy()print('\nnumpy', np_data)print('\ntorch', tor
2022-03-19 16:01:12 983
原创 PolicyIterationSolution
# coding: utf-8#geling修改注释 20180421#liuyubiao修改策略输出为多策略输出import numpy as npimport pprintimport sysimport PolicyEvaluationSolutionif "../" not in sys.path: sys.path.append("../") from lib.envs.gridworld import GridworldEnv# 进行多策略的输出# 定义两个全局变.
2021-10-10 16:23:46 197
原创 ValueIterationSolution
# coding: utf-8#geling修改注释 20180421#liuyubiao修改策略输出为多策略输出import numpy as npimport pprintimport sysif "../" not in sys.path: sys.path.append("../") from lib.envs.gridworld import GridworldEnv# 进行多策略的输出# 定义1个全局变量用来记录运算的次数i_num = 1# 根据传入的四个行为.
2021-10-10 16:21:55 113
原创 PolicyEvaluationSolution
# coding: utf-8import numpy as npimport sysif "../" not in sys.path: sys.path.append("../") from lib.envs.gridworld import GridworldEnv# policy_eval方法是策略评估方法,输入要评估的策略policy_eval,环境env,折扣因子,阈值。输出当前策略下收敛的值函数vdef policy_eval(policy, env, discount_.
2021-10-10 16:19:30 332
原创 深度强化学习 学术前沿与实战应用——PPO
import torch.nn as nnimport torchimport torch.nn.functional as Ffrom torch.autograd import Variableimport randomimport gymimport time# PPO actor-critic模型class Model(nn.Module): def __init__(self, num_inputs, num_outputs): super(Model.
2021-09-30 22:14:56 398
原创 深度强化学习 学术前沿与实战应用——Dueling DQN
class DuelingDQN: def __init__(self, …, dueling = True, sess = None): #省略 self.dueling = dueling #省略 if sess is None: self.sess = tf.Session() self.sess.run(tf.global_variables_initializer()) ...
2021-09-25 12:07:14 338
原创 强化学习 原理与Python实现(四)
二十一点 Blackjack-v0%matplotlib inlineimport numpy as npnp.random.seed(0)import matplotlib.pyplot as pltimport gym环境使用env = gym.make("Blackjack-v0")env.seed(0)print('观察空间 = {}'.format(env.observation_space))print('动作空间 = {}'.format(env.action_.
2021-09-24 19:46:54 634
原创 强化学习 原理与Python实现(三)
冰面滑行 FrozenLake-v0import numpy as npnp.random.seed(0)import gym环境使用env = gym.make('FrozenLake-v0')env.seed(0)print('观察空间 = {}'.format(env.observation_space))print('动作空间 = {}'.format(env.action_space))print('观测空间大小 = {}'.format(env.unwrapped..
2021-09-24 17:52:56 839
原创 深度强化学习 学术前沿与实战应用——DDQN
class DoubleDQN: def learn(self): # 这一段和DQN一样 if self.learn_step_counter % self.replace_target_iter == 0: self.sess.run(self.replace_target_op) print('\ntarget_params_replaced\n') if self.memory_count...
2021-09-24 17:24:01 549
原创 深度强化学习 学术前沿与实战应用——DQN
while True: env.render() action = RL.choose_action(observation) observation_, reward, done = env.step(action) RL.store_transition(observation, action, reward, observation_) if (step > x) and (step % y == 0): RL.learn() ...
2021-09-20 16:45:00 608
原创 强化学习 原理与Python实现(二)
Bellman 方程求解import sympyfrom sympy import symbolssympy.init_printing()求解 Bellman 期望方程v_hungry, v_full = symbols('v_hungry v_full')q_hungry_eat, q_hungry_none, q_full_eat, q_full_none = \ symbols('q_hungry_eat q_hungry_none q_full_eat
2021-09-18 21:07:01 370 2
原创 强化学习 原理与Python实现(一)
安装 Gym 前升级 Python 和 pip升级 pip 命令pip install --upgrade pip最小安装 (Anaconda 3 的管理员模式)pip install gym导入 Gym 库import gym取出环境env = gym.make('CartPole-v0')初始化环境对象env.reset()环境前进一步env.step()图形化显示当前环境env.render()关闭环境env.close.
2021-09-14 22:09:45 639
强化学习源码(DP, MC, TD, DQN, PG, AC, A3C, DDPG).zip
2021-10-10
空空如也
TA创建的收藏夹 TA关注的收藏夹
TA关注的人