22.07.11
Unity ML
- PPO (Proximal Policy Optimization)
- SAC (Soft Actor-Critic)
- Curiosity
- MA-POCA (MultiAgent POsthumous Credit Assignment)
3D Ball ์์
C:\ml-agents-main\ml-agents-main\config\ppo
behaviors:
3DBall:
trainer_type: ppo
hyperparameters:
batch_size: 64
buffer_size: 12000
learning_rate: 0.0003 //
beta: 0.001
epsilon: 0.2
lambd: 0.99
num_epoch: 3
learning_rate_schedule: linear
network_settings:
normalize: true
hidden_units: 128
num_layers: 2
vis_encode_type: simple
reward_signals:
extrinsic:
gamma: 0.99
strength: 1.0
keep_checkpoints: 5
**max_steps: 500000** // ๋ช๋ฒ ๋๋ฆด์ง
time_horizon: 1000
**summary_freq: 12000** // ๊ฒฐ๊ณผ๋ฅผ ๋ช๋ฒ๋ง๋ค ๋ณผ ๊ฒ ์ธ์ง
- Episode
- Behavior ๊ธฐ์ค
- Step
- Agent ๊ธฐ์ค
'๐ฎUnity' ์นดํ ๊ณ ๋ฆฌ์ ๋ค๋ฅธ ๊ธ
[Unity] HDRP, VR (0) | 2022.08.13 |
---|---|
[VR] OVR Hand Tracking + Photon (0) | 2022.08.10 |
[ML-Agents] ML-Agents ํบ์๋ณด๊ธฐ (0) | 2022.08.08 |
[photon] ๋ฐฉ ๋ง๋ค๊ธฐ, RPCs (0) | 2022.07.31 |
[Photon] Photon ํบ์๋ณด๊ธฐ (0) | 2022.07.31 |
๋๊ธ