policy dropout Attention-UNet
policy based PyTorch implementation for embedding bagging.
- Input
- 4364-dim embedding
- Encoder
- 124 x Attention-UNet with 62 heads
- Output
- precision projection
Training config
optimizer=SGD, lr=0.358, scheduler=plateau, warmup=1392