profile
viewpoint
Rami Zouari ramizouari Tunisia I am a student at INSAT currently studying Software Engineering. I am an AI enthusiast, currently working scientifically on Machine Learning.

ramizouari/QIPAT 9

An image processing application & library built in C++20 and the Qt Framework.

Saief1999/big-data-pipeline 4

A pipeline using big data technologies

ramizouari/BNN 3

BNN is a library for binary neural network. In which we will introduce state of the art algorithms and approaches.

ramizouari/CPLibrary 3

Competitive Programming Library

ramizouari/Excellentia 3

A Platform for sharing computer science knowledge and improving problem solving

ramizouari/MachineLearning 3

My attempt to explore Machine Learning & Data Science

ramizouari/ArtificialIntelligence 2

This repository contains a list of academic problems that we solved using classical AI.

ramizouari/ChatRoom 2

a little chatroom

push eventramizouari/StochasticGames

ramizouari

commit sha 383c067519df9e1cbf0f54ad09109107c43693b8

Updating introduction

view details

ramizouari

commit sha f76315bebaa608261f11c62c32716a9590282f05

Adding introductions and conclusions

view details

push time in 21 days

push eventramizouari/StochasticGames

ramizouari

commit sha a5fee8d536b5aa496894bd7c2384cbd17b665a10

Chapter RL almost completed

view details

ramizouari

commit sha ee404a550d0698921b276fb614eae85179d17978

Updating format to INSAT standard

view details

push time in 23 days

push eventramizouari/StochasticGames

ramizouari

commit sha 129c6340398ddafeebc09d7d19fae4502cf1f72a

Adding acronym lists

view details

push time in 24 days

push eventramizouari/StochasticGames

ramizouari

commit sha e55179b708c6ea43cf4fb8a3b3dc122af12330a9

Adding RL/SP chapter

view details

push time in 25 days

push eventramizouari/StochasticGames

ramizouari

commit sha e0ce2fc26e398e747355e24a83bb12874125abc4

Adding Pipeline ardiagram

view details

push time in a month

push eventramizouari/StochasticGames

ramizouari

commit sha 1979f22b44017a4531c69c7aa673ae0d611e1f0c

Refining proofs + A Adding class diagrams

view details

push time in a month

push eventramizouari/StochasticGames

ramizouari

commit sha 7be204d14499567be7bb5a38de4b09bd2a59ec57

Major update to report

view details

push time in a month

startedfacebookresearch/audiocraft

started time in 2 months

push eventramizouari/StochasticGames

ramizouari

commit sha 53ae1afa29680691b38529b622b81e01f5cc7d29

Updating open_spiel

view details

push time in 3 months

push eventramizouari/open_spiel

ramizouari

commit sha b33d31714b174a4ea78440944a41275d922a1813

Fixing Evaluator save thread

view details

push time in 3 months

push eventramizouari/StochasticGames

ramizouari

commit sha 85d1d0b059604c97e5e4cabda7ab2c98798856a1

Updating Report

view details

push time in 3 months

push eventramizouari/StochasticGames

ramizouari

commit sha 7fd0832f66fd74d81162e37acaa189c9c07b7364

Adding report

view details

push time in 3 months

push eventramizouari/open_spiel

ramizouari

commit sha 9c4d37cc1747ed948346b4d5871dd7248aeb871a

More robust RNG + Support for reseeding

view details

push time in 3 months

push eventramizouari/StochasticGames

ramizouari

commit sha 9757fe57f6bcdba357c23411ee5e524ad51859a6

Updating open_spiel

view details

push time in 3 months

push eventramizouari/open_spiel

ramizouari

commit sha 55684f1c4ecaa569c9ba31dc7150b4c75c9faf07

Adding payoff noise + view Replay Buffer infos

view details

push time in 3 months

push eventramizouari/StochasticGames

ramizouari

commit sha 9f9f2dc5a85a0dfc6cc9f095d4c34de89c756b85

Update open_spiel

view details

push time in 3 months

push eventramizouari/open_spiel

ramizouari

commit sha e41976de8b56452ff8d64e0af6026076d2060510

Adding service type to config

view details

push time in 3 months

push eventramizouari/StochasticGames

ramizouari

commit sha 35dfd9ffa0b1913bb5ea62b4b6105284d090b324

Updating open_spiel

view details

push time in 3 months

push eventramizouari/open_spiel

ramizouari

commit sha 38b0737d652c31f5d5236e9bd2022d7542e5b905

Adding template configuration

view details

push time in 3 months

issue commentdeepmind/open_spiel

Updating Alpha Zero

Hello @tewalds.

Thank you for your reply.

In the game that I am working on (Mean Payoff Game), I had to deploy it on a HPC cluster for faster trajectory generation. For my use case, trajectories were sent with reverb, and model broadcasting the model and monitoring were done with a HTTP server on each service.

Also, the HTTP part is generic on the sense that we can switch it with another protocol (or default to the multiprocessing queues as is implemented by default), one only have to change:

  1. Model broadcasting function
  2. The ReplayBuffer implementation (Local via queues / Reverb via gRPC, or a custom one)
  3. The model update part on the actors and evaluators.

I had to switch to Python on that part due to the lack of documentation of C++'s implementation of Reverb, but of course it is doable. I will need to contact the Reverb team for more intuition on their C++ code.

Also another limitation of the C++ part, I am still not find the correct format to call the fit function on C++. I only was able to do inference. Also I was not able to load individual checkpoints, but that can be mitigated by simply loading the whole SavedModel bundle on each update.

On the other hand, assuming the Reverb problem in C++ will be resolved, what I can do is implement the learner in Python, and the actors and evaluators in C++. And have them communicate using for example HTTP + Reverb.

Now, as that will constitute a big code addition, I think it will be best if we split them on PR at a time. And for that I will start with the TF2 update.

And as a performance measure, can you please tell me what games should the new implementation be able to learn?

ramizouari

comment created time in 3 months

more