diff options
Diffstat (limited to 'benchmark/README.md')
-rw-r--r-- | benchmark/README.md | 25 |
1 files changed, 25 insertions, 0 deletions
diff --git a/benchmark/README.md b/benchmark/README.md new file mode 100644 index 000000000..28d02ed28 --- /dev/null +++ b/benchmark/README.md @@ -0,0 +1,25 @@ +# Auto-GPT Benchmarks + +Built for the purpose of benchmarking the performance of agents regardless of how they work. + +Objectively know how well your agent is performing in categories like code, retrieval, memory, and safety. + +Save time and money while doing it through smart dependencies. The best part? It's all automated. + +## Scores: + +<img width="733" alt="Screenshot 2023-07-25 at 10 35 01 AM" src="https://github.com/Significant-Gravitas/Auto-GPT-Benchmarks/assets/9652976/98963e0b-18b9-4b17-9a6a-4d3e4418af70"> + +## Ranking overall: + +- 1- [Beebot](https://github.com/AutoPackAI/beebot) +- 2- [mini-agi](https://github.com/muellerberndt/mini-agi) +- 3- [Auto-GPT](https://github.com/Significant-Gravitas/AutoGPT) + +## Detailed results: + +<img width="733" alt="Screenshot 2023-07-25 at 10 42 15 AM" src="https://github.com/Significant-Gravitas/Auto-GPT-Benchmarks/assets/9652976/39be464c-c842-4437-b28a-07d878542a83"> + +[Click here to see the results and the raw data!](https://docs.google.com/spreadsheets/d/1WXm16P2AHNbKpkOI0LYBpcsGG0O7D8HYTG5Uj0PaJjA/edit#gid=203558751)! + +More agents coming soon ! |