aboutsummaryrefslogtreecommitdiff
path: root/benchmark/README.md
diff options
context:
space:
mode:
Diffstat (limited to 'benchmark/README.md')
-rw-r--r--benchmark/README.md25
1 files changed, 25 insertions, 0 deletions
diff --git a/benchmark/README.md b/benchmark/README.md
new file mode 100644
index 000000000..28d02ed28
--- /dev/null
+++ b/benchmark/README.md
@@ -0,0 +1,25 @@
+# Auto-GPT Benchmarks
+
+Built for the purpose of benchmarking the performance of agents regardless of how they work.
+
+Objectively know how well your agent is performing in categories like code, retrieval, memory, and safety.
+
+Save time and money while doing it through smart dependencies. The best part? It's all automated.
+
+## Scores:
+
+<img width="733" alt="Screenshot 2023-07-25 at 10 35 01 AM" src="https://github.com/Significant-Gravitas/Auto-GPT-Benchmarks/assets/9652976/98963e0b-18b9-4b17-9a6a-4d3e4418af70">
+
+## Ranking overall:
+
+- 1- [Beebot](https://github.com/AutoPackAI/beebot)
+- 2- [mini-agi](https://github.com/muellerberndt/mini-agi)
+- 3- [Auto-GPT](https://github.com/Significant-Gravitas/AutoGPT)
+
+## Detailed results:
+
+<img width="733" alt="Screenshot 2023-07-25 at 10 42 15 AM" src="https://github.com/Significant-Gravitas/Auto-GPT-Benchmarks/assets/9652976/39be464c-c842-4437-b28a-07d878542a83">
+
+[Click here to see the results and the raw data!](https://docs.google.com/spreadsheets/d/1WXm16P2AHNbKpkOI0LYBpcsGG0O7D8HYTG5Uj0PaJjA/edit#gid=203558751)!
+
+More agents coming soon !