aboutsummaryrefslogtreecommitdiff
path: root/.github
AgeCommit message (Collapse)AuthorFilesLines
11 daysRemove arena (#7134)Gravatar Swifty 2-173/+0
* remove arena * refactor: Remove Arena intake workflow * Remove all mention of the arena * remove evo.ninja
2024-04-10fix(ci): Disable annoying "PR too big" auto-messageGravatar Reinier van der Leer 1-0/+1
2024-03-22ci(agent): Add macOS on M1 to AutoGPT CI matrix (#7041)Gravatar Reinier van der Leer 1-2/+2
Use a `macos-14` runner to cover macOS on M1/arm64 - Add `macos-arm64` to `platform-os` matrix, and map it to `macos-14` runner
2024-03-22ci(agent): Disable Python dependency caching on WindowsGravatar Reinier van der Leer 1-1/+3
On Windows, unpacking cached dependencies takes longer than just installing them with `poetry install`. :')
2024-03-22ci(agent): Fix Python dependency caching on macOSGravatar Reinier van der Leer 1-1/+1
2024-03-22ci(agent): Fix Docker CI for PR runs from forks (vol. 2)Gravatar Reinier van der Leer 1-1/+1
- Fix docker image tag format error when `secrets.DOCKER_USER` is not set
2024-03-22ci(agent): Fix Docker CI for PR runs from forksGravatar Reinier van der Leer 1-1/+2
- Disable 'Log in to Docker hub' step for `pull_request` runs
2024-03-21ci(agent): Matrix CI tests across Linux, macOS and Windows (#7029)Gravatar Reinier van der Leer 1-17/+59
* Matrix the AutoGPT Python CI's `test` job across Ubuntu, macOS and Windows - Set up MinIO in a step rather than specifying it under `jobs[test].services`, because services are only supported on Linux runners - Add Windows version of step to install Poetry - Add macOS compatibility patches to 'Install Poetry (Unix)' and `setup_git_auth` steps **Caveats:** - **No Docker on macOS or Windows** * Windows comes with Docker but only supports running Windows containers, while we're mainly interested in using Linux containers for code execution and/or running auxiliary services. * [The macOS runner doesn't come with Docker](https://github.com/actions/runner-images/issues/17). Setting it up is possible but takes ~3-4 minutes, and the performance of the Colima engine is poor: a `docker pull` that takes 2 seconds on Linux takes 45 seconds on macOS. - **No S3 service available on Windows** It seems that running a background process [isn't possible on Windows](https://github.com/actions/runner/issues/598#issuecomment-2011890429), and neither is running Linux-based Docker containers. * Add `autogpt-agent` and OS-specific flags to Codecov upload step * Improve caching of Python dependencies in CI by changing the cache key - Include hash of `poetry.lock` instead of `pyproject.toml` in key - Remove date component from key; it was included to avoid getting stuck to old cached versions of packages when we were still using `requirements.txt`. With `poetry.lock` that is no longer a concern. * Fix skip check in test_s3_file_storage.py
2024-03-01ci: Disable annoying auto-message discouraging big PRsGravatar Reinier van der Leer 1-3/+0
2024-03-01fix(ci/arena): Fix requesting manual reviewGravatar Reinier van der Leer 1-1/+2
Three times the charm, right?
2024-03-01fix(ci/arena): Fix requesting manual reviewGravatar Reinier van der Leer 1-1/+1
2024-03-01fix(ci/arena): Fix requesting manual reviewGravatar Reinier van der Leer 1-1/+1
2024-03-01fix(ci/arena): Skip checking file against itself for duplicatesGravatar Reinier van der Leer 1-6/+7
2024-03-01fix(ci/arena): Improve output formatGravatar Reinier van der Leer 1-5/+6
2024-03-01fix(ci/arena): Reverse check for `pr.mergeable`Gravatar Reinier van der Leer 1-2/+2
2024-03-01fix(ci/arena): Make check for `pr.mergeable` more specificGravatar Reinier van der Leer 1-1/+1
2024-03-01fix(ci/arena): Fix error accessing `context` & improve log output readabilityGravatar Reinier van der Leer 1-20/+20
2024-03-01fix(ci/arena): Fix syntax & formatting errorsGravatar Reinier van der Leer 1-6/+6
2024-03-01feat(ci/arena): Add logging and debug output to workflow scriptGravatar Reinier van der Leer 1-0/+26
2024-03-01ci(arena): Fix `arena-intake` workflowGravatar Reinier van der Leer 1-4/+4
Sorry folks, it's been a while since I wrote javascript :')
2024-03-01ci(arena): Fix `arena-intake` workflowGravatar Reinier van der Leer 1-15/+22
2024-03-01ci: Add 'Arena intake' workflow to automatically check 'entering the arena' PRsGravatar Reinier van der Leer 1-0/+133
2024-02-29ci: Auto-label PRs based on the scope of their diffGravatar Reinier van der Leer 2-0/+34
2024-02-22Update CODEOWNERSGravatar Reinier van der Leer 1-5/+5
2024-02-21fix(ci/frontend): Add trigger on `push` including workflow fileGravatar Reinier van der Leer 1-0/+1
2024-02-21fix(ci/frontend): Add and fix trigger on workflow fileGravatar Reinier van der Leer 1-1/+1
2024-02-21ci: Revise Frontend CIGravatar Reinier van der Leer 2-46/+59
- Rename build-frontend.yml to frontend-ci.yml - Add a `pull_request` trigger - Disable committing and pushing to a `frontend_build_{hash}` branch - (Re)enable auto-creating a pull request for the new frontend build
2024-02-20fix(ci/benchmark): Install benchmark dependenciesGravatar Reinier van der Leer 1-0/+2
Otherwise `poetry -C benchmark run benchmark/reports/format.py` fails.
2024-02-20fix(ci/benchmark): Specify poetry env path for report conversion stepGravatar Reinier van der Leer 1-1/+1
2024-02-20fix(ci/benchmark): Unbreak "Push reports to data branch" stepGravatar Reinier van der Leer 1-1/+2
The `report_subfolder` variable was being populated with two identical lines, because there will be two untracked files in the folder, resulting in the same dirname. This caused later commands using that variable to fail. Fix is to `sort -u` before storing the value to `report_subfolder`.
2024-02-19feat(ci/benchmark): Generate step summary from benchmark reportGravatar Reinier van der Leer 1-0/+12
2024-02-17fix(ci/benchmark): Mitigate VCS conflicts with files in data branchGravatar Reinier van der Leer 1-0/+3
`agbenchmark` currently creates files like success_rate.json in the base REPORTS_FOLDER, which causes conflicts in the last step of the benchmark workflow. To prevent issues, these files must be removed prior to switching to the data branch.
2024-02-17fix(ci/benchmark): Add `set +e` because we expect (some) challenges to failGravatar Reinier van der Leer 1-0/+2
2024-02-17ci: Allow telemetry for non-push events, as long as it's on `master`Gravatar Reinier van der Leer 5-8/+3
Also disable telemetry for AutoGPT's unit/integration tests.
2024-02-17ci: Fix setting/passing `TELEMETRY_*` environment variablesGravatar Reinier van der Leer 4-17/+11
2024-02-17ci: Update actions to newest versionsGravatar Reinier van der Leer 10-36/+38
- `actions/stale` -> `v9` - `actions/cache` -> `v4` - `actions/checkout` -> `v4` - `actions/setup-node` -> `v4` - `docker/login-action` -> `v3` - `actions/setup-python` -> `v5` - `codecov/codecov-action` -> `v4` - `actions/upload-artifact` -> `v4` - `subosito/flutter-action` -> `v2` - `docker/build-push-action` -> `v5` - `docker/setup-buildx-action` -> `v3`
2024-02-17fix(ci/benchmark): Allow workflow to continue regardless of challenge outcomesGravatar Reinier van der Leer 1-0/+7
2024-02-16Rename autogpts-benchmark-nightly.yml to autogpts-benchmark.ymlGravatar Reinier van der Leer 1-0/+0
2024-02-16ci(benchmark): Add nightly benchmark workflowGravatar Reinier van der Leer 1-0/+71
Added autogpts-benchmark-nightly.yml, which will run every night at 02:00 UTC with a selection of challenges.
2024-02-15feat(benchmark): Make report output folder configurableGravatar Reinier van der Leer 1-1/+1
- Make `AgentBenchmarkConfig.reports_folder` directly configurable (through `REPORTS_FOLDER` env variable). The default is still `./agbenchmark_config/reports`. - Change all mentions of `REPORT_LOCATION` (which fulfilled the same function at some point in the past) to `REPORTS_FOLDER`.
2024-02-14ci: Enable telemetry in CI runs on `master`Gravatar Reinier van der Leer 4-0/+13
2024-02-14ci: Pick 3 challenges to run with `--mock` in smoke test CIGravatar Reinier van der Leer 1-1/+1
2024-02-12ci(agent): Add `GIT_REVISION` label to Docker buildsGravatar Reinier van der Leer 3-1/+3
2024-01-02AGBenchmark codebase clean-up (#6650)Gravatar Reinier van der Leer 1-1/+1
* refactor(benchmark): Deduplicate configuration loading logic - Move the configuration loading logic to a separate `load_agbenchmark_config` function in `agbenchmark/config.py` module. - Replace the duplicate loading logic in `conftest.py`, `generate_test.py`, `ReportManager.py`, `reports.py`, and `__main__.py` with calls to `load_agbenchmark_config` function. * fix(benchmark): Fix type errors, linting errors, and clean up CLI validation in __main__.py - Fixed type errors and linting errors in `__main__.py` - Improved the readability of CLI argument validation by introducing a separate function for it * refactor(benchmark): Lint and typefix app.py - Rearranged and cleaned up import statements - Fixed type errors caused by improper use of `psutil` objects - Simplified a number of `os.path` usages by converting to `pathlib` - Use `Task` and `TaskRequestBody` classes from `agent_protocol_client` instead of `.schema` * refactor(benchmark): Replace `.agent_protocol_client` by `agent-protcol-client`, clean up schema.py - Remove `agbenchmark.agent_protocol_client` (an offline copy of `agent-protocol-client`). - Add `agent-protocol-client` as a dependency and change imports to `agent_protocol_client`. - Fix type annotation on `agent_api_interface.py::upload_artifacts` (`ApiClient` -> `AgentApi`). - Remove all unused types from schema.py (= most of them). * refactor(benchmark): Use pathlib in agent_interface.py and agent_api_interface.py * refactor(benchmark): Improve typing, response validation, and readability in app.py - Simplified response generation by leveraging type checking and conversion by FastAPI. - Introduced use of `HTTPException` for error responses. - Improved naming, formatting, and typing in `app.py::create_evaluation`. - Updated the docstring on `app.py::create_agent_task`. - Fixed return type annotations of `create_single_test` and `create_challenge` in generate_test.py. - Added default values to optional attributes on models in report_types_v2.py. - Removed unused imports in `generate_test.py` * refactor(benchmark): Clean up logging and print statements - Introduced use of the `logging` library for unified logging and better readability. - Converted most print statements to use `logger.debug`, `logger.warning`, and `logger.error`. - Improved descriptiveness of log statements. - Removed unnecessary print statements. - Added log statements to unspecific and non-verbose `except` blocks. - Added `--debug` flag, which sets the log level to `DEBUG` and enables a more comprehensive log format. - Added `.utils.logging` module with `configure_logging` function to easily configure the logging library. - Converted raw escape sequences in `.utils.challenge` to use `colorama`. - Renamed `generate_test.py::generate_tests` to `load_challenges`. * refactor(benchmark): Remove unused server.py and agent_interface.py::run_agent - Remove unused server.py file - Remove unused run_agent function from agent_interface.py * refactor(benchmark): Clean up conftest.py - Fix and add type annotations - Rewrite docstrings - Disable or remove unused code - Fix definition of arguments and their types in `pytest_addoption` * refactor(benchmark): Clean up generate_test.py file - Refactored the `create_single_test` function for clarity and readability - Removed unused variables - Made creation of `Challenge` subclasses more straightforward - Made bare `except` more specific - Renamed `Challenge.setup_challenge` method to `run_challenge` - Updated type hints and annotations - Made minor code/readability improvements in `load_challenges` - Added a helper function `_add_challenge_to_module` for attaching a Challenge class to the current module * fix(benchmark): Fix and add type annotations in execute_sub_process.py * refactor(benchmark): Simplify const determination in agent_interface.py - Simplify the logic that determines the value of `HELICONE_GRAPHQL_LOGS` * fix(benchmark): Register category markers to prevent warnings - Use the `pytest_configure` hook to register the known challenge categories as markers. Otherwise, Pytest will raise "unknown marker" warnings at runtime. * refactor(benchmark/challenges): Fix indentation in 4_revenue_retrieval_2/data.json * refactor(benchmark): Update agent_api_interface.py - Add type annotations to `copy_agent_artifacts_into_temp_folder` function - Add note about broken endpoint in the `agent_protocol_client` library - Remove unused variable in `run_api_agent` function - Improve readability and resolve linting error * feat(benchmark): Improve and centralize pathfinding - Search path hierarchy for applicable `agbenchmark_config`, rather than assuming it's in the current folder. - Create `agbenchmark.utils.path_manager` with `AGBenchmarkPathManager` and exporting a `PATH_MANAGER` const. - Replace path constants defined in __main__.py with usages of `PATH_MANAGER`. * feat(benchmark/cli): Clean up and improve CLI - Updated commands, options, and their descriptions to be more intuitive and consistent - Moved slow imports into the entrypoints that use them to speed up application startup - Fixed type hints to match output types of Click options - Hid deprecated `agbenchmark start` command - Refactored code to improve readability and maintainability - Moved main entrypoint into `run` subcommand - Fixed `version` and `serve` subcommands - Added `click-default-group` package to allow using `run` implicitly (for backwards compatibility) - Renamed `--no_dep` to `--no-dep` for consistency - Fixed string formatting issues in log statements * refactor(benchmark/config): Move AgentBenchmarkConfig and related functions to config.py - Move the `AgentBenchmarkConfig` class from `utils/data_types.py` to `config.py`. - Extract the `calculate_info_test_path` function from `utils/data_types.py` and move it to `config.py` as a private helper function `_calculate_info_test_path`. - Move `load_agent_benchmark_config()` to `AgentBenchmarkConfig.load()`. - Changed simple getter methods on `AgentBenchmarkConfig` to calculated properties. - Update all code references according to the changes mentioned above. * refactor(benchmark): Fix ReportManager init parameter types and use pathlib - Fix the type annotation of the `benchmark_start_time` parameter in `ReportManager.__init__`, was mistyped as `str` instead of `datetime`. - Change the type of the `filename` parameter in the `ReportManager.__init__` method from `str` to `Path`. - Rename `self.filename` with `self.report_file` in `ReportManager`. - Change the way the report file is created, opened and saved to use the `Path` object. * refactor(benchmark): Improve typing surrounding ChallengeData and clean up its implementation - Use `ChallengeData` objects instead of untyped `dict` in app.py, generate_test.py, reports.py. - Remove unnecessary methods `serialize`, `get_data`, `get_json_from_path`, `deserialize` from `ChallengeData` class. - Remove unused methods `challenge_from_datum` and `challenge_from_test_data` from `ChallengeData class. - Update function signatures and annotations of `create_challenge` and `generate_single_test` functions in generate_test.py. - Add types to function signatures of `generate_single_call_report` and `finalize_reports` in reports.py. - Remove unnecessary `challenge_data` parameter (in generate_test.py) and fixture (in conftest.py). * refactor(benchmark): Clean up generate_test.py, conftest.py and __main__.py - Cleaned up generate_test.py and conftest.py - Consolidated challenge creation logic in the `Challenge` class itself, most notably the new `Challenge.from_challenge_spec` method. - Moved challenge selection logic from generate_test.py to the `pytest_collection_modifyitems` hook in conftest.py. - Converted methods in the `Challenge` class to class methods where appropriate. - Improved argument handling in the `run_benchmark` function in `__main__.py`. * refactor(benchmark/config): Merge AGBenchmarkPathManager into AgentBenchmarkConfig and reduce fragmented/global state - Merge the functionality of `AGBenchmarkPathManager` into `AgentBenchmarkConfig` to consolidate the configuration management. - Remove the `.path_manager` module containing `AGBenchmarkPathManager`. - Pass the `AgentBenchmarkConfig` and its attributes through function arguments to reduce global state and improve code clarity. * feat(benchmark/serve): Configurable port for `serve` subcommand - Added `--port` option to `serve` subcommand to allow for specifying the port to run the API on. - If no `--port` option is provided, the port will default to the value specified in the `PORT` environment variable, or 8080 if not set. * feat(benchmark/cli): Add `config` subcommand - Added a new subcommand `config` to the AGBenchmark CLI, to display information about the present AGBenchmark config. * fix(benchmark): Gracefully handle incompatible challenge spec files in app.py - Added a check to skip deprecated challenges - Added logging to allow debugging of the loading process - Added handling of validation errors when parsing challenge spec files - Added missing `spec_file` attribute to `ChallengeData` * refactor(benchmark): Move `run_benchmark` entrypoint to main.py, use it in `/reports` endpoint - Move `run_benchmark` and `validate_args` from __main__.py to main.py - Replace agbenchmark subprocess in `app.py:run_single_test` with `run_benchmark` - Move `get_unique_categories` from __main__.py to challenges/__init__.py - Move `OPTIONAL_CATEGORIES` from __main__.py to challenge.py - Reduce operations on updates.json (including `initialize_updates_file`) outside of API * refactor(benchmark): Remove unused `/updates` endpoint and all related code - Remove `updates_json_file` attribute from `AgentBenchmarkConfig` - Remove `get_updates` and `_initialize_updates_file` in app.py - Remove `append_updates_file` and `create_update_json` functions in agent_api_interface.py - Remove call to `append_updates_file` in challenge.py * refactor(benchmark/config): Clean up and update docstrings on `AgentBenchmarkConfig` - Add and update docstrings - Change base class from `BaseModel` to `BaseSettings`, allow extras for backwards compatibility - Make naming of path attributes on `AgentBenchmarkConfig` more consistent - Remove unused `agent_home_directory` attribute - Remove unused `workspace` attribute * fix(benchmark): Restore mechanism to select (optional) categories in agent benchmark config * fix(benchmark): Update agent-protocol-client to v1.1.0 - Fixes issue with fetching task artifact listings
2023-12-13ci: Fix docker release workflowGravatar Reinier van der Leer 1-1/+3
- Update autogpt-docker-release.yml to correctly sanitize image tags - This unbreaks the release workflow
2023-12-11ci/cd: Strip `autogpt-` from tag name for Docker releaseGravatar Reinier van der Leer 1-1/+1
2023-12-11ci/cd: Only run AutoGPT Docker Release workflow on releases linked to ↵Gravatar Reinier van der Leer 1-0/+1
`autogpt-*` tag - Add a condition to the job in autogpt-docker-release.yml to only run on `refs/tags/autogpt-`
2023-12-07feat(agent/workspace): Add GCS and S3 FileWorkspace providers (#6485)Gravatar Reinier van der Leer 2-11/+34
* refactor: Rename FileWorkspace to LocalFileWorkspace and create FileWorkspace abstract class - Rename `FileWorkspace` to `LocalFileWorkspace` to provide a more descriptive name for the class that represents a file workspace that works with local files. - Create a new base class `FileWorkspace` to serve as the parent class for `LocalFileWorkspace`. This allows for easier extension and customization of file workspaces in the future. - Update import statements and references to `FileWorkspace` throughout the codebase to use the new naming conventions. * feat: Add S3FileWorkspace + tests + test setups for CI and Docker - Added S3FileWorkspace class to provide an interface for interacting with a file workspace and storing files in an S3 bucket. - Updated pyproject.toml to include dependencies for boto3 and boto3-stubs. - Implemented unit tests for S3FileWorkspace. - Added MinIO service to Docker CI to allow testing S3 features in CI. - Added autogpt-test service config to docker-compose.yml for local testing with MinIO. * ci(docker): tee test output instead of capturing * fix: Improve error handling in S3FileWorkspace.initialize() - Do not tolerate all `botocore.exceptions.ClientError`s - Raise the exception anyways if the error is not "NoSuchBucket" * feat: Add S3 workspace backend support and S3Credentials - Added support for S3 workspace backend in the Autogpt configuration - Added a new sub-config `S3Credentials` to store S3 credentials - Modified the `.env.template` file to include variables related to S3 credentials - Added a new `s3_credentials` attribute on the `Config` class to store S3 credentials - Moved the `unmasked` method from `ModelProviderCredentials` to the parent `ProviderCredentials` class to handle unmasking for S3 credentials * fix(agent/tests): Fix S3FileWorkspace initialization in test_s3_file_workspace.py - Update the S3FileWorkspace initialization in the test_s3_file_workspace.py file to include the required S3 Credentials. * refactor: Remove S3Credentials and add get_workspace function - Remove `S3Credentials` as boto3 will fetch the config from the environment by itself - Add `get_workspace` function in `autogpt.file_workspace` module - Update `.env.template` and tests to reflect the changes * feat(agent/workspace): Make agent workspace backend configurable - Modified `autogpt.file_workspace.get_workspace` function to either take a workspace `id` or `root_path`. - Modified `FileWorkspaceMixin` to use the `get_workspace` function to set up the workspace. - Updated the type hints and imports accordingly. * feat(agent/workspace): Add GCSFileWorkspace for Google Cloud Storage - Added support for Google Cloud Storage as a storage backend option in the workspace. - Created the `GCSFileWorkspace` class to interface with a file workspace stored in a Google Cloud Storage bucket. - Implemented the `GCSFileWorkspaceConfiguration` class to handle the configuration for Google Cloud Storage workspaces. - Updated the `get_workspace` function to include the option to use Google Cloud Storage as a workspace backend. - Added unit tests for the new `GCSFileWorkspace` class. * fix: Unbreak use of non-local workspaces in AgentProtocolServer - Modify the `_get_task_agent_file_workspace` method to handle both local and non-local workspaces correctly
2023-12-03ci: Fix issue in Docker CIGravatar Reinier van der Leer 1-1/+2
* Stop Docker CI pushing images from PR workflow runs
2023-12-02refactor: Disable mypy and autoflake in CI and pre-commit hookGravatar Reinier van der Leer 1-8/+8
- Commented out the mypy and autoflake checks in the CI workflow and pre-commit config files. - The intention is to enable the mypy check again later.